* [PATCH 1/8] x86: don't pointlessly reload the system call number
2018-04-05 9:52 [PATCH 0/8] use struct pt_regs based syscall calling for x86-64 Dominik Brodowski
@ 2018-04-05 9:53 ` Dominik Brodowski
2018-04-06 17:09 ` [tip:x86/asm] x86/syscalls: Don't " tip-bot for Linus Torvalds
2018-04-05 9:53 ` [PATCH 2/8] syscalls: introduce CONFIG_ARCH_HAS_SYSCALL_WRAPPER Dominik Brodowski
` (7 subsequent siblings)
8 siblings, 1 reply; 27+ messages in thread
From: Dominik Brodowski @ 2018-04-05 9:53 UTC (permalink / raw)
To: linux-kernel, mingo
Cc: Linus Torvalds, Thomas Gleixner, Ingo Molnar, H. Peter Anvin,
Andi Kleen, x86
From: Linus Torvalds <torvalds@linux-foundation.org>
We have it in a register in the low-level asm, just pass it in as an
argument rather than have do_syscall_64() load it back in from the
ptregs pointer.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: x86@kernel.org
Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
---
arch/x86/entry/common.c | 12 ++++++------
arch/x86/entry/entry_64.S | 3 ++-
2 files changed, 8 insertions(+), 7 deletions(-)
diff --git a/arch/x86/entry/common.c b/arch/x86/entry/common.c
index 74f6eee15179..a8b066dbbf48 100644
--- a/arch/x86/entry/common.c
+++ b/arch/x86/entry/common.c
@@ -266,14 +266,13 @@ __visible inline void syscall_return_slowpath(struct pt_regs *regs)
}
#ifdef CONFIG_X86_64
-__visible void do_syscall_64(struct pt_regs *regs)
+__visible void do_syscall_64(unsigned long nr, struct pt_regs *regs)
{
- struct thread_info *ti = current_thread_info();
- unsigned long nr = regs->orig_ax;
+ struct thread_info *ti;
enter_from_user_mode();
local_irq_enable();
-
+ ti = current_thread_info();
if (READ_ONCE(ti->flags) & _TIF_WORK_SYSCALL_ENTRY)
nr = syscall_trace_enter(regs);
@@ -282,8 +281,9 @@ __visible void do_syscall_64(struct pt_regs *regs)
* table. The only functional difference is the x32 bit in
* regs->orig_ax, which changes the behavior of some syscalls.
*/
- if (likely((nr & __SYSCALL_MASK) < NR_syscalls)) {
- nr = array_index_nospec(nr & __SYSCALL_MASK, NR_syscalls);
+ nr &= __SYSCALL_MASK;
+ if (likely(nr < NR_syscalls)) {
+ nr = array_index_nospec(nr, NR_syscalls);
regs->ax = sys_call_table[nr](
regs->di, regs->si, regs->dx,
regs->r10, regs->r8, regs->r9);
diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S
index 936e19642eab..6cfe38665f3c 100644
--- a/arch/x86/entry/entry_64.S
+++ b/arch/x86/entry/entry_64.S
@@ -233,7 +233,8 @@ GLOBAL(entry_SYSCALL_64_after_hwframe)
TRACE_IRQS_OFF
/* IRQs are off. */
- movq %rsp, %rdi
+ movq %rax, %rdi
+ movq %rsp, %rsi
call do_syscall_64 /* returns with IRQs disabled */
TRACE_IRQS_IRETQ /* we're about to change IF */
--
2.16.3
^ permalink raw reply related [flat|nested] 27+ messages in thread
* [tip:x86/asm] x86/syscalls: Don't pointlessly reload the system call number
2018-04-05 9:53 ` [PATCH 1/8] x86: don't pointlessly reload the system call number Dominik Brodowski
@ 2018-04-06 17:09 ` tip-bot for Linus Torvalds
0 siblings, 0 replies; 27+ messages in thread
From: tip-bot for Linus Torvalds @ 2018-04-06 17:09 UTC (permalink / raw)
To: linux-tip-commits
Cc: tglx, peterz, jpoimboe, brgerst, luto, hpa, bp, mingo, dvlasenk,
linux, torvalds, linux-kernel
Commit-ID: dfe64506c01e57159a4c550fe537c13a317ff01b
Gitweb: https://git.kernel.org/tip/dfe64506c01e57159a4c550fe537c13a317ff01b
Author: Linus Torvalds <torvalds@linux-foundation.org>
AuthorDate: Thu, 5 Apr 2018 11:53:00 +0200
Committer: Ingo Molnar <mingo@kernel.org>
CommitDate: Thu, 5 Apr 2018 16:59:24 +0200
x86/syscalls: Don't pointlessly reload the system call number
We have it in a register in the low-level asm, just pass it in as an
argument rather than have do_syscall_64() load it back in from the
ptregs pointer.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20180405095307.3730-2-linux@dominikbrodowski.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
arch/x86/entry/common.c | 12 ++++++------
arch/x86/entry/entry_64.S | 3 ++-
2 files changed, 8 insertions(+), 7 deletions(-)
diff --git a/arch/x86/entry/common.c b/arch/x86/entry/common.c
index 74f6eee15179..a8b066dbbf48 100644
--- a/arch/x86/entry/common.c
+++ b/arch/x86/entry/common.c
@@ -266,14 +266,13 @@ __visible inline void syscall_return_slowpath(struct pt_regs *regs)
}
#ifdef CONFIG_X86_64
-__visible void do_syscall_64(struct pt_regs *regs)
+__visible void do_syscall_64(unsigned long nr, struct pt_regs *regs)
{
- struct thread_info *ti = current_thread_info();
- unsigned long nr = regs->orig_ax;
+ struct thread_info *ti;
enter_from_user_mode();
local_irq_enable();
-
+ ti = current_thread_info();
if (READ_ONCE(ti->flags) & _TIF_WORK_SYSCALL_ENTRY)
nr = syscall_trace_enter(regs);
@@ -282,8 +281,9 @@ __visible void do_syscall_64(struct pt_regs *regs)
* table. The only functional difference is the x32 bit in
* regs->orig_ax, which changes the behavior of some syscalls.
*/
- if (likely((nr & __SYSCALL_MASK) < NR_syscalls)) {
- nr = array_index_nospec(nr & __SYSCALL_MASK, NR_syscalls);
+ nr &= __SYSCALL_MASK;
+ if (likely(nr < NR_syscalls)) {
+ nr = array_index_nospec(nr, NR_syscalls);
regs->ax = sys_call_table[nr](
regs->di, regs->si, regs->dx,
regs->r10, regs->r8, regs->r9);
diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S
index 936e19642eab..6cfe38665f3c 100644
--- a/arch/x86/entry/entry_64.S
+++ b/arch/x86/entry/entry_64.S
@@ -233,7 +233,8 @@ GLOBAL(entry_SYSCALL_64_after_hwframe)
TRACE_IRQS_OFF
/* IRQs are off. */
- movq %rsp, %rdi
+ movq %rax, %rdi
+ movq %rsp, %rsi
call do_syscall_64 /* returns with IRQs disabled */
TRACE_IRQS_IRETQ /* we're about to change IF */
^ permalink raw reply related [flat|nested] 27+ messages in thread
* [PATCH 2/8] syscalls: introduce CONFIG_ARCH_HAS_SYSCALL_WRAPPER
2018-04-05 9:52 [PATCH 0/8] use struct pt_regs based syscall calling for x86-64 Dominik Brodowski
2018-04-05 9:53 ` [PATCH 1/8] x86: don't pointlessly reload the system call number Dominik Brodowski
@ 2018-04-05 9:53 ` Dominik Brodowski
2018-04-06 17:10 ` [tip:x86/asm] syscalls/core: Introduce CONFIG_ARCH_HAS_SYSCALL_WRAPPER=y tip-bot for Dominik Brodowski
2018-04-05 9:53 ` [PATCH 3/8] syscalls/x86: use struct pt_regs based syscall calling for 64-bit syscalls Dominik Brodowski
` (6 subsequent siblings)
8 siblings, 1 reply; 27+ messages in thread
From: Dominik Brodowski @ 2018-04-05 9:53 UTC (permalink / raw)
To: linux-kernel, mingo
Cc: Thomas Gleixner, H. Peter Anvin, Andi Kleen, Ingo Molnar,
Andrew Morton, Al Viro
It may be useful for an architecture to override the definitions of the
SYSCALL_DEFINE0() and __SYSCALL_DEFINEx() macros in <linux/syscalls.h>,
in particular to use a different calling convention for syscalls.
This patch provides a mechanism to do so: It introduces
CONFIG_ARCH_HAS_SYSCALL_WRAPPER. If it is enabled, <asm/sycall_wrapper.h>
is included in <linux/syscalls.h> and may be used to define the macros
mentioned above. Moreover, as the syscall calling convention may be
different if CONFIG_ARCH_HAS_SYSCALL_WRAPPER is set, the syscall function
prototypes in <linux/syscalls.h> are #ifndef'd out in that case.
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
---
include/linux/syscalls.h | 23 +++++++++++++++++++++++
init/Kconfig | 7 +++++++
2 files changed, 30 insertions(+)
diff --git a/include/linux/syscalls.h b/include/linux/syscalls.h
index b961184f597a..503ab245d4ce 100644
--- a/include/linux/syscalls.h
+++ b/include/linux/syscalls.h
@@ -81,6 +81,17 @@ union bpf_attr;
#include <linux/key.h>
#include <trace/syscall.h>
+#ifdef CONFIG_ARCH_HAS_SYSCALL_WRAPPER
+/*
+ * It may be useful for an architecture to override the definitions of the
+ * SYSCALL_DEFINE0() and __SYSCALL_DEFINEx() macros, in particular to use a
+ * different calling convention for syscalls. To allow for that, the prototypes
+ * for the sys_*() functions below will *not* be included if
+ * CONFIG_ARCH_HAS_SYSCALL_WRAPPER is enabled.
+ */
+#include <asm/syscall_wrapper.h>
+#endif /* CONFIG_ARCH_HAS_SYSCALL_WRAPPER */
+
/*
* __MAP - apply a macro to syscall arguments
* __MAP(n, m, t1, a1, t2, a2, ..., tn, an) will expand to
@@ -189,11 +200,13 @@ static inline int is_syscall_trace_event(struct trace_event_call *tp_event)
}
#endif
+#ifndef SYSCALL_DEFINE0
#define SYSCALL_DEFINE0(sname) \
SYSCALL_METADATA(_##sname, 0); \
asmlinkage long sys_##sname(void); \
ALLOW_ERROR_INJECTION(sys_##sname, ERRNO); \
asmlinkage long sys_##sname(void)
+#endif /* SYSCALL_DEFINE0 */
#define SYSCALL_DEFINE1(name, ...) SYSCALL_DEFINEx(1, _##name, __VA_ARGS__)
#define SYSCALL_DEFINE2(name, ...) SYSCALL_DEFINEx(2, _##name, __VA_ARGS__)
@@ -209,6 +222,8 @@ static inline int is_syscall_trace_event(struct trace_event_call *tp_event)
__SYSCALL_DEFINEx(x, sname, __VA_ARGS__)
#define __PROTECT(...) asmlinkage_protect(__VA_ARGS__)
+
+#ifndef __SYSCALL_DEFINEx
#define __SYSCALL_DEFINEx(x, name, ...) \
asmlinkage long sys##name(__MAP(x,__SC_DECL,__VA_ARGS__)) \
__attribute__((alias(__stringify(SyS##name)))); \
@@ -223,6 +238,7 @@ static inline int is_syscall_trace_event(struct trace_event_call *tp_event)
return ret; \
} \
static inline long SYSC##name(__MAP(x,__SC_DECL,__VA_ARGS__))
+#endif /* __SYSCALL_DEFINEx */
/*
* Called before coming back to user-mode. Returning to user-mode with an
@@ -252,7 +268,12 @@ static inline void addr_limit_user_check(void)
* Please note that these prototypes here are only provided for information
* purposes, for static analysis, and for linking from the syscall table.
* These functions should not be called elsewhere from kernel code.
+ *
+ * As the syscall calling convention may be different from the default
+ * for architectures overriding the syscall calling convention, do not
+ * include the prototypes if CONFIG_ARCH_HAS_SYSCALL_WRAPPER is enabled.
*/
+#ifndef CONFIG_ARCH_HAS_SYSCALL_WRAPPER
asmlinkage long sys_io_setup(unsigned nr_reqs, aio_context_t __user *ctx);
asmlinkage long sys_io_destroy(aio_context_t ctx);
asmlinkage long sys_io_submit(aio_context_t, long,
@@ -1076,6 +1097,8 @@ asmlinkage long sys_old_mmap(struct mmap_arg_struct __user *arg);
*/
asmlinkage long sys_ni_syscall(void);
+#endif /* CONFIG_ARCH_HAS_SYSCALL_WRAPPER */
+
/*
* Kernel code should not call syscalls (i.e., sys_xyzyyz()) directly.
diff --git a/init/Kconfig b/init/Kconfig
index 2852692d7c9c..068eb6c3bbf7 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -1923,3 +1923,10 @@ source "kernel/Kconfig.locks"
config ARCH_HAS_SYNC_CORE_BEFORE_USERMODE
bool
+
+# It may be useful for an architecture to override the definitions of the
+# SYSCALL_DEFINE() and __SYSCALL_DEFINEx() macros in <linux/syscalls.h>,
+# in particular to use a different calling convention for syscalls.
+config ARCH_HAS_SYSCALL_WRAPPER
+ def_bool n
+ depends on !COMPAT
--
2.16.3
^ permalink raw reply related [flat|nested] 27+ messages in thread
* [tip:x86/asm] syscalls/core: Introduce CONFIG_ARCH_HAS_SYSCALL_WRAPPER=y
2018-04-05 9:53 ` [PATCH 2/8] syscalls: introduce CONFIG_ARCH_HAS_SYSCALL_WRAPPER Dominik Brodowski
@ 2018-04-06 17:10 ` tip-bot for Dominik Brodowski
0 siblings, 0 replies; 27+ messages in thread
From: tip-bot for Dominik Brodowski @ 2018-04-06 17:10 UTC (permalink / raw)
To: linux-tip-commits
Cc: hpa, torvalds, tglx, bp, linux, brgerst, viro, jpoimboe, mingo,
luto, linux-kernel, akpm, dvlasenk, peterz
Commit-ID: 1bd21c6c21e848996339508d3ffb106d505256a8
Gitweb: https://git.kernel.org/tip/1bd21c6c21e848996339508d3ffb106d505256a8
Author: Dominik Brodowski <linux@dominikbrodowski.net>
AuthorDate: Thu, 5 Apr 2018 11:53:01 +0200
Committer: Ingo Molnar <mingo@kernel.org>
CommitDate: Thu, 5 Apr 2018 16:59:25 +0200
syscalls/core: Introduce CONFIG_ARCH_HAS_SYSCALL_WRAPPER=y
It may be useful for an architecture to override the definitions of the
SYSCALL_DEFINE0() and __SYSCALL_DEFINEx() macros in <linux/syscalls.h>,
in particular to use a different calling convention for syscalls.
This patch provides a mechanism to do so: It introduces
CONFIG_ARCH_HAS_SYSCALL_WRAPPER. If it is enabled, <asm/sycall_wrapper.h>
is included in <linux/syscalls.h> and may be used to define the macros
mentioned above. Moreover, as the syscall calling convention may be
different if CONFIG_ARCH_HAS_SYSCALL_WRAPPER is set, the syscall function
prototypes in <linux/syscalls.h> are #ifndef'd out in that case.
Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20180405095307.3730-3-linux@dominikbrodowski.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
include/linux/syscalls.h | 23 +++++++++++++++++++++++
init/Kconfig | 7 +++++++
2 files changed, 30 insertions(+)
diff --git a/include/linux/syscalls.h b/include/linux/syscalls.h
index b961184f597a..503ab245d4ce 100644
--- a/include/linux/syscalls.h
+++ b/include/linux/syscalls.h
@@ -81,6 +81,17 @@ union bpf_attr;
#include <linux/key.h>
#include <trace/syscall.h>
+#ifdef CONFIG_ARCH_HAS_SYSCALL_WRAPPER
+/*
+ * It may be useful for an architecture to override the definitions of the
+ * SYSCALL_DEFINE0() and __SYSCALL_DEFINEx() macros, in particular to use a
+ * different calling convention for syscalls. To allow for that, the prototypes
+ * for the sys_*() functions below will *not* be included if
+ * CONFIG_ARCH_HAS_SYSCALL_WRAPPER is enabled.
+ */
+#include <asm/syscall_wrapper.h>
+#endif /* CONFIG_ARCH_HAS_SYSCALL_WRAPPER */
+
/*
* __MAP - apply a macro to syscall arguments
* __MAP(n, m, t1, a1, t2, a2, ..., tn, an) will expand to
@@ -189,11 +200,13 @@ static inline int is_syscall_trace_event(struct trace_event_call *tp_event)
}
#endif
+#ifndef SYSCALL_DEFINE0
#define SYSCALL_DEFINE0(sname) \
SYSCALL_METADATA(_##sname, 0); \
asmlinkage long sys_##sname(void); \
ALLOW_ERROR_INJECTION(sys_##sname, ERRNO); \
asmlinkage long sys_##sname(void)
+#endif /* SYSCALL_DEFINE0 */
#define SYSCALL_DEFINE1(name, ...) SYSCALL_DEFINEx(1, _##name, __VA_ARGS__)
#define SYSCALL_DEFINE2(name, ...) SYSCALL_DEFINEx(2, _##name, __VA_ARGS__)
@@ -209,6 +222,8 @@ static inline int is_syscall_trace_event(struct trace_event_call *tp_event)
__SYSCALL_DEFINEx(x, sname, __VA_ARGS__)
#define __PROTECT(...) asmlinkage_protect(__VA_ARGS__)
+
+#ifndef __SYSCALL_DEFINEx
#define __SYSCALL_DEFINEx(x, name, ...) \
asmlinkage long sys##name(__MAP(x,__SC_DECL,__VA_ARGS__)) \
__attribute__((alias(__stringify(SyS##name)))); \
@@ -223,6 +238,7 @@ static inline int is_syscall_trace_event(struct trace_event_call *tp_event)
return ret; \
} \
static inline long SYSC##name(__MAP(x,__SC_DECL,__VA_ARGS__))
+#endif /* __SYSCALL_DEFINEx */
/*
* Called before coming back to user-mode. Returning to user-mode with an
@@ -252,7 +268,12 @@ static inline void addr_limit_user_check(void)
* Please note that these prototypes here are only provided for information
* purposes, for static analysis, and for linking from the syscall table.
* These functions should not be called elsewhere from kernel code.
+ *
+ * As the syscall calling convention may be different from the default
+ * for architectures overriding the syscall calling convention, do not
+ * include the prototypes if CONFIG_ARCH_HAS_SYSCALL_WRAPPER is enabled.
*/
+#ifndef CONFIG_ARCH_HAS_SYSCALL_WRAPPER
asmlinkage long sys_io_setup(unsigned nr_reqs, aio_context_t __user *ctx);
asmlinkage long sys_io_destroy(aio_context_t ctx);
asmlinkage long sys_io_submit(aio_context_t, long,
@@ -1076,6 +1097,8 @@ asmlinkage long sys_old_mmap(struct mmap_arg_struct __user *arg);
*/
asmlinkage long sys_ni_syscall(void);
+#endif /* CONFIG_ARCH_HAS_SYSCALL_WRAPPER */
+
/*
* Kernel code should not call syscalls (i.e., sys_xyzyyz()) directly.
diff --git a/init/Kconfig b/init/Kconfig
index 2852692d7c9c..068eb6c3bbf7 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -1923,3 +1923,10 @@ source "kernel/Kconfig.locks"
config ARCH_HAS_SYNC_CORE_BEFORE_USERMODE
bool
+
+# It may be useful for an architecture to override the definitions of the
+# SYSCALL_DEFINE() and __SYSCALL_DEFINEx() macros in <linux/syscalls.h>,
+# in particular to use a different calling convention for syscalls.
+config ARCH_HAS_SYSCALL_WRAPPER
+ def_bool n
+ depends on !COMPAT
^ permalink raw reply related [flat|nested] 27+ messages in thread
* [PATCH 3/8] syscalls/x86: use struct pt_regs based syscall calling for 64-bit syscalls
2018-04-05 9:52 [PATCH 0/8] use struct pt_regs based syscall calling for x86-64 Dominik Brodowski
2018-04-05 9:53 ` [PATCH 1/8] x86: don't pointlessly reload the system call number Dominik Brodowski
2018-04-05 9:53 ` [PATCH 2/8] syscalls: introduce CONFIG_ARCH_HAS_SYSCALL_WRAPPER Dominik Brodowski
@ 2018-04-05 9:53 ` Dominik Brodowski
2018-04-06 17:11 ` [tip:x86/asm] syscalls/x86: Use 'struct pt_regs' based syscall calling convention " tip-bot for Dominik Brodowski
2018-04-05 9:53 ` [PATCH 4/8] syscalls: prepare ARCH_HAS_SYSCALL_WRAPPER for compat syscalls Dominik Brodowski
` (5 subsequent siblings)
8 siblings, 1 reply; 27+ messages in thread
From: Dominik Brodowski @ 2018-04-05 9:53 UTC (permalink / raw)
To: linux-kernel, mingo
Cc: Thomas Gleixner, Andi Kleen, Ingo Molnar, Andrew Morton, Al Viro,
Andy Lutomirski, Denys Vlasenko, Brian Gerst, Peter Zijlstra,
Linus Torvalds, H. Peter Anvin, x86
Let's make use of ARCH_HAS_SYSCALL_WRAPPER on pure 64-bit x86-64 systems:
Each syscall defines a stub which takes struct pt_regs as its only
argument. It decodes just those parameters it needs, e.g:
asmlinkage long sys_xyzzy(const struct pt_regs *regs)
{
return SyS_xyzzy(regs->di, regs->si, regs->dx);
}
This approach avoids leaking random user-provided register content down
the call chain.
For example, for sys_recv() which is a 4-parameter syscall, the assembly
now is (in slightly reordered fashion):
<sys_recv>:
callq <__fentry__>
/* decode regs->di, ->si, ->dx and ->r10 */
mov 0x70(%rdi),%rdi
mov 0x68(%rdi),%rsi
mov 0x60(%rdi),%rdx
mov 0x38(%rdi),%rcx
[ SyS_recv() is automatically inlined by the compiler,
as it is not [yet] used anywhere else ]
/* clear %r9 and %r8, the 5th and 6th args */
xor %r9d,%r9d
xor %r8d,%r8d
/* do the actual work */
callq __sys_recvfrom
/* cleanup and return */
cltq
retq
The only valid place in an x86-64 kernel which rightfully calls
a syscall function on its own -- vsyscall -- needs to be modified
to pass struct pt_regs onwards as well.
To keep the syscall table generation working independent of
SYSCALL_PTREGS being enabled, the stubs are named the same as the
"original" syscall stubs, i.e. sys_*().
This patch is based on an original proof-of-concept
From: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
and was split up and heavily modified by me, in particular to base it on
ARCH_HAS_SYSCALL_WRAPPER, to limit it to 64-bit-only for the time being,
and to update the vsyscall to the new calling convention.
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: x86@kernel.org
Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
---
arch/x86/Kconfig | 5 +++
arch/x86/entry/common.c | 4 ++
arch/x86/entry/syscall_64.c | 9 ++++-
arch/x86/entry/vsyscall/vsyscall_64.c | 22 +++++++++++
arch/x86/include/asm/syscall.h | 4 ++
arch/x86/include/asm/syscall_wrapper.h | 70 ++++++++++++++++++++++++++++++++++
arch/x86/include/asm/syscalls.h | 7 ++++
include/linux/syscalls.h | 2 +-
8 files changed, 120 insertions(+), 3 deletions(-)
create mode 100644 arch/x86/include/asm/syscall_wrapper.h
diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 27fede438959..67348efc2540 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -2954,3 +2954,8 @@ source "crypto/Kconfig"
source "arch/x86/kvm/Kconfig"
source "lib/Kconfig"
+
+config SYSCALL_PTREGS
+ def_bool y
+ depends on X86_64 && !COMPAT
+ select ARCH_HAS_SYSCALL_WRAPPER
diff --git a/arch/x86/entry/common.c b/arch/x86/entry/common.c
index a8b066dbbf48..e1b91bffa988 100644
--- a/arch/x86/entry/common.c
+++ b/arch/x86/entry/common.c
@@ -284,9 +284,13 @@ __visible void do_syscall_64(unsigned long nr, struct pt_regs *regs)
nr &= __SYSCALL_MASK;
if (likely(nr < NR_syscalls)) {
nr = array_index_nospec(nr, NR_syscalls);
+#ifdef CONFIG_SYSCALL_PTREGS
+ regs->ax = sys_call_table[nr](regs);
+#else
regs->ax = sys_call_table[nr](
regs->di, regs->si, regs->dx,
regs->r10, regs->r8, regs->r9);
+#endif
}
syscall_return_slowpath(regs);
diff --git a/arch/x86/entry/syscall_64.c b/arch/x86/entry/syscall_64.c
index c176d2fab1da..6197850adf91 100644
--- a/arch/x86/entry/syscall_64.c
+++ b/arch/x86/entry/syscall_64.c
@@ -7,14 +7,19 @@
#include <asm/asm-offsets.h>
#include <asm/syscall.h>
+#ifdef CONFIG_SYSCALL_PTREGS
+/* this is a lie, but it does not hurt as sys_ni_syscall just returns -EINVAL */
+extern asmlinkage long sys_ni_syscall(const struct pt_regs *);
+#define __SYSCALL_64(nr, sym, qual) extern asmlinkage long sym(const struct pt_regs *);
+#else /* CONFIG_SYSCALL_PTREGS */
+extern asmlinkage long sys_ni_syscall(unsigned long, unsigned long, unsigned long, unsigned long, unsigned long, unsigned long);
#define __SYSCALL_64(nr, sym, qual) extern asmlinkage long sym(unsigned long, unsigned long, unsigned long, unsigned long, unsigned long, unsigned long);
+#endif /* CONFIG_SYSCALL_PTREGS */
#include <asm/syscalls_64.h>
#undef __SYSCALL_64
#define __SYSCALL_64(nr, sym, qual) [nr] = sym,
-extern long sys_ni_syscall(unsigned long, unsigned long, unsigned long, unsigned long, unsigned long, unsigned long);
-
asmlinkage const sys_call_ptr_t sys_call_table[__NR_syscall_max+1] = {
/*
* Smells like a compiler bug -- it doesn't work
diff --git a/arch/x86/entry/vsyscall/vsyscall_64.c b/arch/x86/entry/vsyscall/vsyscall_64.c
index 317be365bce3..05eebbf9b989 100644
--- a/arch/x86/entry/vsyscall/vsyscall_64.c
+++ b/arch/x86/entry/vsyscall/vsyscall_64.c
@@ -127,6 +127,9 @@ bool emulate_vsyscall(struct pt_regs *regs, unsigned long address)
int vsyscall_nr, syscall_nr, tmp;
int prev_sig_on_uaccess_err;
long ret;
+#ifdef CONFIG_SYSCALL_PTREGS
+ unsigned long orig_dx;
+#endif
/*
* No point in checking CS -- the only way to get here is a user mode
@@ -227,19 +230,38 @@ bool emulate_vsyscall(struct pt_regs *regs, unsigned long address)
ret = -EFAULT;
switch (vsyscall_nr) {
case 0:
+#ifdef CONFIG_SYSCALL_PTREGS
+ /* this decodes regs->di and regs->si on its own */
+ ret = sys_gettimeofday(regs);
+#else
ret = sys_gettimeofday(
(struct timeval __user *)regs->di,
(struct timezone __user *)regs->si);
+#endif /* CONFIG_SYSCALL_PTREGS */
break;
case 1:
+#ifdef CONFIG_SYSCALL_PTREGS
+ /* this decodes regs->di on its own */
+ ret = sys_time(regs);
+#else
ret = sys_time((time_t __user *)regs->di);
+#endif /* CONFIG_SYSCALL_PTREGS */
break;
case 2:
+#ifdef CONFIG_SYSCALL_PTREGS
+ /* while we could clobber regs->dx, we didn't in the past... */
+ orig_dx = regs->dx;
+ regs->dx = 0;
+ /* this decodes regs->di, regs->si and regs->dx on its own */
+ ret = sys_getcpu(regs);
+ regs->dx = orig_dx;
+#else
ret = sys_getcpu((unsigned __user *)regs->di,
(unsigned __user *)regs->si,
NULL);
+#endif /* CONFIG_SYSCALL_PTREGS */
break;
}
diff --git a/arch/x86/include/asm/syscall.h b/arch/x86/include/asm/syscall.h
index 03eedc21246d..17c62373a6f9 100644
--- a/arch/x86/include/asm/syscall.h
+++ b/arch/x86/include/asm/syscall.h
@@ -20,9 +20,13 @@
#include <asm/thread_info.h> /* for TS_COMPAT */
#include <asm/unistd.h>
+#ifdef CONFIG_SYSCALL_PTREGS
+typedef asmlinkage long (*sys_call_ptr_t)(const struct pt_regs *);
+#else
typedef asmlinkage long (*sys_call_ptr_t)(unsigned long, unsigned long,
unsigned long, unsigned long,
unsigned long, unsigned long);
+#endif /* CONFIG_SYSCALL_PTREGS */
extern const sys_call_ptr_t sys_call_table[];
#if defined(CONFIG_X86_32)
diff --git a/arch/x86/include/asm/syscall_wrapper.h b/arch/x86/include/asm/syscall_wrapper.h
new file mode 100644
index 000000000000..702bdee377af
--- /dev/null
+++ b/arch/x86/include/asm/syscall_wrapper.h
@@ -0,0 +1,70 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * syscall_wrapper.h - x86 specific wrappers to syscall definitions
+ */
+
+#ifndef _ASM_X86_SYSCALL_WRAPPER_H
+#define _ASM_X86_SYSCALL_WRAPPER_H
+
+/*
+ * Instead of the generic __SYSCALL_DEFINEx() definition, this macro takes
+ * struct pt_regs *regs as the only argument of the syscall stub named
+ * sys_*(). It decodes just the registers it needs and passes them on to
+ * the SyS_*() wrapper and then to the SYSC_*() function doing the actual job.
+ * These wrappers and functions are inlined, meaning that the assembly looks
+ * as follows (slightly re-ordered):
+ *
+ * <sys_recv>: <-- syscall with 4 parameters
+ * callq <__fentry__>
+ *
+ * mov 0x70(%rdi),%rdi <-- decode regs->di
+ * mov 0x68(%rdi),%rsi <-- decode regs->si
+ * mov 0x60(%rdi),%rdx <-- decode regs->dx
+ * mov 0x38(%rdi),%rcx <-- decode regs->r10
+ *
+ * xor %r9d,%r9d <-- clear %r9
+ * xor %r8d,%r8d <-- clear %r8
+ *
+ * callq __sys_recvfrom <-- do the actual work in __sys_recvfrom()
+ * which takes 6 arguments
+ *
+ * cltq <-- extend return value to 64-bit
+ * retq <-- return
+ *
+ * This approach avoids leaking random user-provided register content down
+ * the call chain.
+ *
+ * As the generic SYSCALL_DEFINE0() macro does not decode any parameters for
+ * obvious reasons, and passing struct pt_regs *regs to it in %rdi does not
+ * hurt, there is no need to override it.
+ */
+#define __SYSCALL_DEFINEx(x, name, ...) \
+ asmlinkage long sys##name(const struct pt_regs *regs); \
+ ALLOW_ERROR_INJECTION(sys##name, ERRNO); \
+ static long SyS##name(__MAP(x,__SC_LONG,__VA_ARGS__)); \
+ static inline long SYSC##name(__MAP(x,__SC_DECL,__VA_ARGS__)); \
+ asmlinkage long sys##name(const struct pt_regs *regs) \
+ { \
+ return SyS##name(__MAP(x,__SC_ARGS \
+ ,,regs->di,,regs->si,,regs->dx \
+ ,,regs->r10,,regs->r8,,regs->r9)); \
+ } \
+ static long SyS##name(__MAP(x,__SC_LONG,__VA_ARGS__)) \
+ { \
+ long ret = SYSC##name(__MAP(x,__SC_CAST,__VA_ARGS__)); \
+ __MAP(x,__SC_TEST,__VA_ARGS__); \
+ __PROTECT(x, ret,__MAP(x,__SC_ARGS,__VA_ARGS__)); \
+ return ret; \
+ } \
+ static inline long SYSC##name(__MAP(x,__SC_DECL,__VA_ARGS__))
+
+/*
+ * For VSYSCALLS, we need to declare these three syscalls with the new
+ * pt_regs-based calling convention for in-kernel use.
+ */
+struct pt_regs;
+asmlinkage long sys_getcpu(const struct pt_regs *regs); /* di,si,dx */
+asmlinkage long sys_gettimeofday(const struct pt_regs *regs); /* di,si */
+asmlinkage long sys_time(const struct pt_regs *regs); /* di */
+
+#endif /* _ASM_X86_SYSCALL_WRAPPER_H */
diff --git a/arch/x86/include/asm/syscalls.h b/arch/x86/include/asm/syscalls.h
index ae6e05fdc24b..e4ad93c05f02 100644
--- a/arch/x86/include/asm/syscalls.h
+++ b/arch/x86/include/asm/syscalls.h
@@ -18,6 +18,12 @@
/* Common in X86_32 and X86_64 */
/* kernel/ioport.c */
long ksys_ioperm(unsigned long from, unsigned long num, int turn_on);
+
+#ifndef CONFIG_SYSCALL_PTREGS
+/*
+ * If CONFIG_SYSCALL_PTREGS is enabled, a different syscall calling convention
+ * is used. Do not include these -- invalid -- prototypes then
+ */
asmlinkage long sys_ioperm(unsigned long, unsigned long, int);
asmlinkage long sys_iopl(unsigned int);
@@ -53,4 +59,5 @@ asmlinkage long sys_mmap(unsigned long, unsigned long, unsigned long,
unsigned long, unsigned long, unsigned long);
#endif /* CONFIG_X86_32 */
+#endif /* CONFIG_SYSCALL_PTREGS */
#endif /* _ASM_X86_SYSCALLS_H */
diff --git a/include/linux/syscalls.h b/include/linux/syscalls.h
index 503ab245d4ce..d7168b3a4b4c 100644
--- a/include/linux/syscalls.h
+++ b/include/linux/syscalls.h
@@ -102,7 +102,7 @@ union bpf_attr;
* for SYSCALL_DEFINE<n>/COMPAT_SYSCALL_DEFINE<n>
*/
#define __MAP0(m,...)
-#define __MAP1(m,t,a) m(t,a)
+#define __MAP1(m,t,a,...) m(t,a)
#define __MAP2(m,t,a,...) m(t,a), __MAP1(m,__VA_ARGS__)
#define __MAP3(m,t,a,...) m(t,a), __MAP2(m,__VA_ARGS__)
#define __MAP4(m,t,a,...) m(t,a), __MAP3(m,__VA_ARGS__)
--
2.16.3
^ permalink raw reply related [flat|nested] 27+ messages in thread
* [tip:x86/asm] syscalls/x86: Use 'struct pt_regs' based syscall calling convention for 64-bit syscalls
2018-04-05 9:53 ` [PATCH 3/8] syscalls/x86: use struct pt_regs based syscall calling for 64-bit syscalls Dominik Brodowski
@ 2018-04-06 17:11 ` tip-bot for Dominik Brodowski
0 siblings, 0 replies; 27+ messages in thread
From: tip-bot for Dominik Brodowski @ 2018-04-06 17:11 UTC (permalink / raw)
To: linux-tip-commits
Cc: peterz, linux-kernel, jpoimboe, hpa, linux, viro, akpm, dvlasenk,
mingo, luto, brgerst, bp, torvalds, tglx
Commit-ID: fa697140f9a20119a9ec8fd7460cc4314fbdaff3
Gitweb: https://git.kernel.org/tip/fa697140f9a20119a9ec8fd7460cc4314fbdaff3
Author: Dominik Brodowski <linux@dominikbrodowski.net>
AuthorDate: Thu, 5 Apr 2018 11:53:02 +0200
Committer: Ingo Molnar <mingo@kernel.org>
CommitDate: Thu, 5 Apr 2018 16:59:26 +0200
syscalls/x86: Use 'struct pt_regs' based syscall calling convention for 64-bit syscalls
Let's make use of ARCH_HAS_SYSCALL_WRAPPER=y on pure 64-bit x86-64 systems:
Each syscall defines a stub which takes struct pt_regs as its only
argument. It decodes just those parameters it needs, e.g:
asmlinkage long sys_xyzzy(const struct pt_regs *regs)
{
return SyS_xyzzy(regs->di, regs->si, regs->dx);
}
This approach avoids leaking random user-provided register content down
the call chain.
For example, for sys_recv() which is a 4-parameter syscall, the assembly
now is (in slightly reordered fashion):
<sys_recv>:
callq <__fentry__>
/* decode regs->di, ->si, ->dx and ->r10 */
mov 0x70(%rdi),%rdi
mov 0x68(%rdi),%rsi
mov 0x60(%rdi),%rdx
mov 0x38(%rdi),%rcx
[ SyS_recv() is automatically inlined by the compiler,
as it is not [yet] used anywhere else ]
/* clear %r9 and %r8, the 5th and 6th args */
xor %r9d,%r9d
xor %r8d,%r8d
/* do the actual work */
callq __sys_recvfrom
/* cleanup and return */
cltq
retq
The only valid place in an x86-64 kernel which rightfully calls
a syscall function on its own -- vsyscall -- needs to be modified
to pass struct pt_regs onwards as well.
To keep the syscall table generation working independent of
SYSCALL_PTREGS being enabled, the stubs are named the same as the
"original" syscall stubs, i.e. sys_*().
This patch is based on an original proof-of-concept
| From: Linus Torvalds <torvalds@linux-foundation.org>
| Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
and was split up and heavily modified by me, in particular to base it on
ARCH_HAS_SYSCALL_WRAPPER, to limit it to 64-bit-only for the time being,
and to update the vsyscall to the new calling convention.
Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20180405095307.3730-4-linux@dominikbrodowski.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
arch/x86/Kconfig | 5 +++
arch/x86/entry/common.c | 4 ++
arch/x86/entry/syscall_64.c | 9 ++++-
arch/x86/entry/vsyscall/vsyscall_64.c | 22 +++++++++++
arch/x86/include/asm/syscall.h | 4 ++
arch/x86/include/asm/syscall_wrapper.h | 70 ++++++++++++++++++++++++++++++++++
arch/x86/include/asm/syscalls.h | 7 ++++
include/linux/syscalls.h | 2 +-
8 files changed, 120 insertions(+), 3 deletions(-)
diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 27fede438959..67348efc2540 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -2954,3 +2954,8 @@ source "crypto/Kconfig"
source "arch/x86/kvm/Kconfig"
source "lib/Kconfig"
+
+config SYSCALL_PTREGS
+ def_bool y
+ depends on X86_64 && !COMPAT
+ select ARCH_HAS_SYSCALL_WRAPPER
diff --git a/arch/x86/entry/common.c b/arch/x86/entry/common.c
index a8b066dbbf48..e1b91bffa988 100644
--- a/arch/x86/entry/common.c
+++ b/arch/x86/entry/common.c
@@ -284,9 +284,13 @@ __visible void do_syscall_64(unsigned long nr, struct pt_regs *regs)
nr &= __SYSCALL_MASK;
if (likely(nr < NR_syscalls)) {
nr = array_index_nospec(nr, NR_syscalls);
+#ifdef CONFIG_SYSCALL_PTREGS
+ regs->ax = sys_call_table[nr](regs);
+#else
regs->ax = sys_call_table[nr](
regs->di, regs->si, regs->dx,
regs->r10, regs->r8, regs->r9);
+#endif
}
syscall_return_slowpath(regs);
diff --git a/arch/x86/entry/syscall_64.c b/arch/x86/entry/syscall_64.c
index c176d2fab1da..6197850adf91 100644
--- a/arch/x86/entry/syscall_64.c
+++ b/arch/x86/entry/syscall_64.c
@@ -7,14 +7,19 @@
#include <asm/asm-offsets.h>
#include <asm/syscall.h>
+#ifdef CONFIG_SYSCALL_PTREGS
+/* this is a lie, but it does not hurt as sys_ni_syscall just returns -EINVAL */
+extern asmlinkage long sys_ni_syscall(const struct pt_regs *);
+#define __SYSCALL_64(nr, sym, qual) extern asmlinkage long sym(const struct pt_regs *);
+#else /* CONFIG_SYSCALL_PTREGS */
+extern asmlinkage long sys_ni_syscall(unsigned long, unsigned long, unsigned long, unsigned long, unsigned long, unsigned long);
#define __SYSCALL_64(nr, sym, qual) extern asmlinkage long sym(unsigned long, unsigned long, unsigned long, unsigned long, unsigned long, unsigned long);
+#endif /* CONFIG_SYSCALL_PTREGS */
#include <asm/syscalls_64.h>
#undef __SYSCALL_64
#define __SYSCALL_64(nr, sym, qual) [nr] = sym,
-extern long sys_ni_syscall(unsigned long, unsigned long, unsigned long, unsigned long, unsigned long, unsigned long);
-
asmlinkage const sys_call_ptr_t sys_call_table[__NR_syscall_max+1] = {
/*
* Smells like a compiler bug -- it doesn't work
diff --git a/arch/x86/entry/vsyscall/vsyscall_64.c b/arch/x86/entry/vsyscall/vsyscall_64.c
index 317be365bce3..05eebbf9b989 100644
--- a/arch/x86/entry/vsyscall/vsyscall_64.c
+++ b/arch/x86/entry/vsyscall/vsyscall_64.c
@@ -127,6 +127,9 @@ bool emulate_vsyscall(struct pt_regs *regs, unsigned long address)
int vsyscall_nr, syscall_nr, tmp;
int prev_sig_on_uaccess_err;
long ret;
+#ifdef CONFIG_SYSCALL_PTREGS
+ unsigned long orig_dx;
+#endif
/*
* No point in checking CS -- the only way to get here is a user mode
@@ -227,19 +230,38 @@ bool emulate_vsyscall(struct pt_regs *regs, unsigned long address)
ret = -EFAULT;
switch (vsyscall_nr) {
case 0:
+#ifdef CONFIG_SYSCALL_PTREGS
+ /* this decodes regs->di and regs->si on its own */
+ ret = sys_gettimeofday(regs);
+#else
ret = sys_gettimeofday(
(struct timeval __user *)regs->di,
(struct timezone __user *)regs->si);
+#endif /* CONFIG_SYSCALL_PTREGS */
break;
case 1:
+#ifdef CONFIG_SYSCALL_PTREGS
+ /* this decodes regs->di on its own */
+ ret = sys_time(regs);
+#else
ret = sys_time((time_t __user *)regs->di);
+#endif /* CONFIG_SYSCALL_PTREGS */
break;
case 2:
+#ifdef CONFIG_SYSCALL_PTREGS
+ /* while we could clobber regs->dx, we didn't in the past... */
+ orig_dx = regs->dx;
+ regs->dx = 0;
+ /* this decodes regs->di, regs->si and regs->dx on its own */
+ ret = sys_getcpu(regs);
+ regs->dx = orig_dx;
+#else
ret = sys_getcpu((unsigned __user *)regs->di,
(unsigned __user *)regs->si,
NULL);
+#endif /* CONFIG_SYSCALL_PTREGS */
break;
}
diff --git a/arch/x86/include/asm/syscall.h b/arch/x86/include/asm/syscall.h
index 03eedc21246d..17c62373a6f9 100644
--- a/arch/x86/include/asm/syscall.h
+++ b/arch/x86/include/asm/syscall.h
@@ -20,9 +20,13 @@
#include <asm/thread_info.h> /* for TS_COMPAT */
#include <asm/unistd.h>
+#ifdef CONFIG_SYSCALL_PTREGS
+typedef asmlinkage long (*sys_call_ptr_t)(const struct pt_regs *);
+#else
typedef asmlinkage long (*sys_call_ptr_t)(unsigned long, unsigned long,
unsigned long, unsigned long,
unsigned long, unsigned long);
+#endif /* CONFIG_SYSCALL_PTREGS */
extern const sys_call_ptr_t sys_call_table[];
#if defined(CONFIG_X86_32)
diff --git a/arch/x86/include/asm/syscall_wrapper.h b/arch/x86/include/asm/syscall_wrapper.h
new file mode 100644
index 000000000000..702bdee377af
--- /dev/null
+++ b/arch/x86/include/asm/syscall_wrapper.h
@@ -0,0 +1,70 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * syscall_wrapper.h - x86 specific wrappers to syscall definitions
+ */
+
+#ifndef _ASM_X86_SYSCALL_WRAPPER_H
+#define _ASM_X86_SYSCALL_WRAPPER_H
+
+/*
+ * Instead of the generic __SYSCALL_DEFINEx() definition, this macro takes
+ * struct pt_regs *regs as the only argument of the syscall stub named
+ * sys_*(). It decodes just the registers it needs and passes them on to
+ * the SyS_*() wrapper and then to the SYSC_*() function doing the actual job.
+ * These wrappers and functions are inlined, meaning that the assembly looks
+ * as follows (slightly re-ordered):
+ *
+ * <sys_recv>: <-- syscall with 4 parameters
+ * callq <__fentry__>
+ *
+ * mov 0x70(%rdi),%rdi <-- decode regs->di
+ * mov 0x68(%rdi),%rsi <-- decode regs->si
+ * mov 0x60(%rdi),%rdx <-- decode regs->dx
+ * mov 0x38(%rdi),%rcx <-- decode regs->r10
+ *
+ * xor %r9d,%r9d <-- clear %r9
+ * xor %r8d,%r8d <-- clear %r8
+ *
+ * callq __sys_recvfrom <-- do the actual work in __sys_recvfrom()
+ * which takes 6 arguments
+ *
+ * cltq <-- extend return value to 64-bit
+ * retq <-- return
+ *
+ * This approach avoids leaking random user-provided register content down
+ * the call chain.
+ *
+ * As the generic SYSCALL_DEFINE0() macro does not decode any parameters for
+ * obvious reasons, and passing struct pt_regs *regs to it in %rdi does not
+ * hurt, there is no need to override it.
+ */
+#define __SYSCALL_DEFINEx(x, name, ...) \
+ asmlinkage long sys##name(const struct pt_regs *regs); \
+ ALLOW_ERROR_INJECTION(sys##name, ERRNO); \
+ static long SyS##name(__MAP(x,__SC_LONG,__VA_ARGS__)); \
+ static inline long SYSC##name(__MAP(x,__SC_DECL,__VA_ARGS__)); \
+ asmlinkage long sys##name(const struct pt_regs *regs) \
+ { \
+ return SyS##name(__MAP(x,__SC_ARGS \
+ ,,regs->di,,regs->si,,regs->dx \
+ ,,regs->r10,,regs->r8,,regs->r9)); \
+ } \
+ static long SyS##name(__MAP(x,__SC_LONG,__VA_ARGS__)) \
+ { \
+ long ret = SYSC##name(__MAP(x,__SC_CAST,__VA_ARGS__)); \
+ __MAP(x,__SC_TEST,__VA_ARGS__); \
+ __PROTECT(x, ret,__MAP(x,__SC_ARGS,__VA_ARGS__)); \
+ return ret; \
+ } \
+ static inline long SYSC##name(__MAP(x,__SC_DECL,__VA_ARGS__))
+
+/*
+ * For VSYSCALLS, we need to declare these three syscalls with the new
+ * pt_regs-based calling convention for in-kernel use.
+ */
+struct pt_regs;
+asmlinkage long sys_getcpu(const struct pt_regs *regs); /* di,si,dx */
+asmlinkage long sys_gettimeofday(const struct pt_regs *regs); /* di,si */
+asmlinkage long sys_time(const struct pt_regs *regs); /* di */
+
+#endif /* _ASM_X86_SYSCALL_WRAPPER_H */
diff --git a/arch/x86/include/asm/syscalls.h b/arch/x86/include/asm/syscalls.h
index ae6e05fdc24b..e4ad93c05f02 100644
--- a/arch/x86/include/asm/syscalls.h
+++ b/arch/x86/include/asm/syscalls.h
@@ -18,6 +18,12 @@
/* Common in X86_32 and X86_64 */
/* kernel/ioport.c */
long ksys_ioperm(unsigned long from, unsigned long num, int turn_on);
+
+#ifndef CONFIG_SYSCALL_PTREGS
+/*
+ * If CONFIG_SYSCALL_PTREGS is enabled, a different syscall calling convention
+ * is used. Do not include these -- invalid -- prototypes then
+ */
asmlinkage long sys_ioperm(unsigned long, unsigned long, int);
asmlinkage long sys_iopl(unsigned int);
@@ -53,4 +59,5 @@ asmlinkage long sys_mmap(unsigned long, unsigned long, unsigned long,
unsigned long, unsigned long, unsigned long);
#endif /* CONFIG_X86_32 */
+#endif /* CONFIG_SYSCALL_PTREGS */
#endif /* _ASM_X86_SYSCALLS_H */
diff --git a/include/linux/syscalls.h b/include/linux/syscalls.h
index 503ab245d4ce..d7168b3a4b4c 100644
--- a/include/linux/syscalls.h
+++ b/include/linux/syscalls.h
@@ -102,7 +102,7 @@ union bpf_attr;
* for SYSCALL_DEFINE<n>/COMPAT_SYSCALL_DEFINE<n>
*/
#define __MAP0(m,...)
-#define __MAP1(m,t,a) m(t,a)
+#define __MAP1(m,t,a,...) m(t,a)
#define __MAP2(m,t,a,...) m(t,a), __MAP1(m,__VA_ARGS__)
#define __MAP3(m,t,a,...) m(t,a), __MAP2(m,__VA_ARGS__)
#define __MAP4(m,t,a,...) m(t,a), __MAP3(m,__VA_ARGS__)
^ permalink raw reply related [flat|nested] 27+ messages in thread
* [PATCH 4/8] syscalls: prepare ARCH_HAS_SYSCALL_WRAPPER for compat syscalls
2018-04-05 9:52 [PATCH 0/8] use struct pt_regs based syscall calling for x86-64 Dominik Brodowski
` (2 preceding siblings ...)
2018-04-05 9:53 ` [PATCH 3/8] syscalls/x86: use struct pt_regs based syscall calling for 64-bit syscalls Dominik Brodowski
@ 2018-04-05 9:53 ` Dominik Brodowski
2018-04-06 17:11 ` [tip:x86/asm] syscalls/core: Prepare CONFIG_ARCH_HAS_SYSCALL_WRAPPER=y " tip-bot for Dominik Brodowski
2018-04-05 9:53 ` [PATCH 5/8] syscalls/x86: use struct pt_regs based syscall calling for IA32_EMULATION and x32 Dominik Brodowski
` (4 subsequent siblings)
8 siblings, 1 reply; 27+ messages in thread
From: Dominik Brodowski @ 2018-04-05 9:53 UTC (permalink / raw)
To: linux-kernel, mingo
Cc: Thomas Gleixner, Andi Kleen, Ingo Molnar, Andrew Morton, Al Viro,
H. Peter Anvin
It may be useful for an architecture to override the definitions of the
COMPAT_SYSCALL_DEFINE0() and __COMPAT_SYSCALL_DEFINEx() macros in
<linux/compat.h>, in particular to use a different calling convention
for syscalls. This patch provides a mechanism to do so, based on the
previously introduced CONFIG_ARCH_HAS_SYSCALL_WRAPPER. If it is enabled,
<asm/sycall_wrapper.h> is included in <linux/compat.h> and may be used
to define the macros mentioned above. Moreover, as the syscall calling
convention may be different if CONFIG_ARCH_HAS_SYSCALL_WRAPPER is set,
the compat syscall function prototypes in <linux/compat.h> are #ifndef'd
out in that case.
As some of the syscalls and/or compat syscalls may not be present,
the COND_SYSCALL() and COND_SYSCALL_COMPAT() macros in kernel/sys_ni.c
as well as the SYS_NI() and COMPAT_SYS_NI() macros in
kernel/time/posix-stubs.c can be re-defined in <asm/syscall_wrapper.h> iff
CONFIG_ARCH_HAS_SYSCALL_WRAPPER is enabled.
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
---
include/linux/compat.h | 22 ++++++++++++++++++++++
init/Kconfig | 9 ++++++---
kernel/sys_ni.c | 10 ++++++++++
kernel/time/posix-stubs.c | 10 ++++++++++
4 files changed, 48 insertions(+), 3 deletions(-)
diff --git a/include/linux/compat.h b/include/linux/compat.h
index 9847c5a013c3..2d85ec5cfda2 100644
--- a/include/linux/compat.h
+++ b/include/linux/compat.h
@@ -24,6 +24,17 @@
#include <asm/siginfo.h>
#include <asm/signal.h>
+#ifdef CONFIG_ARCH_HAS_SYSCALL_WRAPPER
+/*
+ * It may be useful for an architecture to override the definitions of the
+ * COMPAT_SYSCALL_DEFINE0 and COMPAT_SYSCALL_DEFINEx() macros, in particular
+ * to use a different calling convention for syscalls. To allow for that,
+ + the prototypes for the compat_sys_*() functions below will *not* be included
+ * if CONFIG_ARCH_HAS_SYSCALL_WRAPPER is enabled.
+ */
+#include <asm/syscall_wrapper.h>
+#endif /* CONFIG_ARCH_HAS_SYSCALL_WRAPPER */
+
#ifndef COMPAT_USE_64BIT_TIME
#define COMPAT_USE_64BIT_TIME 0
#endif
@@ -32,10 +43,12 @@
#define __SC_DELOUSE(t,v) ((__force t)(unsigned long)(v))
#endif
+#ifndef COMPAT_SYSCALL_DEFINE0
#define COMPAT_SYSCALL_DEFINE0(name) \
asmlinkage long compat_sys_##name(void); \
ALLOW_ERROR_INJECTION(compat_sys_##name, ERRNO); \
asmlinkage long compat_sys_##name(void)
+#endif /* COMPAT_SYSCALL_DEFINE0 */
#define COMPAT_SYSCALL_DEFINE1(name, ...) \
COMPAT_SYSCALL_DEFINEx(1, _##name, __VA_ARGS__)
@@ -50,6 +63,7 @@
#define COMPAT_SYSCALL_DEFINE6(name, ...) \
COMPAT_SYSCALL_DEFINEx(6, _##name, __VA_ARGS__)
+#ifndef COMPAT_SYSCALL_DEFINEx
#define COMPAT_SYSCALL_DEFINEx(x, name, ...) \
asmlinkage long compat_sys##name(__MAP(x,__SC_DECL,__VA_ARGS__));\
asmlinkage long compat_sys##name(__MAP(x,__SC_DECL,__VA_ARGS__))\
@@ -62,6 +76,7 @@
return C_SYSC##name(__MAP(x,__SC_DELOUSE,__VA_ARGS__)); \
} \
static inline long C_SYSC##name(__MAP(x,__SC_DECL,__VA_ARGS__))
+#endif /* COMPAT_SYSCALL_DEFINEx */
#ifndef compat_user_stack_pointer
#define compat_user_stack_pointer() current_user_stack_pointer()
@@ -517,7 +532,12 @@ int __compat_save_altstack(compat_stack_t __user *, unsigned long);
* Please note that these prototypes here are only provided for information
* purposes, for static analysis, and for linking from the syscall table.
* These functions should not be called elsewhere from kernel code.
+ *
+ * As the syscall calling convention may be different from the default
+ * for architectures overriding the syscall calling convention, do not
+ * include the prototypes if CONFIG_ARCH_HAS_SYSCALL_WRAPPER is enabled.
*/
+#ifndef CONFIG_ARCH_HAS_SYSCALL_WRAPPER
asmlinkage long compat_sys_io_setup(unsigned nr_reqs, u32 __user *ctx32p);
asmlinkage long compat_sys_io_submit(compat_aio_context_t ctx_id, int nr,
u32 __user *iocb);
@@ -955,6 +975,8 @@ asmlinkage long compat_sys_stime(compat_time_t __user *tptr);
/* obsolete: net/socket.c */
asmlinkage long compat_sys_socketcall(int call, u32 __user *args);
+#endif /* CONFIG_ARCH_HAS_SYSCALL_WRAPPER */
+
/*
* For most but not all architectures, "am I in a compat syscall?" and
diff --git a/init/Kconfig b/init/Kconfig
index 068eb6c3bbf7..2dbc88051bde 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -1925,8 +1925,11 @@ config ARCH_HAS_SYNC_CORE_BEFORE_USERMODE
bool
# It may be useful for an architecture to override the definitions of the
-# SYSCALL_DEFINE() and __SYSCALL_DEFINEx() macros in <linux/syscalls.h>,
-# in particular to use a different calling convention for syscalls.
+# SYSCALL_DEFINE() and __SYSCALL_DEFINEx() macros in <linux/syscalls.h>
+# and the COMPAT_ variants in <linux/compat.h>, in particular to use a
+# different calling convention for syscalls. They can also override the
+# macros for not-implemented syscalls in kernel/sys_ni.c and
+# kernel/time/posix-stubs.c. All these overrides need to be available in
+# <asm/syscall_wrapper.h>.
config ARCH_HAS_SYSCALL_WRAPPER
def_bool n
- depends on !COMPAT
diff --git a/kernel/sys_ni.c b/kernel/sys_ni.c
index 6cafc008f6db..9791364925dc 100644
--- a/kernel/sys_ni.c
+++ b/kernel/sys_ni.c
@@ -5,6 +5,11 @@
#include <asm/unistd.h>
+#ifdef CONFIG_ARCH_HAS_SYSCALL_WRAPPER
+/* Architectures may override COND_SYSCALL and COND_SYSCALL_COMPAT */
+#include <asm/syscall_wrapper.h>
+#endif /* CONFIG_ARCH_HAS_SYSCALL_WRAPPER */
+
/* we can't #include <linux/syscalls.h> here,
but tell gcc to not warn with -Wmissing-prototypes */
asmlinkage long sys_ni_syscall(void);
@@ -17,8 +22,13 @@ asmlinkage long sys_ni_syscall(void)
return -ENOSYS;
}
+#ifndef COND_SYSCALL
#define COND_SYSCALL(name) cond_syscall(sys_##name)
+#endif /* COND_SYSCALL */
+
+#ifndef COND_SYSCALL_COMPAT
#define COND_SYSCALL_COMPAT(name) cond_syscall(compat_sys_##name)
+#endif /* COND_SYSCALL_COMPAT */
/*
* This list is kept in the same order as include/uapi/asm-generic/unistd.h.
diff --git a/kernel/time/posix-stubs.c b/kernel/time/posix-stubs.c
index b258bee13b02..69a937c3cd81 100644
--- a/kernel/time/posix-stubs.c
+++ b/kernel/time/posix-stubs.c
@@ -19,6 +19,11 @@
#include <linux/posix-timers.h>
#include <linux/compat.h>
+#ifdef CONFIG_ARCH_HAS_SYSCALL_WRAPPER
+/* Architectures may override SYS_NI and COMPAT_SYS_NI */
+#include <asm/syscall_wrapper.h>
+#endif
+
asmlinkage long sys_ni_posix_timers(void)
{
pr_err_once("process %d (%s) attempted a POSIX timer syscall "
@@ -27,8 +32,13 @@ asmlinkage long sys_ni_posix_timers(void)
return -ENOSYS;
}
+#ifndef SYS_NI
#define SYS_NI(name) SYSCALL_ALIAS(sys_##name, sys_ni_posix_timers)
+#endif
+
+#ifndef COMPAT_SYS_NI
#define COMPAT_SYS_NI(name) SYSCALL_ALIAS(compat_sys_##name, sys_ni_posix_timers)
+#endif
SYS_NI(timer_create);
SYS_NI(timer_gettime);
--
2.16.3
^ permalink raw reply related [flat|nested] 27+ messages in thread
* [tip:x86/asm] syscalls/core: Prepare CONFIG_ARCH_HAS_SYSCALL_WRAPPER=y for compat syscalls
2018-04-05 9:53 ` [PATCH 4/8] syscalls: prepare ARCH_HAS_SYSCALL_WRAPPER for compat syscalls Dominik Brodowski
@ 2018-04-06 17:11 ` tip-bot for Dominik Brodowski
0 siblings, 0 replies; 27+ messages in thread
From: tip-bot for Dominik Brodowski @ 2018-04-06 17:11 UTC (permalink / raw)
To: linux-tip-commits
Cc: linux-kernel, linux, bp, jpoimboe, dvlasenk, tglx, viro, mingo,
peterz, akpm, hpa, luto, torvalds, brgerst
Commit-ID: 7303e30ec1d8fb5ca1f07c92d069241c32b2ee1b
Gitweb: https://git.kernel.org/tip/7303e30ec1d8fb5ca1f07c92d069241c32b2ee1b
Author: Dominik Brodowski <linux@dominikbrodowski.net>
AuthorDate: Thu, 5 Apr 2018 11:53:03 +0200
Committer: Ingo Molnar <mingo@kernel.org>
CommitDate: Thu, 5 Apr 2018 16:59:38 +0200
syscalls/core: Prepare CONFIG_ARCH_HAS_SYSCALL_WRAPPER=y for compat syscalls
It may be useful for an architecture to override the definitions of the
COMPAT_SYSCALL_DEFINE0() and __COMPAT_SYSCALL_DEFINEx() macros in
<linux/compat.h>, in particular to use a different calling convention
for syscalls. This patch provides a mechanism to do so, based on the
previously introduced CONFIG_ARCH_HAS_SYSCALL_WRAPPER. If it is enabled,
<asm/sycall_wrapper.h> is included in <linux/compat.h> and may be used
to define the macros mentioned above. Moreover, as the syscall calling
convention may be different if CONFIG_ARCH_HAS_SYSCALL_WRAPPER is set,
the compat syscall function prototypes in <linux/compat.h> are #ifndef'd
out in that case.
As some of the syscalls and/or compat syscalls may not be present,
the COND_SYSCALL() and COND_SYSCALL_COMPAT() macros in kernel/sys_ni.c
as well as the SYS_NI() and COMPAT_SYS_NI() macros in
kernel/time/posix-stubs.c can be re-defined in <asm/syscall_wrapper.h> iff
CONFIG_ARCH_HAS_SYSCALL_WRAPPER is enabled.
Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20180405095307.3730-5-linux@dominikbrodowski.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
include/linux/compat.h | 22 ++++++++++++++++++++++
init/Kconfig | 9 ++++++---
kernel/sys_ni.c | 10 ++++++++++
kernel/time/posix-stubs.c | 10 ++++++++++
4 files changed, 48 insertions(+), 3 deletions(-)
diff --git a/include/linux/compat.h b/include/linux/compat.h
index 9847c5a013c3..2d85ec5cfda2 100644
--- a/include/linux/compat.h
+++ b/include/linux/compat.h
@@ -24,6 +24,17 @@
#include <asm/siginfo.h>
#include <asm/signal.h>
+#ifdef CONFIG_ARCH_HAS_SYSCALL_WRAPPER
+/*
+ * It may be useful for an architecture to override the definitions of the
+ * COMPAT_SYSCALL_DEFINE0 and COMPAT_SYSCALL_DEFINEx() macros, in particular
+ * to use a different calling convention for syscalls. To allow for that,
+ + the prototypes for the compat_sys_*() functions below will *not* be included
+ * if CONFIG_ARCH_HAS_SYSCALL_WRAPPER is enabled.
+ */
+#include <asm/syscall_wrapper.h>
+#endif /* CONFIG_ARCH_HAS_SYSCALL_WRAPPER */
+
#ifndef COMPAT_USE_64BIT_TIME
#define COMPAT_USE_64BIT_TIME 0
#endif
@@ -32,10 +43,12 @@
#define __SC_DELOUSE(t,v) ((__force t)(unsigned long)(v))
#endif
+#ifndef COMPAT_SYSCALL_DEFINE0
#define COMPAT_SYSCALL_DEFINE0(name) \
asmlinkage long compat_sys_##name(void); \
ALLOW_ERROR_INJECTION(compat_sys_##name, ERRNO); \
asmlinkage long compat_sys_##name(void)
+#endif /* COMPAT_SYSCALL_DEFINE0 */
#define COMPAT_SYSCALL_DEFINE1(name, ...) \
COMPAT_SYSCALL_DEFINEx(1, _##name, __VA_ARGS__)
@@ -50,6 +63,7 @@
#define COMPAT_SYSCALL_DEFINE6(name, ...) \
COMPAT_SYSCALL_DEFINEx(6, _##name, __VA_ARGS__)
+#ifndef COMPAT_SYSCALL_DEFINEx
#define COMPAT_SYSCALL_DEFINEx(x, name, ...) \
asmlinkage long compat_sys##name(__MAP(x,__SC_DECL,__VA_ARGS__));\
asmlinkage long compat_sys##name(__MAP(x,__SC_DECL,__VA_ARGS__))\
@@ -62,6 +76,7 @@
return C_SYSC##name(__MAP(x,__SC_DELOUSE,__VA_ARGS__)); \
} \
static inline long C_SYSC##name(__MAP(x,__SC_DECL,__VA_ARGS__))
+#endif /* COMPAT_SYSCALL_DEFINEx */
#ifndef compat_user_stack_pointer
#define compat_user_stack_pointer() current_user_stack_pointer()
@@ -517,7 +532,12 @@ int __compat_save_altstack(compat_stack_t __user *, unsigned long);
* Please note that these prototypes here are only provided for information
* purposes, for static analysis, and for linking from the syscall table.
* These functions should not be called elsewhere from kernel code.
+ *
+ * As the syscall calling convention may be different from the default
+ * for architectures overriding the syscall calling convention, do not
+ * include the prototypes if CONFIG_ARCH_HAS_SYSCALL_WRAPPER is enabled.
*/
+#ifndef CONFIG_ARCH_HAS_SYSCALL_WRAPPER
asmlinkage long compat_sys_io_setup(unsigned nr_reqs, u32 __user *ctx32p);
asmlinkage long compat_sys_io_submit(compat_aio_context_t ctx_id, int nr,
u32 __user *iocb);
@@ -955,6 +975,8 @@ asmlinkage long compat_sys_stime(compat_time_t __user *tptr);
/* obsolete: net/socket.c */
asmlinkage long compat_sys_socketcall(int call, u32 __user *args);
+#endif /* CONFIG_ARCH_HAS_SYSCALL_WRAPPER */
+
/*
* For most but not all architectures, "am I in a compat syscall?" and
diff --git a/init/Kconfig b/init/Kconfig
index 068eb6c3bbf7..2dbc88051bde 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -1925,8 +1925,11 @@ config ARCH_HAS_SYNC_CORE_BEFORE_USERMODE
bool
# It may be useful for an architecture to override the definitions of the
-# SYSCALL_DEFINE() and __SYSCALL_DEFINEx() macros in <linux/syscalls.h>,
-# in particular to use a different calling convention for syscalls.
+# SYSCALL_DEFINE() and __SYSCALL_DEFINEx() macros in <linux/syscalls.h>
+# and the COMPAT_ variants in <linux/compat.h>, in particular to use a
+# different calling convention for syscalls. They can also override the
+# macros for not-implemented syscalls in kernel/sys_ni.c and
+# kernel/time/posix-stubs.c. All these overrides need to be available in
+# <asm/syscall_wrapper.h>.
config ARCH_HAS_SYSCALL_WRAPPER
def_bool n
- depends on !COMPAT
diff --git a/kernel/sys_ni.c b/kernel/sys_ni.c
index 6cafc008f6db..9791364925dc 100644
--- a/kernel/sys_ni.c
+++ b/kernel/sys_ni.c
@@ -5,6 +5,11 @@
#include <asm/unistd.h>
+#ifdef CONFIG_ARCH_HAS_SYSCALL_WRAPPER
+/* Architectures may override COND_SYSCALL and COND_SYSCALL_COMPAT */
+#include <asm/syscall_wrapper.h>
+#endif /* CONFIG_ARCH_HAS_SYSCALL_WRAPPER */
+
/* we can't #include <linux/syscalls.h> here,
but tell gcc to not warn with -Wmissing-prototypes */
asmlinkage long sys_ni_syscall(void);
@@ -17,8 +22,13 @@ asmlinkage long sys_ni_syscall(void)
return -ENOSYS;
}
+#ifndef COND_SYSCALL
#define COND_SYSCALL(name) cond_syscall(sys_##name)
+#endif /* COND_SYSCALL */
+
+#ifndef COND_SYSCALL_COMPAT
#define COND_SYSCALL_COMPAT(name) cond_syscall(compat_sys_##name)
+#endif /* COND_SYSCALL_COMPAT */
/*
* This list is kept in the same order as include/uapi/asm-generic/unistd.h.
diff --git a/kernel/time/posix-stubs.c b/kernel/time/posix-stubs.c
index b258bee13b02..69a937c3cd81 100644
--- a/kernel/time/posix-stubs.c
+++ b/kernel/time/posix-stubs.c
@@ -19,6 +19,11 @@
#include <linux/posix-timers.h>
#include <linux/compat.h>
+#ifdef CONFIG_ARCH_HAS_SYSCALL_WRAPPER
+/* Architectures may override SYS_NI and COMPAT_SYS_NI */
+#include <asm/syscall_wrapper.h>
+#endif
+
asmlinkage long sys_ni_posix_timers(void)
{
pr_err_once("process %d (%s) attempted a POSIX timer syscall "
@@ -27,8 +32,13 @@ asmlinkage long sys_ni_posix_timers(void)
return -ENOSYS;
}
+#ifndef SYS_NI
#define SYS_NI(name) SYSCALL_ALIAS(sys_##name, sys_ni_posix_timers)
+#endif
+
+#ifndef COMPAT_SYS_NI
#define COMPAT_SYS_NI(name) SYSCALL_ALIAS(compat_sys_##name, sys_ni_posix_timers)
+#endif
SYS_NI(timer_create);
SYS_NI(timer_gettime);
^ permalink raw reply related [flat|nested] 27+ messages in thread
* [PATCH 5/8] syscalls/x86: use struct pt_regs based syscall calling for IA32_EMULATION and x32
2018-04-05 9:52 [PATCH 0/8] use struct pt_regs based syscall calling for x86-64 Dominik Brodowski
` (3 preceding siblings ...)
2018-04-05 9:53 ` [PATCH 4/8] syscalls: prepare ARCH_HAS_SYSCALL_WRAPPER for compat syscalls Dominik Brodowski
@ 2018-04-05 9:53 ` Dominik Brodowski
2018-04-06 17:12 ` [tip:x86/asm] syscalls/x86: Use 'struct pt_regs' " tip-bot for Dominik Brodowski
2018-04-05 9:53 ` [PATCH 6/8] syscalls/x86: unconditionally enable struct pt_regs based syscalls on x86_64 Dominik Brodowski
` (3 subsequent siblings)
8 siblings, 1 reply; 27+ messages in thread
From: Dominik Brodowski @ 2018-04-05 9:53 UTC (permalink / raw)
To: linux-kernel, mingo
Cc: Thomas Gleixner, Andi Kleen, Ingo Molnar, Andrew Morton, Al Viro,
Andy Lutomirski, Denys Vlasenko, Brian Gerst, Peter Zijlstra,
Linus Torvalds, x86, H. Peter Anvin
Extend ARCH_HAS_SYSCALL_WRAPPER for i386 emulation and for x32 on 64-bit
x86.
For x32, all we need to do is to create an additional stub for each
compat syscall which decodes the parameters in x86-64 ordering, e.g.:
asmlinkage long __compat_sys_x32_xyzzy(struct pt_regs *regs)
{
return c_SyS_xyzzy(regs->di, regs->si, regs->dx);
}
For i386 emulation, we need to teach compat_sys_*() to take struct
pt_regs as its only argument, e.g.:
asmlinkage long __compat_sys_ia32_xyzzy(struct pt_regs *regs)
{
return c_SyS_xyzzy(regs->bx, regs->cx, regs->dx);
}
In addition, we need to create additional stubs for common syscalls
(that is, for syscalls which have the same parameters on 32-bit and
64-bit), e.g.:
asmlinkage long __sys_ia32_xyzzy(struct pt_regs *regs)
{
return c_sys_xyzzy(regs->bx, regs->cx, regs->dx);
}
This approach avoids leaking random user-provided register content down
the call chain.
This patch is based on an original proof-of-concept
From: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
and was split up and heavily modified by me, in particular to base it on
ARCH_HAS_SYSCALL_WRAPPER.
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: x86@kernel.org
Cc: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
---
arch/x86/Kconfig | 2 +-
arch/x86/entry/common.c | 4 +
arch/x86/entry/syscall_32.c | 15 +-
arch/x86/entry/syscalls/syscall_32.tbl | 677 +++++++++++++++++----------------
arch/x86/entry/syscalls/syscall_64.tbl | 74 ++--
arch/x86/include/asm/syscall_wrapper.h | 117 +++++-
6 files changed, 509 insertions(+), 380 deletions(-)
diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 67348efc2540..7bbd6a174722 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -2957,5 +2957,5 @@ source "lib/Kconfig"
config SYSCALL_PTREGS
def_bool y
- depends on X86_64 && !COMPAT
+ depends on X86_64
select ARCH_HAS_SYSCALL_WRAPPER
diff --git a/arch/x86/entry/common.c b/arch/x86/entry/common.c
index e1b91bffa988..425f798b39e3 100644
--- a/arch/x86/entry/common.c
+++ b/arch/x86/entry/common.c
@@ -325,6 +325,9 @@ static __always_inline void do_syscall_32_irqs_on(struct pt_regs *regs)
if (likely(nr < IA32_NR_syscalls)) {
nr = array_index_nospec(nr, IA32_NR_syscalls);
+#ifdef CONFIG_SYSCALL_PTREGS
+ regs->ax = ia32_sys_call_table[nr](regs);
+#else
/*
* It's possible that a 32-bit syscall implementation
* takes a 64-bit parameter but nonetheless assumes that
@@ -335,6 +338,7 @@ static __always_inline void do_syscall_32_irqs_on(struct pt_regs *regs)
(unsigned int)regs->bx, (unsigned int)regs->cx,
(unsigned int)regs->dx, (unsigned int)regs->si,
(unsigned int)regs->di, (unsigned int)regs->bp);
+#endif /* CONFIG_SYSCALL_PTREGS */
}
syscall_return_slowpath(regs);
diff --git a/arch/x86/entry/syscall_32.c b/arch/x86/entry/syscall_32.c
index 95c294963612..47060dd8efb1 100644
--- a/arch/x86/entry/syscall_32.c
+++ b/arch/x86/entry/syscall_32.c
@@ -7,14 +7,23 @@
#include <asm/asm-offsets.h>
#include <asm/syscall.h>
-#define __SYSCALL_I386(nr, sym, qual) extern asmlinkage long sym(unsigned long, unsigned long, unsigned long, unsigned long, unsigned long, unsigned long) ;
+#ifdef CONFIG_SYSCALL_PTREGS
+/* On X86_64, we use struct pt_regs * to pass parameters to syscalls */
+#define __SYSCALL_I386(nr, sym, qual) extern asmlinkage long sym(const struct pt_regs *);
+
+/* this is a lie, but it does not hurt as sys_ni_syscall just returns -EINVAL */
+extern asmlinkage long sys_ni_syscall(const struct pt_regs *);
+
+#else /* CONFIG_SYSCALL_PTREGS */
+#define __SYSCALL_I386(nr, sym, qual) extern asmlinkage long sym(unsigned long, unsigned long, unsigned long, unsigned long, unsigned long, unsigned long);
+extern asmlinkage long sys_ni_syscall(unsigned long, unsigned long, unsigned long, unsigned long, unsigned long, unsigned long);
+#endif /* CONFIG_SYSCALL_PTREGS */
+
#include <asm/syscalls_32.h>
#undef __SYSCALL_I386
#define __SYSCALL_I386(nr, sym, qual) [nr] = sym,
-extern asmlinkage long sys_ni_syscall(unsigned long, unsigned long, unsigned long, unsigned long, unsigned long, unsigned long);
-
__visible const sys_call_ptr_t ia32_sys_call_table[__NR_syscall_compat_max+1] = {
/*
* Smells like a compiler bug -- it doesn't work
diff --git a/arch/x86/entry/syscalls/syscall_32.tbl b/arch/x86/entry/syscalls/syscall_32.tbl
index c58f75b088c5..7f09a3da0b3d 100644
--- a/arch/x86/entry/syscalls/syscall_32.tbl
+++ b/arch/x86/entry/syscalls/syscall_32.tbl
@@ -4,390 +4,395 @@
# The format is:
# <number> <abi> <name> <entry point> <compat entry point>
#
+# The __sys_ia32 and __compat_sys_ia32 stubs are created on-the-fly for
+# sys_*() system calls and compat_sys_*() compat system calls if
+# IA32_EMULATION is defined, and expect struct pt_regs *regs as their only
+# parameter.
+#
# The abi is always "i386" for this file.
#
0 i386 restart_syscall sys_restart_syscall
-1 i386 exit sys_exit
+1 i386 exit sys_exit __sys_ia32_exit
2 i386 fork sys_fork
-3 i386 read sys_read
-4 i386 write sys_write
-5 i386 open sys_open compat_sys_open
-6 i386 close sys_close
-7 i386 waitpid sys_waitpid
-8 i386 creat sys_creat
-9 i386 link sys_link
-10 i386 unlink sys_unlink
-11 i386 execve sys_execve compat_sys_execve
-12 i386 chdir sys_chdir
-13 i386 time sys_time compat_sys_time
-14 i386 mknod sys_mknod
-15 i386 chmod sys_chmod
-16 i386 lchown sys_lchown16
+3 i386 read sys_read __sys_ia32_read
+4 i386 write sys_write __sys_ia32_write
+5 i386 open sys_open __compat_sys_ia32_open
+6 i386 close sys_close __sys_ia32_close
+7 i386 waitpid sys_waitpid __sys_ia32_waitpid
+8 i386 creat sys_creat __sys_ia32_creat
+9 i386 link sys_link __sys_ia32_link
+10 i386 unlink sys_unlink __sys_ia32_unlink
+11 i386 execve sys_execve __compat_sys_ia32_execve
+12 i386 chdir sys_chdir __sys_ia32_chdir
+13 i386 time sys_time __compat_sys_ia32_time
+14 i386 mknod sys_mknod __sys_ia32_mknod
+15 i386 chmod sys_chmod __sys_ia32_chmod
+16 i386 lchown sys_lchown16 __sys_ia32_lchown16
17 i386 break
-18 i386 oldstat sys_stat
-19 i386 lseek sys_lseek compat_sys_lseek
+18 i386 oldstat sys_stat __sys_ia32_stat
+19 i386 lseek sys_lseek __compat_sys_ia32_lseek
20 i386 getpid sys_getpid
-21 i386 mount sys_mount compat_sys_mount
-22 i386 umount sys_oldumount
-23 i386 setuid sys_setuid16
+21 i386 mount sys_mount __compat_sys_ia32_mount
+22 i386 umount sys_oldumount __sys_ia32_oldumount
+23 i386 setuid sys_setuid16 __sys_ia32_setuid16
24 i386 getuid sys_getuid16
-25 i386 stime sys_stime compat_sys_stime
-26 i386 ptrace sys_ptrace compat_sys_ptrace
-27 i386 alarm sys_alarm
-28 i386 oldfstat sys_fstat
+25 i386 stime sys_stime __compat_sys_ia32_stime
+26 i386 ptrace sys_ptrace __compat_sys_ia32_ptrace
+27 i386 alarm sys_alarm __sys_ia32_alarm
+28 i386 oldfstat sys_fstat __sys_ia32_fstat
29 i386 pause sys_pause
-30 i386 utime sys_utime compat_sys_utime
+30 i386 utime sys_utime __compat_sys_ia32_utime
31 i386 stty
32 i386 gtty
-33 i386 access sys_access
-34 i386 nice sys_nice
+33 i386 access sys_access __sys_ia32_access
+34 i386 nice sys_nice __sys_ia32_nice
35 i386 ftime
36 i386 sync sys_sync
-37 i386 kill sys_kill
-38 i386 rename sys_rename
-39 i386 mkdir sys_mkdir
-40 i386 rmdir sys_rmdir
-41 i386 dup sys_dup
-42 i386 pipe sys_pipe
-43 i386 times sys_times compat_sys_times
+37 i386 kill sys_kill __sys_ia32_kill
+38 i386 rename sys_rename __sys_ia32_rename
+39 i386 mkdir sys_mkdir __sys_ia32_mkdir
+40 i386 rmdir sys_rmdir __sys_ia32_rmdir
+41 i386 dup sys_dup __sys_ia32_dup
+42 i386 pipe sys_pipe __sys_ia32_pipe
+43 i386 times sys_times __compat_sys_ia32_times
44 i386 prof
-45 i386 brk sys_brk
-46 i386 setgid sys_setgid16
+45 i386 brk sys_brk __sys_ia32_brk
+46 i386 setgid sys_setgid16 __sys_ia32_setgid16
47 i386 getgid sys_getgid16
-48 i386 signal sys_signal
+48 i386 signal sys_signal __sys_ia32_signal
49 i386 geteuid sys_geteuid16
50 i386 getegid sys_getegid16
-51 i386 acct sys_acct
-52 i386 umount2 sys_umount
+51 i386 acct sys_acct __sys_ia32_acct
+52 i386 umount2 sys_umount __sys_ia32_umount
53 i386 lock
-54 i386 ioctl sys_ioctl compat_sys_ioctl
-55 i386 fcntl sys_fcntl compat_sys_fcntl64
+54 i386 ioctl sys_ioctl __compat_sys_ia32_ioctl
+55 i386 fcntl sys_fcntl __compat_sys_ia32_fcntl64
56 i386 mpx
-57 i386 setpgid sys_setpgid
+57 i386 setpgid sys_setpgid __sys_ia32_setpgid
58 i386 ulimit
-59 i386 oldolduname sys_olduname
-60 i386 umask sys_umask
-61 i386 chroot sys_chroot
-62 i386 ustat sys_ustat compat_sys_ustat
-63 i386 dup2 sys_dup2
+59 i386 oldolduname sys_olduname __sys_ia32_olduname
+60 i386 umask sys_umask __sys_ia32_umask
+61 i386 chroot sys_chroot __sys_ia32_chroot
+62 i386 ustat sys_ustat __compat_sys_ia32_ustat
+63 i386 dup2 sys_dup2 __sys_ia32_dup2
64 i386 getppid sys_getppid
65 i386 getpgrp sys_getpgrp
66 i386 setsid sys_setsid
-67 i386 sigaction sys_sigaction compat_sys_sigaction
+67 i386 sigaction sys_sigaction __compat_sys_ia32_sigaction
68 i386 sgetmask sys_sgetmask
-69 i386 ssetmask sys_ssetmask
-70 i386 setreuid sys_setreuid16
-71 i386 setregid sys_setregid16
-72 i386 sigsuspend sys_sigsuspend
-73 i386 sigpending sys_sigpending compat_sys_sigpending
-74 i386 sethostname sys_sethostname
-75 i386 setrlimit sys_setrlimit compat_sys_setrlimit
-76 i386 getrlimit sys_old_getrlimit compat_sys_old_getrlimit
-77 i386 getrusage sys_getrusage compat_sys_getrusage
-78 i386 gettimeofday sys_gettimeofday compat_sys_gettimeofday
-79 i386 settimeofday sys_settimeofday compat_sys_settimeofday
-80 i386 getgroups sys_getgroups16
-81 i386 setgroups sys_setgroups16
-82 i386 select sys_old_select compat_sys_old_select
-83 i386 symlink sys_symlink
-84 i386 oldlstat sys_lstat
-85 i386 readlink sys_readlink
-86 i386 uselib sys_uselib
-87 i386 swapon sys_swapon
-88 i386 reboot sys_reboot
-89 i386 readdir sys_old_readdir compat_sys_old_readdir
-90 i386 mmap sys_old_mmap compat_sys_x86_mmap
-91 i386 munmap sys_munmap
-92 i386 truncate sys_truncate compat_sys_truncate
-93 i386 ftruncate sys_ftruncate compat_sys_ftruncate
-94 i386 fchmod sys_fchmod
-95 i386 fchown sys_fchown16
-96 i386 getpriority sys_getpriority
-97 i386 setpriority sys_setpriority
+69 i386 ssetmask sys_ssetmask __sys_ia32_ssetmask
+70 i386 setreuid sys_setreuid16 __sys_ia32_setreuid16
+71 i386 setregid sys_setregid16 __sys_ia32_setregid16
+72 i386 sigsuspend sys_sigsuspend __sys_ia32_sigsuspend
+73 i386 sigpending sys_sigpending __compat_sys_ia32_sigpending
+74 i386 sethostname sys_sethostname __sys_ia32_sethostname
+75 i386 setrlimit sys_setrlimit __compat_sys_ia32_setrlimit
+76 i386 getrlimit sys_old_getrlimit __compat_sys_ia32_old_getrlimit
+77 i386 getrusage sys_getrusage __compat_sys_ia32_getrusage
+78 i386 gettimeofday sys_gettimeofday __compat_sys_ia32_gettimeofday
+79 i386 settimeofday sys_settimeofday __compat_sys_ia32_settimeofday
+80 i386 getgroups sys_getgroups16 __sys_ia32_getgroups16
+81 i386 setgroups sys_setgroups16 __sys_ia32_setgroups16
+82 i386 select sys_old_select __compat_sys_ia32_old_select
+83 i386 symlink sys_symlink __sys_ia32_symlink
+84 i386 oldlstat sys_lstat __sys_ia32_lstat
+85 i386 readlink sys_readlink __sys_ia32_readlink
+86 i386 uselib sys_uselib __sys_ia32_uselib
+87 i386 swapon sys_swapon __sys_ia32_swapon
+88 i386 reboot sys_reboot __sys_ia32_reboot
+89 i386 readdir sys_old_readdir __compat_sys_ia32_old_readdir
+90 i386 mmap sys_old_mmap __compat_sys_ia32_x86_mmap
+91 i386 munmap sys_munmap __sys_ia32_munmap
+92 i386 truncate sys_truncate __compat_sys_ia32_truncate
+93 i386 ftruncate sys_ftruncate __compat_sys_ia32_ftruncate
+94 i386 fchmod sys_fchmod __sys_ia32_fchmod
+95 i386 fchown sys_fchown16 __sys_ia32_fchown16
+96 i386 getpriority sys_getpriority __sys_ia32_getpriority
+97 i386 setpriority sys_setpriority __sys_ia32_setpriority
98 i386 profil
-99 i386 statfs sys_statfs compat_sys_statfs
-100 i386 fstatfs sys_fstatfs compat_sys_fstatfs
-101 i386 ioperm sys_ioperm
-102 i386 socketcall sys_socketcall compat_sys_socketcall
-103 i386 syslog sys_syslog
-104 i386 setitimer sys_setitimer compat_sys_setitimer
-105 i386 getitimer sys_getitimer compat_sys_getitimer
-106 i386 stat sys_newstat compat_sys_newstat
-107 i386 lstat sys_newlstat compat_sys_newlstat
-108 i386 fstat sys_newfstat compat_sys_newfstat
-109 i386 olduname sys_uname
-110 i386 iopl sys_iopl
+99 i386 statfs sys_statfs __compat_sys_ia32_statfs
+100 i386 fstatfs sys_fstatfs __compat_sys_ia32_fstatfs
+101 i386 ioperm sys_ioperm __sys_ia32_ioperm
+102 i386 socketcall sys_socketcall __compat_sys_ia32_socketcall
+103 i386 syslog sys_syslog __sys_ia32_syslog
+104 i386 setitimer sys_setitimer __compat_sys_ia32_setitimer
+105 i386 getitimer sys_getitimer __compat_sys_ia32_getitimer
+106 i386 stat sys_newstat __compat_sys_ia32_newstat
+107 i386 lstat sys_newlstat __compat_sys_ia32_newlstat
+108 i386 fstat sys_newfstat __compat_sys_ia32_newfstat
+109 i386 olduname sys_uname __sys_ia32_uname
+110 i386 iopl sys_iopl __sys_ia32_iopl
111 i386 vhangup sys_vhangup
112 i386 idle
113 i386 vm86old sys_vm86old sys_ni_syscall
-114 i386 wait4 sys_wait4 compat_sys_wait4
-115 i386 swapoff sys_swapoff
-116 i386 sysinfo sys_sysinfo compat_sys_sysinfo
-117 i386 ipc sys_ipc compat_sys_ipc
-118 i386 fsync sys_fsync
+114 i386 wait4 sys_wait4 __compat_sys_ia32_wait4
+115 i386 swapoff sys_swapoff __sys_ia32_swapoff
+116 i386 sysinfo sys_sysinfo __compat_sys_ia32_sysinfo
+117 i386 ipc sys_ipc __compat_sys_ia32_ipc
+118 i386 fsync sys_fsync __sys_ia32_fsync
119 i386 sigreturn sys_sigreturn sys32_sigreturn
-120 i386 clone sys_clone compat_sys_x86_clone
-121 i386 setdomainname sys_setdomainname
-122 i386 uname sys_newuname
-123 i386 modify_ldt sys_modify_ldt
-124 i386 adjtimex sys_adjtimex compat_sys_adjtimex
-125 i386 mprotect sys_mprotect
-126 i386 sigprocmask sys_sigprocmask compat_sys_sigprocmask
+120 i386 clone sys_clone __compat_sys_ia32_x86_clone
+121 i386 setdomainname sys_setdomainname __sys_ia32_setdomainname
+122 i386 uname sys_newuname __sys_ia32_newuname
+123 i386 modify_ldt sys_modify_ldt __sys_ia32_modify_ldt
+124 i386 adjtimex sys_adjtimex __compat_sys_ia32_adjtimex
+125 i386 mprotect sys_mprotect __sys_ia32_mprotect
+126 i386 sigprocmask sys_sigprocmask __compat_sys_ia32_sigprocmask
127 i386 create_module
-128 i386 init_module sys_init_module
-129 i386 delete_module sys_delete_module
+128 i386 init_module sys_init_module __sys_ia32_init_module
+129 i386 delete_module sys_delete_module __sys_ia32_delete_module
130 i386 get_kernel_syms
-131 i386 quotactl sys_quotactl compat_sys_quotactl32
-132 i386 getpgid sys_getpgid
-133 i386 fchdir sys_fchdir
-134 i386 bdflush sys_bdflush
-135 i386 sysfs sys_sysfs
-136 i386 personality sys_personality
+131 i386 quotactl sys_quotactl __compat_sys_ia32_quotactl32
+132 i386 getpgid sys_getpgid __sys_ia32_getpgid
+133 i386 fchdir sys_fchdir __sys_ia32_fchdir
+134 i386 bdflush sys_bdflush __sys_ia32_bdflush
+135 i386 sysfs sys_sysfs __sys_ia32_sysfs
+136 i386 personality sys_personality __sys_ia32_personality
137 i386 afs_syscall
-138 i386 setfsuid sys_setfsuid16
-139 i386 setfsgid sys_setfsgid16
-140 i386 _llseek sys_llseek
-141 i386 getdents sys_getdents compat_sys_getdents
-142 i386 _newselect sys_select compat_sys_select
-143 i386 flock sys_flock
-144 i386 msync sys_msync
-145 i386 readv sys_readv compat_sys_readv
-146 i386 writev sys_writev compat_sys_writev
-147 i386 getsid sys_getsid
-148 i386 fdatasync sys_fdatasync
-149 i386 _sysctl sys_sysctl compat_sys_sysctl
-150 i386 mlock sys_mlock
-151 i386 munlock sys_munlock
-152 i386 mlockall sys_mlockall
+138 i386 setfsuid sys_setfsuid16 __sys_ia32_setfsuid16
+139 i386 setfsgid sys_setfsgid16 __sys_ia32_setfsgid16
+140 i386 _llseek sys_llseek __sys_ia32_llseek
+141 i386 getdents sys_getdents __compat_sys_ia32_getdents
+142 i386 _newselect sys_select __compat_sys_ia32_select
+143 i386 flock sys_flock __sys_ia32_flock
+144 i386 msync sys_msync __sys_ia32_msync
+145 i386 readv sys_readv __compat_sys_ia32_readv
+146 i386 writev sys_writev __compat_sys_ia32_writev
+147 i386 getsid sys_getsid __sys_ia32_getsid
+148 i386 fdatasync sys_fdatasync __sys_ia32_fdatasync
+149 i386 _sysctl sys_sysctl __compat_sys_ia32_sysctl
+150 i386 mlock sys_mlock __sys_ia32_mlock
+151 i386 munlock sys_munlock __sys_ia32_munlock
+152 i386 mlockall sys_mlockall __sys_ia32_mlockall
153 i386 munlockall sys_munlockall
-154 i386 sched_setparam sys_sched_setparam
-155 i386 sched_getparam sys_sched_getparam
-156 i386 sched_setscheduler sys_sched_setscheduler
-157 i386 sched_getscheduler sys_sched_getscheduler
+154 i386 sched_setparam sys_sched_setparam __sys_ia32_sched_setparam
+155 i386 sched_getparam sys_sched_getparam __sys_ia32_sched_getparam
+156 i386 sched_setscheduler sys_sched_setscheduler __sys_ia32_sched_setscheduler
+157 i386 sched_getscheduler sys_sched_getscheduler __sys_ia32_sched_getscheduler
158 i386 sched_yield sys_sched_yield
-159 i386 sched_get_priority_max sys_sched_get_priority_max
-160 i386 sched_get_priority_min sys_sched_get_priority_min
-161 i386 sched_rr_get_interval sys_sched_rr_get_interval compat_sys_sched_rr_get_interval
-162 i386 nanosleep sys_nanosleep compat_sys_nanosleep
-163 i386 mremap sys_mremap
-164 i386 setresuid sys_setresuid16
-165 i386 getresuid sys_getresuid16
+159 i386 sched_get_priority_max sys_sched_get_priority_max __sys_ia32_sched_get_priority_max
+160 i386 sched_get_priority_min sys_sched_get_priority_min __sys_ia32_sched_get_priority_min
+161 i386 sched_rr_get_interval sys_sched_rr_get_interval __compat_sys_ia32_sched_rr_get_interval
+162 i386 nanosleep sys_nanosleep __compat_sys_ia32_nanosleep
+163 i386 mremap sys_mremap __sys_ia32_mremap
+164 i386 setresuid sys_setresuid16 __sys_ia32_setresuid16
+165 i386 getresuid sys_getresuid16 __sys_ia32_getresuid16
166 i386 vm86 sys_vm86 sys_ni_syscall
167 i386 query_module
-168 i386 poll sys_poll
+168 i386 poll sys_poll __sys_ia32_poll
169 i386 nfsservctl
-170 i386 setresgid sys_setresgid16
-171 i386 getresgid sys_getresgid16
-172 i386 prctl sys_prctl
+170 i386 setresgid sys_setresgid16 __sys_ia32_setresgid16
+171 i386 getresgid sys_getresgid16 __sys_ia32_getresgid16
+172 i386 prctl sys_prctl __sys_ia32_prctl
173 i386 rt_sigreturn sys_rt_sigreturn sys32_rt_sigreturn
-174 i386 rt_sigaction sys_rt_sigaction compat_sys_rt_sigaction
-175 i386 rt_sigprocmask sys_rt_sigprocmask
-176 i386 rt_sigpending sys_rt_sigpending compat_sys_rt_sigpending
-177 i386 rt_sigtimedwait sys_rt_sigtimedwait compat_sys_rt_sigtimedwait
-178 i386 rt_sigqueueinfo sys_rt_sigqueueinfo compat_sys_rt_sigqueueinfo
-179 i386 rt_sigsuspend sys_rt_sigsuspend
-180 i386 pread64 sys_pread64 compat_sys_x86_pread
-181 i386 pwrite64 sys_pwrite64 compat_sys_x86_pwrite
-182 i386 chown sys_chown16
-183 i386 getcwd sys_getcwd
-184 i386 capget sys_capget
-185 i386 capset sys_capset
-186 i386 sigaltstack sys_sigaltstack compat_sys_sigaltstack
-187 i386 sendfile sys_sendfile compat_sys_sendfile
+174 i386 rt_sigaction sys_rt_sigaction __compat_sys_ia32_rt_sigaction
+175 i386 rt_sigprocmask sys_rt_sigprocmask __sys_ia32_rt_sigprocmask
+176 i386 rt_sigpending sys_rt_sigpending __compat_sys_ia32_rt_sigpending
+177 i386 rt_sigtimedwait sys_rt_sigtimedwait __compat_sys_ia32_rt_sigtimedwait
+178 i386 rt_sigqueueinfo sys_rt_sigqueueinfo __compat_sys_ia32_rt_sigqueueinfo
+179 i386 rt_sigsuspend sys_rt_sigsuspend __sys_ia32_rt_sigsuspend
+180 i386 pread64 sys_pread64 __compat_sys_ia32_x86_pread
+181 i386 pwrite64 sys_pwrite64 __compat_sys_ia32_x86_pwrite
+182 i386 chown sys_chown16 __sys_ia32_chown16
+183 i386 getcwd sys_getcwd __sys_ia32_getcwd
+184 i386 capget sys_capget __sys_ia32_capget
+185 i386 capset sys_capset __sys_ia32_capset
+186 i386 sigaltstack sys_sigaltstack __compat_sys_ia32_sigaltstack
+187 i386 sendfile sys_sendfile __compat_sys_ia32_sendfile
188 i386 getpmsg
189 i386 putpmsg
190 i386 vfork sys_vfork
-191 i386 ugetrlimit sys_getrlimit compat_sys_getrlimit
-192 i386 mmap2 sys_mmap_pgoff
-193 i386 truncate64 sys_truncate64 compat_sys_x86_truncate64
-194 i386 ftruncate64 sys_ftruncate64 compat_sys_x86_ftruncate64
-195 i386 stat64 sys_stat64 compat_sys_x86_stat64
-196 i386 lstat64 sys_lstat64 compat_sys_x86_lstat64
-197 i386 fstat64 sys_fstat64 compat_sys_x86_fstat64
-198 i386 lchown32 sys_lchown
+191 i386 ugetrlimit sys_getrlimit __compat_sys_ia32_getrlimit
+192 i386 mmap2 sys_mmap_pgoff __sys_ia32_mmap_pgoff
+193 i386 truncate64 sys_truncate64 __compat_sys_ia32_x86_truncate64
+194 i386 ftruncate64 sys_ftruncate64 __compat_sys_ia32_x86_ftruncate64
+195 i386 stat64 sys_stat64 __compat_sys_ia32_x86_stat64
+196 i386 lstat64 sys_lstat64 __compat_sys_ia32_x86_lstat64
+197 i386 fstat64 sys_fstat64 __compat_sys_ia32_x86_fstat64
+198 i386 lchown32 sys_lchown __sys_ia32_lchown
199 i386 getuid32 sys_getuid
200 i386 getgid32 sys_getgid
201 i386 geteuid32 sys_geteuid
202 i386 getegid32 sys_getegid
-203 i386 setreuid32 sys_setreuid
-204 i386 setregid32 sys_setregid
-205 i386 getgroups32 sys_getgroups
-206 i386 setgroups32 sys_setgroups
-207 i386 fchown32 sys_fchown
-208 i386 setresuid32 sys_setresuid
-209 i386 getresuid32 sys_getresuid
-210 i386 setresgid32 sys_setresgid
-211 i386 getresgid32 sys_getresgid
-212 i386 chown32 sys_chown
-213 i386 setuid32 sys_setuid
-214 i386 setgid32 sys_setgid
-215 i386 setfsuid32 sys_setfsuid
-216 i386 setfsgid32 sys_setfsgid
-217 i386 pivot_root sys_pivot_root
-218 i386 mincore sys_mincore
-219 i386 madvise sys_madvise
-220 i386 getdents64 sys_getdents64
-221 i386 fcntl64 sys_fcntl64 compat_sys_fcntl64
+203 i386 setreuid32 sys_setreuid __sys_ia32_setreuid
+204 i386 setregid32 sys_setregid __sys_ia32_setregid
+205 i386 getgroups32 sys_getgroups __sys_ia32_getgroups
+206 i386 setgroups32 sys_setgroups __sys_ia32_setgroups
+207 i386 fchown32 sys_fchown __sys_ia32_fchown
+208 i386 setresuid32 sys_setresuid __sys_ia32_setresuid
+209 i386 getresuid32 sys_getresuid __sys_ia32_getresuid
+210 i386 setresgid32 sys_setresgid __sys_ia32_setresgid
+211 i386 getresgid32 sys_getresgid __sys_ia32_getresgid
+212 i386 chown32 sys_chown __sys_ia32_chown
+213 i386 setuid32 sys_setuid __sys_ia32_setuid
+214 i386 setgid32 sys_setgid __sys_ia32_setgid
+215 i386 setfsuid32 sys_setfsuid __sys_ia32_setfsuid
+216 i386 setfsgid32 sys_setfsgid __sys_ia32_setfsgid
+217 i386 pivot_root sys_pivot_root __sys_ia32_pivot_root
+218 i386 mincore sys_mincore __sys_ia32_mincore
+219 i386 madvise sys_madvise __sys_ia32_madvise
+220 i386 getdents64 sys_getdents64 __sys_ia32_getdents64
+221 i386 fcntl64 sys_fcntl64 __compat_sys_ia32_fcntl64
# 222 is unused
# 223 is unused
224 i386 gettid sys_gettid
-225 i386 readahead sys_readahead compat_sys_x86_readahead
-226 i386 setxattr sys_setxattr
-227 i386 lsetxattr sys_lsetxattr
-228 i386 fsetxattr sys_fsetxattr
-229 i386 getxattr sys_getxattr
-230 i386 lgetxattr sys_lgetxattr
-231 i386 fgetxattr sys_fgetxattr
-232 i386 listxattr sys_listxattr
-233 i386 llistxattr sys_llistxattr
-234 i386 flistxattr sys_flistxattr
-235 i386 removexattr sys_removexattr
-236 i386 lremovexattr sys_lremovexattr
-237 i386 fremovexattr sys_fremovexattr
-238 i386 tkill sys_tkill
-239 i386 sendfile64 sys_sendfile64
-240 i386 futex sys_futex compat_sys_futex
-241 i386 sched_setaffinity sys_sched_setaffinity compat_sys_sched_setaffinity
-242 i386 sched_getaffinity sys_sched_getaffinity compat_sys_sched_getaffinity
-243 i386 set_thread_area sys_set_thread_area
-244 i386 get_thread_area sys_get_thread_area
-245 i386 io_setup sys_io_setup compat_sys_io_setup
-246 i386 io_destroy sys_io_destroy
-247 i386 io_getevents sys_io_getevents compat_sys_io_getevents
-248 i386 io_submit sys_io_submit compat_sys_io_submit
-249 i386 io_cancel sys_io_cancel
-250 i386 fadvise64 sys_fadvise64 compat_sys_x86_fadvise64
+225 i386 readahead sys_readahead __compat_sys_ia32_x86_readahead
+226 i386 setxattr sys_setxattr __sys_ia32_setxattr
+227 i386 lsetxattr sys_lsetxattr __sys_ia32_lsetxattr
+228 i386 fsetxattr sys_fsetxattr __sys_ia32_fsetxattr
+229 i386 getxattr sys_getxattr __sys_ia32_getxattr
+230 i386 lgetxattr sys_lgetxattr __sys_ia32_lgetxattr
+231 i386 fgetxattr sys_fgetxattr __sys_ia32_fgetxattr
+232 i386 listxattr sys_listxattr __sys_ia32_listxattr
+233 i386 llistxattr sys_llistxattr __sys_ia32_llistxattr
+234 i386 flistxattr sys_flistxattr __sys_ia32_flistxattr
+235 i386 removexattr sys_removexattr __sys_ia32_removexattr
+236 i386 lremovexattr sys_lremovexattr __sys_ia32_lremovexattr
+237 i386 fremovexattr sys_fremovexattr __sys_ia32_fremovexattr
+238 i386 tkill sys_tkill __sys_ia32_tkill
+239 i386 sendfile64 sys_sendfile64 __sys_ia32_sendfile64
+240 i386 futex sys_futex __compat_sys_ia32_futex
+241 i386 sched_setaffinity sys_sched_setaffinity __compat_sys_ia32_sched_setaffinity
+242 i386 sched_getaffinity sys_sched_getaffinity __compat_sys_ia32_sched_getaffinity
+243 i386 set_thread_area sys_set_thread_area __sys_ia32_set_thread_area
+244 i386 get_thread_area sys_get_thread_area __sys_ia32_get_thread_area
+245 i386 io_setup sys_io_setup __compat_sys_ia32_io_setup
+246 i386 io_destroy sys_io_destroy __sys_ia32_io_destroy
+247 i386 io_getevents sys_io_getevents __compat_sys_ia32_io_getevents
+248 i386 io_submit sys_io_submit __compat_sys_ia32_io_submit
+249 i386 io_cancel sys_io_cancel __sys_ia32_io_cancel
+250 i386 fadvise64 sys_fadvise64 __compat_sys_ia32_x86_fadvise64
# 251 is available for reuse (was briefly sys_set_zone_reclaim)
-252 i386 exit_group sys_exit_group
-253 i386 lookup_dcookie sys_lookup_dcookie compat_sys_lookup_dcookie
-254 i386 epoll_create sys_epoll_create
-255 i386 epoll_ctl sys_epoll_ctl
-256 i386 epoll_wait sys_epoll_wait
-257 i386 remap_file_pages sys_remap_file_pages
-258 i386 set_tid_address sys_set_tid_address
-259 i386 timer_create sys_timer_create compat_sys_timer_create
-260 i386 timer_settime sys_timer_settime compat_sys_timer_settime
-261 i386 timer_gettime sys_timer_gettime compat_sys_timer_gettime
-262 i386 timer_getoverrun sys_timer_getoverrun
-263 i386 timer_delete sys_timer_delete
-264 i386 clock_settime sys_clock_settime compat_sys_clock_settime
-265 i386 clock_gettime sys_clock_gettime compat_sys_clock_gettime
-266 i386 clock_getres sys_clock_getres compat_sys_clock_getres
-267 i386 clock_nanosleep sys_clock_nanosleep compat_sys_clock_nanosleep
-268 i386 statfs64 sys_statfs64 compat_sys_statfs64
-269 i386 fstatfs64 sys_fstatfs64 compat_sys_fstatfs64
-270 i386 tgkill sys_tgkill
-271 i386 utimes sys_utimes compat_sys_utimes
-272 i386 fadvise64_64 sys_fadvise64_64 compat_sys_x86_fadvise64_64
+252 i386 exit_group sys_exit_group __sys_ia32_exit_group
+253 i386 lookup_dcookie sys_lookup_dcookie __compat_sys_ia32_lookup_dcookie
+254 i386 epoll_create sys_epoll_create __sys_ia32_epoll_create
+255 i386 epoll_ctl sys_epoll_ctl __sys_ia32_epoll_ctl
+256 i386 epoll_wait sys_epoll_wait __sys_ia32_epoll_wait
+257 i386 remap_file_pages sys_remap_file_pages __sys_ia32_remap_file_pages
+258 i386 set_tid_address sys_set_tid_address __sys_ia32_set_tid_address
+259 i386 timer_create sys_timer_create __compat_sys_ia32_timer_create
+260 i386 timer_settime sys_timer_settime __compat_sys_ia32_timer_settime
+261 i386 timer_gettime sys_timer_gettime __compat_sys_ia32_timer_gettime
+262 i386 timer_getoverrun sys_timer_getoverrun __sys_ia32_timer_getoverrun
+263 i386 timer_delete sys_timer_delete __sys_ia32_timer_delete
+264 i386 clock_settime sys_clock_settime __compat_sys_ia32_clock_settime
+265 i386 clock_gettime sys_clock_gettime __compat_sys_ia32_clock_gettime
+266 i386 clock_getres sys_clock_getres __compat_sys_ia32_clock_getres
+267 i386 clock_nanosleep sys_clock_nanosleep __compat_sys_ia32_clock_nanosleep
+268 i386 statfs64 sys_statfs64 __compat_sys_ia32_statfs64
+269 i386 fstatfs64 sys_fstatfs64 __compat_sys_ia32_fstatfs64
+270 i386 tgkill sys_tgkill __sys_ia32_tgkill
+271 i386 utimes sys_utimes __compat_sys_ia32_utimes
+272 i386 fadvise64_64 sys_fadvise64_64 __compat_sys_ia32_x86_fadvise64_64
273 i386 vserver
-274 i386 mbind sys_mbind
-275 i386 get_mempolicy sys_get_mempolicy compat_sys_get_mempolicy
-276 i386 set_mempolicy sys_set_mempolicy
-277 i386 mq_open sys_mq_open compat_sys_mq_open
-278 i386 mq_unlink sys_mq_unlink
-279 i386 mq_timedsend sys_mq_timedsend compat_sys_mq_timedsend
-280 i386 mq_timedreceive sys_mq_timedreceive compat_sys_mq_timedreceive
-281 i386 mq_notify sys_mq_notify compat_sys_mq_notify
-282 i386 mq_getsetattr sys_mq_getsetattr compat_sys_mq_getsetattr
-283 i386 kexec_load sys_kexec_load compat_sys_kexec_load
-284 i386 waitid sys_waitid compat_sys_waitid
+274 i386 mbind sys_mbind __sys_ia32_mbind
+275 i386 get_mempolicy sys_get_mempolicy __compat_sys_ia32_get_mempolicy
+276 i386 set_mempolicy sys_set_mempolicy __sys_ia32_set_mempolicy
+277 i386 mq_open sys_mq_open __compat_sys_ia32_mq_open
+278 i386 mq_unlink sys_mq_unlink __sys_ia32_mq_unlink
+279 i386 mq_timedsend sys_mq_timedsend __compat_sys_ia32_mq_timedsend
+280 i386 mq_timedreceive sys_mq_timedreceive __compat_sys_ia32_mq_timedreceive
+281 i386 mq_notify sys_mq_notify __compat_sys_ia32_mq_notify
+282 i386 mq_getsetattr sys_mq_getsetattr __compat_sys_ia32_mq_getsetattr
+283 i386 kexec_load sys_kexec_load __compat_sys_ia32_kexec_load
+284 i386 waitid sys_waitid __compat_sys_ia32_waitid
# 285 sys_setaltroot
-286 i386 add_key sys_add_key
-287 i386 request_key sys_request_key
-288 i386 keyctl sys_keyctl compat_sys_keyctl
-289 i386 ioprio_set sys_ioprio_set
-290 i386 ioprio_get sys_ioprio_get
+286 i386 add_key sys_add_key __sys_ia32_add_key
+287 i386 request_key sys_request_key __sys_ia32_request_key
+288 i386 keyctl sys_keyctl __compat_sys_ia32_keyctl
+289 i386 ioprio_set sys_ioprio_set __sys_ia32_ioprio_set
+290 i386 ioprio_get sys_ioprio_get __sys_ia32_ioprio_get
291 i386 inotify_init sys_inotify_init
-292 i386 inotify_add_watch sys_inotify_add_watch
-293 i386 inotify_rm_watch sys_inotify_rm_watch
-294 i386 migrate_pages sys_migrate_pages
-295 i386 openat sys_openat compat_sys_openat
-296 i386 mkdirat sys_mkdirat
-297 i386 mknodat sys_mknodat
-298 i386 fchownat sys_fchownat
-299 i386 futimesat sys_futimesat compat_sys_futimesat
-300 i386 fstatat64 sys_fstatat64 compat_sys_x86_fstatat
-301 i386 unlinkat sys_unlinkat
-302 i386 renameat sys_renameat
-303 i386 linkat sys_linkat
-304 i386 symlinkat sys_symlinkat
-305 i386 readlinkat sys_readlinkat
-306 i386 fchmodat sys_fchmodat
-307 i386 faccessat sys_faccessat
-308 i386 pselect6 sys_pselect6 compat_sys_pselect6
-309 i386 ppoll sys_ppoll compat_sys_ppoll
-310 i386 unshare sys_unshare
-311 i386 set_robust_list sys_set_robust_list compat_sys_set_robust_list
-312 i386 get_robust_list sys_get_robust_list compat_sys_get_robust_list
-313 i386 splice sys_splice
-314 i386 sync_file_range sys_sync_file_range compat_sys_x86_sync_file_range
-315 i386 tee sys_tee
-316 i386 vmsplice sys_vmsplice compat_sys_vmsplice
-317 i386 move_pages sys_move_pages compat_sys_move_pages
-318 i386 getcpu sys_getcpu
-319 i386 epoll_pwait sys_epoll_pwait
-320 i386 utimensat sys_utimensat compat_sys_utimensat
-321 i386 signalfd sys_signalfd compat_sys_signalfd
-322 i386 timerfd_create sys_timerfd_create
-323 i386 eventfd sys_eventfd
-324 i386 fallocate sys_fallocate compat_sys_x86_fallocate
-325 i386 timerfd_settime sys_timerfd_settime compat_sys_timerfd_settime
-326 i386 timerfd_gettime sys_timerfd_gettime compat_sys_timerfd_gettime
-327 i386 signalfd4 sys_signalfd4 compat_sys_signalfd4
-328 i386 eventfd2 sys_eventfd2
-329 i386 epoll_create1 sys_epoll_create1
-330 i386 dup3 sys_dup3
-331 i386 pipe2 sys_pipe2
-332 i386 inotify_init1 sys_inotify_init1
-333 i386 preadv sys_preadv compat_sys_preadv
-334 i386 pwritev sys_pwritev compat_sys_pwritev
-335 i386 rt_tgsigqueueinfo sys_rt_tgsigqueueinfo compat_sys_rt_tgsigqueueinfo
-336 i386 perf_event_open sys_perf_event_open
-337 i386 recvmmsg sys_recvmmsg compat_sys_recvmmsg
-338 i386 fanotify_init sys_fanotify_init
-339 i386 fanotify_mark sys_fanotify_mark compat_sys_fanotify_mark
-340 i386 prlimit64 sys_prlimit64
-341 i386 name_to_handle_at sys_name_to_handle_at
-342 i386 open_by_handle_at sys_open_by_handle_at compat_sys_open_by_handle_at
-343 i386 clock_adjtime sys_clock_adjtime compat_sys_clock_adjtime
-344 i386 syncfs sys_syncfs
-345 i386 sendmmsg sys_sendmmsg compat_sys_sendmmsg
-346 i386 setns sys_setns
-347 i386 process_vm_readv sys_process_vm_readv compat_sys_process_vm_readv
-348 i386 process_vm_writev sys_process_vm_writev compat_sys_process_vm_writev
-349 i386 kcmp sys_kcmp
-350 i386 finit_module sys_finit_module
-351 i386 sched_setattr sys_sched_setattr
-352 i386 sched_getattr sys_sched_getattr
-353 i386 renameat2 sys_renameat2
-354 i386 seccomp sys_seccomp
-355 i386 getrandom sys_getrandom
-356 i386 memfd_create sys_memfd_create
-357 i386 bpf sys_bpf
-358 i386 execveat sys_execveat compat_sys_execveat
-359 i386 socket sys_socket
-360 i386 socketpair sys_socketpair
-361 i386 bind sys_bind
-362 i386 connect sys_connect
-363 i386 listen sys_listen
-364 i386 accept4 sys_accept4
-365 i386 getsockopt sys_getsockopt compat_sys_getsockopt
-366 i386 setsockopt sys_setsockopt compat_sys_setsockopt
-367 i386 getsockname sys_getsockname
-368 i386 getpeername sys_getpeername
-369 i386 sendto sys_sendto
-370 i386 sendmsg sys_sendmsg compat_sys_sendmsg
-371 i386 recvfrom sys_recvfrom compat_sys_recvfrom
-372 i386 recvmsg sys_recvmsg compat_sys_recvmsg
-373 i386 shutdown sys_shutdown
-374 i386 userfaultfd sys_userfaultfd
-375 i386 membarrier sys_membarrier
-376 i386 mlock2 sys_mlock2
-377 i386 copy_file_range sys_copy_file_range
-378 i386 preadv2 sys_preadv2 compat_sys_preadv2
-379 i386 pwritev2 sys_pwritev2 compat_sys_pwritev2
-380 i386 pkey_mprotect sys_pkey_mprotect
-381 i386 pkey_alloc sys_pkey_alloc
-382 i386 pkey_free sys_pkey_free
-383 i386 statx sys_statx
-384 i386 arch_prctl sys_arch_prctl compat_sys_arch_prctl
+292 i386 inotify_add_watch sys_inotify_add_watch __sys_ia32_inotify_add_watch
+293 i386 inotify_rm_watch sys_inotify_rm_watch __sys_ia32_inotify_rm_watch
+294 i386 migrate_pages sys_migrate_pages __sys_ia32_migrate_pages
+295 i386 openat sys_openat __compat_sys_ia32_openat
+296 i386 mkdirat sys_mkdirat __sys_ia32_mkdirat
+297 i386 mknodat sys_mknodat __sys_ia32_mknodat
+298 i386 fchownat sys_fchownat __sys_ia32_fchownat
+299 i386 futimesat sys_futimesat __compat_sys_ia32_futimesat
+300 i386 fstatat64 sys_fstatat64 __compat_sys_ia32_x86_fstatat
+301 i386 unlinkat sys_unlinkat __sys_ia32_unlinkat
+302 i386 renameat sys_renameat __sys_ia32_renameat
+303 i386 linkat sys_linkat __sys_ia32_linkat
+304 i386 symlinkat sys_symlinkat __sys_ia32_symlinkat
+305 i386 readlinkat sys_readlinkat __sys_ia32_readlinkat
+306 i386 fchmodat sys_fchmodat __sys_ia32_fchmodat
+307 i386 faccessat sys_faccessat __sys_ia32_faccessat
+308 i386 pselect6 sys_pselect6 __compat_sys_ia32_pselect6
+309 i386 ppoll sys_ppoll __compat_sys_ia32_ppoll
+310 i386 unshare sys_unshare __sys_ia32_unshare
+311 i386 set_robust_list sys_set_robust_list __compat_sys_ia32_set_robust_list
+312 i386 get_robust_list sys_get_robust_list __compat_sys_ia32_get_robust_list
+313 i386 splice sys_splice __sys_ia32_splice
+314 i386 sync_file_range sys_sync_file_range __compat_sys_ia32_x86_sync_file_range
+315 i386 tee sys_tee __sys_ia32_tee
+316 i386 vmsplice sys_vmsplice __compat_sys_ia32_vmsplice
+317 i386 move_pages sys_move_pages __compat_sys_ia32_move_pages
+318 i386 getcpu sys_getcpu __sys_ia32_getcpu
+319 i386 epoll_pwait sys_epoll_pwait __sys_ia32_epoll_pwait
+320 i386 utimensat sys_utimensat __compat_sys_ia32_utimensat
+321 i386 signalfd sys_signalfd __compat_sys_ia32_signalfd
+322 i386 timerfd_create sys_timerfd_create __sys_ia32_timerfd_create
+323 i386 eventfd sys_eventfd __sys_ia32_eventfd
+324 i386 fallocate sys_fallocate __compat_sys_ia32_x86_fallocate
+325 i386 timerfd_settime sys_timerfd_settime __compat_sys_ia32_timerfd_settime
+326 i386 timerfd_gettime sys_timerfd_gettime __compat_sys_ia32_timerfd_gettime
+327 i386 signalfd4 sys_signalfd4 __compat_sys_ia32_signalfd4
+328 i386 eventfd2 sys_eventfd2 __sys_ia32_eventfd2
+329 i386 epoll_create1 sys_epoll_create1 __sys_ia32_epoll_create1
+330 i386 dup3 sys_dup3 __sys_ia32_dup3
+331 i386 pipe2 sys_pipe2 __sys_ia32_pipe2
+332 i386 inotify_init1 sys_inotify_init1 __sys_ia32_inotify_init1
+333 i386 preadv sys_preadv __compat_sys_ia32_preadv
+334 i386 pwritev sys_pwritev __compat_sys_ia32_pwritev
+335 i386 rt_tgsigqueueinfo sys_rt_tgsigqueueinfo __compat_sys_ia32_rt_tgsigqueueinfo
+336 i386 perf_event_open sys_perf_event_open __sys_ia32_perf_event_open
+337 i386 recvmmsg sys_recvmmsg __compat_sys_ia32_recvmmsg
+338 i386 fanotify_init sys_fanotify_init __sys_ia32_fanotify_init
+339 i386 fanotify_mark sys_fanotify_mark __compat_sys_ia32_fanotify_mark
+340 i386 prlimit64 sys_prlimit64 __sys_ia32_prlimit64
+341 i386 name_to_handle_at sys_name_to_handle_at __sys_ia32_name_to_handle_at
+342 i386 open_by_handle_at sys_open_by_handle_at __compat_sys_ia32_open_by_handle_at
+343 i386 clock_adjtime sys_clock_adjtime __compat_sys_ia32_clock_adjtime
+344 i386 syncfs sys_syncfs __sys_ia32_syncfs
+345 i386 sendmmsg sys_sendmmsg __compat_sys_ia32_sendmmsg
+346 i386 setns sys_setns __sys_ia32_setns
+347 i386 process_vm_readv sys_process_vm_readv __compat_sys_ia32_process_vm_readv
+348 i386 process_vm_writev sys_process_vm_writev __compat_sys_ia32_process_vm_writev
+349 i386 kcmp sys_kcmp __sys_ia32_kcmp
+350 i386 finit_module sys_finit_module __sys_ia32_finit_module
+351 i386 sched_setattr sys_sched_setattr __sys_ia32_sched_setattr
+352 i386 sched_getattr sys_sched_getattr __sys_ia32_sched_getattr
+353 i386 renameat2 sys_renameat2 __sys_ia32_renameat2
+354 i386 seccomp sys_seccomp __sys_ia32_seccomp
+355 i386 getrandom sys_getrandom __sys_ia32_getrandom
+356 i386 memfd_create sys_memfd_create __sys_ia32_memfd_create
+357 i386 bpf sys_bpf __sys_ia32_bpf
+358 i386 execveat sys_execveat __compat_sys_ia32_execveat
+359 i386 socket sys_socket __sys_ia32_socket
+360 i386 socketpair sys_socketpair __sys_ia32_socketpair
+361 i386 bind sys_bind __sys_ia32_bind
+362 i386 connect sys_connect __sys_ia32_connect
+363 i386 listen sys_listen __sys_ia32_listen
+364 i386 accept4 sys_accept4 __sys_ia32_accept4
+365 i386 getsockopt sys_getsockopt __compat_sys_ia32_getsockopt
+366 i386 setsockopt sys_setsockopt __compat_sys_ia32_setsockopt
+367 i386 getsockname sys_getsockname __sys_ia32_getsockname
+368 i386 getpeername sys_getpeername __sys_ia32_getpeername
+369 i386 sendto sys_sendto __sys_ia32_sendto
+370 i386 sendmsg sys_sendmsg __compat_sys_ia32_sendmsg
+371 i386 recvfrom sys_recvfrom __compat_sys_ia32_recvfrom
+372 i386 recvmsg sys_recvmsg __compat_sys_ia32_recvmsg
+373 i386 shutdown sys_shutdown __sys_ia32_shutdown
+374 i386 userfaultfd sys_userfaultfd __sys_ia32_userfaultfd
+375 i386 membarrier sys_membarrier __sys_ia32_membarrier
+376 i386 mlock2 sys_mlock2 __sys_ia32_mlock2
+377 i386 copy_file_range sys_copy_file_range __sys_ia32_copy_file_range
+378 i386 preadv2 sys_preadv2 __compat_sys_ia32_preadv2
+379 i386 pwritev2 sys_pwritev2 __compat_sys_ia32_pwritev2
+380 i386 pkey_mprotect sys_pkey_mprotect __sys_ia32_pkey_mprotect
+381 i386 pkey_alloc sys_pkey_alloc __sys_ia32_pkey_alloc
+382 i386 pkey_free sys_pkey_free __sys_ia32_pkey_free
+383 i386 statx sys_statx __sys_ia32_statx
+384 i386 arch_prctl sys_arch_prctl __compat_sys_ia32_arch_prctl
diff --git a/arch/x86/entry/syscalls/syscall_64.tbl b/arch/x86/entry/syscalls/syscall_64.tbl
index 5aef183e2f85..a83c0f7f462f 100644
--- a/arch/x86/entry/syscalls/syscall_64.tbl
+++ b/arch/x86/entry/syscalls/syscall_64.tbl
@@ -342,41 +342,43 @@
#
# x32-specific system call numbers start at 512 to avoid cache impact
-# for native 64-bit operation.
+# for native 64-bit operation. The __compat_sys_x32 stubs are created
+# on-the-fly for compat_sys_*() compatibility system calls if X86_X32
+# is defined.
#
-512 x32 rt_sigaction compat_sys_rt_sigaction
+512 x32 rt_sigaction __compat_sys_x32_rt_sigaction
513 x32 rt_sigreturn sys32_x32_rt_sigreturn
-514 x32 ioctl compat_sys_ioctl
-515 x32 readv compat_sys_readv
-516 x32 writev compat_sys_writev
-517 x32 recvfrom compat_sys_recvfrom
-518 x32 sendmsg compat_sys_sendmsg
-519 x32 recvmsg compat_sys_recvmsg
-520 x32 execve compat_sys_execve/ptregs
-521 x32 ptrace compat_sys_ptrace
-522 x32 rt_sigpending compat_sys_rt_sigpending
-523 x32 rt_sigtimedwait compat_sys_rt_sigtimedwait
-524 x32 rt_sigqueueinfo compat_sys_rt_sigqueueinfo
-525 x32 sigaltstack compat_sys_sigaltstack
-526 x32 timer_create compat_sys_timer_create
-527 x32 mq_notify compat_sys_mq_notify
-528 x32 kexec_load compat_sys_kexec_load
-529 x32 waitid compat_sys_waitid
-530 x32 set_robust_list compat_sys_set_robust_list
-531 x32 get_robust_list compat_sys_get_robust_list
-532 x32 vmsplice compat_sys_vmsplice
-533 x32 move_pages compat_sys_move_pages
-534 x32 preadv compat_sys_preadv64
-535 x32 pwritev compat_sys_pwritev64
-536 x32 rt_tgsigqueueinfo compat_sys_rt_tgsigqueueinfo
-537 x32 recvmmsg compat_sys_recvmmsg
-538 x32 sendmmsg compat_sys_sendmmsg
-539 x32 process_vm_readv compat_sys_process_vm_readv
-540 x32 process_vm_writev compat_sys_process_vm_writev
-541 x32 setsockopt compat_sys_setsockopt
-542 x32 getsockopt compat_sys_getsockopt
-543 x32 io_setup compat_sys_io_setup
-544 x32 io_submit compat_sys_io_submit
-545 x32 execveat compat_sys_execveat/ptregs
-546 x32 preadv2 compat_sys_preadv64v2
-547 x32 pwritev2 compat_sys_pwritev64v2
+514 x32 ioctl __compat_sys_x32_ioctl
+515 x32 readv __compat_sys_x32_readv
+516 x32 writev __compat_sys_x32_writev
+517 x32 recvfrom __compat_sys_x32_recvfrom
+518 x32 sendmsg __compat_sys_x32_sendmsg
+519 x32 recvmsg __compat_sys_x32_recvmsg
+520 x32 execve __compat_sys_x32_execve/ptregs
+521 x32 ptrace __compat_sys_x32_ptrace
+522 x32 rt_sigpending __compat_sys_x32_rt_sigpending
+523 x32 rt_sigtimedwait __compat_sys_x32_rt_sigtimedwait
+524 x32 rt_sigqueueinfo __compat_sys_x32_rt_sigqueueinfo
+525 x32 sigaltstack __compat_sys_x32_sigaltstack
+526 x32 timer_create __compat_sys_x32_timer_create
+527 x32 mq_notify __compat_sys_x32_mq_notify
+528 x32 kexec_load __compat_sys_x32_kexec_load
+529 x32 waitid __compat_sys_x32_waitid
+530 x32 set_robust_list __compat_sys_x32_set_robust_list
+531 x32 get_robust_list __compat_sys_x32_get_robust_list
+532 x32 vmsplice __compat_sys_x32_vmsplice
+533 x32 move_pages __compat_sys_x32_move_pages
+534 x32 preadv __compat_sys_x32_preadv64
+535 x32 pwritev __compat_sys_x32_pwritev64
+536 x32 rt_tgsigqueueinfo __compat_sys_x32_rt_tgsigqueueinfo
+537 x32 recvmmsg __compat_sys_x32_recvmmsg
+538 x32 sendmmsg __compat_sys_x32_sendmmsg
+539 x32 process_vm_readv __compat_sys_x32_process_vm_readv
+540 x32 process_vm_writev __compat_sys_x32_process_vm_writev
+541 x32 setsockopt __compat_sys_x32_setsockopt
+542 x32 getsockopt __compat_sys_x32_getsockopt
+543 x32 io_setup __compat_sys_x32_io_setup
+544 x32 io_submit __compat_sys_x32_io_submit
+545 x32 execveat __compat_sys_x32_execveat/ptregs
+546 x32 preadv2 __compat_sys_x32_preadv64v2
+547 x32 pwritev2 __compat_sys_x32_pwritev64v2
diff --git a/arch/x86/include/asm/syscall_wrapper.h b/arch/x86/include/asm/syscall_wrapper.h
index 702bdee377af..49d7e4970110 100644
--- a/arch/x86/include/asm/syscall_wrapper.h
+++ b/arch/x86/include/asm/syscall_wrapper.h
@@ -6,6 +6,111 @@
#ifndef _ASM_X86_SYSCALL_WRAPPER_H
#define _ASM_X86_SYSCALL_WRAPPER_H
+/* Mapping of registers to parameters for syscalls on x86-64 and x32 */
+#define SC_X86_64_REGS_TO_ARGS(x, ...) \
+ __MAP(x,__SC_ARGS \
+ ,,regs->di,,regs->si,,regs->dx \
+ ,,regs->r10,,regs->r8,,regs->r9) \
+
+/* Mapping of registers to parameters for syscalls on i386 */
+#define SC_IA32_REGS_TO_ARGS(x, ...) \
+ __MAP(x,__SC_ARGS \
+ ,,(unsigned int)regs->bx,,(unsigned int)regs->cx \
+ ,,(unsigned int)regs->dx,,(unsigned int)regs->si \
+ ,,(unsigned int)regs->di,,(unsigned int)regs->bp)
+
+#ifdef CONFIG_IA32_EMULATION
+/*
+ * For IA32 emulation, we need to handle "compat" syscalls *and* create
+ * additional wrappers (aptly named __sys_ia32_sys_xyzzy) which decode the
+ * ia32 regs in the proper order for shared or "common" syscalls. As some
+ * syscalls may not be implemented, we need to expand COND_SYSCALL in
+ * kernel/sys_ni.c and SYS_NI in kernel/time/posix-stubs.c to cover this
+ * case as well.
+ */
+#define COMPAT_SC_IA32_STUBx(x, name, ...) \
+ asmlinkage long __compat_sys_ia32##name(const struct pt_regs *regs);\
+ ALLOW_ERROR_INJECTION(__compat_sys_ia32##name, ERRNO); \
+ asmlinkage long __compat_sys_ia32##name(const struct pt_regs *regs)\
+ { \
+ return c_SyS##name(SC_IA32_REGS_TO_ARGS(x,__VA_ARGS__));\
+ } \
+
+#define SC_IA32_WRAPPERx(x, name, ...) \
+ asmlinkage long __sys_ia32##name(const struct pt_regs *regs); \
+ ALLOW_ERROR_INJECTION(__sys_ia32##name, ERRNO); \
+ asmlinkage long __sys_ia32##name(const struct pt_regs *regs) \
+ { \
+ return SyS##name(SC_IA32_REGS_TO_ARGS(x,__VA_ARGS__)); \
+ }
+
+#define COND_SYSCALL(name) \
+ cond_syscall(sys_##name); \
+ cond_syscall(__sys_ia32_##name)
+
+#define SYS_NI(name) \
+ SYSCALL_ALIAS(sys_##name, sys_ni_posix_timers); \
+ SYSCALL_ALIAS(__sys_ia32_##name, sys_ni_posix_timers)
+
+#else /* CONFIG_IA32_EMULATION */
+#define COMPAT_SC_IA32_STUBx(x, name, ...)
+#define SC_IA32_WRAPPERx(x, fullname, name, ...)
+#endif /* CONFIG_IA32_EMULATION */
+
+
+#ifdef CONFIG_X86_X32
+/*
+ * For the x32 ABI, we need to create a stub for compat_sys_*() which is aware
+ * of the x86-64-style parameter ordering of x32 syscalls. The syscalls common
+ * with x86_64 obviously do not need such care.
+ */
+#define COMPAT_SC_X32_STUBx(x, name, ...) \
+ asmlinkage long __compat_sys_x32##name(const struct pt_regs *regs);\
+ ALLOW_ERROR_INJECTION(__compat_sys_x32##name, ERRNO); \
+ asmlinkage long __compat_sys_x32##name(const struct pt_regs *regs)\
+ { \
+ return c_SyS##name(SC_X86_64_REGS_TO_ARGS(x,__VA_ARGS__));\
+ } \
+
+#else /* CONFIG_X86_X32 */
+#define COMPAT_SC_X32_STUBx(x, name, ...)
+#endif /* CONFIG_X86_X32 */
+
+
+#ifdef CONFIG_COMPAT
+/*
+ * Compat means IA32_EMULATION and/or X86_X32. As they use a different
+ * mapping of registers to parameters, we need to generate stubs for each
+ * of them. There is no need to implement COMPAT_SYSCALL_DEFINE0, as it is
+ * unused on x86.
+ */
+#define COMPAT_SYSCALL_DEFINEx(x, name, ...) \
+ static long c_SyS##name(__MAP(x,__SC_LONG,__VA_ARGS__)); \
+ static inline long C_SYSC##name(__MAP(x,__SC_DECL,__VA_ARGS__));\
+ COMPAT_SC_IA32_STUBx(x, name, __VA_ARGS__) \
+ COMPAT_SC_X32_STUBx(x, name, __VA_ARGS__) \
+ static long c_SyS##name(__MAP(x,__SC_LONG,__VA_ARGS__)) \
+ { \
+ return C_SYSC##name(__MAP(x,__SC_DELOUSE,__VA_ARGS__)); \
+ } \
+ static inline long C_SYSC##name(__MAP(x,__SC_DECL,__VA_ARGS__))
+
+/*
+ * As some compat syscalls may not be implemented, we need to expand
+ * COND_SYSCALL_COMPAT in kernel/sys_ni.c and COMPAT_SYS_NI in
+ * kernel/time/posix-stubs.c to cover this case as well.
+ */
+#define COND_SYSCALL_COMPAT(name) \
+ cond_syscall(__compat_sys_ia32_##name); \
+ cond_syscall(__compat_sys_x32_##name)
+
+#define COMPAT_SYS_NI(name) \
+ SYSCALL_ALIAS(__compat_sys_ia32_##name, sys_ni_posix_timers); \
+ SYSCALL_ALIAS(__compat_sys_x32_##name, sys_ni_posix_timers)
+
+#endif /* CONFIG_COMPAT */
+
+
/*
* Instead of the generic __SYSCALL_DEFINEx() definition, this macro takes
* struct pt_regs *regs as the only argument of the syscall stub named
@@ -34,9 +139,14 @@
* This approach avoids leaking random user-provided register content down
* the call chain.
*
+ * If IA32_EMULATION is enabled, this macro generates an additional wrapper
+ * named __sys_ia32_*() which decodes the struct pt_regs *regs according
+ * to the i386 calling convention (bx, cx, dx, si, di, bp).
+ *
* As the generic SYSCALL_DEFINE0() macro does not decode any parameters for
* obvious reasons, and passing struct pt_regs *regs to it in %rdi does not
- * hurt, there is no need to override it.
+ * hurt, there is no need to override it, or to define it differently for
+ * IA32_EMULATION.
*/
#define __SYSCALL_DEFINEx(x, name, ...) \
asmlinkage long sys##name(const struct pt_regs *regs); \
@@ -45,10 +155,9 @@
static inline long SYSC##name(__MAP(x,__SC_DECL,__VA_ARGS__)); \
asmlinkage long sys##name(const struct pt_regs *regs) \
{ \
- return SyS##name(__MAP(x,__SC_ARGS \
- ,,regs->di,,regs->si,,regs->dx \
- ,,regs->r10,,regs->r8,,regs->r9)); \
+ return SyS##name(SC_X86_64_REGS_TO_ARGS(x,__VA_ARGS__));\
} \
+ SC_IA32_WRAPPERx(x, name, __VA_ARGS__) \
static long SyS##name(__MAP(x,__SC_LONG,__VA_ARGS__)) \
{ \
long ret = SYSC##name(__MAP(x,__SC_CAST,__VA_ARGS__)); \
--
2.16.3
^ permalink raw reply related [flat|nested] 27+ messages in thread
* [tip:x86/asm] syscalls/x86: Use 'struct pt_regs' based syscall calling for IA32_EMULATION and x32
2018-04-05 9:53 ` [PATCH 5/8] syscalls/x86: use struct pt_regs based syscall calling for IA32_EMULATION and x32 Dominik Brodowski
@ 2018-04-06 17:12 ` tip-bot for Dominik Brodowski
0 siblings, 0 replies; 27+ messages in thread
From: tip-bot for Dominik Brodowski @ 2018-04-06 17:12 UTC (permalink / raw)
To: linux-tip-commits
Cc: peterz, mingo, dvlasenk, linux-kernel, viro, luto, hpa, bp,
jpoimboe, tglx, akpm, brgerst, linux, torvalds
Commit-ID: ebeb8c82ffaf94435806ff0b686fffd41dd410b5
Gitweb: https://git.kernel.org/tip/ebeb8c82ffaf94435806ff0b686fffd41dd410b5
Author: Dominik Brodowski <linux@dominikbrodowski.net>
AuthorDate: Thu, 5 Apr 2018 11:53:04 +0200
Committer: Ingo Molnar <mingo@kernel.org>
CommitDate: Thu, 5 Apr 2018 16:59:38 +0200
syscalls/x86: Use 'struct pt_regs' based syscall calling for IA32_EMULATION and x32
Extend ARCH_HAS_SYSCALL_WRAPPER for i386 emulation and for x32 on 64-bit
x86.
For x32, all we need to do is to create an additional stub for each
compat syscall which decodes the parameters in x86-64 ordering, e.g.:
asmlinkage long __compat_sys_x32_xyzzy(struct pt_regs *regs)
{
return c_SyS_xyzzy(regs->di, regs->si, regs->dx);
}
For i386 emulation, we need to teach compat_sys_*() to take struct
pt_regs as its only argument, e.g.:
asmlinkage long __compat_sys_ia32_xyzzy(struct pt_regs *regs)
{
return c_SyS_xyzzy(regs->bx, regs->cx, regs->dx);
}
In addition, we need to create additional stubs for common syscalls
(that is, for syscalls which have the same parameters on 32-bit and
64-bit), e.g.:
asmlinkage long __sys_ia32_xyzzy(struct pt_regs *regs)
{
return c_sys_xyzzy(regs->bx, regs->cx, regs->dx);
}
This approach avoids leaking random user-provided register content down
the call chain.
This patch is based on an original proof-of-concept
| From: Linus Torvalds <torvalds@linux-foundation.org>
| Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
and was split up and heavily modified by me, in particular to base it on
ARCH_HAS_SYSCALL_WRAPPER.
Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20180405095307.3730-6-linux@dominikbrodowski.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
arch/x86/Kconfig | 2 +-
arch/x86/entry/common.c | 4 +
arch/x86/entry/syscall_32.c | 15 +-
arch/x86/entry/syscalls/syscall_32.tbl | 677 +++++++++++++++++----------------
arch/x86/entry/syscalls/syscall_64.tbl | 74 ++--
arch/x86/include/asm/syscall_wrapper.h | 117 +++++-
6 files changed, 509 insertions(+), 380 deletions(-)
diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 67348efc2540..7bbd6a174722 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -2957,5 +2957,5 @@ source "lib/Kconfig"
config SYSCALL_PTREGS
def_bool y
- depends on X86_64 && !COMPAT
+ depends on X86_64
select ARCH_HAS_SYSCALL_WRAPPER
diff --git a/arch/x86/entry/common.c b/arch/x86/entry/common.c
index e1b91bffa988..425f798b39e3 100644
--- a/arch/x86/entry/common.c
+++ b/arch/x86/entry/common.c
@@ -325,6 +325,9 @@ static __always_inline void do_syscall_32_irqs_on(struct pt_regs *regs)
if (likely(nr < IA32_NR_syscalls)) {
nr = array_index_nospec(nr, IA32_NR_syscalls);
+#ifdef CONFIG_SYSCALL_PTREGS
+ regs->ax = ia32_sys_call_table[nr](regs);
+#else
/*
* It's possible that a 32-bit syscall implementation
* takes a 64-bit parameter but nonetheless assumes that
@@ -335,6 +338,7 @@ static __always_inline void do_syscall_32_irqs_on(struct pt_regs *regs)
(unsigned int)regs->bx, (unsigned int)regs->cx,
(unsigned int)regs->dx, (unsigned int)regs->si,
(unsigned int)regs->di, (unsigned int)regs->bp);
+#endif /* CONFIG_SYSCALL_PTREGS */
}
syscall_return_slowpath(regs);
diff --git a/arch/x86/entry/syscall_32.c b/arch/x86/entry/syscall_32.c
index 95c294963612..47060dd8efb1 100644
--- a/arch/x86/entry/syscall_32.c
+++ b/arch/x86/entry/syscall_32.c
@@ -7,14 +7,23 @@
#include <asm/asm-offsets.h>
#include <asm/syscall.h>
-#define __SYSCALL_I386(nr, sym, qual) extern asmlinkage long sym(unsigned long, unsigned long, unsigned long, unsigned long, unsigned long, unsigned long) ;
+#ifdef CONFIG_SYSCALL_PTREGS
+/* On X86_64, we use struct pt_regs * to pass parameters to syscalls */
+#define __SYSCALL_I386(nr, sym, qual) extern asmlinkage long sym(const struct pt_regs *);
+
+/* this is a lie, but it does not hurt as sys_ni_syscall just returns -EINVAL */
+extern asmlinkage long sys_ni_syscall(const struct pt_regs *);
+
+#else /* CONFIG_SYSCALL_PTREGS */
+#define __SYSCALL_I386(nr, sym, qual) extern asmlinkage long sym(unsigned long, unsigned long, unsigned long, unsigned long, unsigned long, unsigned long);
+extern asmlinkage long sys_ni_syscall(unsigned long, unsigned long, unsigned long, unsigned long, unsigned long, unsigned long);
+#endif /* CONFIG_SYSCALL_PTREGS */
+
#include <asm/syscalls_32.h>
#undef __SYSCALL_I386
#define __SYSCALL_I386(nr, sym, qual) [nr] = sym,
-extern asmlinkage long sys_ni_syscall(unsigned long, unsigned long, unsigned long, unsigned long, unsigned long, unsigned long);
-
__visible const sys_call_ptr_t ia32_sys_call_table[__NR_syscall_compat_max+1] = {
/*
* Smells like a compiler bug -- it doesn't work
diff --git a/arch/x86/entry/syscalls/syscall_32.tbl b/arch/x86/entry/syscalls/syscall_32.tbl
index c58f75b088c5..7f09a3da0b3d 100644
--- a/arch/x86/entry/syscalls/syscall_32.tbl
+++ b/arch/x86/entry/syscalls/syscall_32.tbl
@@ -4,390 +4,395 @@
# The format is:
# <number> <abi> <name> <entry point> <compat entry point>
#
+# The __sys_ia32 and __compat_sys_ia32 stubs are created on-the-fly for
+# sys_*() system calls and compat_sys_*() compat system calls if
+# IA32_EMULATION is defined, and expect struct pt_regs *regs as their only
+# parameter.
+#
# The abi is always "i386" for this file.
#
0 i386 restart_syscall sys_restart_syscall
-1 i386 exit sys_exit
+1 i386 exit sys_exit __sys_ia32_exit
2 i386 fork sys_fork
-3 i386 read sys_read
-4 i386 write sys_write
-5 i386 open sys_open compat_sys_open
-6 i386 close sys_close
-7 i386 waitpid sys_waitpid
-8 i386 creat sys_creat
-9 i386 link sys_link
-10 i386 unlink sys_unlink
-11 i386 execve sys_execve compat_sys_execve
-12 i386 chdir sys_chdir
-13 i386 time sys_time compat_sys_time
-14 i386 mknod sys_mknod
-15 i386 chmod sys_chmod
-16 i386 lchown sys_lchown16
+3 i386 read sys_read __sys_ia32_read
+4 i386 write sys_write __sys_ia32_write
+5 i386 open sys_open __compat_sys_ia32_open
+6 i386 close sys_close __sys_ia32_close
+7 i386 waitpid sys_waitpid __sys_ia32_waitpid
+8 i386 creat sys_creat __sys_ia32_creat
+9 i386 link sys_link __sys_ia32_link
+10 i386 unlink sys_unlink __sys_ia32_unlink
+11 i386 execve sys_execve __compat_sys_ia32_execve
+12 i386 chdir sys_chdir __sys_ia32_chdir
+13 i386 time sys_time __compat_sys_ia32_time
+14 i386 mknod sys_mknod __sys_ia32_mknod
+15 i386 chmod sys_chmod __sys_ia32_chmod
+16 i386 lchown sys_lchown16 __sys_ia32_lchown16
17 i386 break
-18 i386 oldstat sys_stat
-19 i386 lseek sys_lseek compat_sys_lseek
+18 i386 oldstat sys_stat __sys_ia32_stat
+19 i386 lseek sys_lseek __compat_sys_ia32_lseek
20 i386 getpid sys_getpid
-21 i386 mount sys_mount compat_sys_mount
-22 i386 umount sys_oldumount
-23 i386 setuid sys_setuid16
+21 i386 mount sys_mount __compat_sys_ia32_mount
+22 i386 umount sys_oldumount __sys_ia32_oldumount
+23 i386 setuid sys_setuid16 __sys_ia32_setuid16
24 i386 getuid sys_getuid16
-25 i386 stime sys_stime compat_sys_stime
-26 i386 ptrace sys_ptrace compat_sys_ptrace
-27 i386 alarm sys_alarm
-28 i386 oldfstat sys_fstat
+25 i386 stime sys_stime __compat_sys_ia32_stime
+26 i386 ptrace sys_ptrace __compat_sys_ia32_ptrace
+27 i386 alarm sys_alarm __sys_ia32_alarm
+28 i386 oldfstat sys_fstat __sys_ia32_fstat
29 i386 pause sys_pause
-30 i386 utime sys_utime compat_sys_utime
+30 i386 utime sys_utime __compat_sys_ia32_utime
31 i386 stty
32 i386 gtty
-33 i386 access sys_access
-34 i386 nice sys_nice
+33 i386 access sys_access __sys_ia32_access
+34 i386 nice sys_nice __sys_ia32_nice
35 i386 ftime
36 i386 sync sys_sync
-37 i386 kill sys_kill
-38 i386 rename sys_rename
-39 i386 mkdir sys_mkdir
-40 i386 rmdir sys_rmdir
-41 i386 dup sys_dup
-42 i386 pipe sys_pipe
-43 i386 times sys_times compat_sys_times
+37 i386 kill sys_kill __sys_ia32_kill
+38 i386 rename sys_rename __sys_ia32_rename
+39 i386 mkdir sys_mkdir __sys_ia32_mkdir
+40 i386 rmdir sys_rmdir __sys_ia32_rmdir
+41 i386 dup sys_dup __sys_ia32_dup
+42 i386 pipe sys_pipe __sys_ia32_pipe
+43 i386 times sys_times __compat_sys_ia32_times
44 i386 prof
-45 i386 brk sys_brk
-46 i386 setgid sys_setgid16
+45 i386 brk sys_brk __sys_ia32_brk
+46 i386 setgid sys_setgid16 __sys_ia32_setgid16
47 i386 getgid sys_getgid16
-48 i386 signal sys_signal
+48 i386 signal sys_signal __sys_ia32_signal
49 i386 geteuid sys_geteuid16
50 i386 getegid sys_getegid16
-51 i386 acct sys_acct
-52 i386 umount2 sys_umount
+51 i386 acct sys_acct __sys_ia32_acct
+52 i386 umount2 sys_umount __sys_ia32_umount
53 i386 lock
-54 i386 ioctl sys_ioctl compat_sys_ioctl
-55 i386 fcntl sys_fcntl compat_sys_fcntl64
+54 i386 ioctl sys_ioctl __compat_sys_ia32_ioctl
+55 i386 fcntl sys_fcntl __compat_sys_ia32_fcntl64
56 i386 mpx
-57 i386 setpgid sys_setpgid
+57 i386 setpgid sys_setpgid __sys_ia32_setpgid
58 i386 ulimit
-59 i386 oldolduname sys_olduname
-60 i386 umask sys_umask
-61 i386 chroot sys_chroot
-62 i386 ustat sys_ustat compat_sys_ustat
-63 i386 dup2 sys_dup2
+59 i386 oldolduname sys_olduname __sys_ia32_olduname
+60 i386 umask sys_umask __sys_ia32_umask
+61 i386 chroot sys_chroot __sys_ia32_chroot
+62 i386 ustat sys_ustat __compat_sys_ia32_ustat
+63 i386 dup2 sys_dup2 __sys_ia32_dup2
64 i386 getppid sys_getppid
65 i386 getpgrp sys_getpgrp
66 i386 setsid sys_setsid
-67 i386 sigaction sys_sigaction compat_sys_sigaction
+67 i386 sigaction sys_sigaction __compat_sys_ia32_sigaction
68 i386 sgetmask sys_sgetmask
-69 i386 ssetmask sys_ssetmask
-70 i386 setreuid sys_setreuid16
-71 i386 setregid sys_setregid16
-72 i386 sigsuspend sys_sigsuspend
-73 i386 sigpending sys_sigpending compat_sys_sigpending
-74 i386 sethostname sys_sethostname
-75 i386 setrlimit sys_setrlimit compat_sys_setrlimit
-76 i386 getrlimit sys_old_getrlimit compat_sys_old_getrlimit
-77 i386 getrusage sys_getrusage compat_sys_getrusage
-78 i386 gettimeofday sys_gettimeofday compat_sys_gettimeofday
-79 i386 settimeofday sys_settimeofday compat_sys_settimeofday
-80 i386 getgroups sys_getgroups16
-81 i386 setgroups sys_setgroups16
-82 i386 select sys_old_select compat_sys_old_select
-83 i386 symlink sys_symlink
-84 i386 oldlstat sys_lstat
-85 i386 readlink sys_readlink
-86 i386 uselib sys_uselib
-87 i386 swapon sys_swapon
-88 i386 reboot sys_reboot
-89 i386 readdir sys_old_readdir compat_sys_old_readdir
-90 i386 mmap sys_old_mmap compat_sys_x86_mmap
-91 i386 munmap sys_munmap
-92 i386 truncate sys_truncate compat_sys_truncate
-93 i386 ftruncate sys_ftruncate compat_sys_ftruncate
-94 i386 fchmod sys_fchmod
-95 i386 fchown sys_fchown16
-96 i386 getpriority sys_getpriority
-97 i386 setpriority sys_setpriority
+69 i386 ssetmask sys_ssetmask __sys_ia32_ssetmask
+70 i386 setreuid sys_setreuid16 __sys_ia32_setreuid16
+71 i386 setregid sys_setregid16 __sys_ia32_setregid16
+72 i386 sigsuspend sys_sigsuspend __sys_ia32_sigsuspend
+73 i386 sigpending sys_sigpending __compat_sys_ia32_sigpending
+74 i386 sethostname sys_sethostname __sys_ia32_sethostname
+75 i386 setrlimit sys_setrlimit __compat_sys_ia32_setrlimit
+76 i386 getrlimit sys_old_getrlimit __compat_sys_ia32_old_getrlimit
+77 i386 getrusage sys_getrusage __compat_sys_ia32_getrusage
+78 i386 gettimeofday sys_gettimeofday __compat_sys_ia32_gettimeofday
+79 i386 settimeofday sys_settimeofday __compat_sys_ia32_settimeofday
+80 i386 getgroups sys_getgroups16 __sys_ia32_getgroups16
+81 i386 setgroups sys_setgroups16 __sys_ia32_setgroups16
+82 i386 select sys_old_select __compat_sys_ia32_old_select
+83 i386 symlink sys_symlink __sys_ia32_symlink
+84 i386 oldlstat sys_lstat __sys_ia32_lstat
+85 i386 readlink sys_readlink __sys_ia32_readlink
+86 i386 uselib sys_uselib __sys_ia32_uselib
+87 i386 swapon sys_swapon __sys_ia32_swapon
+88 i386 reboot sys_reboot __sys_ia32_reboot
+89 i386 readdir sys_old_readdir __compat_sys_ia32_old_readdir
+90 i386 mmap sys_old_mmap __compat_sys_ia32_x86_mmap
+91 i386 munmap sys_munmap __sys_ia32_munmap
+92 i386 truncate sys_truncate __compat_sys_ia32_truncate
+93 i386 ftruncate sys_ftruncate __compat_sys_ia32_ftruncate
+94 i386 fchmod sys_fchmod __sys_ia32_fchmod
+95 i386 fchown sys_fchown16 __sys_ia32_fchown16
+96 i386 getpriority sys_getpriority __sys_ia32_getpriority
+97 i386 setpriority sys_setpriority __sys_ia32_setpriority
98 i386 profil
-99 i386 statfs sys_statfs compat_sys_statfs
-100 i386 fstatfs sys_fstatfs compat_sys_fstatfs
-101 i386 ioperm sys_ioperm
-102 i386 socketcall sys_socketcall compat_sys_socketcall
-103 i386 syslog sys_syslog
-104 i386 setitimer sys_setitimer compat_sys_setitimer
-105 i386 getitimer sys_getitimer compat_sys_getitimer
-106 i386 stat sys_newstat compat_sys_newstat
-107 i386 lstat sys_newlstat compat_sys_newlstat
-108 i386 fstat sys_newfstat compat_sys_newfstat
-109 i386 olduname sys_uname
-110 i386 iopl sys_iopl
+99 i386 statfs sys_statfs __compat_sys_ia32_statfs
+100 i386 fstatfs sys_fstatfs __compat_sys_ia32_fstatfs
+101 i386 ioperm sys_ioperm __sys_ia32_ioperm
+102 i386 socketcall sys_socketcall __compat_sys_ia32_socketcall
+103 i386 syslog sys_syslog __sys_ia32_syslog
+104 i386 setitimer sys_setitimer __compat_sys_ia32_setitimer
+105 i386 getitimer sys_getitimer __compat_sys_ia32_getitimer
+106 i386 stat sys_newstat __compat_sys_ia32_newstat
+107 i386 lstat sys_newlstat __compat_sys_ia32_newlstat
+108 i386 fstat sys_newfstat __compat_sys_ia32_newfstat
+109 i386 olduname sys_uname __sys_ia32_uname
+110 i386 iopl sys_iopl __sys_ia32_iopl
111 i386 vhangup sys_vhangup
112 i386 idle
113 i386 vm86old sys_vm86old sys_ni_syscall
-114 i386 wait4 sys_wait4 compat_sys_wait4
-115 i386 swapoff sys_swapoff
-116 i386 sysinfo sys_sysinfo compat_sys_sysinfo
-117 i386 ipc sys_ipc compat_sys_ipc
-118 i386 fsync sys_fsync
+114 i386 wait4 sys_wait4 __compat_sys_ia32_wait4
+115 i386 swapoff sys_swapoff __sys_ia32_swapoff
+116 i386 sysinfo sys_sysinfo __compat_sys_ia32_sysinfo
+117 i386 ipc sys_ipc __compat_sys_ia32_ipc
+118 i386 fsync sys_fsync __sys_ia32_fsync
119 i386 sigreturn sys_sigreturn sys32_sigreturn
-120 i386 clone sys_clone compat_sys_x86_clone
-121 i386 setdomainname sys_setdomainname
-122 i386 uname sys_newuname
-123 i386 modify_ldt sys_modify_ldt
-124 i386 adjtimex sys_adjtimex compat_sys_adjtimex
-125 i386 mprotect sys_mprotect
-126 i386 sigprocmask sys_sigprocmask compat_sys_sigprocmask
+120 i386 clone sys_clone __compat_sys_ia32_x86_clone
+121 i386 setdomainname sys_setdomainname __sys_ia32_setdomainname
+122 i386 uname sys_newuname __sys_ia32_newuname
+123 i386 modify_ldt sys_modify_ldt __sys_ia32_modify_ldt
+124 i386 adjtimex sys_adjtimex __compat_sys_ia32_adjtimex
+125 i386 mprotect sys_mprotect __sys_ia32_mprotect
+126 i386 sigprocmask sys_sigprocmask __compat_sys_ia32_sigprocmask
127 i386 create_module
-128 i386 init_module sys_init_module
-129 i386 delete_module sys_delete_module
+128 i386 init_module sys_init_module __sys_ia32_init_module
+129 i386 delete_module sys_delete_module __sys_ia32_delete_module
130 i386 get_kernel_syms
-131 i386 quotactl sys_quotactl compat_sys_quotactl32
-132 i386 getpgid sys_getpgid
-133 i386 fchdir sys_fchdir
-134 i386 bdflush sys_bdflush
-135 i386 sysfs sys_sysfs
-136 i386 personality sys_personality
+131 i386 quotactl sys_quotactl __compat_sys_ia32_quotactl32
+132 i386 getpgid sys_getpgid __sys_ia32_getpgid
+133 i386 fchdir sys_fchdir __sys_ia32_fchdir
+134 i386 bdflush sys_bdflush __sys_ia32_bdflush
+135 i386 sysfs sys_sysfs __sys_ia32_sysfs
+136 i386 personality sys_personality __sys_ia32_personality
137 i386 afs_syscall
-138 i386 setfsuid sys_setfsuid16
-139 i386 setfsgid sys_setfsgid16
-140 i386 _llseek sys_llseek
-141 i386 getdents sys_getdents compat_sys_getdents
-142 i386 _newselect sys_select compat_sys_select
-143 i386 flock sys_flock
-144 i386 msync sys_msync
-145 i386 readv sys_readv compat_sys_readv
-146 i386 writev sys_writev compat_sys_writev
-147 i386 getsid sys_getsid
-148 i386 fdatasync sys_fdatasync
-149 i386 _sysctl sys_sysctl compat_sys_sysctl
-150 i386 mlock sys_mlock
-151 i386 munlock sys_munlock
-152 i386 mlockall sys_mlockall
+138 i386 setfsuid sys_setfsuid16 __sys_ia32_setfsuid16
+139 i386 setfsgid sys_setfsgid16 __sys_ia32_setfsgid16
+140 i386 _llseek sys_llseek __sys_ia32_llseek
+141 i386 getdents sys_getdents __compat_sys_ia32_getdents
+142 i386 _newselect sys_select __compat_sys_ia32_select
+143 i386 flock sys_flock __sys_ia32_flock
+144 i386 msync sys_msync __sys_ia32_msync
+145 i386 readv sys_readv __compat_sys_ia32_readv
+146 i386 writev sys_writev __compat_sys_ia32_writev
+147 i386 getsid sys_getsid __sys_ia32_getsid
+148 i386 fdatasync sys_fdatasync __sys_ia32_fdatasync
+149 i386 _sysctl sys_sysctl __compat_sys_ia32_sysctl
+150 i386 mlock sys_mlock __sys_ia32_mlock
+151 i386 munlock sys_munlock __sys_ia32_munlock
+152 i386 mlockall sys_mlockall __sys_ia32_mlockall
153 i386 munlockall sys_munlockall
-154 i386 sched_setparam sys_sched_setparam
-155 i386 sched_getparam sys_sched_getparam
-156 i386 sched_setscheduler sys_sched_setscheduler
-157 i386 sched_getscheduler sys_sched_getscheduler
+154 i386 sched_setparam sys_sched_setparam __sys_ia32_sched_setparam
+155 i386 sched_getparam sys_sched_getparam __sys_ia32_sched_getparam
+156 i386 sched_setscheduler sys_sched_setscheduler __sys_ia32_sched_setscheduler
+157 i386 sched_getscheduler sys_sched_getscheduler __sys_ia32_sched_getscheduler
158 i386 sched_yield sys_sched_yield
-159 i386 sched_get_priority_max sys_sched_get_priority_max
-160 i386 sched_get_priority_min sys_sched_get_priority_min
-161 i386 sched_rr_get_interval sys_sched_rr_get_interval compat_sys_sched_rr_get_interval
-162 i386 nanosleep sys_nanosleep compat_sys_nanosleep
-163 i386 mremap sys_mremap
-164 i386 setresuid sys_setresuid16
-165 i386 getresuid sys_getresuid16
+159 i386 sched_get_priority_max sys_sched_get_priority_max __sys_ia32_sched_get_priority_max
+160 i386 sched_get_priority_min sys_sched_get_priority_min __sys_ia32_sched_get_priority_min
+161 i386 sched_rr_get_interval sys_sched_rr_get_interval __compat_sys_ia32_sched_rr_get_interval
+162 i386 nanosleep sys_nanosleep __compat_sys_ia32_nanosleep
+163 i386 mremap sys_mremap __sys_ia32_mremap
+164 i386 setresuid sys_setresuid16 __sys_ia32_setresuid16
+165 i386 getresuid sys_getresuid16 __sys_ia32_getresuid16
166 i386 vm86 sys_vm86 sys_ni_syscall
167 i386 query_module
-168 i386 poll sys_poll
+168 i386 poll sys_poll __sys_ia32_poll
169 i386 nfsservctl
-170 i386 setresgid sys_setresgid16
-171 i386 getresgid sys_getresgid16
-172 i386 prctl sys_prctl
+170 i386 setresgid sys_setresgid16 __sys_ia32_setresgid16
+171 i386 getresgid sys_getresgid16 __sys_ia32_getresgid16
+172 i386 prctl sys_prctl __sys_ia32_prctl
173 i386 rt_sigreturn sys_rt_sigreturn sys32_rt_sigreturn
-174 i386 rt_sigaction sys_rt_sigaction compat_sys_rt_sigaction
-175 i386 rt_sigprocmask sys_rt_sigprocmask
-176 i386 rt_sigpending sys_rt_sigpending compat_sys_rt_sigpending
-177 i386 rt_sigtimedwait sys_rt_sigtimedwait compat_sys_rt_sigtimedwait
-178 i386 rt_sigqueueinfo sys_rt_sigqueueinfo compat_sys_rt_sigqueueinfo
-179 i386 rt_sigsuspend sys_rt_sigsuspend
-180 i386 pread64 sys_pread64 compat_sys_x86_pread
-181 i386 pwrite64 sys_pwrite64 compat_sys_x86_pwrite
-182 i386 chown sys_chown16
-183 i386 getcwd sys_getcwd
-184 i386 capget sys_capget
-185 i386 capset sys_capset
-186 i386 sigaltstack sys_sigaltstack compat_sys_sigaltstack
-187 i386 sendfile sys_sendfile compat_sys_sendfile
+174 i386 rt_sigaction sys_rt_sigaction __compat_sys_ia32_rt_sigaction
+175 i386 rt_sigprocmask sys_rt_sigprocmask __sys_ia32_rt_sigprocmask
+176 i386 rt_sigpending sys_rt_sigpending __compat_sys_ia32_rt_sigpending
+177 i386 rt_sigtimedwait sys_rt_sigtimedwait __compat_sys_ia32_rt_sigtimedwait
+178 i386 rt_sigqueueinfo sys_rt_sigqueueinfo __compat_sys_ia32_rt_sigqueueinfo
+179 i386 rt_sigsuspend sys_rt_sigsuspend __sys_ia32_rt_sigsuspend
+180 i386 pread64 sys_pread64 __compat_sys_ia32_x86_pread
+181 i386 pwrite64 sys_pwrite64 __compat_sys_ia32_x86_pwrite
+182 i386 chown sys_chown16 __sys_ia32_chown16
+183 i386 getcwd sys_getcwd __sys_ia32_getcwd
+184 i386 capget sys_capget __sys_ia32_capget
+185 i386 capset sys_capset __sys_ia32_capset
+186 i386 sigaltstack sys_sigaltstack __compat_sys_ia32_sigaltstack
+187 i386 sendfile sys_sendfile __compat_sys_ia32_sendfile
188 i386 getpmsg
189 i386 putpmsg
190 i386 vfork sys_vfork
-191 i386 ugetrlimit sys_getrlimit compat_sys_getrlimit
-192 i386 mmap2 sys_mmap_pgoff
-193 i386 truncate64 sys_truncate64 compat_sys_x86_truncate64
-194 i386 ftruncate64 sys_ftruncate64 compat_sys_x86_ftruncate64
-195 i386 stat64 sys_stat64 compat_sys_x86_stat64
-196 i386 lstat64 sys_lstat64 compat_sys_x86_lstat64
-197 i386 fstat64 sys_fstat64 compat_sys_x86_fstat64
-198 i386 lchown32 sys_lchown
+191 i386 ugetrlimit sys_getrlimit __compat_sys_ia32_getrlimit
+192 i386 mmap2 sys_mmap_pgoff __sys_ia32_mmap_pgoff
+193 i386 truncate64 sys_truncate64 __compat_sys_ia32_x86_truncate64
+194 i386 ftruncate64 sys_ftruncate64 __compat_sys_ia32_x86_ftruncate64
+195 i386 stat64 sys_stat64 __compat_sys_ia32_x86_stat64
+196 i386 lstat64 sys_lstat64 __compat_sys_ia32_x86_lstat64
+197 i386 fstat64 sys_fstat64 __compat_sys_ia32_x86_fstat64
+198 i386 lchown32 sys_lchown __sys_ia32_lchown
199 i386 getuid32 sys_getuid
200 i386 getgid32 sys_getgid
201 i386 geteuid32 sys_geteuid
202 i386 getegid32 sys_getegid
-203 i386 setreuid32 sys_setreuid
-204 i386 setregid32 sys_setregid
-205 i386 getgroups32 sys_getgroups
-206 i386 setgroups32 sys_setgroups
-207 i386 fchown32 sys_fchown
-208 i386 setresuid32 sys_setresuid
-209 i386 getresuid32 sys_getresuid
-210 i386 setresgid32 sys_setresgid
-211 i386 getresgid32 sys_getresgid
-212 i386 chown32 sys_chown
-213 i386 setuid32 sys_setuid
-214 i386 setgid32 sys_setgid
-215 i386 setfsuid32 sys_setfsuid
-216 i386 setfsgid32 sys_setfsgid
-217 i386 pivot_root sys_pivot_root
-218 i386 mincore sys_mincore
-219 i386 madvise sys_madvise
-220 i386 getdents64 sys_getdents64
-221 i386 fcntl64 sys_fcntl64 compat_sys_fcntl64
+203 i386 setreuid32 sys_setreuid __sys_ia32_setreuid
+204 i386 setregid32 sys_setregid __sys_ia32_setregid
+205 i386 getgroups32 sys_getgroups __sys_ia32_getgroups
+206 i386 setgroups32 sys_setgroups __sys_ia32_setgroups
+207 i386 fchown32 sys_fchown __sys_ia32_fchown
+208 i386 setresuid32 sys_setresuid __sys_ia32_setresuid
+209 i386 getresuid32 sys_getresuid __sys_ia32_getresuid
+210 i386 setresgid32 sys_setresgid __sys_ia32_setresgid
+211 i386 getresgid32 sys_getresgid __sys_ia32_getresgid
+212 i386 chown32 sys_chown __sys_ia32_chown
+213 i386 setuid32 sys_setuid __sys_ia32_setuid
+214 i386 setgid32 sys_setgid __sys_ia32_setgid
+215 i386 setfsuid32 sys_setfsuid __sys_ia32_setfsuid
+216 i386 setfsgid32 sys_setfsgid __sys_ia32_setfsgid
+217 i386 pivot_root sys_pivot_root __sys_ia32_pivot_root
+218 i386 mincore sys_mincore __sys_ia32_mincore
+219 i386 madvise sys_madvise __sys_ia32_madvise
+220 i386 getdents64 sys_getdents64 __sys_ia32_getdents64
+221 i386 fcntl64 sys_fcntl64 __compat_sys_ia32_fcntl64
# 222 is unused
# 223 is unused
224 i386 gettid sys_gettid
-225 i386 readahead sys_readahead compat_sys_x86_readahead
-226 i386 setxattr sys_setxattr
-227 i386 lsetxattr sys_lsetxattr
-228 i386 fsetxattr sys_fsetxattr
-229 i386 getxattr sys_getxattr
-230 i386 lgetxattr sys_lgetxattr
-231 i386 fgetxattr sys_fgetxattr
-232 i386 listxattr sys_listxattr
-233 i386 llistxattr sys_llistxattr
-234 i386 flistxattr sys_flistxattr
-235 i386 removexattr sys_removexattr
-236 i386 lremovexattr sys_lremovexattr
-237 i386 fremovexattr sys_fremovexattr
-238 i386 tkill sys_tkill
-239 i386 sendfile64 sys_sendfile64
-240 i386 futex sys_futex compat_sys_futex
-241 i386 sched_setaffinity sys_sched_setaffinity compat_sys_sched_setaffinity
-242 i386 sched_getaffinity sys_sched_getaffinity compat_sys_sched_getaffinity
-243 i386 set_thread_area sys_set_thread_area
-244 i386 get_thread_area sys_get_thread_area
-245 i386 io_setup sys_io_setup compat_sys_io_setup
-246 i386 io_destroy sys_io_destroy
-247 i386 io_getevents sys_io_getevents compat_sys_io_getevents
-248 i386 io_submit sys_io_submit compat_sys_io_submit
-249 i386 io_cancel sys_io_cancel
-250 i386 fadvise64 sys_fadvise64 compat_sys_x86_fadvise64
+225 i386 readahead sys_readahead __compat_sys_ia32_x86_readahead
+226 i386 setxattr sys_setxattr __sys_ia32_setxattr
+227 i386 lsetxattr sys_lsetxattr __sys_ia32_lsetxattr
+228 i386 fsetxattr sys_fsetxattr __sys_ia32_fsetxattr
+229 i386 getxattr sys_getxattr __sys_ia32_getxattr
+230 i386 lgetxattr sys_lgetxattr __sys_ia32_lgetxattr
+231 i386 fgetxattr sys_fgetxattr __sys_ia32_fgetxattr
+232 i386 listxattr sys_listxattr __sys_ia32_listxattr
+233 i386 llistxattr sys_llistxattr __sys_ia32_llistxattr
+234 i386 flistxattr sys_flistxattr __sys_ia32_flistxattr
+235 i386 removexattr sys_removexattr __sys_ia32_removexattr
+236 i386 lremovexattr sys_lremovexattr __sys_ia32_lremovexattr
+237 i386 fremovexattr sys_fremovexattr __sys_ia32_fremovexattr
+238 i386 tkill sys_tkill __sys_ia32_tkill
+239 i386 sendfile64 sys_sendfile64 __sys_ia32_sendfile64
+240 i386 futex sys_futex __compat_sys_ia32_futex
+241 i386 sched_setaffinity sys_sched_setaffinity __compat_sys_ia32_sched_setaffinity
+242 i386 sched_getaffinity sys_sched_getaffinity __compat_sys_ia32_sched_getaffinity
+243 i386 set_thread_area sys_set_thread_area __sys_ia32_set_thread_area
+244 i386 get_thread_area sys_get_thread_area __sys_ia32_get_thread_area
+245 i386 io_setup sys_io_setup __compat_sys_ia32_io_setup
+246 i386 io_destroy sys_io_destroy __sys_ia32_io_destroy
+247 i386 io_getevents sys_io_getevents __compat_sys_ia32_io_getevents
+248 i386 io_submit sys_io_submit __compat_sys_ia32_io_submit
+249 i386 io_cancel sys_io_cancel __sys_ia32_io_cancel
+250 i386 fadvise64 sys_fadvise64 __compat_sys_ia32_x86_fadvise64
# 251 is available for reuse (was briefly sys_set_zone_reclaim)
-252 i386 exit_group sys_exit_group
-253 i386 lookup_dcookie sys_lookup_dcookie compat_sys_lookup_dcookie
-254 i386 epoll_create sys_epoll_create
-255 i386 epoll_ctl sys_epoll_ctl
-256 i386 epoll_wait sys_epoll_wait
-257 i386 remap_file_pages sys_remap_file_pages
-258 i386 set_tid_address sys_set_tid_address
-259 i386 timer_create sys_timer_create compat_sys_timer_create
-260 i386 timer_settime sys_timer_settime compat_sys_timer_settime
-261 i386 timer_gettime sys_timer_gettime compat_sys_timer_gettime
-262 i386 timer_getoverrun sys_timer_getoverrun
-263 i386 timer_delete sys_timer_delete
-264 i386 clock_settime sys_clock_settime compat_sys_clock_settime
-265 i386 clock_gettime sys_clock_gettime compat_sys_clock_gettime
-266 i386 clock_getres sys_clock_getres compat_sys_clock_getres
-267 i386 clock_nanosleep sys_clock_nanosleep compat_sys_clock_nanosleep
-268 i386 statfs64 sys_statfs64 compat_sys_statfs64
-269 i386 fstatfs64 sys_fstatfs64 compat_sys_fstatfs64
-270 i386 tgkill sys_tgkill
-271 i386 utimes sys_utimes compat_sys_utimes
-272 i386 fadvise64_64 sys_fadvise64_64 compat_sys_x86_fadvise64_64
+252 i386 exit_group sys_exit_group __sys_ia32_exit_group
+253 i386 lookup_dcookie sys_lookup_dcookie __compat_sys_ia32_lookup_dcookie
+254 i386 epoll_create sys_epoll_create __sys_ia32_epoll_create
+255 i386 epoll_ctl sys_epoll_ctl __sys_ia32_epoll_ctl
+256 i386 epoll_wait sys_epoll_wait __sys_ia32_epoll_wait
+257 i386 remap_file_pages sys_remap_file_pages __sys_ia32_remap_file_pages
+258 i386 set_tid_address sys_set_tid_address __sys_ia32_set_tid_address
+259 i386 timer_create sys_timer_create __compat_sys_ia32_timer_create
+260 i386 timer_settime sys_timer_settime __compat_sys_ia32_timer_settime
+261 i386 timer_gettime sys_timer_gettime __compat_sys_ia32_timer_gettime
+262 i386 timer_getoverrun sys_timer_getoverrun __sys_ia32_timer_getoverrun
+263 i386 timer_delete sys_timer_delete __sys_ia32_timer_delete
+264 i386 clock_settime sys_clock_settime __compat_sys_ia32_clock_settime
+265 i386 clock_gettime sys_clock_gettime __compat_sys_ia32_clock_gettime
+266 i386 clock_getres sys_clock_getres __compat_sys_ia32_clock_getres
+267 i386 clock_nanosleep sys_clock_nanosleep __compat_sys_ia32_clock_nanosleep
+268 i386 statfs64 sys_statfs64 __compat_sys_ia32_statfs64
+269 i386 fstatfs64 sys_fstatfs64 __compat_sys_ia32_fstatfs64
+270 i386 tgkill sys_tgkill __sys_ia32_tgkill
+271 i386 utimes sys_utimes __compat_sys_ia32_utimes
+272 i386 fadvise64_64 sys_fadvise64_64 __compat_sys_ia32_x86_fadvise64_64
273 i386 vserver
-274 i386 mbind sys_mbind
-275 i386 get_mempolicy sys_get_mempolicy compat_sys_get_mempolicy
-276 i386 set_mempolicy sys_set_mempolicy
-277 i386 mq_open sys_mq_open compat_sys_mq_open
-278 i386 mq_unlink sys_mq_unlink
-279 i386 mq_timedsend sys_mq_timedsend compat_sys_mq_timedsend
-280 i386 mq_timedreceive sys_mq_timedreceive compat_sys_mq_timedreceive
-281 i386 mq_notify sys_mq_notify compat_sys_mq_notify
-282 i386 mq_getsetattr sys_mq_getsetattr compat_sys_mq_getsetattr
-283 i386 kexec_load sys_kexec_load compat_sys_kexec_load
-284 i386 waitid sys_waitid compat_sys_waitid
+274 i386 mbind sys_mbind __sys_ia32_mbind
+275 i386 get_mempolicy sys_get_mempolicy __compat_sys_ia32_get_mempolicy
+276 i386 set_mempolicy sys_set_mempolicy __sys_ia32_set_mempolicy
+277 i386 mq_open sys_mq_open __compat_sys_ia32_mq_open
+278 i386 mq_unlink sys_mq_unlink __sys_ia32_mq_unlink
+279 i386 mq_timedsend sys_mq_timedsend __compat_sys_ia32_mq_timedsend
+280 i386 mq_timedreceive sys_mq_timedreceive __compat_sys_ia32_mq_timedreceive
+281 i386 mq_notify sys_mq_notify __compat_sys_ia32_mq_notify
+282 i386 mq_getsetattr sys_mq_getsetattr __compat_sys_ia32_mq_getsetattr
+283 i386 kexec_load sys_kexec_load __compat_sys_ia32_kexec_load
+284 i386 waitid sys_waitid __compat_sys_ia32_waitid
# 285 sys_setaltroot
-286 i386 add_key sys_add_key
-287 i386 request_key sys_request_key
-288 i386 keyctl sys_keyctl compat_sys_keyctl
-289 i386 ioprio_set sys_ioprio_set
-290 i386 ioprio_get sys_ioprio_get
+286 i386 add_key sys_add_key __sys_ia32_add_key
+287 i386 request_key sys_request_key __sys_ia32_request_key
+288 i386 keyctl sys_keyctl __compat_sys_ia32_keyctl
+289 i386 ioprio_set sys_ioprio_set __sys_ia32_ioprio_set
+290 i386 ioprio_get sys_ioprio_get __sys_ia32_ioprio_get
291 i386 inotify_init sys_inotify_init
-292 i386 inotify_add_watch sys_inotify_add_watch
-293 i386 inotify_rm_watch sys_inotify_rm_watch
-294 i386 migrate_pages sys_migrate_pages
-295 i386 openat sys_openat compat_sys_openat
-296 i386 mkdirat sys_mkdirat
-297 i386 mknodat sys_mknodat
-298 i386 fchownat sys_fchownat
-299 i386 futimesat sys_futimesat compat_sys_futimesat
-300 i386 fstatat64 sys_fstatat64 compat_sys_x86_fstatat
-301 i386 unlinkat sys_unlinkat
-302 i386 renameat sys_renameat
-303 i386 linkat sys_linkat
-304 i386 symlinkat sys_symlinkat
-305 i386 readlinkat sys_readlinkat
-306 i386 fchmodat sys_fchmodat
-307 i386 faccessat sys_faccessat
-308 i386 pselect6 sys_pselect6 compat_sys_pselect6
-309 i386 ppoll sys_ppoll compat_sys_ppoll
-310 i386 unshare sys_unshare
-311 i386 set_robust_list sys_set_robust_list compat_sys_set_robust_list
-312 i386 get_robust_list sys_get_robust_list compat_sys_get_robust_list
-313 i386 splice sys_splice
-314 i386 sync_file_range sys_sync_file_range compat_sys_x86_sync_file_range
-315 i386 tee sys_tee
-316 i386 vmsplice sys_vmsplice compat_sys_vmsplice
-317 i386 move_pages sys_move_pages compat_sys_move_pages
-318 i386 getcpu sys_getcpu
-319 i386 epoll_pwait sys_epoll_pwait
-320 i386 utimensat sys_utimensat compat_sys_utimensat
-321 i386 signalfd sys_signalfd compat_sys_signalfd
-322 i386 timerfd_create sys_timerfd_create
-323 i386 eventfd sys_eventfd
-324 i386 fallocate sys_fallocate compat_sys_x86_fallocate
-325 i386 timerfd_settime sys_timerfd_settime compat_sys_timerfd_settime
-326 i386 timerfd_gettime sys_timerfd_gettime compat_sys_timerfd_gettime
-327 i386 signalfd4 sys_signalfd4 compat_sys_signalfd4
-328 i386 eventfd2 sys_eventfd2
-329 i386 epoll_create1 sys_epoll_create1
-330 i386 dup3 sys_dup3
-331 i386 pipe2 sys_pipe2
-332 i386 inotify_init1 sys_inotify_init1
-333 i386 preadv sys_preadv compat_sys_preadv
-334 i386 pwritev sys_pwritev compat_sys_pwritev
-335 i386 rt_tgsigqueueinfo sys_rt_tgsigqueueinfo compat_sys_rt_tgsigqueueinfo
-336 i386 perf_event_open sys_perf_event_open
-337 i386 recvmmsg sys_recvmmsg compat_sys_recvmmsg
-338 i386 fanotify_init sys_fanotify_init
-339 i386 fanotify_mark sys_fanotify_mark compat_sys_fanotify_mark
-340 i386 prlimit64 sys_prlimit64
-341 i386 name_to_handle_at sys_name_to_handle_at
-342 i386 open_by_handle_at sys_open_by_handle_at compat_sys_open_by_handle_at
-343 i386 clock_adjtime sys_clock_adjtime compat_sys_clock_adjtime
-344 i386 syncfs sys_syncfs
-345 i386 sendmmsg sys_sendmmsg compat_sys_sendmmsg
-346 i386 setns sys_setns
-347 i386 process_vm_readv sys_process_vm_readv compat_sys_process_vm_readv
-348 i386 process_vm_writev sys_process_vm_writev compat_sys_process_vm_writev
-349 i386 kcmp sys_kcmp
-350 i386 finit_module sys_finit_module
-351 i386 sched_setattr sys_sched_setattr
-352 i386 sched_getattr sys_sched_getattr
-353 i386 renameat2 sys_renameat2
-354 i386 seccomp sys_seccomp
-355 i386 getrandom sys_getrandom
-356 i386 memfd_create sys_memfd_create
-357 i386 bpf sys_bpf
-358 i386 execveat sys_execveat compat_sys_execveat
-359 i386 socket sys_socket
-360 i386 socketpair sys_socketpair
-361 i386 bind sys_bind
-362 i386 connect sys_connect
-363 i386 listen sys_listen
-364 i386 accept4 sys_accept4
-365 i386 getsockopt sys_getsockopt compat_sys_getsockopt
-366 i386 setsockopt sys_setsockopt compat_sys_setsockopt
-367 i386 getsockname sys_getsockname
-368 i386 getpeername sys_getpeername
-369 i386 sendto sys_sendto
-370 i386 sendmsg sys_sendmsg compat_sys_sendmsg
-371 i386 recvfrom sys_recvfrom compat_sys_recvfrom
-372 i386 recvmsg sys_recvmsg compat_sys_recvmsg
-373 i386 shutdown sys_shutdown
-374 i386 userfaultfd sys_userfaultfd
-375 i386 membarrier sys_membarrier
-376 i386 mlock2 sys_mlock2
-377 i386 copy_file_range sys_copy_file_range
-378 i386 preadv2 sys_preadv2 compat_sys_preadv2
-379 i386 pwritev2 sys_pwritev2 compat_sys_pwritev2
-380 i386 pkey_mprotect sys_pkey_mprotect
-381 i386 pkey_alloc sys_pkey_alloc
-382 i386 pkey_free sys_pkey_free
-383 i386 statx sys_statx
-384 i386 arch_prctl sys_arch_prctl compat_sys_arch_prctl
+292 i386 inotify_add_watch sys_inotify_add_watch __sys_ia32_inotify_add_watch
+293 i386 inotify_rm_watch sys_inotify_rm_watch __sys_ia32_inotify_rm_watch
+294 i386 migrate_pages sys_migrate_pages __sys_ia32_migrate_pages
+295 i386 openat sys_openat __compat_sys_ia32_openat
+296 i386 mkdirat sys_mkdirat __sys_ia32_mkdirat
+297 i386 mknodat sys_mknodat __sys_ia32_mknodat
+298 i386 fchownat sys_fchownat __sys_ia32_fchownat
+299 i386 futimesat sys_futimesat __compat_sys_ia32_futimesat
+300 i386 fstatat64 sys_fstatat64 __compat_sys_ia32_x86_fstatat
+301 i386 unlinkat sys_unlinkat __sys_ia32_unlinkat
+302 i386 renameat sys_renameat __sys_ia32_renameat
+303 i386 linkat sys_linkat __sys_ia32_linkat
+304 i386 symlinkat sys_symlinkat __sys_ia32_symlinkat
+305 i386 readlinkat sys_readlinkat __sys_ia32_readlinkat
+306 i386 fchmodat sys_fchmodat __sys_ia32_fchmodat
+307 i386 faccessat sys_faccessat __sys_ia32_faccessat
+308 i386 pselect6 sys_pselect6 __compat_sys_ia32_pselect6
+309 i386 ppoll sys_ppoll __compat_sys_ia32_ppoll
+310 i386 unshare sys_unshare __sys_ia32_unshare
+311 i386 set_robust_list sys_set_robust_list __compat_sys_ia32_set_robust_list
+312 i386 get_robust_list sys_get_robust_list __compat_sys_ia32_get_robust_list
+313 i386 splice sys_splice __sys_ia32_splice
+314 i386 sync_file_range sys_sync_file_range __compat_sys_ia32_x86_sync_file_range
+315 i386 tee sys_tee __sys_ia32_tee
+316 i386 vmsplice sys_vmsplice __compat_sys_ia32_vmsplice
+317 i386 move_pages sys_move_pages __compat_sys_ia32_move_pages
+318 i386 getcpu sys_getcpu __sys_ia32_getcpu
+319 i386 epoll_pwait sys_epoll_pwait __sys_ia32_epoll_pwait
+320 i386 utimensat sys_utimensat __compat_sys_ia32_utimensat
+321 i386 signalfd sys_signalfd __compat_sys_ia32_signalfd
+322 i386 timerfd_create sys_timerfd_create __sys_ia32_timerfd_create
+323 i386 eventfd sys_eventfd __sys_ia32_eventfd
+324 i386 fallocate sys_fallocate __compat_sys_ia32_x86_fallocate
+325 i386 timerfd_settime sys_timerfd_settime __compat_sys_ia32_timerfd_settime
+326 i386 timerfd_gettime sys_timerfd_gettime __compat_sys_ia32_timerfd_gettime
+327 i386 signalfd4 sys_signalfd4 __compat_sys_ia32_signalfd4
+328 i386 eventfd2 sys_eventfd2 __sys_ia32_eventfd2
+329 i386 epoll_create1 sys_epoll_create1 __sys_ia32_epoll_create1
+330 i386 dup3 sys_dup3 __sys_ia32_dup3
+331 i386 pipe2 sys_pipe2 __sys_ia32_pipe2
+332 i386 inotify_init1 sys_inotify_init1 __sys_ia32_inotify_init1
+333 i386 preadv sys_preadv __compat_sys_ia32_preadv
+334 i386 pwritev sys_pwritev __compat_sys_ia32_pwritev
+335 i386 rt_tgsigqueueinfo sys_rt_tgsigqueueinfo __compat_sys_ia32_rt_tgsigqueueinfo
+336 i386 perf_event_open sys_perf_event_open __sys_ia32_perf_event_open
+337 i386 recvmmsg sys_recvmmsg __compat_sys_ia32_recvmmsg
+338 i386 fanotify_init sys_fanotify_init __sys_ia32_fanotify_init
+339 i386 fanotify_mark sys_fanotify_mark __compat_sys_ia32_fanotify_mark
+340 i386 prlimit64 sys_prlimit64 __sys_ia32_prlimit64
+341 i386 name_to_handle_at sys_name_to_handle_at __sys_ia32_name_to_handle_at
+342 i386 open_by_handle_at sys_open_by_handle_at __compat_sys_ia32_open_by_handle_at
+343 i386 clock_adjtime sys_clock_adjtime __compat_sys_ia32_clock_adjtime
+344 i386 syncfs sys_syncfs __sys_ia32_syncfs
+345 i386 sendmmsg sys_sendmmsg __compat_sys_ia32_sendmmsg
+346 i386 setns sys_setns __sys_ia32_setns
+347 i386 process_vm_readv sys_process_vm_readv __compat_sys_ia32_process_vm_readv
+348 i386 process_vm_writev sys_process_vm_writev __compat_sys_ia32_process_vm_writev
+349 i386 kcmp sys_kcmp __sys_ia32_kcmp
+350 i386 finit_module sys_finit_module __sys_ia32_finit_module
+351 i386 sched_setattr sys_sched_setattr __sys_ia32_sched_setattr
+352 i386 sched_getattr sys_sched_getattr __sys_ia32_sched_getattr
+353 i386 renameat2 sys_renameat2 __sys_ia32_renameat2
+354 i386 seccomp sys_seccomp __sys_ia32_seccomp
+355 i386 getrandom sys_getrandom __sys_ia32_getrandom
+356 i386 memfd_create sys_memfd_create __sys_ia32_memfd_create
+357 i386 bpf sys_bpf __sys_ia32_bpf
+358 i386 execveat sys_execveat __compat_sys_ia32_execveat
+359 i386 socket sys_socket __sys_ia32_socket
+360 i386 socketpair sys_socketpair __sys_ia32_socketpair
+361 i386 bind sys_bind __sys_ia32_bind
+362 i386 connect sys_connect __sys_ia32_connect
+363 i386 listen sys_listen __sys_ia32_listen
+364 i386 accept4 sys_accept4 __sys_ia32_accept4
+365 i386 getsockopt sys_getsockopt __compat_sys_ia32_getsockopt
+366 i386 setsockopt sys_setsockopt __compat_sys_ia32_setsockopt
+367 i386 getsockname sys_getsockname __sys_ia32_getsockname
+368 i386 getpeername sys_getpeername __sys_ia32_getpeername
+369 i386 sendto sys_sendto __sys_ia32_sendto
+370 i386 sendmsg sys_sendmsg __compat_sys_ia32_sendmsg
+371 i386 recvfrom sys_recvfrom __compat_sys_ia32_recvfrom
+372 i386 recvmsg sys_recvmsg __compat_sys_ia32_recvmsg
+373 i386 shutdown sys_shutdown __sys_ia32_shutdown
+374 i386 userfaultfd sys_userfaultfd __sys_ia32_userfaultfd
+375 i386 membarrier sys_membarrier __sys_ia32_membarrier
+376 i386 mlock2 sys_mlock2 __sys_ia32_mlock2
+377 i386 copy_file_range sys_copy_file_range __sys_ia32_copy_file_range
+378 i386 preadv2 sys_preadv2 __compat_sys_ia32_preadv2
+379 i386 pwritev2 sys_pwritev2 __compat_sys_ia32_pwritev2
+380 i386 pkey_mprotect sys_pkey_mprotect __sys_ia32_pkey_mprotect
+381 i386 pkey_alloc sys_pkey_alloc __sys_ia32_pkey_alloc
+382 i386 pkey_free sys_pkey_free __sys_ia32_pkey_free
+383 i386 statx sys_statx __sys_ia32_statx
+384 i386 arch_prctl sys_arch_prctl __compat_sys_ia32_arch_prctl
diff --git a/arch/x86/entry/syscalls/syscall_64.tbl b/arch/x86/entry/syscalls/syscall_64.tbl
index 5aef183e2f85..a83c0f7f462f 100644
--- a/arch/x86/entry/syscalls/syscall_64.tbl
+++ b/arch/x86/entry/syscalls/syscall_64.tbl
@@ -342,41 +342,43 @@
#
# x32-specific system call numbers start at 512 to avoid cache impact
-# for native 64-bit operation.
+# for native 64-bit operation. The __compat_sys_x32 stubs are created
+# on-the-fly for compat_sys_*() compatibility system calls if X86_X32
+# is defined.
#
-512 x32 rt_sigaction compat_sys_rt_sigaction
+512 x32 rt_sigaction __compat_sys_x32_rt_sigaction
513 x32 rt_sigreturn sys32_x32_rt_sigreturn
-514 x32 ioctl compat_sys_ioctl
-515 x32 readv compat_sys_readv
-516 x32 writev compat_sys_writev
-517 x32 recvfrom compat_sys_recvfrom
-518 x32 sendmsg compat_sys_sendmsg
-519 x32 recvmsg compat_sys_recvmsg
-520 x32 execve compat_sys_execve/ptregs
-521 x32 ptrace compat_sys_ptrace
-522 x32 rt_sigpending compat_sys_rt_sigpending
-523 x32 rt_sigtimedwait compat_sys_rt_sigtimedwait
-524 x32 rt_sigqueueinfo compat_sys_rt_sigqueueinfo
-525 x32 sigaltstack compat_sys_sigaltstack
-526 x32 timer_create compat_sys_timer_create
-527 x32 mq_notify compat_sys_mq_notify
-528 x32 kexec_load compat_sys_kexec_load
-529 x32 waitid compat_sys_waitid
-530 x32 set_robust_list compat_sys_set_robust_list
-531 x32 get_robust_list compat_sys_get_robust_list
-532 x32 vmsplice compat_sys_vmsplice
-533 x32 move_pages compat_sys_move_pages
-534 x32 preadv compat_sys_preadv64
-535 x32 pwritev compat_sys_pwritev64
-536 x32 rt_tgsigqueueinfo compat_sys_rt_tgsigqueueinfo
-537 x32 recvmmsg compat_sys_recvmmsg
-538 x32 sendmmsg compat_sys_sendmmsg
-539 x32 process_vm_readv compat_sys_process_vm_readv
-540 x32 process_vm_writev compat_sys_process_vm_writev
-541 x32 setsockopt compat_sys_setsockopt
-542 x32 getsockopt compat_sys_getsockopt
-543 x32 io_setup compat_sys_io_setup
-544 x32 io_submit compat_sys_io_submit
-545 x32 execveat compat_sys_execveat/ptregs
-546 x32 preadv2 compat_sys_preadv64v2
-547 x32 pwritev2 compat_sys_pwritev64v2
+514 x32 ioctl __compat_sys_x32_ioctl
+515 x32 readv __compat_sys_x32_readv
+516 x32 writev __compat_sys_x32_writev
+517 x32 recvfrom __compat_sys_x32_recvfrom
+518 x32 sendmsg __compat_sys_x32_sendmsg
+519 x32 recvmsg __compat_sys_x32_recvmsg
+520 x32 execve __compat_sys_x32_execve/ptregs
+521 x32 ptrace __compat_sys_x32_ptrace
+522 x32 rt_sigpending __compat_sys_x32_rt_sigpending
+523 x32 rt_sigtimedwait __compat_sys_x32_rt_sigtimedwait
+524 x32 rt_sigqueueinfo __compat_sys_x32_rt_sigqueueinfo
+525 x32 sigaltstack __compat_sys_x32_sigaltstack
+526 x32 timer_create __compat_sys_x32_timer_create
+527 x32 mq_notify __compat_sys_x32_mq_notify
+528 x32 kexec_load __compat_sys_x32_kexec_load
+529 x32 waitid __compat_sys_x32_waitid
+530 x32 set_robust_list __compat_sys_x32_set_robust_list
+531 x32 get_robust_list __compat_sys_x32_get_robust_list
+532 x32 vmsplice __compat_sys_x32_vmsplice
+533 x32 move_pages __compat_sys_x32_move_pages
+534 x32 preadv __compat_sys_x32_preadv64
+535 x32 pwritev __compat_sys_x32_pwritev64
+536 x32 rt_tgsigqueueinfo __compat_sys_x32_rt_tgsigqueueinfo
+537 x32 recvmmsg __compat_sys_x32_recvmmsg
+538 x32 sendmmsg __compat_sys_x32_sendmmsg
+539 x32 process_vm_readv __compat_sys_x32_process_vm_readv
+540 x32 process_vm_writev __compat_sys_x32_process_vm_writev
+541 x32 setsockopt __compat_sys_x32_setsockopt
+542 x32 getsockopt __compat_sys_x32_getsockopt
+543 x32 io_setup __compat_sys_x32_io_setup
+544 x32 io_submit __compat_sys_x32_io_submit
+545 x32 execveat __compat_sys_x32_execveat/ptregs
+546 x32 preadv2 __compat_sys_x32_preadv64v2
+547 x32 pwritev2 __compat_sys_x32_pwritev64v2
diff --git a/arch/x86/include/asm/syscall_wrapper.h b/arch/x86/include/asm/syscall_wrapper.h
index 702bdee377af..49d7e4970110 100644
--- a/arch/x86/include/asm/syscall_wrapper.h
+++ b/arch/x86/include/asm/syscall_wrapper.h
@@ -6,6 +6,111 @@
#ifndef _ASM_X86_SYSCALL_WRAPPER_H
#define _ASM_X86_SYSCALL_WRAPPER_H
+/* Mapping of registers to parameters for syscalls on x86-64 and x32 */
+#define SC_X86_64_REGS_TO_ARGS(x, ...) \
+ __MAP(x,__SC_ARGS \
+ ,,regs->di,,regs->si,,regs->dx \
+ ,,regs->r10,,regs->r8,,regs->r9) \
+
+/* Mapping of registers to parameters for syscalls on i386 */
+#define SC_IA32_REGS_TO_ARGS(x, ...) \
+ __MAP(x,__SC_ARGS \
+ ,,(unsigned int)regs->bx,,(unsigned int)regs->cx \
+ ,,(unsigned int)regs->dx,,(unsigned int)regs->si \
+ ,,(unsigned int)regs->di,,(unsigned int)regs->bp)
+
+#ifdef CONFIG_IA32_EMULATION
+/*
+ * For IA32 emulation, we need to handle "compat" syscalls *and* create
+ * additional wrappers (aptly named __sys_ia32_sys_xyzzy) which decode the
+ * ia32 regs in the proper order for shared or "common" syscalls. As some
+ * syscalls may not be implemented, we need to expand COND_SYSCALL in
+ * kernel/sys_ni.c and SYS_NI in kernel/time/posix-stubs.c to cover this
+ * case as well.
+ */
+#define COMPAT_SC_IA32_STUBx(x, name, ...) \
+ asmlinkage long __compat_sys_ia32##name(const struct pt_regs *regs);\
+ ALLOW_ERROR_INJECTION(__compat_sys_ia32##name, ERRNO); \
+ asmlinkage long __compat_sys_ia32##name(const struct pt_regs *regs)\
+ { \
+ return c_SyS##name(SC_IA32_REGS_TO_ARGS(x,__VA_ARGS__));\
+ } \
+
+#define SC_IA32_WRAPPERx(x, name, ...) \
+ asmlinkage long __sys_ia32##name(const struct pt_regs *regs); \
+ ALLOW_ERROR_INJECTION(__sys_ia32##name, ERRNO); \
+ asmlinkage long __sys_ia32##name(const struct pt_regs *regs) \
+ { \
+ return SyS##name(SC_IA32_REGS_TO_ARGS(x,__VA_ARGS__)); \
+ }
+
+#define COND_SYSCALL(name) \
+ cond_syscall(sys_##name); \
+ cond_syscall(__sys_ia32_##name)
+
+#define SYS_NI(name) \
+ SYSCALL_ALIAS(sys_##name, sys_ni_posix_timers); \
+ SYSCALL_ALIAS(__sys_ia32_##name, sys_ni_posix_timers)
+
+#else /* CONFIG_IA32_EMULATION */
+#define COMPAT_SC_IA32_STUBx(x, name, ...)
+#define SC_IA32_WRAPPERx(x, fullname, name, ...)
+#endif /* CONFIG_IA32_EMULATION */
+
+
+#ifdef CONFIG_X86_X32
+/*
+ * For the x32 ABI, we need to create a stub for compat_sys_*() which is aware
+ * of the x86-64-style parameter ordering of x32 syscalls. The syscalls common
+ * with x86_64 obviously do not need such care.
+ */
+#define COMPAT_SC_X32_STUBx(x, name, ...) \
+ asmlinkage long __compat_sys_x32##name(const struct pt_regs *regs);\
+ ALLOW_ERROR_INJECTION(__compat_sys_x32##name, ERRNO); \
+ asmlinkage long __compat_sys_x32##name(const struct pt_regs *regs)\
+ { \
+ return c_SyS##name(SC_X86_64_REGS_TO_ARGS(x,__VA_ARGS__));\
+ } \
+
+#else /* CONFIG_X86_X32 */
+#define COMPAT_SC_X32_STUBx(x, name, ...)
+#endif /* CONFIG_X86_X32 */
+
+
+#ifdef CONFIG_COMPAT
+/*
+ * Compat means IA32_EMULATION and/or X86_X32. As they use a different
+ * mapping of registers to parameters, we need to generate stubs for each
+ * of them. There is no need to implement COMPAT_SYSCALL_DEFINE0, as it is
+ * unused on x86.
+ */
+#define COMPAT_SYSCALL_DEFINEx(x, name, ...) \
+ static long c_SyS##name(__MAP(x,__SC_LONG,__VA_ARGS__)); \
+ static inline long C_SYSC##name(__MAP(x,__SC_DECL,__VA_ARGS__));\
+ COMPAT_SC_IA32_STUBx(x, name, __VA_ARGS__) \
+ COMPAT_SC_X32_STUBx(x, name, __VA_ARGS__) \
+ static long c_SyS##name(__MAP(x,__SC_LONG,__VA_ARGS__)) \
+ { \
+ return C_SYSC##name(__MAP(x,__SC_DELOUSE,__VA_ARGS__)); \
+ } \
+ static inline long C_SYSC##name(__MAP(x,__SC_DECL,__VA_ARGS__))
+
+/*
+ * As some compat syscalls may not be implemented, we need to expand
+ * COND_SYSCALL_COMPAT in kernel/sys_ni.c and COMPAT_SYS_NI in
+ * kernel/time/posix-stubs.c to cover this case as well.
+ */
+#define COND_SYSCALL_COMPAT(name) \
+ cond_syscall(__compat_sys_ia32_##name); \
+ cond_syscall(__compat_sys_x32_##name)
+
+#define COMPAT_SYS_NI(name) \
+ SYSCALL_ALIAS(__compat_sys_ia32_##name, sys_ni_posix_timers); \
+ SYSCALL_ALIAS(__compat_sys_x32_##name, sys_ni_posix_timers)
+
+#endif /* CONFIG_COMPAT */
+
+
/*
* Instead of the generic __SYSCALL_DEFINEx() definition, this macro takes
* struct pt_regs *regs as the only argument of the syscall stub named
@@ -34,9 +139,14 @@
* This approach avoids leaking random user-provided register content down
* the call chain.
*
+ * If IA32_EMULATION is enabled, this macro generates an additional wrapper
+ * named __sys_ia32_*() which decodes the struct pt_regs *regs according
+ * to the i386 calling convention (bx, cx, dx, si, di, bp).
+ *
* As the generic SYSCALL_DEFINE0() macro does not decode any parameters for
* obvious reasons, and passing struct pt_regs *regs to it in %rdi does not
- * hurt, there is no need to override it.
+ * hurt, there is no need to override it, or to define it differently for
+ * IA32_EMULATION.
*/
#define __SYSCALL_DEFINEx(x, name, ...) \
asmlinkage long sys##name(const struct pt_regs *regs); \
@@ -45,10 +155,9 @@
static inline long SYSC##name(__MAP(x,__SC_DECL,__VA_ARGS__)); \
asmlinkage long sys##name(const struct pt_regs *regs) \
{ \
- return SyS##name(__MAP(x,__SC_ARGS \
- ,,regs->di,,regs->si,,regs->dx \
- ,,regs->r10,,regs->r8,,regs->r9)); \
+ return SyS##name(SC_X86_64_REGS_TO_ARGS(x,__VA_ARGS__));\
} \
+ SC_IA32_WRAPPERx(x, name, __VA_ARGS__) \
static long SyS##name(__MAP(x,__SC_LONG,__VA_ARGS__)) \
{ \
long ret = SYSC##name(__MAP(x,__SC_CAST,__VA_ARGS__)); \
^ permalink raw reply related [flat|nested] 27+ messages in thread
* [PATCH 6/8] syscalls/x86: unconditionally enable struct pt_regs based syscalls on x86_64
2018-04-05 9:52 [PATCH 0/8] use struct pt_regs based syscall calling for x86-64 Dominik Brodowski
` (4 preceding siblings ...)
2018-04-05 9:53 ` [PATCH 5/8] syscalls/x86: use struct pt_regs based syscall calling for IA32_EMULATION and x32 Dominik Brodowski
@ 2018-04-05 9:53 ` Dominik Brodowski
2018-04-06 17:12 ` [tip:x86/asm] syscalls/x86: Unconditionally enable 'struct pt_regs' " tip-bot for Dominik Brodowski
2018-04-05 9:53 ` [PATCH 7/8] x86/entry/64: extend register clearing on syscall entry to lower registers Dominik Brodowski
` (2 subsequent siblings)
8 siblings, 1 reply; 27+ messages in thread
From: Dominik Brodowski @ 2018-04-05 9:53 UTC (permalink / raw)
To: linux-kernel, mingo
Cc: Thomas Gleixner, Andi Kleen, Ingo Molnar, Andrew Morton, Al Viro,
Andy Lutomirski, Denys Vlasenko, Brian Gerst, Peter Zijlstra,
Linus Torvalds, H. Peter Anvin, x86
Removing CONFIG_SYSCALL_PTREGS from arch/x86/Kconfig and simply selecting
ARCH_HAS_SYSCALL_WRAPPER unconditionally on x86-64 allows us to simplify
several codepaths.
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: x86@kernel.org
Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
---
arch/x86/Kconfig | 6 +-----
arch/x86/entry/common.c | 10 ++--------
arch/x86/entry/syscall_32.c | 6 +++---
arch/x86/entry/syscall_64.c | 5 -----
arch/x86/entry/vsyscall/vsyscall_64.c | 18 ------------------
arch/x86/include/asm/syscall.h | 4 ++--
arch/x86/include/asm/syscalls.h | 20 ++++----------------
7 files changed, 12 insertions(+), 57 deletions(-)
diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 7bbd6a174722..bcdd3e0e2ef5 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -29,6 +29,7 @@ config X86_64
select HAVE_ARCH_SOFT_DIRTY
select MODULES_USE_ELF_RELA
select X86_DEV_DMA_OPS
+ select ARCH_HAS_SYSCALL_WRAPPER
#
# Arch settings
@@ -2954,8 +2955,3 @@ source "crypto/Kconfig"
source "arch/x86/kvm/Kconfig"
source "lib/Kconfig"
-
-config SYSCALL_PTREGS
- def_bool y
- depends on X86_64
- select ARCH_HAS_SYSCALL_WRAPPER
diff --git a/arch/x86/entry/common.c b/arch/x86/entry/common.c
index 425f798b39e3..fbf6a6c3fd2d 100644
--- a/arch/x86/entry/common.c
+++ b/arch/x86/entry/common.c
@@ -284,13 +284,7 @@ __visible void do_syscall_64(unsigned long nr, struct pt_regs *regs)
nr &= __SYSCALL_MASK;
if (likely(nr < NR_syscalls)) {
nr = array_index_nospec(nr, NR_syscalls);
-#ifdef CONFIG_SYSCALL_PTREGS
regs->ax = sys_call_table[nr](regs);
-#else
- regs->ax = sys_call_table[nr](
- regs->di, regs->si, regs->dx,
- regs->r10, regs->r8, regs->r9);
-#endif
}
syscall_return_slowpath(regs);
@@ -325,7 +319,7 @@ static __always_inline void do_syscall_32_irqs_on(struct pt_regs *regs)
if (likely(nr < IA32_NR_syscalls)) {
nr = array_index_nospec(nr, IA32_NR_syscalls);
-#ifdef CONFIG_SYSCALL_PTREGS
+#ifdef CONFIG_IA32_EMULATION
regs->ax = ia32_sys_call_table[nr](regs);
#else
/*
@@ -338,7 +332,7 @@ static __always_inline void do_syscall_32_irqs_on(struct pt_regs *regs)
(unsigned int)regs->bx, (unsigned int)regs->cx,
(unsigned int)regs->dx, (unsigned int)regs->si,
(unsigned int)regs->di, (unsigned int)regs->bp);
-#endif /* CONFIG_SYSCALL_PTREGS */
+#endif /* CONFIG_IA32_EMULATION */
}
syscall_return_slowpath(regs);
diff --git a/arch/x86/entry/syscall_32.c b/arch/x86/entry/syscall_32.c
index 47060dd8efb1..aa3336a7cb15 100644
--- a/arch/x86/entry/syscall_32.c
+++ b/arch/x86/entry/syscall_32.c
@@ -7,17 +7,17 @@
#include <asm/asm-offsets.h>
#include <asm/syscall.h>
-#ifdef CONFIG_SYSCALL_PTREGS
+#ifdef CONFIG_IA32_EMULATION
/* On X86_64, we use struct pt_regs * to pass parameters to syscalls */
#define __SYSCALL_I386(nr, sym, qual) extern asmlinkage long sym(const struct pt_regs *);
/* this is a lie, but it does not hurt as sys_ni_syscall just returns -EINVAL */
extern asmlinkage long sys_ni_syscall(const struct pt_regs *);
-#else /* CONFIG_SYSCALL_PTREGS */
+#else /* CONFIG_IA32_EMULATION */
#define __SYSCALL_I386(nr, sym, qual) extern asmlinkage long sym(unsigned long, unsigned long, unsigned long, unsigned long, unsigned long, unsigned long);
extern asmlinkage long sys_ni_syscall(unsigned long, unsigned long, unsigned long, unsigned long, unsigned long, unsigned long);
-#endif /* CONFIG_SYSCALL_PTREGS */
+#endif /* CONFIG_IA32_EMULATION */
#include <asm/syscalls_32.h>
#undef __SYSCALL_I386
diff --git a/arch/x86/entry/syscall_64.c b/arch/x86/entry/syscall_64.c
index 6197850adf91..d5252bc1e380 100644
--- a/arch/x86/entry/syscall_64.c
+++ b/arch/x86/entry/syscall_64.c
@@ -7,14 +7,9 @@
#include <asm/asm-offsets.h>
#include <asm/syscall.h>
-#ifdef CONFIG_SYSCALL_PTREGS
/* this is a lie, but it does not hurt as sys_ni_syscall just returns -EINVAL */
extern asmlinkage long sys_ni_syscall(const struct pt_regs *);
#define __SYSCALL_64(nr, sym, qual) extern asmlinkage long sym(const struct pt_regs *);
-#else /* CONFIG_SYSCALL_PTREGS */
-extern asmlinkage long sys_ni_syscall(unsigned long, unsigned long, unsigned long, unsigned long, unsigned long, unsigned long);
-#define __SYSCALL_64(nr, sym, qual) extern asmlinkage long sym(unsigned long, unsigned long, unsigned long, unsigned long, unsigned long, unsigned long);
-#endif /* CONFIG_SYSCALL_PTREGS */
#include <asm/syscalls_64.h>
#undef __SYSCALL_64
diff --git a/arch/x86/entry/vsyscall/vsyscall_64.c b/arch/x86/entry/vsyscall/vsyscall_64.c
index 05eebbf9b989..20b3d4a88ee4 100644
--- a/arch/x86/entry/vsyscall/vsyscall_64.c
+++ b/arch/x86/entry/vsyscall/vsyscall_64.c
@@ -127,9 +127,7 @@ bool emulate_vsyscall(struct pt_regs *regs, unsigned long address)
int vsyscall_nr, syscall_nr, tmp;
int prev_sig_on_uaccess_err;
long ret;
-#ifdef CONFIG_SYSCALL_PTREGS
unsigned long orig_dx;
-#endif
/*
* No point in checking CS -- the only way to get here is a user mode
@@ -230,38 +228,22 @@ bool emulate_vsyscall(struct pt_regs *regs, unsigned long address)
ret = -EFAULT;
switch (vsyscall_nr) {
case 0:
-#ifdef CONFIG_SYSCALL_PTREGS
/* this decodes regs->di and regs->si on its own */
ret = sys_gettimeofday(regs);
-#else
- ret = sys_gettimeofday(
- (struct timeval __user *)regs->di,
- (struct timezone __user *)regs->si);
-#endif /* CONFIG_SYSCALL_PTREGS */
break;
case 1:
-#ifdef CONFIG_SYSCALL_PTREGS
/* this decodes regs->di on its own */
ret = sys_time(regs);
-#else
- ret = sys_time((time_t __user *)regs->di);
-#endif /* CONFIG_SYSCALL_PTREGS */
break;
case 2:
-#ifdef CONFIG_SYSCALL_PTREGS
/* while we could clobber regs->dx, we didn't in the past... */
orig_dx = regs->dx;
regs->dx = 0;
/* this decodes regs->di, regs->si and regs->dx on its own */
ret = sys_getcpu(regs);
regs->dx = orig_dx;
-#else
- ret = sys_getcpu((unsigned __user *)regs->di,
- (unsigned __user *)regs->si,
- NULL);
-#endif /* CONFIG_SYSCALL_PTREGS */
break;
}
diff --git a/arch/x86/include/asm/syscall.h b/arch/x86/include/asm/syscall.h
index 17c62373a6f9..d653139857af 100644
--- a/arch/x86/include/asm/syscall.h
+++ b/arch/x86/include/asm/syscall.h
@@ -20,13 +20,13 @@
#include <asm/thread_info.h> /* for TS_COMPAT */
#include <asm/unistd.h>
-#ifdef CONFIG_SYSCALL_PTREGS
+#ifdef CONFIG_X86_64
typedef asmlinkage long (*sys_call_ptr_t)(const struct pt_regs *);
#else
typedef asmlinkage long (*sys_call_ptr_t)(unsigned long, unsigned long,
unsigned long, unsigned long,
unsigned long, unsigned long);
-#endif /* CONFIG_SYSCALL_PTREGS */
+#endif /* CONFIG_X86_64 */
extern const sys_call_ptr_t sys_call_table[];
#if defined(CONFIG_X86_32)
diff --git a/arch/x86/include/asm/syscalls.h b/arch/x86/include/asm/syscalls.h
index e4ad93c05f02..d4d18d94695c 100644
--- a/arch/x86/include/asm/syscalls.h
+++ b/arch/x86/include/asm/syscalls.h
@@ -19,10 +19,10 @@
/* kernel/ioport.c */
long ksys_ioperm(unsigned long from, unsigned long num, int turn_on);
-#ifndef CONFIG_SYSCALL_PTREGS
-/*
- * If CONFIG_SYSCALL_PTREGS is enabled, a different syscall calling convention
- * is used. Do not include these -- invalid -- prototypes then
+#ifdef CONFIG_X86_32
+/*
+ * These definitions are only valid on pure 32-bit systems; x86-64 uses a
+ * different syscall calling convention
*/
asmlinkage long sys_ioperm(unsigned long, unsigned long, int);
asmlinkage long sys_iopl(unsigned int);
@@ -38,7 +38,6 @@ asmlinkage long sys_set_thread_area(struct user_desc __user *);
asmlinkage long sys_get_thread_area(struct user_desc __user *);
/* X86_32 only */
-#ifdef CONFIG_X86_32
/* kernel/signal.c */
asmlinkage long sys_sigreturn(void);
@@ -48,16 +47,5 @@ struct vm86_struct;
asmlinkage long sys_vm86old(struct vm86_struct __user *);
asmlinkage long sys_vm86(unsigned long, unsigned long);
-#else /* CONFIG_X86_32 */
-
-/* X86_64 only */
-/* kernel/process_64.c */
-asmlinkage long sys_arch_prctl(int, unsigned long);
-
-/* kernel/sys_x86_64.c */
-asmlinkage long sys_mmap(unsigned long, unsigned long, unsigned long,
- unsigned long, unsigned long, unsigned long);
-
#endif /* CONFIG_X86_32 */
-#endif /* CONFIG_SYSCALL_PTREGS */
#endif /* _ASM_X86_SYSCALLS_H */
--
2.16.3
^ permalink raw reply related [flat|nested] 27+ messages in thread
* [tip:x86/asm] syscalls/x86: Unconditionally enable 'struct pt_regs' based syscalls on x86_64
2018-04-05 9:53 ` [PATCH 6/8] syscalls/x86: unconditionally enable struct pt_regs based syscalls on x86_64 Dominik Brodowski
@ 2018-04-06 17:12 ` tip-bot for Dominik Brodowski
0 siblings, 0 replies; 27+ messages in thread
From: tip-bot for Dominik Brodowski @ 2018-04-06 17:12 UTC (permalink / raw)
To: linux-tip-commits
Cc: linux-kernel, linux, luto, bp, torvalds, tglx, mingo, brgerst,
peterz, viro, akpm, dvlasenk, jpoimboe, hpa
Commit-ID: f8781c4a226319fe60e652118b90cf094ccfe747
Gitweb: https://git.kernel.org/tip/f8781c4a226319fe60e652118b90cf094ccfe747
Author: Dominik Brodowski <linux@dominikbrodowski.net>
AuthorDate: Thu, 5 Apr 2018 11:53:05 +0200
Committer: Ingo Molnar <mingo@kernel.org>
CommitDate: Thu, 5 Apr 2018 16:59:38 +0200
syscalls/x86: Unconditionally enable 'struct pt_regs' based syscalls on x86_64
Removing CONFIG_SYSCALL_PTREGS from arch/x86/Kconfig and simply selecting
ARCH_HAS_SYSCALL_WRAPPER unconditionally on x86-64 allows us to simplify
several codepaths.
Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20180405095307.3730-7-linux@dominikbrodowski.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
arch/x86/Kconfig | 6 +-----
arch/x86/entry/common.c | 10 ++--------
arch/x86/entry/syscall_32.c | 6 +++---
arch/x86/entry/syscall_64.c | 5 -----
arch/x86/entry/vsyscall/vsyscall_64.c | 18 ------------------
arch/x86/include/asm/syscall.h | 4 ++--
arch/x86/include/asm/syscalls.h | 20 ++++----------------
7 files changed, 12 insertions(+), 57 deletions(-)
diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 7bbd6a174722..bcdd3e0e2ef5 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -29,6 +29,7 @@ config X86_64
select HAVE_ARCH_SOFT_DIRTY
select MODULES_USE_ELF_RELA
select X86_DEV_DMA_OPS
+ select ARCH_HAS_SYSCALL_WRAPPER
#
# Arch settings
@@ -2954,8 +2955,3 @@ source "crypto/Kconfig"
source "arch/x86/kvm/Kconfig"
source "lib/Kconfig"
-
-config SYSCALL_PTREGS
- def_bool y
- depends on X86_64
- select ARCH_HAS_SYSCALL_WRAPPER
diff --git a/arch/x86/entry/common.c b/arch/x86/entry/common.c
index 425f798b39e3..fbf6a6c3fd2d 100644
--- a/arch/x86/entry/common.c
+++ b/arch/x86/entry/common.c
@@ -284,13 +284,7 @@ __visible void do_syscall_64(unsigned long nr, struct pt_regs *regs)
nr &= __SYSCALL_MASK;
if (likely(nr < NR_syscalls)) {
nr = array_index_nospec(nr, NR_syscalls);
-#ifdef CONFIG_SYSCALL_PTREGS
regs->ax = sys_call_table[nr](regs);
-#else
- regs->ax = sys_call_table[nr](
- regs->di, regs->si, regs->dx,
- regs->r10, regs->r8, regs->r9);
-#endif
}
syscall_return_slowpath(regs);
@@ -325,7 +319,7 @@ static __always_inline void do_syscall_32_irqs_on(struct pt_regs *regs)
if (likely(nr < IA32_NR_syscalls)) {
nr = array_index_nospec(nr, IA32_NR_syscalls);
-#ifdef CONFIG_SYSCALL_PTREGS
+#ifdef CONFIG_IA32_EMULATION
regs->ax = ia32_sys_call_table[nr](regs);
#else
/*
@@ -338,7 +332,7 @@ static __always_inline void do_syscall_32_irqs_on(struct pt_regs *regs)
(unsigned int)regs->bx, (unsigned int)regs->cx,
(unsigned int)regs->dx, (unsigned int)regs->si,
(unsigned int)regs->di, (unsigned int)regs->bp);
-#endif /* CONFIG_SYSCALL_PTREGS */
+#endif /* CONFIG_IA32_EMULATION */
}
syscall_return_slowpath(regs);
diff --git a/arch/x86/entry/syscall_32.c b/arch/x86/entry/syscall_32.c
index 47060dd8efb1..aa3336a7cb15 100644
--- a/arch/x86/entry/syscall_32.c
+++ b/arch/x86/entry/syscall_32.c
@@ -7,17 +7,17 @@
#include <asm/asm-offsets.h>
#include <asm/syscall.h>
-#ifdef CONFIG_SYSCALL_PTREGS
+#ifdef CONFIG_IA32_EMULATION
/* On X86_64, we use struct pt_regs * to pass parameters to syscalls */
#define __SYSCALL_I386(nr, sym, qual) extern asmlinkage long sym(const struct pt_regs *);
/* this is a lie, but it does not hurt as sys_ni_syscall just returns -EINVAL */
extern asmlinkage long sys_ni_syscall(const struct pt_regs *);
-#else /* CONFIG_SYSCALL_PTREGS */
+#else /* CONFIG_IA32_EMULATION */
#define __SYSCALL_I386(nr, sym, qual) extern asmlinkage long sym(unsigned long, unsigned long, unsigned long, unsigned long, unsigned long, unsigned long);
extern asmlinkage long sys_ni_syscall(unsigned long, unsigned long, unsigned long, unsigned long, unsigned long, unsigned long);
-#endif /* CONFIG_SYSCALL_PTREGS */
+#endif /* CONFIG_IA32_EMULATION */
#include <asm/syscalls_32.h>
#undef __SYSCALL_I386
diff --git a/arch/x86/entry/syscall_64.c b/arch/x86/entry/syscall_64.c
index 6197850adf91..d5252bc1e380 100644
--- a/arch/x86/entry/syscall_64.c
+++ b/arch/x86/entry/syscall_64.c
@@ -7,14 +7,9 @@
#include <asm/asm-offsets.h>
#include <asm/syscall.h>
-#ifdef CONFIG_SYSCALL_PTREGS
/* this is a lie, but it does not hurt as sys_ni_syscall just returns -EINVAL */
extern asmlinkage long sys_ni_syscall(const struct pt_regs *);
#define __SYSCALL_64(nr, sym, qual) extern asmlinkage long sym(const struct pt_regs *);
-#else /* CONFIG_SYSCALL_PTREGS */
-extern asmlinkage long sys_ni_syscall(unsigned long, unsigned long, unsigned long, unsigned long, unsigned long, unsigned long);
-#define __SYSCALL_64(nr, sym, qual) extern asmlinkage long sym(unsigned long, unsigned long, unsigned long, unsigned long, unsigned long, unsigned long);
-#endif /* CONFIG_SYSCALL_PTREGS */
#include <asm/syscalls_64.h>
#undef __SYSCALL_64
diff --git a/arch/x86/entry/vsyscall/vsyscall_64.c b/arch/x86/entry/vsyscall/vsyscall_64.c
index 05eebbf9b989..20b3d4a88ee4 100644
--- a/arch/x86/entry/vsyscall/vsyscall_64.c
+++ b/arch/x86/entry/vsyscall/vsyscall_64.c
@@ -127,9 +127,7 @@ bool emulate_vsyscall(struct pt_regs *regs, unsigned long address)
int vsyscall_nr, syscall_nr, tmp;
int prev_sig_on_uaccess_err;
long ret;
-#ifdef CONFIG_SYSCALL_PTREGS
unsigned long orig_dx;
-#endif
/*
* No point in checking CS -- the only way to get here is a user mode
@@ -230,38 +228,22 @@ bool emulate_vsyscall(struct pt_regs *regs, unsigned long address)
ret = -EFAULT;
switch (vsyscall_nr) {
case 0:
-#ifdef CONFIG_SYSCALL_PTREGS
/* this decodes regs->di and regs->si on its own */
ret = sys_gettimeofday(regs);
-#else
- ret = sys_gettimeofday(
- (struct timeval __user *)regs->di,
- (struct timezone __user *)regs->si);
-#endif /* CONFIG_SYSCALL_PTREGS */
break;
case 1:
-#ifdef CONFIG_SYSCALL_PTREGS
/* this decodes regs->di on its own */
ret = sys_time(regs);
-#else
- ret = sys_time((time_t __user *)regs->di);
-#endif /* CONFIG_SYSCALL_PTREGS */
break;
case 2:
-#ifdef CONFIG_SYSCALL_PTREGS
/* while we could clobber regs->dx, we didn't in the past... */
orig_dx = regs->dx;
regs->dx = 0;
/* this decodes regs->di, regs->si and regs->dx on its own */
ret = sys_getcpu(regs);
regs->dx = orig_dx;
-#else
- ret = sys_getcpu((unsigned __user *)regs->di,
- (unsigned __user *)regs->si,
- NULL);
-#endif /* CONFIG_SYSCALL_PTREGS */
break;
}
diff --git a/arch/x86/include/asm/syscall.h b/arch/x86/include/asm/syscall.h
index 17c62373a6f9..d653139857af 100644
--- a/arch/x86/include/asm/syscall.h
+++ b/arch/x86/include/asm/syscall.h
@@ -20,13 +20,13 @@
#include <asm/thread_info.h> /* for TS_COMPAT */
#include <asm/unistd.h>
-#ifdef CONFIG_SYSCALL_PTREGS
+#ifdef CONFIG_X86_64
typedef asmlinkage long (*sys_call_ptr_t)(const struct pt_regs *);
#else
typedef asmlinkage long (*sys_call_ptr_t)(unsigned long, unsigned long,
unsigned long, unsigned long,
unsigned long, unsigned long);
-#endif /* CONFIG_SYSCALL_PTREGS */
+#endif /* CONFIG_X86_64 */
extern const sys_call_ptr_t sys_call_table[];
#if defined(CONFIG_X86_32)
diff --git a/arch/x86/include/asm/syscalls.h b/arch/x86/include/asm/syscalls.h
index e4ad93c05f02..d4d18d94695c 100644
--- a/arch/x86/include/asm/syscalls.h
+++ b/arch/x86/include/asm/syscalls.h
@@ -19,10 +19,10 @@
/* kernel/ioport.c */
long ksys_ioperm(unsigned long from, unsigned long num, int turn_on);
-#ifndef CONFIG_SYSCALL_PTREGS
-/*
- * If CONFIG_SYSCALL_PTREGS is enabled, a different syscall calling convention
- * is used. Do not include these -- invalid -- prototypes then
+#ifdef CONFIG_X86_32
+/*
+ * These definitions are only valid on pure 32-bit systems; x86-64 uses a
+ * different syscall calling convention
*/
asmlinkage long sys_ioperm(unsigned long, unsigned long, int);
asmlinkage long sys_iopl(unsigned int);
@@ -38,7 +38,6 @@ asmlinkage long sys_set_thread_area(struct user_desc __user *);
asmlinkage long sys_get_thread_area(struct user_desc __user *);
/* X86_32 only */
-#ifdef CONFIG_X86_32
/* kernel/signal.c */
asmlinkage long sys_sigreturn(void);
@@ -48,16 +47,5 @@ struct vm86_struct;
asmlinkage long sys_vm86old(struct vm86_struct __user *);
asmlinkage long sys_vm86(unsigned long, unsigned long);
-#else /* CONFIG_X86_32 */
-
-/* X86_64 only */
-/* kernel/process_64.c */
-asmlinkage long sys_arch_prctl(int, unsigned long);
-
-/* kernel/sys_x86_64.c */
-asmlinkage long sys_mmap(unsigned long, unsigned long, unsigned long,
- unsigned long, unsigned long, unsigned long);
-
#endif /* CONFIG_X86_32 */
-#endif /* CONFIG_SYSCALL_PTREGS */
#endif /* _ASM_X86_SYSCALLS_H */
^ permalink raw reply related [flat|nested] 27+ messages in thread
* [PATCH 7/8] x86/entry/64: extend register clearing on syscall entry to lower registers
2018-04-05 9:52 [PATCH 0/8] use struct pt_regs based syscall calling for x86-64 Dominik Brodowski
` (5 preceding siblings ...)
2018-04-05 9:53 ` [PATCH 6/8] syscalls/x86: unconditionally enable struct pt_regs based syscalls on x86_64 Dominik Brodowski
@ 2018-04-05 9:53 ` Dominik Brodowski
2018-04-06 17:13 ` [tip:x86/asm] syscalls/x86: Extend " tip-bot for Dominik Brodowski
2018-04-05 9:53 ` [PATCH 8/8] syscalls/x86: rename struct pt_regs-based sys_*() to __sys_x86_*() Dominik Brodowski
2018-04-05 15:19 ` [PATCH 0/8] use struct pt_regs based syscall calling for x86-64 Ingo Molnar
8 siblings, 1 reply; 27+ messages in thread
From: Dominik Brodowski @ 2018-04-05 9:53 UTC (permalink / raw)
To: linux-kernel, mingo
Cc: Thomas Gleixner, Andi Kleen, Ingo Molnar, Andrew Morton, Al Viro,
Andy Lutomirski, Denys Vlasenko, Brian Gerst, Peter Zijlstra,
Linus Torvalds, H. Peter Anvin, x86
To reduce the chance that random user space content leaks down the call
chain in registers, also clear lower registers on syscall entry:
For 64-bit syscalls, extend the register clearing in PUSH_AND_CLEAR_REGS
to %dx and %cx. This should not hurt at all, also on the other callers
of that macro. We do not need to clear %rdi and %rsi for syscall entry,
as those registers are used to pass the parameters to do_syscall_64().
For the 32-bit compat syscalls, do_int80_syscall_32() and
do_fast_syscall_32() each only take one parameter. Therefore, extend the
register clearing to %dx, %cx, and %si in entry_SYSCALL_compat and
entry_INT80_compat.
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: x86@kernel.org
Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
---
arch/x86/entry/calling.h | 2 ++
arch/x86/entry/entry_64_compat.S | 6 ++++++
2 files changed, 8 insertions(+)
diff --git a/arch/x86/entry/calling.h b/arch/x86/entry/calling.h
index be63330c5511..352e70cd33e8 100644
--- a/arch/x86/entry/calling.h
+++ b/arch/x86/entry/calling.h
@@ -114,7 +114,9 @@ For 32-bit we have the following conventions - kernel is built with
pushq %rsi /* pt_regs->si */
.endif
pushq \rdx /* pt_regs->dx */
+ xorl %edx, %edx /* nospec dx */
pushq %rcx /* pt_regs->cx */
+ xorl %ecx, %ecx /* nospec cx */
pushq \rax /* pt_regs->ax */
pushq %r8 /* pt_regs->r8 */
xorl %r8d, %r8d /* nospec r8 */
diff --git a/arch/x86/entry/entry_64_compat.S b/arch/x86/entry/entry_64_compat.S
index 08425c42f8b7..9af927e59d49 100644
--- a/arch/x86/entry/entry_64_compat.S
+++ b/arch/x86/entry/entry_64_compat.S
@@ -220,8 +220,11 @@ GLOBAL(entry_SYSCALL_compat_after_hwframe)
pushq %rax /* pt_regs->orig_ax */
pushq %rdi /* pt_regs->di */
pushq %rsi /* pt_regs->si */
+ xorl %esi, %esi /* nospec si */
pushq %rdx /* pt_regs->dx */
+ xorl %edx, %edx /* nospec dx */
pushq %rbp /* pt_regs->cx (stashed in bp) */
+ xorl %ecx, %ecx /* nospec cx */
pushq $-ENOSYS /* pt_regs->ax */
pushq $0 /* pt_regs->r8 = 0 */
xorl %r8d, %r8d /* nospec r8 */
@@ -365,8 +368,11 @@ ENTRY(entry_INT80_compat)
pushq (%rdi) /* pt_regs->di */
pushq %rsi /* pt_regs->si */
+ xorl %esi, %esi /* nospec si */
pushq %rdx /* pt_regs->dx */
+ xorl %edx, %edx /* nospec dx */
pushq %rcx /* pt_regs->cx */
+ xorl %ecx, %ecx /* nospec cx */
pushq $-ENOSYS /* pt_regs->ax */
pushq $0 /* pt_regs->r8 = 0 */
xorl %r8d, %r8d /* nospec r8 */
--
2.16.3
^ permalink raw reply related [flat|nested] 27+ messages in thread
* [tip:x86/asm] syscalls/x86: Extend register clearing on syscall entry to lower registers
2018-04-05 9:53 ` [PATCH 7/8] x86/entry/64: extend register clearing on syscall entry to lower registers Dominik Brodowski
@ 2018-04-06 17:13 ` tip-bot for Dominik Brodowski
0 siblings, 0 replies; 27+ messages in thread
From: tip-bot for Dominik Brodowski @ 2018-04-06 17:13 UTC (permalink / raw)
To: linux-tip-commits
Cc: akpm, viro, dvlasenk, luto, hpa, linux, bp, tglx, jpoimboe,
mingo, linux-kernel, torvalds, brgerst, peterz
Commit-ID: 6dc936f175cc6d12a8eb14d29b87e9238e460383
Gitweb: https://git.kernel.org/tip/6dc936f175cc6d12a8eb14d29b87e9238e460383
Author: Dominik Brodowski <linux@dominikbrodowski.net>
AuthorDate: Thu, 5 Apr 2018 11:53:06 +0200
Committer: Ingo Molnar <mingo@kernel.org>
CommitDate: Thu, 5 Apr 2018 16:59:39 +0200
syscalls/x86: Extend register clearing on syscall entry to lower registers
To reduce the chance that random user space content leaks down the call
chain in registers, also clear lower registers on syscall entry:
For 64-bit syscalls, extend the register clearing in PUSH_AND_CLEAR_REGS
to %dx and %cx. This should not hurt at all, also on the other callers
of that macro. We do not need to clear %rdi and %rsi for syscall entry,
as those registers are used to pass the parameters to do_syscall_64().
For the 32-bit compat syscalls, do_int80_syscall_32() and
do_fast_syscall_32() each only take one parameter. Therefore, extend the
register clearing to %dx, %cx, and %si in entry_SYSCALL_compat and
entry_INT80_compat.
Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20180405095307.3730-8-linux@dominikbrodowski.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
arch/x86/entry/calling.h | 2 ++
arch/x86/entry/entry_64_compat.S | 6 ++++++
2 files changed, 8 insertions(+)
diff --git a/arch/x86/entry/calling.h b/arch/x86/entry/calling.h
index be63330c5511..352e70cd33e8 100644
--- a/arch/x86/entry/calling.h
+++ b/arch/x86/entry/calling.h
@@ -114,7 +114,9 @@ For 32-bit we have the following conventions - kernel is built with
pushq %rsi /* pt_regs->si */
.endif
pushq \rdx /* pt_regs->dx */
+ xorl %edx, %edx /* nospec dx */
pushq %rcx /* pt_regs->cx */
+ xorl %ecx, %ecx /* nospec cx */
pushq \rax /* pt_regs->ax */
pushq %r8 /* pt_regs->r8 */
xorl %r8d, %r8d /* nospec r8 */
diff --git a/arch/x86/entry/entry_64_compat.S b/arch/x86/entry/entry_64_compat.S
index 08425c42f8b7..9af927e59d49 100644
--- a/arch/x86/entry/entry_64_compat.S
+++ b/arch/x86/entry/entry_64_compat.S
@@ -220,8 +220,11 @@ GLOBAL(entry_SYSCALL_compat_after_hwframe)
pushq %rax /* pt_regs->orig_ax */
pushq %rdi /* pt_regs->di */
pushq %rsi /* pt_regs->si */
+ xorl %esi, %esi /* nospec si */
pushq %rdx /* pt_regs->dx */
+ xorl %edx, %edx /* nospec dx */
pushq %rbp /* pt_regs->cx (stashed in bp) */
+ xorl %ecx, %ecx /* nospec cx */
pushq $-ENOSYS /* pt_regs->ax */
pushq $0 /* pt_regs->r8 = 0 */
xorl %r8d, %r8d /* nospec r8 */
@@ -365,8 +368,11 @@ ENTRY(entry_INT80_compat)
pushq (%rdi) /* pt_regs->di */
pushq %rsi /* pt_regs->si */
+ xorl %esi, %esi /* nospec si */
pushq %rdx /* pt_regs->dx */
+ xorl %edx, %edx /* nospec dx */
pushq %rcx /* pt_regs->cx */
+ xorl %ecx, %ecx /* nospec cx */
pushq $-ENOSYS /* pt_regs->ax */
pushq $0 /* pt_regs->r8 = 0 */
xorl %r8d, %r8d /* nospec r8 */
^ permalink raw reply related [flat|nested] 27+ messages in thread
* [PATCH 8/8] syscalls/x86: rename struct pt_regs-based sys_*() to __sys_x86_*()
2018-04-05 9:52 [PATCH 0/8] use struct pt_regs based syscall calling for x86-64 Dominik Brodowski
` (6 preceding siblings ...)
2018-04-05 9:53 ` [PATCH 7/8] x86/entry/64: extend register clearing on syscall entry to lower registers Dominik Brodowski
@ 2018-04-05 9:53 ` Dominik Brodowski
2018-04-05 18:35 ` kbuild test robot
2018-04-05 15:19 ` [PATCH 0/8] use struct pt_regs based syscall calling for x86-64 Ingo Molnar
8 siblings, 1 reply; 27+ messages in thread
From: Dominik Brodowski @ 2018-04-05 9:53 UTC (permalink / raw)
To: linux-kernel, mingo
Cc: Linus Torvalds, Thomas Gleixner, Ingo Molnar, H. Peter Anvin,
Andi Kleen, x86
While it may make sense to name different things differently, I am not so
sure that the additional code is worth it here...
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: x86@kernel.org
Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
---
arch/x86/entry/syscalls/syscall_32.tbl | 49 +--
arch/x86/entry/syscalls/syscall_64.tbl | 638 +++++++++++++++++----------------
arch/x86/entry/vsyscall/vsyscall_64.c | 6 +-
arch/x86/include/asm/syscall_wrapper.h | 48 ++-
4 files changed, 381 insertions(+), 360 deletions(-)
diff --git a/arch/x86/entry/syscalls/syscall_32.tbl b/arch/x86/entry/syscalls/syscall_32.tbl
index 7f09a3da0b3d..9b0973a20a65 100644
--- a/arch/x86/entry/syscalls/syscall_32.tbl
+++ b/arch/x86/entry/syscalls/syscall_32.tbl
@@ -7,13 +7,14 @@
# The __sys_ia32 and __compat_sys_ia32 stubs are created on-the-fly for
# sys_*() system calls and compat_sys_*() compat system calls if
# IA32_EMULATION is defined, and expect struct pt_regs *regs as their only
-# parameter.
+# parameter. The __sys_x86_ stubs below, which are common to x86-64, refer
+# solely to 0-parameter syscalls.
#
# The abi is always "i386" for this file.
#
-0 i386 restart_syscall sys_restart_syscall
+0 i386 restart_syscall sys_restart_syscall __sys_x86_restart_syscall
1 i386 exit sys_exit __sys_ia32_exit
-2 i386 fork sys_fork
+2 i386 fork sys_fork __sys_x86_fork
3 i386 read sys_read __sys_ia32_read
4 i386 write sys_write __sys_ia32_write
5 i386 open sys_open __compat_sys_ia32_open
@@ -31,23 +32,23 @@
17 i386 break
18 i386 oldstat sys_stat __sys_ia32_stat
19 i386 lseek sys_lseek __compat_sys_ia32_lseek
-20 i386 getpid sys_getpid
+20 i386 getpid sys_getpid __sys_x86_getpid
21 i386 mount sys_mount __compat_sys_ia32_mount
22 i386 umount sys_oldumount __sys_ia32_oldumount
23 i386 setuid sys_setuid16 __sys_ia32_setuid16
-24 i386 getuid sys_getuid16
+24 i386 getuid sys_getuid16 __sys_x86_getuid16
25 i386 stime sys_stime __compat_sys_ia32_stime
26 i386 ptrace sys_ptrace __compat_sys_ia32_ptrace
27 i386 alarm sys_alarm __sys_ia32_alarm
28 i386 oldfstat sys_fstat __sys_ia32_fstat
-29 i386 pause sys_pause
+29 i386 pause sys_pause __sys_x86_pause
30 i386 utime sys_utime __compat_sys_ia32_utime
31 i386 stty
32 i386 gtty
33 i386 access sys_access __sys_ia32_access
34 i386 nice sys_nice __sys_ia32_nice
35 i386 ftime
-36 i386 sync sys_sync
+36 i386 sync sys_sync __sys_x86_sync
37 i386 kill sys_kill __sys_ia32_kill
38 i386 rename sys_rename __sys_ia32_rename
39 i386 mkdir sys_mkdir __sys_ia32_mkdir
@@ -58,10 +59,10 @@
44 i386 prof
45 i386 brk sys_brk __sys_ia32_brk
46 i386 setgid sys_setgid16 __sys_ia32_setgid16
-47 i386 getgid sys_getgid16
+47 i386 getgid sys_getgid16 __sys_x86_getgid16
48 i386 signal sys_signal __sys_ia32_signal
-49 i386 geteuid sys_geteuid16
-50 i386 getegid sys_getegid16
+49 i386 geteuid sys_geteuid16 __sys_x86_geteuid16
+50 i386 getegid sys_getegid16 __sys_x86_getegid16
51 i386 acct sys_acct __sys_ia32_acct
52 i386 umount2 sys_umount __sys_ia32_umount
53 i386 lock
@@ -75,11 +76,11 @@
61 i386 chroot sys_chroot __sys_ia32_chroot
62 i386 ustat sys_ustat __compat_sys_ia32_ustat
63 i386 dup2 sys_dup2 __sys_ia32_dup2
-64 i386 getppid sys_getppid
-65 i386 getpgrp sys_getpgrp
-66 i386 setsid sys_setsid
+64 i386 getppid sys_getppid __sys_x86_getppid
+65 i386 getpgrp sys_getpgrp __sys_x86_getpgrp
+66 i386 setsid sys_setsid __sys_x86_setsid
67 i386 sigaction sys_sigaction __compat_sys_ia32_sigaction
-68 i386 sgetmask sys_sgetmask
+68 i386 sgetmask sys_sgetmask __sys_x86_sgetmask
69 i386 ssetmask sys_ssetmask __sys_ia32_ssetmask
70 i386 setreuid sys_setreuid16 __sys_ia32_setreuid16
71 i386 setregid sys_setregid16 __sys_ia32_setregid16
@@ -122,7 +123,7 @@
108 i386 fstat sys_newfstat __compat_sys_ia32_newfstat
109 i386 olduname sys_uname __sys_ia32_uname
110 i386 iopl sys_iopl __sys_ia32_iopl
-111 i386 vhangup sys_vhangup
+111 i386 vhangup sys_vhangup __sys_x86_vhangup
112 i386 idle
113 i386 vm86old sys_vm86old sys_ni_syscall
114 i386 wait4 sys_wait4 __compat_sys_ia32_wait4
@@ -164,12 +165,12 @@
150 i386 mlock sys_mlock __sys_ia32_mlock
151 i386 munlock sys_munlock __sys_ia32_munlock
152 i386 mlockall sys_mlockall __sys_ia32_mlockall
-153 i386 munlockall sys_munlockall
+153 i386 munlockall sys_munlockall __sys_x86_munlockall
154 i386 sched_setparam sys_sched_setparam __sys_ia32_sched_setparam
155 i386 sched_getparam sys_sched_getparam __sys_ia32_sched_getparam
156 i386 sched_setscheduler sys_sched_setscheduler __sys_ia32_sched_setscheduler
157 i386 sched_getscheduler sys_sched_getscheduler __sys_ia32_sched_getscheduler
-158 i386 sched_yield sys_sched_yield
+158 i386 sched_yield sys_sched_yield __sys_x86_sched_yield
159 i386 sched_get_priority_max sys_sched_get_priority_max __sys_ia32_sched_get_priority_max
160 i386 sched_get_priority_min sys_sched_get_priority_min __sys_ia32_sched_get_priority_min
161 i386 sched_rr_get_interval sys_sched_rr_get_interval __compat_sys_ia32_sched_rr_get_interval
@@ -201,7 +202,7 @@
187 i386 sendfile sys_sendfile __compat_sys_ia32_sendfile
188 i386 getpmsg
189 i386 putpmsg
-190 i386 vfork sys_vfork
+190 i386 vfork sys_vfork __sys_x86_vfork
191 i386 ugetrlimit sys_getrlimit __compat_sys_ia32_getrlimit
192 i386 mmap2 sys_mmap_pgoff __sys_ia32_mmap_pgoff
193 i386 truncate64 sys_truncate64 __compat_sys_ia32_x86_truncate64
@@ -210,10 +211,10 @@
196 i386 lstat64 sys_lstat64 __compat_sys_ia32_x86_lstat64
197 i386 fstat64 sys_fstat64 __compat_sys_ia32_x86_fstat64
198 i386 lchown32 sys_lchown __sys_ia32_lchown
-199 i386 getuid32 sys_getuid
-200 i386 getgid32 sys_getgid
-201 i386 geteuid32 sys_geteuid
-202 i386 getegid32 sys_getegid
+199 i386 getuid32 sys_getuid __sys_x86_getuid
+200 i386 getgid32 sys_getgid __sys_x86_getgid
+201 i386 geteuid32 sys_geteuid __sys_x86_geteuid
+202 i386 getegid32 sys_getegid __sys_x86_getegid
203 i386 setreuid32 sys_setreuid __sys_ia32_setreuid
204 i386 setregid32 sys_setregid __sys_ia32_setregid
205 i386 getgroups32 sys_getgroups __sys_ia32_getgroups
@@ -235,7 +236,7 @@
221 i386 fcntl64 sys_fcntl64 __compat_sys_ia32_fcntl64
# 222 is unused
# 223 is unused
-224 i386 gettid sys_gettid
+224 i386 gettid sys_gettid __sys_x86_gettid
225 i386 readahead sys_readahead __compat_sys_ia32_x86_readahead
226 i386 setxattr sys_setxattr __sys_ia32_setxattr
227 i386 lsetxattr sys_lsetxattr __sys_ia32_lsetxattr
@@ -302,7 +303,7 @@
288 i386 keyctl sys_keyctl __compat_sys_ia32_keyctl
289 i386 ioprio_set sys_ioprio_set __sys_ia32_ioprio_set
290 i386 ioprio_get sys_ioprio_get __sys_ia32_ioprio_get
-291 i386 inotify_init sys_inotify_init
+291 i386 inotify_init sys_inotify_init __sys_x86_inotify_init
292 i386 inotify_add_watch sys_inotify_add_watch __sys_ia32_inotify_add_watch
293 i386 inotify_rm_watch sys_inotify_rm_watch __sys_ia32_inotify_rm_watch
294 i386 migrate_pages sys_migrate_pages __sys_ia32_migrate_pages
diff --git a/arch/x86/entry/syscalls/syscall_64.tbl b/arch/x86/entry/syscalls/syscall_64.tbl
index a83c0f7f462f..58ff63bb55aa 100644
--- a/arch/x86/entry/syscalls/syscall_64.tbl
+++ b/arch/x86/entry/syscalls/syscall_64.tbl
@@ -4,341 +4,343 @@
# The format is:
# <number> <abi> <name> <entry point>
#
+# The __sys_x86_x86_*() stubs are created on-the-fly for sys_x86_*() system calls
+#
# The abi is "common", "64" or "x32" for this file.
#
-0 common read sys_read
-1 common write sys_write
-2 common open sys_open
-3 common close sys_close
-4 common stat sys_newstat
-5 common fstat sys_newfstat
-6 common lstat sys_newlstat
-7 common poll sys_poll
-8 common lseek sys_lseek
-9 common mmap sys_mmap
-10 common mprotect sys_mprotect
-11 common munmap sys_munmap
-12 common brk sys_brk
-13 64 rt_sigaction sys_rt_sigaction
-14 common rt_sigprocmask sys_rt_sigprocmask
-15 64 rt_sigreturn sys_rt_sigreturn/ptregs
-16 64 ioctl sys_ioctl
-17 common pread64 sys_pread64
-18 common pwrite64 sys_pwrite64
-19 64 readv sys_readv
-20 64 writev sys_writev
-21 common access sys_access
-22 common pipe sys_pipe
-23 common select sys_select
-24 common sched_yield sys_sched_yield
-25 common mremap sys_mremap
-26 common msync sys_msync
-27 common mincore sys_mincore
-28 common madvise sys_madvise
-29 common shmget sys_shmget
-30 common shmat sys_shmat
-31 common shmctl sys_shmctl
-32 common dup sys_dup
-33 common dup2 sys_dup2
-34 common pause sys_pause
-35 common nanosleep sys_nanosleep
-36 common getitimer sys_getitimer
-37 common alarm sys_alarm
-38 common setitimer sys_setitimer
-39 common getpid sys_getpid
-40 common sendfile sys_sendfile64
-41 common socket sys_socket
-42 common connect sys_connect
-43 common accept sys_accept
-44 common sendto sys_sendto
-45 64 recvfrom sys_recvfrom
-46 64 sendmsg sys_sendmsg
-47 64 recvmsg sys_recvmsg
-48 common shutdown sys_shutdown
-49 common bind sys_bind
-50 common listen sys_listen
-51 common getsockname sys_getsockname
-52 common getpeername sys_getpeername
-53 common socketpair sys_socketpair
-54 64 setsockopt sys_setsockopt
-55 64 getsockopt sys_getsockopt
-56 common clone sys_clone/ptregs
-57 common fork sys_fork/ptregs
-58 common vfork sys_vfork/ptregs
-59 64 execve sys_execve/ptregs
-60 common exit sys_exit
-61 common wait4 sys_wait4
-62 common kill sys_kill
-63 common uname sys_newuname
-64 common semget sys_semget
-65 common semop sys_semop
-66 common semctl sys_semctl
-67 common shmdt sys_shmdt
-68 common msgget sys_msgget
-69 common msgsnd sys_msgsnd
-70 common msgrcv sys_msgrcv
-71 common msgctl sys_msgctl
-72 common fcntl sys_fcntl
-73 common flock sys_flock
-74 common fsync sys_fsync
-75 common fdatasync sys_fdatasync
-76 common truncate sys_truncate
-77 common ftruncate sys_ftruncate
-78 common getdents sys_getdents
-79 common getcwd sys_getcwd
-80 common chdir sys_chdir
-81 common fchdir sys_fchdir
-82 common rename sys_rename
-83 common mkdir sys_mkdir
-84 common rmdir sys_rmdir
-85 common creat sys_creat
-86 common link sys_link
-87 common unlink sys_unlink
-88 common symlink sys_symlink
-89 common readlink sys_readlink
-90 common chmod sys_chmod
-91 common fchmod sys_fchmod
-92 common chown sys_chown
-93 common fchown sys_fchown
-94 common lchown sys_lchown
-95 common umask sys_umask
-96 common gettimeofday sys_gettimeofday
-97 common getrlimit sys_getrlimit
-98 common getrusage sys_getrusage
-99 common sysinfo sys_sysinfo
-100 common times sys_times
-101 64 ptrace sys_ptrace
-102 common getuid sys_getuid
-103 common syslog sys_syslog
-104 common getgid sys_getgid
-105 common setuid sys_setuid
-106 common setgid sys_setgid
-107 common geteuid sys_geteuid
-108 common getegid sys_getegid
-109 common setpgid sys_setpgid
-110 common getppid sys_getppid
-111 common getpgrp sys_getpgrp
-112 common setsid sys_setsid
-113 common setreuid sys_setreuid
-114 common setregid sys_setregid
-115 common getgroups sys_getgroups
-116 common setgroups sys_setgroups
-117 common setresuid sys_setresuid
-118 common getresuid sys_getresuid
-119 common setresgid sys_setresgid
-120 common getresgid sys_getresgid
-121 common getpgid sys_getpgid
-122 common setfsuid sys_setfsuid
-123 common setfsgid sys_setfsgid
-124 common getsid sys_getsid
-125 common capget sys_capget
-126 common capset sys_capset
-127 64 rt_sigpending sys_rt_sigpending
-128 64 rt_sigtimedwait sys_rt_sigtimedwait
-129 64 rt_sigqueueinfo sys_rt_sigqueueinfo
-130 common rt_sigsuspend sys_rt_sigsuspend
-131 64 sigaltstack sys_sigaltstack
-132 common utime sys_utime
-133 common mknod sys_mknod
+0 common read __sys_x86_read
+1 common write __sys_x86_write
+2 common open __sys_x86_open
+3 common close __sys_x86_close
+4 common stat __sys_x86_newstat
+5 common fstat __sys_x86_newfstat
+6 common lstat __sys_x86_newlstat
+7 common poll __sys_x86_poll
+8 common lseek __sys_x86_lseek
+9 common mmap __sys_x86_mmap
+10 common mprotect __sys_x86_mprotect
+11 common munmap __sys_x86_munmap
+12 common brk __sys_x86_brk
+13 64 rt_sigaction __sys_x86_rt_sigaction
+14 common rt_sigprocmask __sys_x86_rt_sigprocmask
+15 64 rt_sigreturn __sys_x86_rt_sigreturn/ptregs
+16 64 ioctl __sys_x86_ioctl
+17 common pread64 __sys_x86_pread64
+18 common pwrite64 __sys_x86_pwrite64
+19 64 readv __sys_x86_readv
+20 64 writev __sys_x86_writev
+21 common access __sys_x86_access
+22 common pipe __sys_x86_pipe
+23 common select __sys_x86_select
+24 common sched_yield __sys_x86_sched_yield
+25 common mremap __sys_x86_mremap
+26 common msync __sys_x86_msync
+27 common mincore __sys_x86_mincore
+28 common madvise __sys_x86_madvise
+29 common shmget __sys_x86_shmget
+30 common shmat __sys_x86_shmat
+31 common shmctl __sys_x86_shmctl
+32 common dup __sys_x86_dup
+33 common dup2 __sys_x86_dup2
+34 common pause __sys_x86_pause
+35 common nanosleep __sys_x86_nanosleep
+36 common getitimer __sys_x86_getitimer
+37 common alarm __sys_x86_alarm
+38 common setitimer __sys_x86_setitimer
+39 common getpid __sys_x86_getpid
+40 common sendfile __sys_x86_sendfile64
+41 common socket __sys_x86_socket
+42 common connect __sys_x86_connect
+43 common accept __sys_x86_accept
+44 common sendto __sys_x86_sendto
+45 64 recvfrom __sys_x86_recvfrom
+46 64 sendmsg __sys_x86_sendmsg
+47 64 recvmsg __sys_x86_recvmsg
+48 common shutdown __sys_x86_shutdown
+49 common bind __sys_x86_bind
+50 common listen __sys_x86_listen
+51 common getsockname __sys_x86_getsockname
+52 common getpeername __sys_x86_getpeername
+53 common socketpair __sys_x86_socketpair
+54 64 setsockopt __sys_x86_setsockopt
+55 64 getsockopt __sys_x86_getsockopt
+56 common clone __sys_x86_clone/ptregs
+57 common fork __sys_x86_fork/ptregs
+58 common vfork __sys_x86_vfork/ptregs
+59 64 execve __sys_x86_execve/ptregs
+60 common exit __sys_x86_exit
+61 common wait4 __sys_x86_wait4
+62 common kill __sys_x86_kill
+63 common uname __sys_x86_newuname
+64 common semget __sys_x86_semget
+65 common semop __sys_x86_semop
+66 common semctl __sys_x86_semctl
+67 common shmdt __sys_x86_shmdt
+68 common msgget __sys_x86_msgget
+69 common msgsnd __sys_x86_msgsnd
+70 common msgrcv __sys_x86_msgrcv
+71 common msgctl __sys_x86_msgctl
+72 common fcntl __sys_x86_fcntl
+73 common flock __sys_x86_flock
+74 common fsync __sys_x86_fsync
+75 common fdatasync __sys_x86_fdatasync
+76 common truncate __sys_x86_truncate
+77 common ftruncate __sys_x86_ftruncate
+78 common getdents __sys_x86_getdents
+79 common getcwd __sys_x86_getcwd
+80 common chdir __sys_x86_chdir
+81 common fchdir __sys_x86_fchdir
+82 common rename __sys_x86_rename
+83 common mkdir __sys_x86_mkdir
+84 common rmdir __sys_x86_rmdir
+85 common creat __sys_x86_creat
+86 common link __sys_x86_link
+87 common unlink __sys_x86_unlink
+88 common symlink __sys_x86_symlink
+89 common readlink __sys_x86_readlink
+90 common chmod __sys_x86_chmod
+91 common fchmod __sys_x86_fchmod
+92 common chown __sys_x86_chown
+93 common fchown __sys_x86_fchown
+94 common lchown __sys_x86_lchown
+95 common umask __sys_x86_umask
+96 common gettimeofday __sys_x86_gettimeofday
+97 common getrlimit __sys_x86_getrlimit
+98 common getrusage __sys_x86_getrusage
+99 common sysinfo __sys_x86_sysinfo
+100 common times __sys_x86_times
+101 64 ptrace __sys_x86_ptrace
+102 common getuid __sys_x86_getuid
+103 common syslog __sys_x86_syslog
+104 common getgid __sys_x86_getgid
+105 common setuid __sys_x86_setuid
+106 common setgid __sys_x86_setgid
+107 common geteuid __sys_x86_geteuid
+108 common getegid __sys_x86_getegid
+109 common setpgid __sys_x86_setpgid
+110 common getppid __sys_x86_getppid
+111 common getpgrp __sys_x86_getpgrp
+112 common setsid __sys_x86_setsid
+113 common setreuid __sys_x86_setreuid
+114 common setregid __sys_x86_setregid
+115 common getgroups __sys_x86_getgroups
+116 common setgroups __sys_x86_setgroups
+117 common setresuid __sys_x86_setresuid
+118 common getresuid __sys_x86_getresuid
+119 common setresgid __sys_x86_setresgid
+120 common getresgid __sys_x86_getresgid
+121 common getpgid __sys_x86_getpgid
+122 common setfsuid __sys_x86_setfsuid
+123 common setfsgid __sys_x86_setfsgid
+124 common getsid __sys_x86_getsid
+125 common capget __sys_x86_capget
+126 common capset __sys_x86_capset
+127 64 rt_sigpending __sys_x86_rt_sigpending
+128 64 rt_sigtimedwait __sys_x86_rt_sigtimedwait
+129 64 rt_sigqueueinfo __sys_x86_rt_sigqueueinfo
+130 common rt_sigsuspend __sys_x86_rt_sigsuspend
+131 64 sigaltstack __sys_x86_sigaltstack
+132 common utime __sys_x86_utime
+133 common mknod __sys_x86_mknod
134 64 uselib
-135 common personality sys_personality
-136 common ustat sys_ustat
-137 common statfs sys_statfs
-138 common fstatfs sys_fstatfs
-139 common sysfs sys_sysfs
-140 common getpriority sys_getpriority
-141 common setpriority sys_setpriority
-142 common sched_setparam sys_sched_setparam
-143 common sched_getparam sys_sched_getparam
-144 common sched_setscheduler sys_sched_setscheduler
-145 common sched_getscheduler sys_sched_getscheduler
-146 common sched_get_priority_max sys_sched_get_priority_max
-147 common sched_get_priority_min sys_sched_get_priority_min
-148 common sched_rr_get_interval sys_sched_rr_get_interval
-149 common mlock sys_mlock
-150 common munlock sys_munlock
-151 common mlockall sys_mlockall
-152 common munlockall sys_munlockall
-153 common vhangup sys_vhangup
-154 common modify_ldt sys_modify_ldt
-155 common pivot_root sys_pivot_root
-156 64 _sysctl sys_sysctl
-157 common prctl sys_prctl
-158 common arch_prctl sys_arch_prctl
-159 common adjtimex sys_adjtimex
-160 common setrlimit sys_setrlimit
-161 common chroot sys_chroot
-162 common sync sys_sync
-163 common acct sys_acct
-164 common settimeofday sys_settimeofday
-165 common mount sys_mount
-166 common umount2 sys_umount
-167 common swapon sys_swapon
-168 common swapoff sys_swapoff
-169 common reboot sys_reboot
-170 common sethostname sys_sethostname
-171 common setdomainname sys_setdomainname
-172 common iopl sys_iopl/ptregs
-173 common ioperm sys_ioperm
+135 common personality __sys_x86_personality
+136 common ustat __sys_x86_ustat
+137 common statfs __sys_x86_statfs
+138 common fstatfs __sys_x86_fstatfs
+139 common sysfs __sys_x86_sysfs
+140 common getpriority __sys_x86_getpriority
+141 common setpriority __sys_x86_setpriority
+142 common sched_setparam __sys_x86_sched_setparam
+143 common sched_getparam __sys_x86_sched_getparam
+144 common sched_setscheduler __sys_x86_sched_setscheduler
+145 common sched_getscheduler __sys_x86_sched_getscheduler
+146 common sched_get_priority_max __sys_x86_sched_get_priority_max
+147 common sched_get_priority_min __sys_x86_sched_get_priority_min
+148 common sched_rr_get_interval __sys_x86_sched_rr_get_interval
+149 common mlock __sys_x86_mlock
+150 common munlock __sys_x86_munlock
+151 common mlockall __sys_x86_mlockall
+152 common munlockall __sys_x86_munlockall
+153 common vhangup __sys_x86_vhangup
+154 common modify_ldt __sys_x86_modify_ldt
+155 common pivot_root __sys_x86_pivot_root
+156 64 _sysctl __sys_x86_sysctl
+157 common prctl __sys_x86_prctl
+158 common arch_prctl __sys_x86_arch_prctl
+159 common adjtimex __sys_x86_adjtimex
+160 common setrlimit __sys_x86_setrlimit
+161 common chroot __sys_x86_chroot
+162 common sync __sys_x86_sync
+163 common acct __sys_x86_acct
+164 common settimeofday __sys_x86_settimeofday
+165 common mount __sys_x86_mount
+166 common umount2 __sys_x86_umount
+167 common swapon __sys_x86_swapon
+168 common swapoff __sys_x86_swapoff
+169 common reboot __sys_x86_reboot
+170 common sethostname __sys_x86_sethostname
+171 common setdomainname __sys_x86_setdomainname
+172 common iopl __sys_x86_iopl/ptregs
+173 common ioperm __sys_x86_ioperm
174 64 create_module
-175 common init_module sys_init_module
-176 common delete_module sys_delete_module
+175 common init_module __sys_x86_init_module
+176 common delete_module __sys_x86_delete_module
177 64 get_kernel_syms
178 64 query_module
-179 common quotactl sys_quotactl
+179 common quotactl __sys_x86_quotactl
180 64 nfsservctl
181 common getpmsg
182 common putpmsg
183 common afs_syscall
184 common tuxcall
185 common security
-186 common gettid sys_gettid
-187 common readahead sys_readahead
-188 common setxattr sys_setxattr
-189 common lsetxattr sys_lsetxattr
-190 common fsetxattr sys_fsetxattr
-191 common getxattr sys_getxattr
-192 common lgetxattr sys_lgetxattr
-193 common fgetxattr sys_fgetxattr
-194 common listxattr sys_listxattr
-195 common llistxattr sys_llistxattr
-196 common flistxattr sys_flistxattr
-197 common removexattr sys_removexattr
-198 common lremovexattr sys_lremovexattr
-199 common fremovexattr sys_fremovexattr
-200 common tkill sys_tkill
-201 common time sys_time
-202 common futex sys_futex
-203 common sched_setaffinity sys_sched_setaffinity
-204 common sched_getaffinity sys_sched_getaffinity
+186 common gettid __sys_x86_gettid
+187 common readahead __sys_x86_readahead
+188 common setxattr __sys_x86_setxattr
+189 common lsetxattr __sys_x86_lsetxattr
+190 common fsetxattr __sys_x86_fsetxattr
+191 common getxattr __sys_x86_getxattr
+192 common lgetxattr __sys_x86_lgetxattr
+193 common fgetxattr __sys_x86_fgetxattr
+194 common listxattr __sys_x86_listxattr
+195 common llistxattr __sys_x86_llistxattr
+196 common flistxattr __sys_x86_flistxattr
+197 common removexattr __sys_x86_removexattr
+198 common lremovexattr __sys_x86_lremovexattr
+199 common fremovexattr __sys_x86_fremovexattr
+200 common tkill __sys_x86_tkill
+201 common time __sys_x86_time
+202 common futex __sys_x86_futex
+203 common sched_setaffinity __sys_x86_sched_setaffinity
+204 common sched_getaffinity __sys_x86_sched_getaffinity
205 64 set_thread_area
-206 64 io_setup sys_io_setup
-207 common io_destroy sys_io_destroy
-208 common io_getevents sys_io_getevents
-209 64 io_submit sys_io_submit
-210 common io_cancel sys_io_cancel
+206 64 io_setup __sys_x86_io_setup
+207 common io_destroy __sys_x86_io_destroy
+208 common io_getevents __sys_x86_io_getevents
+209 64 io_submit __sys_x86_io_submit
+210 common io_cancel __sys_x86_io_cancel
211 64 get_thread_area
-212 common lookup_dcookie sys_lookup_dcookie
-213 common epoll_create sys_epoll_create
+212 common lookup_dcookie __sys_x86_lookup_dcookie
+213 common epoll_create __sys_x86_epoll_create
214 64 epoll_ctl_old
215 64 epoll_wait_old
-216 common remap_file_pages sys_remap_file_pages
-217 common getdents64 sys_getdents64
-218 common set_tid_address sys_set_tid_address
-219 common restart_syscall sys_restart_syscall
-220 common semtimedop sys_semtimedop
-221 common fadvise64 sys_fadvise64
-222 64 timer_create sys_timer_create
-223 common timer_settime sys_timer_settime
-224 common timer_gettime sys_timer_gettime
-225 common timer_getoverrun sys_timer_getoverrun
-226 common timer_delete sys_timer_delete
-227 common clock_settime sys_clock_settime
-228 common clock_gettime sys_clock_gettime
-229 common clock_getres sys_clock_getres
-230 common clock_nanosleep sys_clock_nanosleep
-231 common exit_group sys_exit_group
-232 common epoll_wait sys_epoll_wait
-233 common epoll_ctl sys_epoll_ctl
-234 common tgkill sys_tgkill
-235 common utimes sys_utimes
+216 common remap_file_pages __sys_x86_remap_file_pages
+217 common getdents64 __sys_x86_getdents64
+218 common set_tid_address __sys_x86_set_tid_address
+219 common restart_syscall __sys_x86_restart_syscall
+220 common semtimedop __sys_x86_semtimedop
+221 common fadvise64 __sys_x86_fadvise64
+222 64 timer_create __sys_x86_timer_create
+223 common timer_settime __sys_x86_timer_settime
+224 common timer_gettime __sys_x86_timer_gettime
+225 common timer_getoverrun __sys_x86_timer_getoverrun
+226 common timer_delete __sys_x86_timer_delete
+227 common clock_settime __sys_x86_clock_settime
+228 common clock_gettime __sys_x86_clock_gettime
+229 common clock_getres __sys_x86_clock_getres
+230 common clock_nanosleep __sys_x86_clock_nanosleep
+231 common exit_group __sys_x86_exit_group
+232 common epoll_wait __sys_x86_epoll_wait
+233 common epoll_ctl __sys_x86_epoll_ctl
+234 common tgkill __sys_x86_tgkill
+235 common utimes __sys_x86_utimes
236 64 vserver
-237 common mbind sys_mbind
-238 common set_mempolicy sys_set_mempolicy
-239 common get_mempolicy sys_get_mempolicy
-240 common mq_open sys_mq_open
-241 common mq_unlink sys_mq_unlink
-242 common mq_timedsend sys_mq_timedsend
-243 common mq_timedreceive sys_mq_timedreceive
-244 64 mq_notify sys_mq_notify
-245 common mq_getsetattr sys_mq_getsetattr
-246 64 kexec_load sys_kexec_load
-247 64 waitid sys_waitid
-248 common add_key sys_add_key
-249 common request_key sys_request_key
-250 common keyctl sys_keyctl
-251 common ioprio_set sys_ioprio_set
-252 common ioprio_get sys_ioprio_get
-253 common inotify_init sys_inotify_init
-254 common inotify_add_watch sys_inotify_add_watch
-255 common inotify_rm_watch sys_inotify_rm_watch
-256 common migrate_pages sys_migrate_pages
-257 common openat sys_openat
-258 common mkdirat sys_mkdirat
-259 common mknodat sys_mknodat
-260 common fchownat sys_fchownat
-261 common futimesat sys_futimesat
-262 common newfstatat sys_newfstatat
-263 common unlinkat sys_unlinkat
-264 common renameat sys_renameat
-265 common linkat sys_linkat
-266 common symlinkat sys_symlinkat
-267 common readlinkat sys_readlinkat
-268 common fchmodat sys_fchmodat
-269 common faccessat sys_faccessat
-270 common pselect6 sys_pselect6
-271 common ppoll sys_ppoll
-272 common unshare sys_unshare
-273 64 set_robust_list sys_set_robust_list
-274 64 get_robust_list sys_get_robust_list
-275 common splice sys_splice
-276 common tee sys_tee
-277 common sync_file_range sys_sync_file_range
-278 64 vmsplice sys_vmsplice
-279 64 move_pages sys_move_pages
-280 common utimensat sys_utimensat
-281 common epoll_pwait sys_epoll_pwait
-282 common signalfd sys_signalfd
-283 common timerfd_create sys_timerfd_create
-284 common eventfd sys_eventfd
-285 common fallocate sys_fallocate
-286 common timerfd_settime sys_timerfd_settime
-287 common timerfd_gettime sys_timerfd_gettime
-288 common accept4 sys_accept4
-289 common signalfd4 sys_signalfd4
-290 common eventfd2 sys_eventfd2
-291 common epoll_create1 sys_epoll_create1
-292 common dup3 sys_dup3
-293 common pipe2 sys_pipe2
-294 common inotify_init1 sys_inotify_init1
-295 64 preadv sys_preadv
-296 64 pwritev sys_pwritev
-297 64 rt_tgsigqueueinfo sys_rt_tgsigqueueinfo
-298 common perf_event_open sys_perf_event_open
-299 64 recvmmsg sys_recvmmsg
-300 common fanotify_init sys_fanotify_init
-301 common fanotify_mark sys_fanotify_mark
-302 common prlimit64 sys_prlimit64
-303 common name_to_handle_at sys_name_to_handle_at
-304 common open_by_handle_at sys_open_by_handle_at
-305 common clock_adjtime sys_clock_adjtime
-306 common syncfs sys_syncfs
-307 64 sendmmsg sys_sendmmsg
-308 common setns sys_setns
-309 common getcpu sys_getcpu
-310 64 process_vm_readv sys_process_vm_readv
-311 64 process_vm_writev sys_process_vm_writev
-312 common kcmp sys_kcmp
-313 common finit_module sys_finit_module
-314 common sched_setattr sys_sched_setattr
-315 common sched_getattr sys_sched_getattr
-316 common renameat2 sys_renameat2
-317 common seccomp sys_seccomp
-318 common getrandom sys_getrandom
-319 common memfd_create sys_memfd_create
-320 common kexec_file_load sys_kexec_file_load
-321 common bpf sys_bpf
-322 64 execveat sys_execveat/ptregs
-323 common userfaultfd sys_userfaultfd
-324 common membarrier sys_membarrier
-325 common mlock2 sys_mlock2
-326 common copy_file_range sys_copy_file_range
-327 64 preadv2 sys_preadv2
-328 64 pwritev2 sys_pwritev2
-329 common pkey_mprotect sys_pkey_mprotect
-330 common pkey_alloc sys_pkey_alloc
-331 common pkey_free sys_pkey_free
-332 common statx sys_statx
+237 common mbind __sys_x86_mbind
+238 common set_mempolicy __sys_x86_set_mempolicy
+239 common get_mempolicy __sys_x86_get_mempolicy
+240 common mq_open __sys_x86_mq_open
+241 common mq_unlink __sys_x86_mq_unlink
+242 common mq_timedsend __sys_x86_mq_timedsend
+243 common mq_timedreceive __sys_x86_mq_timedreceive
+244 64 mq_notify __sys_x86_mq_notify
+245 common mq_getsetattr __sys_x86_mq_getsetattr
+246 64 kexec_load __sys_x86_kexec_load
+247 64 waitid __sys_x86_waitid
+248 common add_key __sys_x86_add_key
+249 common request_key __sys_x86_request_key
+250 common keyctl __sys_x86_keyctl
+251 common ioprio_set __sys_x86_ioprio_set
+252 common ioprio_get __sys_x86_ioprio_get
+253 common inotify_init __sys_x86_inotify_init
+254 common inotify_add_watch __sys_x86_inotify_add_watch
+255 common inotify_rm_watch __sys_x86_inotify_rm_watch
+256 common migrate_pages __sys_x86_migrate_pages
+257 common openat __sys_x86_openat
+258 common mkdirat __sys_x86_mkdirat
+259 common mknodat __sys_x86_mknodat
+260 common fchownat __sys_x86_fchownat
+261 common futimesat __sys_x86_futimesat
+262 common newfstatat __sys_x86_newfstatat
+263 common unlinkat __sys_x86_unlinkat
+264 common renameat __sys_x86_renameat
+265 common linkat __sys_x86_linkat
+266 common symlinkat __sys_x86_symlinkat
+267 common readlinkat __sys_x86_readlinkat
+268 common fchmodat __sys_x86_fchmodat
+269 common faccessat __sys_x86_faccessat
+270 common pselect6 __sys_x86_pselect6
+271 common ppoll __sys_x86_ppoll
+272 common unshare __sys_x86_unshare
+273 64 set_robust_list __sys_x86_set_robust_list
+274 64 get_robust_list __sys_x86_get_robust_list
+275 common splice __sys_x86_splice
+276 common tee __sys_x86_tee
+277 common sync_file_range __sys_x86_sync_file_range
+278 64 vmsplice __sys_x86_vmsplice
+279 64 move_pages __sys_x86_move_pages
+280 common utimensat __sys_x86_utimensat
+281 common epoll_pwait __sys_x86_epoll_pwait
+282 common signalfd __sys_x86_signalfd
+283 common timerfd_create __sys_x86_timerfd_create
+284 common eventfd __sys_x86_eventfd
+285 common fallocate __sys_x86_fallocate
+286 common timerfd_settime __sys_x86_timerfd_settime
+287 common timerfd_gettime __sys_x86_timerfd_gettime
+288 common accept4 __sys_x86_accept4
+289 common signalfd4 __sys_x86_signalfd4
+290 common eventfd2 __sys_x86_eventfd2
+291 common epoll_create1 __sys_x86_epoll_create1
+292 common dup3 __sys_x86_dup3
+293 common pipe2 __sys_x86_pipe2
+294 common inotify_init1 __sys_x86_inotify_init1
+295 64 preadv __sys_x86_preadv
+296 64 pwritev __sys_x86_pwritev
+297 64 rt_tgsigqueueinfo __sys_x86_rt_tgsigqueueinfo
+298 common perf_event_open __sys_x86_perf_event_open
+299 64 recvmmsg __sys_x86_recvmmsg
+300 common fanotify_init __sys_x86_fanotify_init
+301 common fanotify_mark __sys_x86_fanotify_mark
+302 common prlimit64 __sys_x86_prlimit64
+303 common name_to_handle_at __sys_x86_name_to_handle_at
+304 common open_by_handle_at __sys_x86_open_by_handle_at
+305 common clock_adjtime __sys_x86_clock_adjtime
+306 common syncfs __sys_x86_syncfs
+307 64 sendmmsg __sys_x86_sendmmsg
+308 common setns __sys_x86_setns
+309 common getcpu __sys_x86_getcpu
+310 64 process_vm_readv __sys_x86_process_vm_readv
+311 64 process_vm_writev __sys_x86_process_vm_writev
+312 common kcmp __sys_x86_kcmp
+313 common finit_module __sys_x86_finit_module
+314 common sched_setattr __sys_x86_sched_setattr
+315 common sched_getattr __sys_x86_sched_getattr
+316 common renameat2 __sys_x86_renameat2
+317 common seccomp __sys_x86_seccomp
+318 common getrandom __sys_x86_getrandom
+319 common memfd_create __sys_x86_memfd_create
+320 common kexec_file_load __sys_x86_kexec_file_load
+321 common bpf __sys_x86_bpf
+322 64 execveat __sys_x86_execveat/ptregs
+323 common userfaultfd __sys_x86_userfaultfd
+324 common membarrier __sys_x86_membarrier
+325 common mlock2 __sys_x86_mlock2
+326 common copy_file_range __sys_x86_copy_file_range
+327 64 preadv2 __sys_x86_preadv2
+328 64 pwritev2 __sys_x86_pwritev2
+329 common pkey_mprotect __sys_x86_pkey_mprotect
+330 common pkey_alloc __sys_x86_pkey_alloc
+331 common pkey_free __sys_x86_pkey_free
+332 common statx __sys_x86_statx
#
# x32-specific system call numbers start at 512 to avoid cache impact
diff --git a/arch/x86/entry/vsyscall/vsyscall_64.c b/arch/x86/entry/vsyscall/vsyscall_64.c
index 20b3d4a88ee4..6da86e0a8a9c 100644
--- a/arch/x86/entry/vsyscall/vsyscall_64.c
+++ b/arch/x86/entry/vsyscall/vsyscall_64.c
@@ -229,12 +229,12 @@ bool emulate_vsyscall(struct pt_regs *regs, unsigned long address)
switch (vsyscall_nr) {
case 0:
/* this decodes regs->di and regs->si on its own */
- ret = sys_gettimeofday(regs);
+ ret = __sys_x86_gettimeofday(regs);
break;
case 1:
/* this decodes regs->di on its own */
- ret = sys_time(regs);
+ ret = __sys_x86_time(regs);
break;
case 2:
@@ -242,7 +242,7 @@ bool emulate_vsyscall(struct pt_regs *regs, unsigned long address)
orig_dx = regs->dx;
regs->dx = 0;
/* this decodes regs->di, regs->si and regs->dx on its own */
- ret = sys_getcpu(regs);
+ ret = __sys_x86_getcpu(regs);
regs->dx = orig_dx;
break;
}
diff --git a/arch/x86/include/asm/syscall_wrapper.h b/arch/x86/include/asm/syscall_wrapper.h
index 49d7e4970110..5159c10314e4 100644
--- a/arch/x86/include/asm/syscall_wrapper.h
+++ b/arch/x86/include/asm/syscall_wrapper.h
@@ -45,11 +45,11 @@
}
#define COND_SYSCALL(name) \
- cond_syscall(sys_##name); \
+ cond_syscall(__sys_x86_##name); \
cond_syscall(__sys_ia32_##name)
#define SYS_NI(name) \
- SYSCALL_ALIAS(sys_##name, sys_ni_posix_timers); \
+ SYSCALL_ALIAS(__sys_x86_##name, sys_ni_posix_timers); \
SYSCALL_ALIAS(__sys_ia32_##name, sys_ni_posix_timers)
#else /* CONFIG_IA32_EMULATION */
@@ -114,12 +114,12 @@
/*
* Instead of the generic __SYSCALL_DEFINEx() definition, this macro takes
* struct pt_regs *regs as the only argument of the syscall stub named
- * sys_*(). It decodes just the registers it needs and passes them on to
+ * __sys_x86_*(). It decodes just the registers it needs and passes them on to
* the SyS_*() wrapper and then to the SYSC_*() function doing the actual job.
* These wrappers and functions are inlined, meaning that the assembly looks
* as follows (slightly re-ordered):
*
- * <sys_recv>: <-- syscall with 4 parameters
+ * <__sys_x86_recv>: <-- syscall with 4 parameters
* callq <__fentry__>
*
* mov 0x70(%rdi),%rdi <-- decode regs->di
@@ -142,18 +142,13 @@
* If IA32_EMULATION is enabled, this macro generates an additional wrapper
* named __sys_ia32_*() which decodes the struct pt_regs *regs according
* to the i386 calling convention (bx, cx, dx, si, di, bp).
- *
- * As the generic SYSCALL_DEFINE0() macro does not decode any parameters for
- * obvious reasons, and passing struct pt_regs *regs to it in %rdi does not
- * hurt, there is no need to override it, or to define it differently for
- * IA32_EMULATION.
*/
#define __SYSCALL_DEFINEx(x, name, ...) \
- asmlinkage long sys##name(const struct pt_regs *regs); \
- ALLOW_ERROR_INJECTION(sys##name, ERRNO); \
+ asmlinkage long __sys_x86##name(const struct pt_regs *regs); \
+ ALLOW_ERROR_INJECTION(__sys_x86##name, ERRNO); \
static long SyS##name(__MAP(x,__SC_LONG,__VA_ARGS__)); \
static inline long SYSC##name(__MAP(x,__SC_DECL,__VA_ARGS__)); \
- asmlinkage long sys##name(const struct pt_regs *regs) \
+ asmlinkage long __sys_x86##name(const struct pt_regs *regs) \
{ \
return SyS##name(SC_X86_64_REGS_TO_ARGS(x,__VA_ARGS__));\
} \
@@ -167,13 +162,36 @@
} \
static inline long SYSC##name(__MAP(x,__SC_DECL,__VA_ARGS__))
+
+/*
+ * As the generic SYSCALL_DEFINE0() macro does not decode any parameters for
+ * obvious reasons, and passing struct pt_regs *regs to it in %rdi does not
+ * hurt, we only need to re-define it here to keep the naming congruent to
+ * SYSCALL_DEFINEx() -- which is essential for the COND_SYSCALL() and SYS_NI()
+ * macros to work correctly
+ */
+#define SYSCALL_DEFINE0(sname) \
+ SYSCALL_METADATA(_##sname, 0); \
+ asmlinkage long __sys_x86_##sname(void); \
+ ALLOW_ERROR_INJECTION(__sys_x86_##sname, ERRNO); \
+ asmlinkage long __sys_x86_##sname(void)
+
+#ifndef COND_SYSCALL
+#define COND_SYSCALL(name) cond_syscall(__sys_x86_##name)
+#endif
+
+#ifndef SYS_NI
+#define SYS_NI(name) SYSCALL_ALIAS(__sys_x86_##name, sys_ni_posix_timers);
+#endif
+
+
/*
* For VSYSCALLS, we need to declare these three syscalls with the new
* pt_regs-based calling convention for in-kernel use.
*/
struct pt_regs;
-asmlinkage long sys_getcpu(const struct pt_regs *regs); /* di,si,dx */
-asmlinkage long sys_gettimeofday(const struct pt_regs *regs); /* di,si */
-asmlinkage long sys_time(const struct pt_regs *regs); /* di */
+asmlinkage long __sys_x86_getcpu(const struct pt_regs *regs);
+asmlinkage long __sys_x86_gettimeofday(const struct pt_regs *regs);
+asmlinkage long __sys_x86_time(const struct pt_regs *regs);
#endif /* _ASM_X86_SYSCALL_WRAPPER_H */
--
2.16.3
^ permalink raw reply related [flat|nested] 27+ messages in thread
* Re: [PATCH 8/8] syscalls/x86: rename struct pt_regs-based sys_*() to __sys_x86_*()
2018-04-05 9:53 ` [PATCH 8/8] syscalls/x86: rename struct pt_regs-based sys_*() to __sys_x86_*() Dominik Brodowski
@ 2018-04-05 18:35 ` kbuild test robot
0 siblings, 0 replies; 27+ messages in thread
From: kbuild test robot @ 2018-04-05 18:35 UTC (permalink / raw)
To: Dominik Brodowski
Cc: kbuild-all, linux-kernel, mingo, Linus Torvalds, Thomas Gleixner,
Ingo Molnar, H. Peter Anvin, Andi Kleen, x86
[-- Attachment #1: Type: text/plain, Size: 2677 bytes --]
Hi Dominik,
I love your patch! Yet something to improve:
[auto build test ERROR on linus/master]
[also build test ERROR on next-20180405]
[cannot apply to tip/x86/core v4.16]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]
url: https://github.com/0day-ci/linux/commits/Dominik-Brodowski/use-struct-pt_regs-based-syscall-calling-for-x86-64/20180406-003520
config: um-x86_64_defconfig (attached as .config)
compiler: gcc-7 (Debian 7.3.0-1) 7.3.0
reproduce:
# save the attached .config to linux build tree
make ARCH=um SUBARCH=x86_64
All errors (new ones prefixed by >>):
>> arch/x86/um/sys_call_table_64.o:(.rodata+0x0): undefined reference to `__sys_x86_read'
>> arch/x86/um/sys_call_table_64.o:(.rodata+0x8): undefined reference to `__sys_x86_write'
>> arch/x86/um/sys_call_table_64.o:(.rodata+0x10): undefined reference to `__sys_x86_open'
>> arch/x86/um/sys_call_table_64.o:(.rodata+0x18): undefined reference to `__sys_x86_close'
>> arch/x86/um/sys_call_table_64.o:(.rodata+0x20): undefined reference to `__sys_x86_newstat'
>> arch/x86/um/sys_call_table_64.o:(.rodata+0x28): undefined reference to `__sys_x86_newfstat'
>> arch/x86/um/sys_call_table_64.o:(.rodata+0x30): undefined reference to `__sys_x86_newlstat'
>> arch/x86/um/sys_call_table_64.o:(.rodata+0x38): undefined reference to `__sys_x86_poll'
>> arch/x86/um/sys_call_table_64.o:(.rodata+0x40): undefined reference to `__sys_x86_lseek'
>> arch/x86/um/sys_call_table_64.o:(.rodata+0x48): undefined reference to `__sys_x86_mmap'
>> arch/x86/um/sys_call_table_64.o:(.rodata+0x50): undefined reference to `__sys_x86_mprotect'
>> arch/x86/um/sys_call_table_64.o:(.rodata+0x58): undefined reference to `__sys_x86_munmap'
>> arch/x86/um/sys_call_table_64.o:(.rodata+0x60): undefined reference to `__sys_x86_brk'
>> arch/x86/um/sys_call_table_64.o:(.rodata+0x68): undefined reference to `__sys_x86_rt_sigaction'
>> arch/x86/um/sys_call_table_64.o:(.rodata+0x70): undefined reference to `__sys_x86_rt_sigprocmask'
>> arch/x86/um/sys_call_table_64.o:(.rodata+0x78): undefined reference to `__sys_x86_rt_sigreturn'
>> arch/x86/um/sys_call_table_64.o:(.rodata+0x80): undefined reference to `__sys_x86_ioctl'
>> arch/x86/um/sys_call_table_64.o:(.rodata+0x88): undefined reference to `__sys_x86_pread64'
>> arch/x86/um/sys_call_table_64.o:(.rodata+0x90): undefined reference to `__sys_x86_pwrite64'
>> arch/x86/um/sys_call_table_64.o:(.rodata+0x98): undefined reference to `__sys_x86_readv'
---
0-DAY kernel test infrastructure Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all Intel Corporation
[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 7357 bytes --]
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH 0/8] use struct pt_regs based syscall calling for x86-64
2018-04-05 9:52 [PATCH 0/8] use struct pt_regs based syscall calling for x86-64 Dominik Brodowski
` (7 preceding siblings ...)
2018-04-05 9:53 ` [PATCH 8/8] syscalls/x86: rename struct pt_regs-based sys_*() to __sys_x86_*() Dominik Brodowski
@ 2018-04-05 15:19 ` Ingo Molnar
2018-04-05 20:31 ` Dominik Brodowski
8 siblings, 1 reply; 27+ messages in thread
From: Ingo Molnar @ 2018-04-05 15:19 UTC (permalink / raw)
To: Dominik Brodowski
Cc: linux-kernel, Al Viro, Andi Kleen, Andrew Morton,
Andy Lutomirski, Brian Gerst, Denys Vlasenko, H. Peter Anvin,
Ingo Molnar, Linus Torvalds, Peter Zijlstra, Thomas Gleixner,
x86
* Dominik Brodowski <linux@dominikbrodowski.net> wrote:
> Dominik Brodowski (7):
> syscalls: introduce CONFIG_ARCH_HAS_SYSCALL_WRAPPER
> syscalls/x86: use struct pt_regs based syscall calling for 64-bit
> syscalls
> syscalls: prepare ARCH_HAS_SYSCALL_WRAPPER for compat syscalls
> syscalls/x86: use struct pt_regs based syscall calling for
> IA32_EMULATION and x32
> syscalls/x86: unconditionally enable struct pt_regs based syscalls on
> x86_64
> x86/entry/64: extend register clearing on syscall entry to lower
> registers
> syscalls/x86: rename struct pt_regs-based sys_*() to __sys_x86_*()
>
> Linus Torvalds (1):
> x86: don't pointlessly reload the system call number
>
> arch/x86/Kconfig | 1 +
> arch/x86/entry/calling.h | 2 +
> arch/x86/entry/common.c | 20 +-
> arch/x86/entry/entry_64.S | 3 +-
> arch/x86/entry/entry_64_compat.S | 6 +
> arch/x86/entry/syscall_32.c | 15 +-
> arch/x86/entry/syscall_64.c | 6 +-
> arch/x86/entry/syscalls/syscall_32.tbl | 724 +++++++++++++++++----------------
> arch/x86/entry/syscalls/syscall_64.tbl | 712 ++++++++++++++++----------------
> arch/x86/entry/vsyscall/vsyscall_64.c | 18 +-
> arch/x86/include/asm/syscall.h | 4 +
> arch/x86/include/asm/syscall_wrapper.h | 197 +++++++++
> arch/x86/include/asm/syscalls.h | 17 +-
> include/linux/compat.h | 22 +
> include/linux/syscalls.h | 25 +-
> init/Kconfig | 10 +
> kernel/sys_ni.c | 10 +
> kernel/time/posix-stubs.c | 10 +
> 18 files changed, 1054 insertions(+), 748 deletions(-)
> create mode 100644 arch/x86/include/asm/syscall_wrapper.h
Ok, this series looks mostly good to me, but AFAICS this breaks the UML build:
make[2]: *** No rule to make target 'archheaders'. Stop.
arch/um/Makefile:119: recipe for target 'archheaders' failed
make[1]: *** [archheaders] Error 2
make[1]: *** Waiting for unfinished jobs....
Thanks,
Ingo
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH 0/8] use struct pt_regs based syscall calling for x86-64
2018-04-05 15:19 ` [PATCH 0/8] use struct pt_regs based syscall calling for x86-64 Ingo Molnar
@ 2018-04-05 20:31 ` Dominik Brodowski
2018-04-06 8:23 ` Ingo Molnar
0 siblings, 1 reply; 27+ messages in thread
From: Dominik Brodowski @ 2018-04-05 20:31 UTC (permalink / raw)
To: Ingo Molnar
Cc: linux-kernel, Al Viro, Andi Kleen, Andrew Morton,
Andy Lutomirski, Brian Gerst, Denys Vlasenko, H. Peter Anvin,
Ingo Molnar, Linus Torvalds, Peter Zijlstra, Thomas Gleixner,
x86
On Thu, Apr 05, 2018 at 05:19:33PM +0200, Ingo Molnar wrote:
> Ok, this series looks mostly good to me, but AFAICS this breaks the UML build:
>
> make[2]: *** No rule to make target 'archheaders'. Stop.
> arch/um/Makefile:119: recipe for target 'archheaders' failed
> make[1]: *** [archheaders] Error 2
> make[1]: *** Waiting for unfinished jobs....
Ah, that's caused by patch 8/8 which I did and do not like all that much
anyway: UML re-uses syscall_64.tbl which now has x86-specific entries like
__sys_x86_pread64, but expects the generic syscall stub sys_pread64
referenced there. Fixup patch below; could be folded with patch 8/8. Or
patch 8/8 could simply be dropped from the series altogether...
Thanks,
Dominik
--------------------------------------------------------------------------
>From f5049ea4e1e5e7751e72a22cbc1b3a9389959a04 Mon Sep 17 00:00:00 2001
From: Dominik Brodowski <linux@dominikbrodowski.net>
Date: Thu, 5 Apr 2018 22:16:23 +0200
Subject: [PATCH] syscalls/x86: fix UML syscall table
To differentiate the different calling regime used on x86, the syscall
functions received __sys_x86_ as prefix (instead of sys_). This breaks
the build of UML on 64-bit x86, as it re-uses the same syscall table.
To fix this, replace the __sys_x86_ prefix with sys_ during the generation
of <asm/syscalls_64.h>.
Fixes: syscalls/x86: rename struct pt_regs-based sys_*() to __sys_x86_*()
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: x86@kernel.org
Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
diff --git a/arch/x86/entry/syscalls/syscalltbl.sh b/arch/x86/entry/syscalls/syscalltbl.sh
index d71ef4bd3615..d6376b87ecb2 100644
--- a/arch/x86/entry/syscalls/syscalltbl.sh
+++ b/arch/x86/entry/syscalls/syscalltbl.sh
@@ -20,17 +20,47 @@ syscall_macro() {
echo "__SYSCALL_${abi}($nr, $real_entry, $qualifier)"
}
-emit() {
+emit64() {
abi="$1"
nr="$2"
entry="$3"
compat="$4"
- if [ "$abi" = "64" -a -n "$compat" ]; then
+ if [ "$abi" = "I386" ]; then
+ echo "emit64() called for a 32-bit syscall" >&2
+ exit 1
+ fi
+
+ if [ -n "$compat" ]; then
echo "a compat entry for a 64-bit syscall makes no sense" >&2
exit 1
fi
+ if [ -n "$entry" ]; then
+ # For CONFIG_UML, we need to strip the __sys_x86 prefix
+ if [ "${entry}" = "${entry#__compat_sys}" ]; then
+ umlentry="sys${entry#__sys_x86}"
+ fi
+
+ echo "#ifdef CONFIG_X86"
+ syscall_macro "$abi" "$nr" "$entry"
+ echo "#else /* CONFIG_UML */"
+ syscall_macro "$abi" "$nr" "$umlentry"
+ echo "#endif"
+ fi
+}
+
+emit32() {
+ abi="$1"
+ nr="$2"
+ entry="$3"
+ compat="$4"
+
+ if [ "$abi" = "64" ]; then
+ echo "emit32() called for a 64-bit syscall" >&2
+ exit 1
+ fi
+
if [ -z "$compat" ]; then
if [ -n "$entry" ]; then
syscall_macro "$abi" "$nr" "$entry"
@@ -53,14 +83,14 @@ grep '^[0-9]' "$in" | sort -n | (
# COMMON is the same as 64, except that we don't expect X32
# programs to use it. Our expectation has nothing to do with
# any generated code, so treat them the same.
- emit 64 "$nr" "$entry" "$compat"
+ emit64 64 "$nr" "$entry" "$compat"
elif [ "$abi" = "X32" ]; then
# X32 is equivalent to 64 on an X32-compatible kernel.
echo "#ifdef CONFIG_X86_X32_ABI"
- emit 64 "$nr" "$entry" "$compat"
+ emit64 64 "$nr" "$entry" "$compat"
echo "#endif"
elif [ "$abi" = "I386" ]; then
- emit "$abi" "$nr" "$entry" "$compat"
+ emit32 "$abi" "$nr" "$entry" "$compat"
else
echo "Unknown abi $abi" >&2
exit 1
^ permalink raw reply related [flat|nested] 27+ messages in thread
* Re: [PATCH 0/8] use struct pt_regs based syscall calling for x86-64
2018-04-05 20:31 ` Dominik Brodowski
@ 2018-04-06 8:23 ` Ingo Molnar
2018-04-06 8:31 ` Dominik Brodowski
2018-04-06 8:34 ` Dominik Brodowski
0 siblings, 2 replies; 27+ messages in thread
From: Ingo Molnar @ 2018-04-06 8:23 UTC (permalink / raw)
To: Dominik Brodowski
Cc: linux-kernel, Al Viro, Andi Kleen, Andrew Morton,
Andy Lutomirski, Brian Gerst, Denys Vlasenko, H. Peter Anvin,
Ingo Molnar, Linus Torvalds, Peter Zijlstra, Thomas Gleixner,
x86
* Dominik Brodowski <linux@dominikbrodowski.net> wrote:
> On Thu, Apr 05, 2018 at 05:19:33PM +0200, Ingo Molnar wrote:
> > Ok, this series looks mostly good to me, but AFAICS this breaks the UML build:
> >
> > make[2]: *** No rule to make target 'archheaders'. Stop.
> > arch/um/Makefile:119: recipe for target 'archheaders' failed
> > make[1]: *** [archheaders] Error 2
> > make[1]: *** Waiting for unfinished jobs....
>
> Ah, that's caused by patch 8/8 which I did and do not like all that much
> anyway: UML re-uses syscall_64.tbl which now has x86-specific entries like
> __sys_x86_pread64, but expects the generic syscall stub sys_pread64
> referenced there. Fixup patch below; could be folded with patch 8/8. Or
> patch 8/8 could simply be dropped from the series altogether...
I still like the 'truth in advertising' aspect. For example if I see this in the
syscall table:
10 common mprotect __sys_x86_mprotect
I can immediately find the _real_ syscall entry point:
ffffffff81180a10 <__sys_x86_mprotect>:
ffffffff81180a10: 48 8b 57 60 mov 0x60(%rdi),%rdx
ffffffff81180a14: 48 8b 77 68 mov 0x68(%rdi),%rsi
ffffffff81180a18: b9 ff ff ff ff mov $0xffffffff,%ecx
ffffffff81180a1d: 48 8b 7f 70 mov 0x70(%rdi),%rdi
ffffffff81180a21: e8 fa fc ff ff callq ffffffff81180720 <do_mprotect_pkey>
ffffffff81180a26: 48 98 cltq
ffffffff81180a28: c3 retq
ffffffff81180a29: 0f 1f 80 00 00 00 00 nopl 0x0(%rax)
If, on the other hand, I see this entry:
10 common mprotect sys_mprotect
Then, as a first step, no symbol anywhere matches with this:
triton:~/tip> grep sys_mprotect System.map
triton:~/tip>
"sys_mprotect" does not exist in any easily discoverable sense. You have to *know*
to replace the sys_ prefix with __sys_x86_ to find it.
Now arguably we could use a __sys_ prefix instead of the grep-barrier __sys_x86
prefix - but that too would be somewhat confusing I think.
I mean, the fact that we are passing in a ptregs pointer is a complexity of the
x86 kernel that *exists*, why hide it and make it harder to discover what's
happening, for something as important as system calls?
In terms of UML breakage, UML arguably is tightly coupled to its host
architecture:
> Subject: [PATCH] syscalls/x86: fix UML syscall table
Even with your patch applied I still see build failures:
$ make ARCH=um defconfig
$ make ARCH=um linux
...
arch/um/os-Linux/signal.c: In function ‘hard_handler’:
arch/um/os-Linux/signal.c:163:22: error: dereferencing pointer to incomplete type
‘struct ucontext’
mcontext_t *mc = &uc->uc_mcontext;
^~
scripts/Makefile.build:324: recipe for target 'arch/um/os-Linux/signal.o' failed
make[1]: *** [arch/um/os-Linux/signal.o] Error 1
Thanks,
Ingo
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH 0/8] use struct pt_regs based syscall calling for x86-64
2018-04-06 8:23 ` Ingo Molnar
@ 2018-04-06 8:31 ` Dominik Brodowski
2018-04-06 8:34 ` Dominik Brodowski
1 sibling, 0 replies; 27+ messages in thread
From: Dominik Brodowski @ 2018-04-06 8:31 UTC (permalink / raw)
To: Ingo Molnar
Cc: linux-kernel, Al Viro, Andi Kleen, Andrew Morton,
Andy Lutomirski, Brian Gerst, Denys Vlasenko, H. Peter Anvin,
Ingo Molnar, Linus Torvalds, Peter Zijlstra, Thomas Gleixner,
x86
On Fri, Apr 06, 2018 at 10:23:22AM +0200, Ingo Molnar wrote:
>
> * Dominik Brodowski <linux@dominikbrodowski.net> wrote:
>
> > On Thu, Apr 05, 2018 at 05:19:33PM +0200, Ingo Molnar wrote:
> > > Ok, this series looks mostly good to me, but AFAICS this breaks the UML build:
> > >
> > > make[2]: *** No rule to make target 'archheaders'. Stop.
> > > arch/um/Makefile:119: recipe for target 'archheaders' failed
> > > make[1]: *** [archheaders] Error 2
> > > make[1]: *** Waiting for unfinished jobs....
> >
> > Ah, that's caused by patch 8/8 which I did and do not like all that much
> > anyway: UML re-uses syscall_64.tbl which now has x86-specific entries like
> > __sys_x86_pread64, but expects the generic syscall stub sys_pread64
> > referenced there. Fixup patch below; could be folded with patch 8/8. Or
> > patch 8/8 could simply be dropped from the series altogether...
>
> I still like the 'truth in advertising' aspect. For example if I see this in the
> syscall table:
>
> 10 common mprotect __sys_x86_mprotect
>
> I can immediately find the _real_ syscall entry point:
>
> ffffffff81180a10 <__sys_x86_mprotect>:
> ffffffff81180a10: 48 8b 57 60 mov 0x60(%rdi),%rdx
> ffffffff81180a14: 48 8b 77 68 mov 0x68(%rdi),%rsi
> ffffffff81180a18: b9 ff ff ff ff mov $0xffffffff,%ecx
> ffffffff81180a1d: 48 8b 7f 70 mov 0x70(%rdi),%rdi
> ffffffff81180a21: e8 fa fc ff ff callq ffffffff81180720 <do_mprotect_pkey>
> ffffffff81180a26: 48 98 cltq
> ffffffff81180a28: c3 retq
> ffffffff81180a29: 0f 1f 80 00 00 00 00 nopl 0x0(%rax)
>
> If, on the other hand, I see this entry:
>
> 10 common mprotect sys_mprotect
>
> Then, as a first step, no symbol anywhere matches with this:
>
> triton:~/tip> grep sys_mprotect System.map
> triton:~/tip>
>
> "sys_mprotect" does not exist in any easily discoverable sense. You have to *know*
> to replace the sys_ prefix with __sys_x86_ to find it.
>
> Now arguably we could use a __sys_ prefix instead of the grep-barrier __sys_x86
> prefix - but that too would be somewhat confusing I think.
>
> I mean, the fact that we are passing in a ptregs pointer is a complexity of the
> x86 kernel that *exists*, why hide it and make it harder to discover what's
> happening, for something as important as system calls?
>
> In terms of UML breakage, UML arguably is tightly coupled to its host
> architecture:
>
> > Subject: [PATCH] syscalls/x86: fix UML syscall table
>
> Even with your patch applied I still see build failures:
>
> $ make ARCH=um defconfig
> $ make ARCH=um linux
> ...
> arch/um/os-Linux/signal.c: In function ‘hard_handler’:
> arch/um/os-Linux/signal.c:163:22: error: dereferencing pointer to incomplete type
> ‘struct ucontext’
> mcontext_t *mc = &uc->uc_mcontext;
> ^~
> scripts/Makefile.build:324: recipe for target 'arch/um/os-Linux/signal.o' failed
> make[1]: *** [arch/um/os-Linux/signal.o] Error 1
That build failure is already present in mainline as of 38c23685b273
(when building on Arch / gcc-7.3.1; building on Debian oldstable / gcc-4.9
works fine). And -- just checked -- this build failure also exists for
plain v4.16.
Thanks,
Dominik
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH 0/8] use struct pt_regs based syscall calling for x86-64
2018-04-06 8:23 ` Ingo Molnar
2018-04-06 8:31 ` Dominik Brodowski
@ 2018-04-06 8:34 ` Dominik Brodowski
2018-04-06 9:20 ` Ingo Molnar
1 sibling, 1 reply; 27+ messages in thread
From: Dominik Brodowski @ 2018-04-06 8:34 UTC (permalink / raw)
To: Ingo Molnar
Cc: linux-kernel, Al Viro, Andi Kleen, Andrew Morton,
Andy Lutomirski, Brian Gerst, Denys Vlasenko, H. Peter Anvin,
Ingo Molnar, Linus Torvalds, Peter Zijlstra, Thomas Gleixner,
x86
On Fri, Apr 06, 2018 at 10:23:22AM +0200, Ingo Molnar wrote:
>
> * Dominik Brodowski <linux@dominikbrodowski.net> wrote:
>
> > On Thu, Apr 05, 2018 at 05:19:33PM +0200, Ingo Molnar wrote:
> > > Ok, this series looks mostly good to me, but AFAICS this breaks the UML build:
> > >
> > > make[2]: *** No rule to make target 'archheaders'. Stop.
> > > arch/um/Makefile:119: recipe for target 'archheaders' failed
> > > make[1]: *** [archheaders] Error 2
> > > make[1]: *** Waiting for unfinished jobs....
> >
> > Ah, that's caused by patch 8/8 which I did and do not like all that much
> > anyway: UML re-uses syscall_64.tbl which now has x86-specific entries like
> > __sys_x86_pread64, but expects the generic syscall stub sys_pread64
> > referenced there. Fixup patch below; could be folded with patch 8/8. Or
> > patch 8/8 could simply be dropped from the series altogether...
>
> I still like the 'truth in advertising' aspect. For example if I see this in the
> syscall table:
>
> 10 common mprotect __sys_x86_mprotect
>
> I can immediately find the _real_ syscall entry point:
>
> ffffffff81180a10 <__sys_x86_mprotect>:
> ffffffff81180a10: 48 8b 57 60 mov 0x60(%rdi),%rdx
> ffffffff81180a14: 48 8b 77 68 mov 0x68(%rdi),%rsi
> ffffffff81180a18: b9 ff ff ff ff mov $0xffffffff,%ecx
> ffffffff81180a1d: 48 8b 7f 70 mov 0x70(%rdi),%rdi
> ffffffff81180a21: e8 fa fc ff ff callq ffffffff81180720 <do_mprotect_pkey>
> ffffffff81180a26: 48 98 cltq
> ffffffff81180a28: c3 retq
> ffffffff81180a29: 0f 1f 80 00 00 00 00 nopl 0x0(%rax)
>
> If, on the other hand, I see this entry:
>
> 10 common mprotect sys_mprotect
>
> Then, as a first step, no symbol anywhere matches with this:
>
> triton:~/tip> grep sys_mprotect System.map
> triton:~/tip>
>
> "sys_mprotect" does not exist in any easily discoverable sense. You have to *know*
> to replace the sys_ prefix with __sys_x86_ to find it.
>
> Now arguably we could use a __sys_ prefix instead of the grep-barrier __sys_x86
> prefix - but that too would be somewhat confusing I think.
Well, if looking at the ARCH="um" kernel, you won't find the
__sys_x86_mprotect there in its System.map -- so we either have to
disentangle um and plain x86, or live with some cause for confusion.
__sys_mprotect as prefix won't work by the way, as the double-underscore
__sys_ variant is already used in net/* for internal syscall helpers.
Thanks,
Dominik
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH 0/8] use struct pt_regs based syscall calling for x86-64
2018-04-06 8:34 ` Dominik Brodowski
@ 2018-04-06 9:20 ` Ingo Molnar
2018-04-06 9:34 ` Dominik Brodowski
0 siblings, 1 reply; 27+ messages in thread
From: Ingo Molnar @ 2018-04-06 9:20 UTC (permalink / raw)
To: Dominik Brodowski
Cc: linux-kernel, Al Viro, Andi Kleen, Andrew Morton,
Andy Lutomirski, Brian Gerst, Denys Vlasenko, H. Peter Anvin,
Ingo Molnar, Linus Torvalds, Peter Zijlstra, Thomas Gleixner,
x86
* Dominik Brodowski <linux@dominikbrodowski.net> wrote:
> On Fri, Apr 06, 2018 at 10:23:22AM +0200, Ingo Molnar wrote:
> >
> > * Dominik Brodowski <linux@dominikbrodowski.net> wrote:
> >
> > > On Thu, Apr 05, 2018 at 05:19:33PM +0200, Ingo Molnar wrote:
> > > > Ok, this series looks mostly good to me, but AFAICS this breaks the UML build:
> > > >
> > > > make[2]: *** No rule to make target 'archheaders'. Stop.
> > > > arch/um/Makefile:119: recipe for target 'archheaders' failed
> > > > make[1]: *** [archheaders] Error 2
> > > > make[1]: *** Waiting for unfinished jobs....
> > >
> > > Ah, that's caused by patch 8/8 which I did and do not like all that much
> > > anyway: UML re-uses syscall_64.tbl which now has x86-specific entries like
> > > __sys_x86_pread64, but expects the generic syscall stub sys_pread64
> > > referenced there. Fixup patch below; could be folded with patch 8/8. Or
> > > patch 8/8 could simply be dropped from the series altogether...
> >
> > I still like the 'truth in advertising' aspect. For example if I see this in the
> > syscall table:
> >
> > 10 common mprotect __sys_x86_mprotect
> >
> > I can immediately find the _real_ syscall entry point:
> >
> > ffffffff81180a10 <__sys_x86_mprotect>:
> > ffffffff81180a10: 48 8b 57 60 mov 0x60(%rdi),%rdx
> > ffffffff81180a14: 48 8b 77 68 mov 0x68(%rdi),%rsi
> > ffffffff81180a18: b9 ff ff ff ff mov $0xffffffff,%ecx
> > ffffffff81180a1d: 48 8b 7f 70 mov 0x70(%rdi),%rdi
> > ffffffff81180a21: e8 fa fc ff ff callq ffffffff81180720 <do_mprotect_pkey>
> > ffffffff81180a26: 48 98 cltq
> > ffffffff81180a28: c3 retq
> > ffffffff81180a29: 0f 1f 80 00 00 00 00 nopl 0x0(%rax)
> >
> > If, on the other hand, I see this entry:
> >
> > 10 common mprotect sys_mprotect
> >
> > Then, as a first step, no symbol anywhere matches with this:
> >
> > triton:~/tip> grep sys_mprotect System.map
> > triton:~/tip>
> >
> > "sys_mprotect" does not exist in any easily discoverable sense. You have to *know*
> > to replace the sys_ prefix with __sys_x86_ to find it.
> >
> > Now arguably we could use a __sys_ prefix instead of the grep-barrier __sys_x86
> > prefix - but that too would be somewhat confusing I think.
>
> Well, if looking at the ARCH="um" kernel, you won't find the __sys_x86_mprotect
> there in its System.map -- so we either have to disentangle um and plain x86, or
> live with some cause for confusion.
I'm primarily concerned about everything making sense on x86 - UML is an entirely
separate architecture with heavy tradeoffs and kludges.
> __sys_mprotect as prefix won't work by the way, as the double-underscore __sys_
> variant is already used in net/* for internal syscall helpers.
Ok - then triple underscore - but overall I think it's more confusing.
Btw., what was the problem with calling the x86 ptregs wrapper sys_mprotect?
The only reason I suggested the __sys_x86_ prefix was because you originally
suggested that there's symbol name overlap, but I don't think that's the case
within the same kernel build, as the regular non-ptregs prototype:
asmlinkage long sys_mprotect(unsigned long start, size_t len, unsigned long prot);
... will only exist on !CONFIG_ARCH_HAS_SYSCALL_WRAPPER kernels.
So maybe that's the simplest and least confusing solution.
Thanks,
Ingo
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH 0/8] use struct pt_regs based syscall calling for x86-64
2018-04-06 9:20 ` Ingo Molnar
@ 2018-04-06 9:34 ` Dominik Brodowski
2018-04-06 12:34 ` Ingo Molnar
0 siblings, 1 reply; 27+ messages in thread
From: Dominik Brodowski @ 2018-04-06 9:34 UTC (permalink / raw)
To: Ingo Molnar
Cc: linux-kernel, Al Viro, Andi Kleen, Andrew Morton,
Andy Lutomirski, Brian Gerst, Denys Vlasenko, H. Peter Anvin,
Ingo Molnar, Linus Torvalds, Peter Zijlstra, Thomas Gleixner,
x86
On Fri, Apr 06, 2018 at 11:20:46AM +0200, Ingo Molnar wrote:
>
> * Dominik Brodowski <linux@dominikbrodowski.net> wrote:
>
> > On Fri, Apr 06, 2018 at 10:23:22AM +0200, Ingo Molnar wrote:
> > >
> > > * Dominik Brodowski <linux@dominikbrodowski.net> wrote:
> > >
> > > > On Thu, Apr 05, 2018 at 05:19:33PM +0200, Ingo Molnar wrote:
> > > > > Ok, this series looks mostly good to me, but AFAICS this breaks the UML build:
> > > > >
> > > > > make[2]: *** No rule to make target 'archheaders'. Stop.
> > > > > arch/um/Makefile:119: recipe for target 'archheaders' failed
> > > > > make[1]: *** [archheaders] Error 2
> > > > > make[1]: *** Waiting for unfinished jobs....
> > > >
> > > > Ah, that's caused by patch 8/8 which I did and do not like all that much
> > > > anyway: UML re-uses syscall_64.tbl which now has x86-specific entries like
> > > > __sys_x86_pread64, but expects the generic syscall stub sys_pread64
> > > > referenced there. Fixup patch below; could be folded with patch 8/8. Or
> > > > patch 8/8 could simply be dropped from the series altogether...
> > >
> > > I still like the 'truth in advertising' aspect. For example if I see this in the
> > > syscall table:
> > >
> > > 10 common mprotect __sys_x86_mprotect
> > >
> > > I can immediately find the _real_ syscall entry point:
> > >
> > > ffffffff81180a10 <__sys_x86_mprotect>:
> > > ffffffff81180a10: 48 8b 57 60 mov 0x60(%rdi),%rdx
> > > ffffffff81180a14: 48 8b 77 68 mov 0x68(%rdi),%rsi
> > > ffffffff81180a18: b9 ff ff ff ff mov $0xffffffff,%ecx
> > > ffffffff81180a1d: 48 8b 7f 70 mov 0x70(%rdi),%rdi
> > > ffffffff81180a21: e8 fa fc ff ff callq ffffffff81180720 <do_mprotect_pkey>
> > > ffffffff81180a26: 48 98 cltq
> > > ffffffff81180a28: c3 retq
> > > ffffffff81180a29: 0f 1f 80 00 00 00 00 nopl 0x0(%rax)
> > >
> > > If, on the other hand, I see this entry:
> > >
> > > 10 common mprotect sys_mprotect
> > >
> > > Then, as a first step, no symbol anywhere matches with this:
> > >
> > > triton:~/tip> grep sys_mprotect System.map
> > > triton:~/tip>
> > >
> > > "sys_mprotect" does not exist in any easily discoverable sense. You have to *know*
> > > to replace the sys_ prefix with __sys_x86_ to find it.
> > >
> > > Now arguably we could use a __sys_ prefix instead of the grep-barrier __sys_x86
> > > prefix - but that too would be somewhat confusing I think.
> >
> > Well, if looking at the ARCH="um" kernel, you won't find the __sys_x86_mprotect
> > there in its System.map -- so we either have to disentangle um and plain x86, or
> > live with some cause for confusion.
>
> I'm primarily concerned about everything making sense on x86 - UML is an entirely
> separate architecture with heavy tradeoffs and kludges.
Agreed.
> > __sys_mprotect as prefix won't work by the way, as the double-underscore __sys_
> > variant is already used in net/* for internal syscall helpers.
>
> Ok - then triple underscore - but overall I think it's more confusing.
>
> Btw., what was the problem with calling the x86 ptregs wrapper sys_mprotect?
>
> The only reason I suggested the __sys_x86_ prefix was because you originally
> suggested that there's symbol name overlap, but I don't think that's the case
> within the same kernel build, as the regular non-ptregs prototype:
Indeed, there's no symbol name overlap within the same kernel build, but
technically different stubs named the same. If that's fine, just drop patch
8/8 (including the UML fixup) and things should be fine, with the stub and
the entry in the syscall table both named sys_mprotect.
For IA32_EMULATION, we have __sys_ia32_mprotect as stub for the same
syscall, including this name as entry in syscall_32.tbl.
More problematic is the naming for the compat stubs for IA32_EMAULATION and
X32, where we have
__compat_sys_ia32_waitid
__compat_sys_x32_waitid
for example. We *could* rename one of those to compat_sys_waitid() and levae
the other as-is, but actually I prefer it now how it is.
Thanks,
Dominik
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH 0/8] use struct pt_regs based syscall calling for x86-64
2018-04-06 9:34 ` Dominik Brodowski
@ 2018-04-06 12:34 ` Ingo Molnar
2018-04-06 13:07 ` Dominik Brodowski
0 siblings, 1 reply; 27+ messages in thread
From: Ingo Molnar @ 2018-04-06 12:34 UTC (permalink / raw)
To: Dominik Brodowski
Cc: linux-kernel, Al Viro, Andi Kleen, Andrew Morton,
Andy Lutomirski, Brian Gerst, Denys Vlasenko, H. Peter Anvin,
Ingo Molnar, Linus Torvalds, Peter Zijlstra, Thomas Gleixner,
x86, Peter Zijlstra
* Dominik Brodowski <linux@dominikbrodowski.net> wrote:
> > > __sys_mprotect as prefix won't work by the way, as the double-underscore __sys_
> > > variant is already used in net/* for internal syscall helpers.
> >
> > Ok - then triple underscore - but overall I think it's more confusing.
> >
> > Btw., what was the problem with calling the x86 ptregs wrapper sys_mprotect?
> >
> > The only reason I suggested the __sys_x86_ prefix was because you originally
> > suggested that there's symbol name overlap, but I don't think that's the case
> > within the same kernel build, as the regular non-ptregs prototype:
>
> Indeed, there's no symbol name overlap within the same kernel build, but
> technically different stubs named the same. If that's fine, just drop patch
> 8/8 (including the UML fixup) and things should be fine, with the stub and
> the entry in the syscall table both named sys_mprotect.
Ok, I've dropped patch #8.
> For IA32_EMULATION, we have __sys_ia32_mprotect as stub for the same
> syscall, including this name as entry in syscall_32.tbl.
>
> More problematic is the naming for the compat stubs for IA32_EMAULATION and
> X32, where we have
>
> __compat_sys_ia32_waitid
> __compat_sys_x32_waitid
>
> for example. We *could* rename one of those to compat_sys_waitid() and levae
> the other as-is, but actually I prefer it now how it is.
yeah, this is more symmetric I think.
So right now we have these symbols:
triton:~/tip> grep waitid System.map
ffffffff8105f1e0 t kernel_waitid # common C function (64-bit kargs)
ffffffff8105f2b0 t SYSC_waitid # 64-bit uaddr args C function 352 bytes
ffffffff8105f410 T sys_waitid # 64-bit-ptregs -> C stub, 32 bytes
ffffffff8105f430 T __sys_ia32_waitid # 32-bit-ptregs -> C stub, 32 bytes
ffffffff8105f450 t C_SYSC_waitid # 32-bit uaddr args C function, 400 bytes
ffffffff8105f5e0 T __compat_sys_ia32_waitid # 32-bit-ptregs -> C stub, 32 bytes
ffffffff8105f600 T __compat_sys_x32_waitid # 64-bit-ptregs -> C stub, 32 bytes
BTW., what is the role of generating __sys_ia32_waitid()? It seems unused when a
syscall has a compat variant otherwise - like here.
Naming wise the odd thumb out is sys_waitid :-/
I'd argue that we should at minimum name it __sys_waitid:
ffffffff8105f1e0 t kernel_waitid # common C function (64-bit kargs)
ffffffff8105f2b0 t SYSC_waitid # 64-bit uaddr args C function
ffffffff8105f410 T __sys_waitid # 64-bit-ptregs -> C stub
ffffffff8105f430 T __sys_ia32_waitid # 32-bit-ptregs -> C stub
ffffffff8105f450 t C_SYSC_waitid # 32-bit uaddr args C function
ffffffff8105f5e0 T __compat_sys_ia32_waitid # 32-bit-ptregs -> C stub
ffffffff8105f600 T __compat_sys_x32_waitid # 64-bit-ptregs -> C stub
because that makes it all organized based on the same principle:
{__compat|_}_sys{_ia32|_x32|}_waittid
But arguably there are a whole lot more naming weirdnesses we could fix:
- I find it somewhat confusing that that 'C' in C_SYSC stands not for a C callign
convention, but for 'COMPAT' - i.e. COMPAT_SYSC would be better.
- Another detail is that why is it called 'SYSC', if the other functions use the
'sys' prefix? Wouldn't 'SYS' be more consistent?
- If we introduced a prefix for the regular 64-bit system call format as well,
we could have: x64, x32 and ia32.
- And finally, I think the argument format modifiers should be consistently
prefixes - right now they are a mixture of pre- and post-fixes.
I.e. I'd generate the names like this:
__{x64,x32,ia32}[_compat]_sys_waittid()
The fully consistent nomenclature would be someting like this:
ffffffff8105f1e0 t kernel_waitid # common C function (64-bit kargs)
ffffffff8105f2b0 t SYS_waitid # 64-bit uaddr args C function
ffffffff8105f410 T __x64_sys_waitid # 64-bit-ptregs -> C stub
ffffffff8105f430 T __ia32_sys_waitid # 32-bit-ptregs -> C stub
ffffffff8105f450 t COMPAT_SYS_waitid # 32-bit uaddr args C function
ffffffff8105f5e0 T __ia32_compat_sys_waitid # 32-bit-ptregs -> C stub
ffffffff8105f600 T __x32_compat_sys_waitid # 64-bit-ptregs -> C stub
Looks a lot tidier and a lot more logical, doesn't it?
Makes grepping easier as well, because (case-insensitive) patterns like
'sys_waittid' would identify all the variants.
Personally I'd also do a s/ia32/i32 rename:
ffffffff8105f1e0 t kernel_waitid # common C function (64-bit kargs)
ffffffff8105f2b0 t SYS_waitid # 64-bit uaddr args C function
ffffffff8105f410 T __x64_sys_waitid # 64-bit-ptregs -> C stub
ffffffff8105f430 T __i32_sys_waitid # 32-bit-ptregs -> C stub
ffffffff8105f450 t COMPAT_SYS_waitid # 32-bit uaddr args C function
ffffffff8105f5e0 T __i32_compat_sys_waitid # 32-bit-ptregs -> C stub
ffffffff8105f600 T __x32_compat_sys_waitid # 64-bit-ptregs -> C stub
... but maybe that's too much ;-)
Thanks,
Ingo
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH 0/8] use struct pt_regs based syscall calling for x86-64
2018-04-06 12:34 ` Ingo Molnar
@ 2018-04-06 13:07 ` Dominik Brodowski
2018-04-06 17:03 ` Ingo Molnar
0 siblings, 1 reply; 27+ messages in thread
From: Dominik Brodowski @ 2018-04-06 13:07 UTC (permalink / raw)
To: Ingo Molnar
Cc: linux-kernel, Al Viro, Andi Kleen, Andrew Morton,
Andy Lutomirski, Brian Gerst, Denys Vlasenko, H. Peter Anvin,
Ingo Molnar, Linus Torvalds, Peter Zijlstra, Thomas Gleixner,
x86, Peter Zijlstra
On Fri, Apr 06, 2018 at 02:34:20PM +0200, Ingo Molnar wrote:
>
> * Dominik Brodowski <linux@dominikbrodowski.net> wrote:
>
> > > > __sys_mprotect as prefix won't work by the way, as the double-underscore __sys_
> > > > variant is already used in net/* for internal syscall helpers.
> > >
> > > Ok - then triple underscore - but overall I think it's more confusing.
> > >
> > > Btw., what was the problem with calling the x86 ptregs wrapper sys_mprotect?
> > >
> > > The only reason I suggested the __sys_x86_ prefix was because you originally
> > > suggested that there's symbol name overlap, but I don't think that's the case
> > > within the same kernel build, as the regular non-ptregs prototype:
> >
> > Indeed, there's no symbol name overlap within the same kernel build, but
> > technically different stubs named the same. If that's fine, just drop patch
> > 8/8 (including the UML fixup) and things should be fine, with the stub and
> > the entry in the syscall table both named sys_mprotect.
>
> Ok, I've dropped patch #8.
Thanks!
> > For IA32_EMULATION, we have __sys_ia32_mprotect as stub for the same
> > syscall, including this name as entry in syscall_32.tbl.
> >
> > More problematic is the naming for the compat stubs for IA32_EMAULATION and
> > X32, where we have
> >
> > __compat_sys_ia32_waitid
> > __compat_sys_x32_waitid
> >
> > for example. We *could* rename one of those to compat_sys_waitid() and levae
> > the other as-is, but actually I prefer it now how it is.
>
> yeah, this is more symmetric I think.
>
> So right now we have these symbols:
>
> triton:~/tip> grep waitid System.map
>
> ffffffff8105f1e0 t kernel_waitid # common C function (64-bit kargs)
> ffffffff8105f2b0 t SYSC_waitid # 64-bit uaddr args C function 352 bytes
> ffffffff8105f410 T sys_waitid # 64-bit-ptregs -> C stub, 32 bytes
> ffffffff8105f430 T __sys_ia32_waitid # 32-bit-ptregs -> C stub, 32 bytes
> ffffffff8105f450 t C_SYSC_waitid # 32-bit uaddr args C function, 400 bytes
> ffffffff8105f5e0 T __compat_sys_ia32_waitid # 32-bit-ptregs -> C stub, 32 bytes
> ffffffff8105f600 T __compat_sys_x32_waitid # 64-bit-ptregs -> C stub, 32 bytes
>
> BTW., what is the role of generating __sys_ia32_waitid()? It seems unused when a
> syscall has a compat variant otherwise - like here.
Indeed it is unused -- but when compiling
SYSCALL_DEFINE5(waitid, ...)
we don't know yet that somewhere else in the kernel there is
COMPAT_SYSCALL_DEFINE5(waitid, ...)
So if we want to compile the __sys_ia32_ stub only iff it is used, we'd
need to have different macros for syscalls which have compat variants and
for those which do not. Or do you have an idea how we could trick the
compiler to do the right thing?
> Naming wise the odd thumb out is sys_waitid :-/
>
> I'd argue that we should at minimum name it __sys_waitid:
Can't do, as that namespace is already taken.
> ffffffff8105f1e0 t kernel_waitid # common C function (64-bit kargs)
> ffffffff8105f2b0 t SYSC_waitid # 64-bit uaddr args C function
> ffffffff8105f410 T __sys_waitid # 64-bit-ptregs -> C stub
> ffffffff8105f430 T __sys_ia32_waitid # 32-bit-ptregs -> C stub
> ffffffff8105f450 t C_SYSC_waitid # 32-bit uaddr args C function
> ffffffff8105f5e0 T __compat_sys_ia32_waitid # 32-bit-ptregs -> C stub
> ffffffff8105f600 T __compat_sys_x32_waitid # 64-bit-ptregs -> C stub
>
> because that makes it all organized based on the same principle:
>
> {__compat|_}_sys{_ia32|_x32|}_waittid
>
> But arguably there are a whole lot more naming weirdnesses we could fix:
>
> - I find it somewhat confusing that that 'C' in C_SYSC stands not for a C callign
> convention, but for 'COMPAT' - i.e. COMPAT_SYSC would be better.
>
> - Another detail is that why is it called 'SYSC', if the other functions use the
> 'sys' prefix? Wouldn't 'SYS' be more consistent?
It sounds totally reasonable to re-name those, but we should do it not only
for the x86 case, but in include/linux/syscalls.h as well.
> - If we introduced a prefix for the regular 64-bit system call format as well,
> we could have: x64, x32 and ia32.
>
> - And finally, I think the argument format modifiers should be consistently
> prefixes - right now they are a mixture of pre- and post-fixes.
>
> I.e. I'd generate the names like this:
>
> __{x64,x32,ia32}[_compat]_sys_waittid()
>
> The fully consistent nomenclature would be someting like this:
>
> ffffffff8105f1e0 t kernel_waitid # common C function (64-bit kargs)
> ffffffff8105f2b0 t SYS_waitid # 64-bit uaddr args C function
> ffffffff8105f410 T __x64_sys_waitid # 64-bit-ptregs -> C stub
> ffffffff8105f430 T __ia32_sys_waitid # 32-bit-ptregs -> C stub
> ffffffff8105f450 t COMPAT_SYS_waitid # 32-bit uaddr args C function
> ffffffff8105f5e0 T __ia32_compat_sys_waitid # 32-bit-ptregs -> C stub
> ffffffff8105f600 T __x32_compat_sys_waitid # 64-bit-ptregs -> C stub
>
> Looks a lot tidier and a lot more logical, doesn't it?
Indeed. Want me to prepare a new patch 8/8 on top which does the renaming
(for x86 and for the generic case), or will you do the re-naming while
merging my patches yourself?
> Personally I'd also do a s/ia32/i32 rename:
Uh, no, please. It makes things align more beautifully, yes, but the
32-bit sub-architecture commonly is referred to as either i386 or ia32...
Thanks,
Dominik
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH 0/8] use struct pt_regs based syscall calling for x86-64
2018-04-06 13:07 ` Dominik Brodowski
@ 2018-04-06 17:03 ` Ingo Molnar
0 siblings, 0 replies; 27+ messages in thread
From: Ingo Molnar @ 2018-04-06 17:03 UTC (permalink / raw)
To: Dominik Brodowski
Cc: linux-kernel, Al Viro, Andi Kleen, Andrew Morton,
Andy Lutomirski, Brian Gerst, Denys Vlasenko, H. Peter Anvin,
Ingo Molnar, Linus Torvalds, Peter Zijlstra, Thomas Gleixner,
x86, Peter Zijlstra
* Dominik Brodowski <linux@dominikbrodowski.net> wrote:
> > I.e. I'd generate the names like this:
> >
> > __{x64,x32,ia32}[_compat]_sys_waittid()
> >
> > The fully consistent nomenclature would be someting like this:
> >
> > ffffffff8105f1e0 t kernel_waitid # common C function (64-bit kargs)
> > ffffffff8105f2b0 t SYS_waitid # 64-bit uaddr args C function
> > ffffffff8105f410 T __x64_sys_waitid # 64-bit-ptregs -> C stub
> > ffffffff8105f430 T __ia32_sys_waitid # 32-bit-ptregs -> C stub
> > ffffffff8105f450 t COMPAT_SYS_waitid # 32-bit uaddr args C function
> > ffffffff8105f5e0 T __ia32_compat_sys_waitid # 32-bit-ptregs -> C stub
> > ffffffff8105f600 T __x32_compat_sys_waitid # 64-bit-ptregs -> C stub
> >
> > Looks a lot tidier and a lot more logical, doesn't it?
>
> Indeed. Want me to prepare a new patch 8/8 on top which does the renaming
> (for x86 and for the generic case), or will you do the re-naming while
> merging my patches yourself?
Please do an 8/8 patch that does the rename - I'll push out the first 7 patches so
they get more testing.
Note, I have not checked the above name space for namespace collisions - but
unless we are unlucky it should be fine.
BTW., is there any deep reason why some of these names are capitalized?
I.e. could we use:
ffffffff8105f1e0 t kernel_waitid # common C function (64-bit kargs)
ffffffff8105f2b0 t sys_waitid # 64-bit uaddr args C function
ffffffff8105f410 T __x64_sys_waitid # 64-bit-ptregs -> C stub
ffffffff8105f430 T __ia32_sys_waitid # 32-bit-ptregs -> C stub
ffffffff8105f450 t compat_sys_waitid # 32-bit uaddr args C function
ffffffff8105f5e0 T __ia32_compat_sys_waitid # 32-bit-ptregs -> C stub
ffffffff8105f600 T __x32_compat_sys_waitid # 64-bit-ptregs -> C stub
?
Note how this reduces naming complexity and increases the self-consistency even more.
Thanks,
Ingo
^ permalink raw reply [flat|nested] 27+ messages in thread