All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/3] x86: Fix pt_regs passed by value
@ 2009-02-10 14:51 Brian Gerst
  2009-02-10 14:51 ` [PATCH 1/3] x86: Use pt_regs pointer in do_device_not_available() Brian Gerst
                   ` (3 more replies)
  0 siblings, 4 replies; 36+ messages in thread
From: Brian Gerst @ 2009-02-10 14:51 UTC (permalink / raw)
  To: Tejun Heo; +Cc: Ingo Molnar, linux-kernel

Several syscalls and one trap handler currently rely on the pt_regs
struct being passed in by value.  This causes problems with stack
protector, and can also cause problems with corruption due to tail-call
optimization.  Change them to use a pointer to the pt_regs struct
instead.

--
Brian Gerst


^ permalink raw reply	[flat|nested] 36+ messages in thread

* [PATCH 1/3] x86: Use pt_regs pointer in do_device_not_available()
  2009-02-10 14:51 [PATCH 0/3] x86: Fix pt_regs passed by value Brian Gerst
@ 2009-02-10 14:51 ` Brian Gerst
  2009-02-11  7:43   ` Tejun Heo
  2009-02-10 14:51 ` [PATCH 2/3] x86: Pass in pt_regs pointer for syscalls that need it Brian Gerst
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 36+ messages in thread
From: Brian Gerst @ 2009-02-10 14:51 UTC (permalink / raw)
  To: Tejun Heo; +Cc: Ingo Molnar, linux-kernel

The generic exception handler (error_code) passes in the pt_regs
pointer and the error code (unused in this case).  The commit
"x86: fix math_emu register frame access" changed this to pass by
value, which doesn't work correctly with stack protector enabled.
Change it back to use the pt_regs pointer.

Signed-off-by: Brian Gerst <brgerst@gmail.com>
---
 arch/x86/include/asm/traps.h |    2 +-
 arch/x86/kernel/traps.c      |    9 +++++----
 2 files changed, 6 insertions(+), 5 deletions(-)

diff --git a/arch/x86/include/asm/traps.h b/arch/x86/include/asm/traps.h
index cf3bb05..0d53425 100644
--- a/arch/x86/include/asm/traps.h
+++ b/arch/x86/include/asm/traps.h
@@ -41,7 +41,7 @@ dotraplinkage void do_int3(struct pt_regs *, long);
 dotraplinkage void do_overflow(struct pt_regs *, long);
 dotraplinkage void do_bounds(struct pt_regs *, long);
 dotraplinkage void do_invalid_op(struct pt_regs *, long);
-dotraplinkage void do_device_not_available(struct pt_regs);
+dotraplinkage void do_device_not_available(struct pt_regs *, long);
 dotraplinkage void do_coprocessor_segment_overrun(struct pt_regs *, long);
 dotraplinkage void do_invalid_TSS(struct pt_regs *, long);
 dotraplinkage void do_segment_not_present(struct pt_regs *, long);
diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
index 3b7b2e1..71a8f87 100644
--- a/arch/x86/kernel/traps.c
+++ b/arch/x86/kernel/traps.c
@@ -905,19 +905,20 @@ void math_emulate(struct math_emu_info *info)
 }
 #endif /* CONFIG_MATH_EMULATION */
 
-dotraplinkage void __kprobes do_device_not_available(struct pt_regs regs)
+dotraplinkage void __kprobes
+do_device_not_available(struct pt_regs *regs, long error_code)
 {
 #ifdef CONFIG_X86_32
 	if (read_cr0() & X86_CR0_EM) {
 		struct math_emu_info info = { };
 
-		conditional_sti(&regs);
+		conditional_sti(regs);
 
-		info.regs = &regs;
+		info.regs = regs;
 		math_emulate(&info);
 	} else {
 		math_state_restore(); /* interrupts still off */
-		conditional_sti(&regs);
+		conditional_sti(regs);
 	}
 #else
 	math_state_restore();
-- 
1.6.1


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 2/3] x86: Pass in pt_regs pointer for syscalls that need it
  2009-02-10 14:51 [PATCH 0/3] x86: Fix pt_regs passed by value Brian Gerst
  2009-02-10 14:51 ` [PATCH 1/3] x86: Use pt_regs pointer in do_device_not_available() Brian Gerst
@ 2009-02-10 14:51 ` Brian Gerst
  2009-02-11  7:41   ` Tejun Heo
                     ` (2 more replies)
  2009-02-10 14:51 ` [PATCH 3/3] x86: Drop -fno-stack-protector after pt_regs fixes Brian Gerst
  2009-02-11 11:42 ` [PATCH 0/3] x86: Fix pt_regs passed by value Ingo Molnar
  3 siblings, 3 replies; 36+ messages in thread
From: Brian Gerst @ 2009-02-10 14:51 UTC (permalink / raw)
  To: Tejun Heo; +Cc: Ingo Molnar, linux-kernel

Some syscalls need to access the pt_regs structure, either to copy
user register state or to modifiy it.  This patch adds stubs to load
the address of the pt_regs struct into the %eax register, and changes
the syscalls to regparm(1) to receive the pt_regs pointer as the
first argument.

Signed-off-by: Brian Gerst <brgerst@gmail.com>
---
 arch/x86/include/asm/linkage.h     |    7 +++++++
 arch/x86/include/asm/syscalls.h    |   25 +++++++++++++++----------
 arch/x86/kernel/entry_32.S         |   20 ++++++++++++++++++++
 arch/x86/kernel/ioport.c           |    4 +---
 arch/x86/kernel/process_32.c       |   35 ++++++++++++++---------------------
 arch/x86/kernel/signal.c           |   35 +++++++----------------------------
 arch/x86/kernel/syscall_table_32.S |   20 ++++++++++----------
 arch/x86/kernel/vm86_32.c          |   15 +++++++--------
 8 files changed, 81 insertions(+), 80 deletions(-)

diff --git a/arch/x86/include/asm/linkage.h b/arch/x86/include/asm/linkage.h
index 5d98d0b..2fd5926 100644
--- a/arch/x86/include/asm/linkage.h
+++ b/arch/x86/include/asm/linkage.h
@@ -18,6 +18,13 @@
 #define asmregparm __attribute__((regparm(3)))
 
 /*
+ * For syscalls that need a pointer to the pt_regs struct (ie. fork).
+ * The regs pointer is passed in %eax as the first argument.  The
+ * remaining function arguments remain on the stack.
+ */
+#define ptregscall __attribute__((regparm(1)))
+
+/*
  * Make sure the compiler doesn't do anything stupid with the
  * arguments on the stack - they are owned by the *caller*, not
  * the callee. This just fools gcc into not spilling into them,
diff --git a/arch/x86/include/asm/syscalls.h b/arch/x86/include/asm/syscalls.h
index c0b0bda..6172952 100644
--- a/arch/x86/include/asm/syscalls.h
+++ b/arch/x86/include/asm/syscalls.h
@@ -29,21 +29,26 @@ asmlinkage int sys_get_thread_area(struct user_desc __user *);
 /* X86_32 only */
 #ifdef CONFIG_X86_32
 /* kernel/process_32.c */
-asmlinkage int sys_fork(struct pt_regs);
-asmlinkage int sys_clone(struct pt_regs);
-asmlinkage int sys_vfork(struct pt_regs);
-asmlinkage int sys_execve(struct pt_regs);
+ptregscall int sys_fork(struct pt_regs *);
+ptregscall int sys_clone(struct pt_regs *, unsigned long,
+			 unsigned long, int __user *,
+			 unsigned long, int __user *);
+ptregscall int sys_vfork(struct pt_regs *);
+ptregscall int sys_execve(struct pt_regs *, char __user *,
+			  char __user * __user *,
+			  char __user * __user *);
 
 /* kernel/signal_32.c */
 asmlinkage int sys_sigsuspend(int, int, old_sigset_t);
 asmlinkage int sys_sigaction(int, const struct old_sigaction __user *,
 			     struct old_sigaction __user *);
-asmlinkage int sys_sigaltstack(unsigned long);
-asmlinkage unsigned long sys_sigreturn(unsigned long);
-asmlinkage int sys_rt_sigreturn(unsigned long);
+ptregscall int sys_sigaltstack(struct pt_regs *, const stack_t __user *,
+			       stack_t __user *);
+ptregscall unsigned long sys_sigreturn(struct pt_regs *);
+ptregscall int sys_rt_sigreturn(struct pt_regs *);
 
 /* kernel/ioport.c */
-asmlinkage long sys_iopl(unsigned long);
+ptregscall long sys_iopl(struct pt_regs *, unsigned int);
 
 /* kernel/sys_i386_32.c */
 asmlinkage long sys_mmap2(unsigned long, unsigned long, unsigned long,
@@ -59,8 +64,8 @@ struct oldold_utsname;
 asmlinkage int sys_olduname(struct oldold_utsname __user *);
 
 /* kernel/vm86_32.c */
-asmlinkage int sys_vm86old(struct pt_regs);
-asmlinkage int sys_vm86(struct pt_regs);
+ptregscall int sys_vm86old(struct pt_regs *, struct vm86_struct __user *);
+ptregscall int sys_vm86(struct pt_regs *, unsigned long, unsigned long);
 
 #else /* CONFIG_X86_32 */
 
diff --git a/arch/x86/kernel/entry_32.S b/arch/x86/kernel/entry_32.S
index 5f5bd22..3de7b57 100644
--- a/arch/x86/kernel/entry_32.S
+++ b/arch/x86/kernel/entry_32.S
@@ -697,6 +697,26 @@ syscall_badsys:
 END(syscall_badsys)
 	CFI_ENDPROC
 
+/*
+ * System calls that need a pt_regs pointer.
+ */
+#define PTREGSCALL(name) \
+	ALIGN; \
+ptregs_##name: \
+	leal 4(%esp),%eax; \
+	jmp sys_##name;
+
+PTREGSCALL(iopl)
+PTREGSCALL(fork)
+PTREGSCALL(clone)
+PTREGSCALL(vfork)
+PTREGSCALL(execve)
+PTREGSCALL(sigaltstack)
+PTREGSCALL(sigreturn)
+PTREGSCALL(rt_sigreturn)
+PTREGSCALL(vm86)
+PTREGSCALL(vm86old)
+
 .macro FIXUP_ESPFIX_STACK
 	/* since we are on a wrong stack, we cant make it a C code :( */
 	PER_CPU(gdt_page, %ebx)
diff --git a/arch/x86/kernel/ioport.c b/arch/x86/kernel/ioport.c
index b12208f..7ec1486 100644
--- a/arch/x86/kernel/ioport.c
+++ b/arch/x86/kernel/ioport.c
@@ -131,10 +131,8 @@ static int do_iopl(unsigned int level, struct pt_regs *regs)
 }
 
 #ifdef CONFIG_X86_32
-asmlinkage long sys_iopl(unsigned long regsp)
+ptregscall long sys_iopl(struct pt_regs *regs, unsigned int level)
 {
-	struct pt_regs *regs = (struct pt_regs *)&regsp;
-	unsigned int level = regs->bx;
 	struct thread_struct *t = &current->thread;
 	int rc;
 
diff --git a/arch/x86/kernel/process_32.c b/arch/x86/kernel/process_32.c
index 9a62383..922e3d2 100644
--- a/arch/x86/kernel/process_32.c
+++ b/arch/x86/kernel/process_32.c
@@ -593,24 +593,18 @@ __switch_to(struct task_struct *prev_p, struct task_struct *next_p)
 	return prev_p;
 }
 
-asmlinkage int sys_fork(struct pt_regs regs)
+ptregscall int sys_fork(struct pt_regs *regs)
 {
-	return do_fork(SIGCHLD, regs.sp, &regs, 0, NULL, NULL);
+	return do_fork(SIGCHLD, regs->sp, regs, 0, NULL, NULL);
 }
 
-asmlinkage int sys_clone(struct pt_regs regs)
+ptregscall int sys_clone(struct pt_regs *regs, unsigned long clone_flags,
+			 unsigned long newsp, int __user *parent_tidptr,
+			 unsigned long unused, int __user *child_tidptr)
 {
-	unsigned long clone_flags;
-	unsigned long newsp;
-	int __user *parent_tidptr, *child_tidptr;
-
-	clone_flags = regs.bx;
-	newsp = regs.cx;
-	parent_tidptr = (int __user *)regs.dx;
-	child_tidptr = (int __user *)regs.di;
 	if (!newsp)
-		newsp = regs.sp;
-	return do_fork(clone_flags, newsp, &regs, 0, parent_tidptr, child_tidptr);
+		newsp = regs->sp;
+	return do_fork(clone_flags, newsp, regs, 0, parent_tidptr, child_tidptr);
 }
 
 /*
@@ -623,27 +617,26 @@ asmlinkage int sys_clone(struct pt_regs regs)
  * do not have enough call-clobbered registers to hold all
  * the information you need.
  */
-asmlinkage int sys_vfork(struct pt_regs regs)
+ptregscall int sys_vfork(struct pt_regs *regs)
 {
-	return do_fork(CLONE_VFORK | CLONE_VM | SIGCHLD, regs.sp, &regs, 0, NULL, NULL);
+	return do_fork(CLONE_VFORK | CLONE_VM | SIGCHLD, regs->sp, regs, 0, NULL, NULL);
 }
 
 /*
  * sys_execve() executes a new program.
  */
-asmlinkage int sys_execve(struct pt_regs regs)
+ptregscall int sys_execve(struct pt_regs *regs, char __user *u_filename,
+			  char __user * __user *argv,
+			  char __user * __user *envp)
 {
 	int error;
 	char *filename;
 
-	filename = getname((char __user *) regs.bx);
+	filename = getname(u_filename);
 	error = PTR_ERR(filename);
 	if (IS_ERR(filename))
 		goto out;
-	error = do_execve(filename,
-			(char __user * __user *) regs.cx,
-			(char __user * __user *) regs.dx,
-			&regs);
+	error = do_execve(filename, argv, envp, regs);
 	if (error == 0) {
 		/* Make sure we don't return using sysenter.. */
 		set_thread_flag(TIF_IRET);
diff --git a/arch/x86/kernel/signal.c b/arch/x86/kernel/signal.c
index 8562387..d7a1583 100644
--- a/arch/x86/kernel/signal.c
+++ b/arch/x86/kernel/signal.c
@@ -549,39 +549,28 @@ sys_sigaction(int sig, const struct old_sigaction __user *act,
 #endif /* CONFIG_X86_32 */
 
 #ifdef CONFIG_X86_32
-asmlinkage int sys_sigaltstack(unsigned long bx)
-{
-	/*
-	 * This is needed to make gcc realize it doesn't own the
-	 * "struct pt_regs"
-	 */
-	struct pt_regs *regs = (struct pt_regs *)&bx;
-	const stack_t __user *uss = (const stack_t __user *)bx;
-	stack_t __user *uoss = (stack_t __user *)regs->cx;
-
-	return do_sigaltstack(uss, uoss, regs->sp);
-}
+ptregscall int
+sys_sigaltstack(struct pt_regs *regs, const stack_t __user *uss,
+		stack_t __user *uoss)
 #else /* !CONFIG_X86_32 */
 asmlinkage long
 sys_sigaltstack(const stack_t __user *uss, stack_t __user *uoss,
 		struct pt_regs *regs)
+#endif /* CONFIG_X86_32 */
 {
 	return do_sigaltstack(uss, uoss, regs->sp);
 }
-#endif /* CONFIG_X86_32 */
 
 /*
  * Do a signal return; undo the signal stack.
  */
 #ifdef CONFIG_X86_32
-asmlinkage unsigned long sys_sigreturn(unsigned long __unused)
+ptregscall unsigned long sys_sigreturn(struct pt_regs *regs)
 {
 	struct sigframe __user *frame;
-	struct pt_regs *regs;
 	unsigned long ax;
 	sigset_t set;
 
-	regs = (struct pt_regs *) &__unused;
 	frame = (struct sigframe __user *)(regs->sp - 8);
 
 	if (!access_ok(VERIFY_READ, frame, sizeof(*frame)))
@@ -640,23 +629,13 @@ badframe:
 }
 
 #ifdef CONFIG_X86_32
-/*
- * Note: do not pass in pt_regs directly as with tail-call optimization
- * GCC will incorrectly stomp on the caller's frame and corrupt user-space
- * register state:
- */
-asmlinkage int sys_rt_sigreturn(unsigned long __unused)
-{
-	struct pt_regs *regs = (struct pt_regs *)&__unused;
-
-	return do_rt_sigreturn(regs);
-}
+ptregscall int sys_rt_sigreturn(struct pt_regs *regs)
 #else /* !CONFIG_X86_32 */
 asmlinkage long sys_rt_sigreturn(struct pt_regs *regs)
+#endif /* CONFIG_X86_32 */
 {
 	return do_rt_sigreturn(regs);
 }
-#endif /* CONFIG_X86_32 */
 
 /*
  * OK, we're invoking a handler:
diff --git a/arch/x86/kernel/syscall_table_32.S b/arch/x86/kernel/syscall_table_32.S
index e2e86a0..3bdb648 100644
--- a/arch/x86/kernel/syscall_table_32.S
+++ b/arch/x86/kernel/syscall_table_32.S
@@ -1,7 +1,7 @@
 ENTRY(sys_call_table)
 	.long sys_restart_syscall	/* 0 - old "setup()" system call, used for restarting */
 	.long sys_exit
-	.long sys_fork
+	.long ptregs_fork
 	.long sys_read
 	.long sys_write
 	.long sys_open		/* 5 */
@@ -10,7 +10,7 @@ ENTRY(sys_call_table)
 	.long sys_creat
 	.long sys_link
 	.long sys_unlink	/* 10 */
-	.long sys_execve
+	.long ptregs_execve
 	.long sys_chdir
 	.long sys_time
 	.long sys_mknod
@@ -109,17 +109,17 @@ ENTRY(sys_call_table)
 	.long sys_newlstat
 	.long sys_newfstat
 	.long sys_uname
-	.long sys_iopl		/* 110 */
+	.long ptregs_iopl	/* 110 */
 	.long sys_vhangup
 	.long sys_ni_syscall	/* old "idle" system call */
-	.long sys_vm86old
+	.long ptregs_vm86old
 	.long sys_wait4
 	.long sys_swapoff	/* 115 */
 	.long sys_sysinfo
 	.long sys_ipc
 	.long sys_fsync
-	.long sys_sigreturn
-	.long sys_clone		/* 120 */
+	.long ptregs_sigreturn
+	.long ptregs_clone	/* 120 */
 	.long sys_setdomainname
 	.long sys_newuname
 	.long sys_modify_ldt
@@ -165,14 +165,14 @@ ENTRY(sys_call_table)
 	.long sys_mremap
 	.long sys_setresuid16
 	.long sys_getresuid16	/* 165 */
-	.long sys_vm86
+	.long ptregs_vm86
 	.long sys_ni_syscall	/* Old sys_query_module */
 	.long sys_poll
 	.long sys_nfsservctl
 	.long sys_setresgid16	/* 170 */
 	.long sys_getresgid16
 	.long sys_prctl
-	.long sys_rt_sigreturn
+	.long ptregs_rt_sigreturn
 	.long sys_rt_sigaction
 	.long sys_rt_sigprocmask	/* 175 */
 	.long sys_rt_sigpending
@@ -185,11 +185,11 @@ ENTRY(sys_call_table)
 	.long sys_getcwd
 	.long sys_capget
 	.long sys_capset	/* 185 */
-	.long sys_sigaltstack
+	.long ptregs_sigaltstack
 	.long sys_sendfile
 	.long sys_ni_syscall	/* reserved for streams1 */
 	.long sys_ni_syscall	/* reserved for streams2 */
-	.long sys_vfork		/* 190 */
+	.long ptregs_vfork	/* 190 */
 	.long sys_getrlimit
 	.long sys_mmap2
 	.long sys_truncate64
diff --git a/arch/x86/kernel/vm86_32.c b/arch/x86/kernel/vm86_32.c
index 55ea30d..8fa6ba7 100644
--- a/arch/x86/kernel/vm86_32.c
+++ b/arch/x86/kernel/vm86_32.c
@@ -197,9 +197,8 @@ out:
 static int do_vm86_irq_handling(int subfunction, int irqnumber);
 static void do_sys_vm86(struct kernel_vm86_struct *info, struct task_struct *tsk);
 
-asmlinkage int sys_vm86old(struct pt_regs regs)
+ptregscall int sys_vm86old(struct pt_regs *regs, struct vm86_struct __user *v86)
 {
-	struct vm86_struct __user *v86 = (struct vm86_struct __user *)regs.bx;
 	struct kernel_vm86_struct info; /* declare this _on top_,
 					 * this avoids wasting of stack space.
 					 * This remains on the stack until we
@@ -218,7 +217,7 @@ asmlinkage int sys_vm86old(struct pt_regs regs)
 	if (tmp)
 		goto out;
 	memset(&info.vm86plus, 0, (int)&info.regs32 - (int)&info.vm86plus);
-	info.regs32 = &regs;
+	info.regs32 = regs;
 	tsk->thread.vm86_info = v86;
 	do_sys_vm86(&info, tsk);
 	ret = 0;	/* we never return here */
@@ -227,7 +226,7 @@ out:
 }
 
 
-asmlinkage int sys_vm86(struct pt_regs regs)
+ptregscall int sys_vm86(struct pt_regs *regs, unsigned long cmd, unsigned long arg)
 {
 	struct kernel_vm86_struct info; /* declare this _on top_,
 					 * this avoids wasting of stack space.
@@ -239,12 +238,12 @@ asmlinkage int sys_vm86(struct pt_regs regs)
 	struct vm86plus_struct __user *v86;
 
 	tsk = current;
-	switch (regs.bx) {
+	switch (cmd) {
 	case VM86_REQUEST_IRQ:
 	case VM86_FREE_IRQ:
 	case VM86_GET_IRQ_BITS:
 	case VM86_GET_AND_RESET_IRQ:
-		ret = do_vm86_irq_handling(regs.bx, (int)regs.cx);
+		ret = do_vm86_irq_handling(cmd, (int)arg);
 		goto out;
 	case VM86_PLUS_INSTALL_CHECK:
 		/*
@@ -261,14 +260,14 @@ asmlinkage int sys_vm86(struct pt_regs regs)
 	ret = -EPERM;
 	if (tsk->thread.saved_sp0)
 		goto out;
-	v86 = (struct vm86plus_struct __user *)regs.cx;
+	v86 = (struct vm86plus_struct __user *)arg;
 	tmp = copy_vm86_regs_from_user(&info.regs, &v86->regs,
 				       offsetof(struct kernel_vm86_struct, regs32) -
 				       sizeof(info.regs));
 	ret = -EFAULT;
 	if (tmp)
 		goto out;
-	info.regs32 = &regs;
+	info.regs32 = regs;
 	info.vm86plus.is_vm86pus = 1;
 	tsk->thread.vm86_info = (struct vm86_struct __user *)v86;
 	do_sys_vm86(&info, tsk);
-- 
1.6.1


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 3/3] x86: Drop -fno-stack-protector after pt_regs fixes
  2009-02-10 14:51 [PATCH 0/3] x86: Fix pt_regs passed by value Brian Gerst
  2009-02-10 14:51 ` [PATCH 1/3] x86: Use pt_regs pointer in do_device_not_available() Brian Gerst
  2009-02-10 14:51 ` [PATCH 2/3] x86: Pass in pt_regs pointer for syscalls that need it Brian Gerst
@ 2009-02-10 14:51 ` Brian Gerst
  2009-02-11 11:42 ` [PATCH 0/3] x86: Fix pt_regs passed by value Ingo Molnar
  3 siblings, 0 replies; 36+ messages in thread
From: Brian Gerst @ 2009-02-10 14:51 UTC (permalink / raw)
  To: Tejun Heo; +Cc: Ingo Molnar, linux-kernel

Now that no functions rely on struct pt_regs being passed by value,
stack protector can be enabled.

Signed-off-by: Brian Gerst <brgerst@gmail.com>
---
 arch/x86/kernel/Makefile |   18 ------------------
 1 files changed, 0 insertions(+), 18 deletions(-)

diff --git a/arch/x86/kernel/Makefile b/arch/x86/kernel/Makefile
index b1f8be3..37fa30b 100644
--- a/arch/x86/kernel/Makefile
+++ b/arch/x86/kernel/Makefile
@@ -24,24 +24,6 @@ CFLAGS_vsyscall_64.o	:= $(PROFILING) -g0 $(nostackp)
 CFLAGS_hpet.o		:= $(nostackp)
 CFLAGS_tsc.o		:= $(nostackp)
 CFLAGS_paravirt.o	:= $(nostackp)
-#
-# On x86_32, register frame is passed verbatim on stack as struct
-# pt_regs.  gcc considers the parameter to belong to the callee and
-# with -fstack-protector it copies pt_regs to the callee's stack frame
-# to put the structure after the stack canary causing changes made by
-# the exception handlers to be lost.  Turn off stack protector for all
-# files containing functions which take struct pt_regs from register
-# frame.
-#
-# The proper way to fix this is to teach gcc that the argument belongs
-# to the caller for these functions, oh well...
-#
-ifdef CONFIG_X86_32
-CFLAGS_process_32.o	:= $(nostackp)
-CFLAGS_vm86_32.o	:= $(nostackp)
-CFLAGS_signal.o		:= $(nostackp)
-CFLAGS_traps.o		:= $(nostackp)
-endif
 
 obj-y			:= process_$(BITS).o signal.o entry_$(BITS).o
 obj-y			+= traps.o irq.o irq_$(BITS).o dumpstack_$(BITS).o
-- 
1.6.1


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* Re: [PATCH 2/3] x86: Pass in pt_regs pointer for syscalls that need it
  2009-02-10 14:51 ` [PATCH 2/3] x86: Pass in pt_regs pointer for syscalls that need it Brian Gerst
@ 2009-02-11  7:41   ` Tejun Heo
  2009-02-11 10:18     ` Ingo Molnar
  2009-02-11 14:31     ` Brian Gerst
  2009-02-11 17:52   ` H. Peter Anvin
  2009-02-11 21:43   ` [PATCH] x86: pass in pt_regs pointer for syscalls that need it (take 2) Brian Gerst
  2 siblings, 2 replies; 36+ messages in thread
From: Tejun Heo @ 2009-02-11  7:41 UTC (permalink / raw)
  To: Brian Gerst; +Cc: Ingo Molnar, linux-kernel

Hello, Brian.

Brian Gerst wrote:
> Some syscalls need to access the pt_regs structure, either to copy
> user register state or to modifiy it.  This patch adds stubs to load
> the address of the pt_regs struct into the %eax register, and changes
> the syscalls to regparm(1) to receive the pt_regs pointer as the
> first argument.

Heh... neat.  Just one question.

> -asmlinkage long sys_iopl(unsigned long regsp)
> +ptregscall long sys_iopl(struct pt_regs *regs, unsigned int level)
>  {
> -	struct pt_regs *regs = (struct pt_regs *)&regsp;
> -	unsigned int level = regs->bx;

Here and at other places where the function takes more than one
arguments, wouldn't it be better to just take *regs and use other
parameters from regs?  That way we won't have to worry about gcc
corrupting register frame at all and I think it's cleaner that way.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 1/3] x86: Use pt_regs pointer in do_device_not_available()
  2009-02-10 14:51 ` [PATCH 1/3] x86: Use pt_regs pointer in do_device_not_available() Brian Gerst
@ 2009-02-11  7:43   ` Tejun Heo
  2009-02-11 10:13     ` Ingo Molnar
  2009-02-11 14:34     ` Brian Gerst
  0 siblings, 2 replies; 36+ messages in thread
From: Tejun Heo @ 2009-02-11  7:43 UTC (permalink / raw)
  To: Brian Gerst; +Cc: Ingo Molnar, linux-kernel

Hello, Brian.

Brian Gerst wrote:
> The generic exception handler (error_code) passes in the pt_regs
> pointer and the error code (unused in this case).  The commit
> "x86: fix math_emu register frame access" changed this to pass by
> value, which doesn't work correctly with stack protector enabled.
> Change it back to use the pt_regs pointer.
> 
> Signed-off-by: Brian Gerst <brgerst@gmail.com>
> ---
>  arch/x86/include/asm/traps.h |    2 +-
>  arch/x86/kernel/traps.c      |    9 +++++----
>  2 files changed, 6 insertions(+), 5 deletions(-)
> 
> diff --git a/arch/x86/include/asm/traps.h b/arch/x86/include/asm/traps.h
> index cf3bb05..0d53425 100644
> --- a/arch/x86/include/asm/traps.h
> +++ b/arch/x86/include/asm/traps.h
> @@ -41,7 +41,7 @@ dotraplinkage void do_int3(struct pt_regs *, long);
>  dotraplinkage void do_overflow(struct pt_regs *, long);
>  dotraplinkage void do_bounds(struct pt_regs *, long);
>  dotraplinkage void do_invalid_op(struct pt_regs *, long);
> -dotraplinkage void do_device_not_available(struct pt_regs);
> +dotraplinkage void do_device_not_available(struct pt_regs *, long);
>  dotraplinkage void do_coprocessor_segment_overrun(struct pt_regs *, long);
>  dotraplinkage void do_invalid_TSS(struct pt_regs *, long);
>  dotraplinkage void do_segment_not_present(struct pt_regs *, long);
> diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
> index 3b7b2e1..71a8f87 100644
> --- a/arch/x86/kernel/traps.c
> +++ b/arch/x86/kernel/traps.c
> @@ -905,19 +905,20 @@ void math_emulate(struct math_emu_info *info)
>  }
>  #endif /* CONFIG_MATH_EMULATION */
>  
> -dotraplinkage void __kprobes do_device_not_available(struct pt_regs regs)
> +dotraplinkage void __kprobes
> +do_device_not_available(struct pt_regs *regs, long error_code)

What do you think about just taking pt_regs and accessing
regs->orig_eax instead of having separate call convention for pt_regs
ones and trap ones?  Too much work without enough benefit?

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 1/3] x86: Use pt_regs pointer in do_device_not_available()
  2009-02-11  7:43   ` Tejun Heo
@ 2009-02-11 10:13     ` Ingo Molnar
  2009-02-11 14:34     ` Brian Gerst
  1 sibling, 0 replies; 36+ messages in thread
From: Ingo Molnar @ 2009-02-11 10:13 UTC (permalink / raw)
  To: Tejun Heo; +Cc: Brian Gerst, linux-kernel, H. Peter Anvin


* Tejun Heo <tj@kernel.org> wrote:

> Hello, Brian.
> 
> Brian Gerst wrote:
> > The generic exception handler (error_code) passes in the pt_regs
> > pointer and the error code (unused in this case).  The commit
> > "x86: fix math_emu register frame access" changed this to pass by
> > value, which doesn't work correctly with stack protector enabled.
> > Change it back to use the pt_regs pointer.
> > 
> > Signed-off-by: Brian Gerst <brgerst@gmail.com>
> > ---
> >  arch/x86/include/asm/traps.h |    2 +-
> >  arch/x86/kernel/traps.c      |    9 +++++----
> >  2 files changed, 6 insertions(+), 5 deletions(-)
> > 
> > diff --git a/arch/x86/include/asm/traps.h b/arch/x86/include/asm/traps.h
> > index cf3bb05..0d53425 100644
> > --- a/arch/x86/include/asm/traps.h
> > +++ b/arch/x86/include/asm/traps.h
> > @@ -41,7 +41,7 @@ dotraplinkage void do_int3(struct pt_regs *, long);
> >  dotraplinkage void do_overflow(struct pt_regs *, long);
> >  dotraplinkage void do_bounds(struct pt_regs *, long);
> >  dotraplinkage void do_invalid_op(struct pt_regs *, long);
> > -dotraplinkage void do_device_not_available(struct pt_regs);
> > +dotraplinkage void do_device_not_available(struct pt_regs *, long);
> >  dotraplinkage void do_coprocessor_segment_overrun(struct pt_regs *, long);
> >  dotraplinkage void do_invalid_TSS(struct pt_regs *, long);
> >  dotraplinkage void do_segment_not_present(struct pt_regs *, long);
> > diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
> > index 3b7b2e1..71a8f87 100644
> > --- a/arch/x86/kernel/traps.c
> > +++ b/arch/x86/kernel/traps.c
> > @@ -905,19 +905,20 @@ void math_emulate(struct math_emu_info *info)
> >  }
> >  #endif /* CONFIG_MATH_EMULATION */
> >  
> > -dotraplinkage void __kprobes do_device_not_available(struct pt_regs regs)
> > +dotraplinkage void __kprobes
> > +do_device_not_available(struct pt_regs *regs, long error_code)
> 
> What do you think about just taking pt_regs and accessing
> regs->orig_eax instead of having separate call convention for pt_regs
> ones and trap ones?  Too much work without enough benefit?

Looks worthwile to me. [ Cleanups rarely have clear benefits of their own,
it's the sheer mass of them that makes a difference in the end. Like the
many snowflakes that can bend a tree ;-) ]

There's one small namespace complication here: it's pt_regs->orig_eax on
32-bit and pt_regs->orig_rax on 64-bit.

So i'd suggest the introduction of an anonymous union "error_code" field
which overlays orig_eax and that can be used just fine from unified code too.
That would also be more readable.

Or we could rename orig_eax/orig_rax to error_code. (it might be a bit
confusing in the syscall entry context though.)

	Ingo

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 2/3] x86: Pass in pt_regs pointer for syscalls that need it
  2009-02-11  7:41   ` Tejun Heo
@ 2009-02-11 10:18     ` Ingo Molnar
  2009-02-11 14:14       ` Tejun Heo
  2009-02-11 14:31     ` Brian Gerst
  1 sibling, 1 reply; 36+ messages in thread
From: Ingo Molnar @ 2009-02-11 10:18 UTC (permalink / raw)
  To: Tejun Heo; +Cc: Brian Gerst, linux-kernel


* Tejun Heo <tj@kernel.org> wrote:

> Hello, Brian.
> 
> Brian Gerst wrote:
> > Some syscalls need to access the pt_regs structure, either to copy
> > user register state or to modifiy it.  This patch adds stubs to load
> > the address of the pt_regs struct into the %eax register, and changes
> > the syscalls to regparm(1) to receive the pt_regs pointer as the
> > first argument.
> 
> Heh... neat.  Just one question.
> 
> > -asmlinkage long sys_iopl(unsigned long regsp)
> > +ptregscall long sys_iopl(struct pt_regs *regs, unsigned int level)
> >  {
> > -	struct pt_regs *regs = (struct pt_regs *)&regsp;
> > -	unsigned int level = regs->bx;
> 
> Here and at other places where the function takes more than one
> arguments, wouldn't it be better to just take *regs and use other
> parameters from regs?  That way we won't have to worry about gcc
> corrupting register frame at all and I think it's cleaner that way.

Hm, gcc cannot corrupt register arguments only on-stack arguments - but
your suggestion nevertheless makes sense as an optimization. I'd suggest
this to be done as a separate patch though, both for regression analysis
reasons (easier to bisect - the patch is large enough already) and from
a size/performance analysis POV. (so we can see the benefits in isolation)

	Ingo

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 0/3] x86: Fix pt_regs passed by value
  2009-02-10 14:51 [PATCH 0/3] x86: Fix pt_regs passed by value Brian Gerst
                   ` (2 preceding siblings ...)
  2009-02-10 14:51 ` [PATCH 3/3] x86: Drop -fno-stack-protector after pt_regs fixes Brian Gerst
@ 2009-02-11 11:42 ` Ingo Molnar
  2009-02-11 14:15   ` Tejun Heo
  3 siblings, 1 reply; 36+ messages in thread
From: Ingo Molnar @ 2009-02-11 11:42 UTC (permalink / raw)
  To: Brian Gerst; +Cc: Tejun Heo, linux-kernel, H. Peter Anvin, Thomas Gleixner


* Brian Gerst <brgerst@gmail.com> wrote:

> Several syscalls and one trap handler currently rely on the pt_regs
> struct being passed in by value.  This causes problems with stack
> protector, and can also cause problems with corruption due to tail-call
> optimization.  Change them to use a pointer to the pt_regs struct
> instead.

looks good! Applied them to tip:core/percpu, thanks Brian.

Tejun, i have added your Acked-by lines. (let me know if that's
incorrect). The review feedback you gave is correct and can/should
be addressed in separate patches.

	Ingo

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 2/3] x86: Pass in pt_regs pointer for syscalls that need it
  2009-02-11 10:18     ` Ingo Molnar
@ 2009-02-11 14:14       ` Tejun Heo
  0 siblings, 0 replies; 36+ messages in thread
From: Tejun Heo @ 2009-02-11 14:14 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: Brian Gerst, linux-kernel

Ingo Molnar wrote:
> * Tejun Heo <tj@kernel.org> wrote:
> 
>> Hello, Brian.
>>
>> Brian Gerst wrote:
>>> Some syscalls need to access the pt_regs structure, either to copy
>>> user register state or to modifiy it.  This patch adds stubs to load
>>> the address of the pt_regs struct into the %eax register, and changes
>>> the syscalls to regparm(1) to receive the pt_regs pointer as the
>>> first argument.
>> Heh... neat.  Just one question.
>>
>>> -asmlinkage long sys_iopl(unsigned long regsp)
>>> +ptregscall long sys_iopl(struct pt_regs *regs, unsigned int level)
>>>  {
>>> -	struct pt_regs *regs = (struct pt_regs *)&regsp;
>>> -	unsigned int level = regs->bx;
>> Here and at other places where the function takes more than one
>> arguments, wouldn't it be better to just take *regs and use other
>> parameters from regs?  That way we won't have to worry about gcc
>> corrupting register frame at all and I think it's cleaner that way.
> 
> Hm, gcc cannot corrupt register arguments only on-stack arguments - but
> your suggestion nevertheless makes sense as an optimization. I'd suggest
> this to be done as a separate patch though, both for regression analysis
> reasons (easier to bisect - the patch is large enough already) and from
> a size/performance analysis POV. (so we can see the benefits in isolation)

ptregscall is regparm(1) so arguments are on-stack from the second
one, so the callee can corrupt them.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 0/3] x86: Fix pt_regs passed by value
  2009-02-11 11:42 ` [PATCH 0/3] x86: Fix pt_regs passed by value Ingo Molnar
@ 2009-02-11 14:15   ` Tejun Heo
  0 siblings, 0 replies; 36+ messages in thread
From: Tejun Heo @ 2009-02-11 14:15 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: Brian Gerst, linux-kernel, H. Peter Anvin, Thomas Gleixner

Ingo Molnar wrote:
> * Brian Gerst <brgerst@gmail.com> wrote:
> 
>> Several syscalls and one trap handler currently rely on the pt_regs
>> struct being passed in by value.  This causes problems with stack
>> protector, and can also cause problems with corruption due to tail-call
>> optimization.  Change them to use a pointer to the pt_regs struct
>> instead.
> 
> looks good! Applied them to tip:core/percpu, thanks Brian.
> 
> Tejun, i have added your Acked-by lines. (let me know if that's
> incorrect). The review feedback you gave is correct and can/should
> be addressed in separate patches.

Yeah, sure.  I like the direction.  Brian, are you interested in
pursuing further changes?

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 2/3] x86: Pass in pt_regs pointer for syscalls that need  it
  2009-02-11  7:41   ` Tejun Heo
  2009-02-11 10:18     ` Ingo Molnar
@ 2009-02-11 14:31     ` Brian Gerst
  2009-02-11 14:41       ` Tejun Heo
  2009-02-11 15:01       ` Ingo Molnar
  1 sibling, 2 replies; 36+ messages in thread
From: Brian Gerst @ 2009-02-11 14:31 UTC (permalink / raw)
  To: Tejun Heo; +Cc: Ingo Molnar, linux-kernel

On Wed, Feb 11, 2009 at 2:41 AM, Tejun Heo <tj@kernel.org> wrote:
> Hello, Brian.
>
> Brian Gerst wrote:
>> Some syscalls need to access the pt_regs structure, either to copy
>> user register state or to modifiy it.  This patch adds stubs to load
>> the address of the pt_regs struct into the %eax register, and changes
>> the syscalls to regparm(1) to receive the pt_regs pointer as the
>> first argument.
>
> Heh... neat.  Just one question.
>
>> -asmlinkage long sys_iopl(unsigned long regsp)
>> +ptregscall long sys_iopl(struct pt_regs *regs, unsigned int level)
>>  {
>> -     struct pt_regs *regs = (struct pt_regs *)&regsp;
>> -     unsigned int level = regs->bx;
>
> Here and at other places where the function takes more than one
> arguments, wouldn't it be better to just take *regs and use other
> parameters from regs?  That way we won't have to worry about gcc
> corrupting register frame at all and I think it's cleaner that way.

Expanding the parameters is good documentation.  If there is a risk of
tail-call optimization causing the register corruption, then
asmlinkage_protect() should be used.  The problem isn't limited to
just the syscalls that take pt_regs.  It's just getting the args out
of the pt_regs struct was an easy hack to get around it.  I checked
the disassembly of these functions and didn't see this happen on gcc
4.3.0.

--
Brian Gerst

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 1/3] x86: Use pt_regs pointer in do_device_not_available()
  2009-02-11  7:43   ` Tejun Heo
  2009-02-11 10:13     ` Ingo Molnar
@ 2009-02-11 14:34     ` Brian Gerst
  2009-02-11 14:42       ` Tejun Heo
  1 sibling, 1 reply; 36+ messages in thread
From: Brian Gerst @ 2009-02-11 14:34 UTC (permalink / raw)
  To: Tejun Heo; +Cc: Ingo Molnar, linux-kernel

On Wed, Feb 11, 2009 at 2:43 AM, Tejun Heo <tj@kernel.org> wrote:
> Hello, Brian.
>
> Brian Gerst wrote:
>> The generic exception handler (error_code) passes in the pt_regs
>> pointer and the error code (unused in this case).  The commit
>> "x86: fix math_emu register frame access" changed this to pass by
>> value, which doesn't work correctly with stack protector enabled.
>> Change it back to use the pt_regs pointer.
>>
>> Signed-off-by: Brian Gerst <brgerst@gmail.com>
>> ---
>>  arch/x86/include/asm/traps.h |    2 +-
>>  arch/x86/kernel/traps.c      |    9 +++++----
>>  2 files changed, 6 insertions(+), 5 deletions(-)
>>
>> diff --git a/arch/x86/include/asm/traps.h b/arch/x86/include/asm/traps.h
>> index cf3bb05..0d53425 100644
>> --- a/arch/x86/include/asm/traps.h
>> +++ b/arch/x86/include/asm/traps.h
>> @@ -41,7 +41,7 @@ dotraplinkage void do_int3(struct pt_regs *, long);
>>  dotraplinkage void do_overflow(struct pt_regs *, long);
>>  dotraplinkage void do_bounds(struct pt_regs *, long);
>>  dotraplinkage void do_invalid_op(struct pt_regs *, long);
>> -dotraplinkage void do_device_not_available(struct pt_regs);
>> +dotraplinkage void do_device_not_available(struct pt_regs *, long);
>>  dotraplinkage void do_coprocessor_segment_overrun(struct pt_regs *, long);
>>  dotraplinkage void do_invalid_TSS(struct pt_regs *, long);
>>  dotraplinkage void do_segment_not_present(struct pt_regs *, long);
>> diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
>> index 3b7b2e1..71a8f87 100644
>> --- a/arch/x86/kernel/traps.c
>> +++ b/arch/x86/kernel/traps.c
>> @@ -905,19 +905,20 @@ void math_emulate(struct math_emu_info *info)
>>  }
>>  #endif /* CONFIG_MATH_EMULATION */
>>
>> -dotraplinkage void __kprobes do_device_not_available(struct pt_regs regs)
>> +dotraplinkage void __kprobes
>> +do_device_not_available(struct pt_regs *regs, long error_code)
>
> What do you think about just taking pt_regs and accessing
> regs->orig_eax instead of having separate call convention for pt_regs
> ones and trap ones?  Too much work without enough benefit?

I don't quite follow what you are trying to say here.  Are you saying
use the same calling convention for the exception handlers (anything
called from error_code) and system calls?

--
Brian Gerst

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 2/3] x86: Pass in pt_regs pointer for syscalls that need it
  2009-02-11 14:31     ` Brian Gerst
@ 2009-02-11 14:41       ` Tejun Heo
  2009-02-11 14:43         ` Tejun Heo
  2009-02-11 14:48         ` Tejun Heo
  2009-02-11 15:01       ` Ingo Molnar
  1 sibling, 2 replies; 36+ messages in thread
From: Tejun Heo @ 2009-02-11 14:41 UTC (permalink / raw)
  To: Brian Gerst; +Cc: Ingo Molnar, linux-kernel

Hello, Brian.

Brian Gerst wrote:
>> Here and at other places where the function takes more than one
>> arguments, wouldn't it be better to just take *regs and use other
>> parameters from regs?  That way we won't have to worry about gcc
>> corrupting register frame at all and I think it's cleaner that way.
> 
> Expanding the parameters is good documentation.

Copying from ptregs to appropriately named local variable would
provide at least similar level of documentation but I don't think this
is a big deal one way or the other.

> If there is a risk of tail-call optimization causing the register
> corruption, then asmlinkage_protect() should be used.  The problem
> isn't limited to just the syscalls that take pt_regs.  It's just
> getting the args out of the pt_regs struct was an easy hack to get
> around it.

If pt_regs is being passed with regparm(1) and no other parameter is
specified, it's a proper solution as we can guarantee that callee
can't corrupt (or discard changes to) the register frame no matter
what gcc does.

> I checked the disassembly of these functions and didn't see this
> happen on gcc 4.3.0.

Well, tracking down why run_init_process() is returning 0 with
-fstack-protector wasn't much of fun.  These breakages are very subtle
and if we're gonna pass in pointer to pt_regs anyway and thus can
guarantee such breakage can't happen at no additional cost, I think we
should do that even if it means slightly more argument fetching in a
few places.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 1/3] x86: Use pt_regs pointer in do_device_not_available()
  2009-02-11 14:34     ` Brian Gerst
@ 2009-02-11 14:42       ` Tejun Heo
  2009-02-11 14:46         ` Brian Gerst
  0 siblings, 1 reply; 36+ messages in thread
From: Tejun Heo @ 2009-02-11 14:42 UTC (permalink / raw)
  To: Brian Gerst; +Cc: Ingo Molnar, linux-kernel

Brian Gerst wrote:
> I don't quite follow what you are trying to say here.  Are you saying
> use the same calling convention for the exception handlers (anything
> called from error_code) and system calls?

Yeap.

-- 
tejun

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 2/3] x86: Pass in pt_regs pointer for syscalls that need it
  2009-02-11 14:41       ` Tejun Heo
@ 2009-02-11 14:43         ` Tejun Heo
  2009-02-11 14:48         ` Tejun Heo
  1 sibling, 0 replies; 36+ messages in thread
From: Tejun Heo @ 2009-02-11 14:43 UTC (permalink / raw)
  To: Brian Gerst; +Cc: Ingo Molnar, linux-kernel

Eh... I missed an adjective.

Tejun Heo wrote:
> Well, tracking down why run_init_process() is returning 0 with
> -fstack-protector wasn't much of fun.  These breakages are very subtle
> and if we're gonna pass in pointer to pt_regs anyway and thus can
> guarantee such breakage can't happen at no additional cost, I think we
> should do that even if it means slightly more argument fetching in a
                                               ^
                                             verbose
> few places.


-- 
tejun

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 1/3] x86: Use pt_regs pointer in do_device_not_available()
  2009-02-11 14:42       ` Tejun Heo
@ 2009-02-11 14:46         ` Brian Gerst
  2009-02-11 14:53           ` Tejun Heo
  0 siblings, 1 reply; 36+ messages in thread
From: Brian Gerst @ 2009-02-11 14:46 UTC (permalink / raw)
  To: Tejun Heo; +Cc: Ingo Molnar, linux-kernel

On Wed, Feb 11, 2009 at 9:42 AM, Tejun Heo <tj@kernel.org> wrote:
> Brian Gerst wrote:
>> I don't quite follow what you are trying to say here.  Are you saying
>> use the same calling convention for the exception handlers (anything
>> called from error_code) and system calls?
>
> Yeap.

Doesn't make sense to me, two very fundamentally different things.  I
supposed you could eliminate the error_code parameter, but that's alot
of work to remove just one instruction.

--
Brian Gerst

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 2/3] x86: Pass in pt_regs pointer for syscalls that need it
  2009-02-11 14:41       ` Tejun Heo
  2009-02-11 14:43         ` Tejun Heo
@ 2009-02-11 14:48         ` Tejun Heo
  2009-02-11 14:58           ` Ingo Molnar
  2009-02-11 14:59           ` Brian Gerst
  1 sibling, 2 replies; 36+ messages in thread
From: Tejun Heo @ 2009-02-11 14:48 UTC (permalink / raw)
  To: Brian Gerst; +Cc: Ingo Molnar, linux-kernel

Tejun Heo wrote:
>> I checked the disassembly of these functions and didn't see this
>> happen on gcc 4.3.0.
> 
> Well, tracking down why run_init_process() is returning 0 with
> -fstack-protector wasn't much of fun.  These breakages are very subtle
> and if we're gonna pass in pointer to pt_regs anyway and thus can
> guarantee such breakage can't happen at no additional cost, I think we
> should do that even if it means slightly more argument fetching in a
> few places.

In addition, if we do that, we can remove the horrible
asmlinkage_protect() thing altogether.

-- 
tejun

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 1/3] x86: Use pt_regs pointer in do_device_not_available()
  2009-02-11 14:46         ` Brian Gerst
@ 2009-02-11 14:53           ` Tejun Heo
  0 siblings, 0 replies; 36+ messages in thread
From: Tejun Heo @ 2009-02-11 14:53 UTC (permalink / raw)
  To: Brian Gerst; +Cc: Ingo Molnar, linux-kernel

Brian Gerst wrote:
> On Wed, Feb 11, 2009 at 9:42 AM, Tejun Heo <tj@kernel.org> wrote:
>> Brian Gerst wrote:
>>> I don't quite follow what you are trying to say here.  Are you saying
>>> use the same calling convention for the exception handlers (anything
>>> called from error_code) and system calls?
>> Yeap.
> 
> Doesn't make sense to me, two very fundamentally different things.  I
> supposed you could eliminate the error_code parameter, but that's alot
> of work to remove just one instruction.

I don't think they're fundamentally different.  Well, I guess it all
depends on how you look at it.  Those are just functions called after
kernel entry with register stack frame as argument.  Some might have
more fields others might not.

I don't think having two conventions is too bad, so whether to do this
or not is basically a cleanup decision.  Eh... if you feel like doing
it, please go ahead.  If not, well, maybe somebody will do it someday
(or not).  :-)

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 2/3] x86: Pass in pt_regs pointer for syscalls that need it
  2009-02-11 14:48         ` Tejun Heo
@ 2009-02-11 14:58           ` Ingo Molnar
  2009-02-11 14:59           ` Brian Gerst
  1 sibling, 0 replies; 36+ messages in thread
From: Ingo Molnar @ 2009-02-11 14:58 UTC (permalink / raw)
  To: Tejun Heo; +Cc: Brian Gerst, linux-kernel


* Tejun Heo <htejun@gmail.com> wrote:

> Tejun Heo wrote:
> >> I checked the disassembly of these functions and didn't see this
> >> happen on gcc 4.3.0.
> > 
> > Well, tracking down why run_init_process() is returning 0 with
> > -fstack-protector wasn't much of fun.  These breakages are very subtle
> > and if we're gonna pass in pointer to pt_regs anyway and thus can
> > guarantee such breakage can't happen at no additional cost, I think we
> > should do that even if it means slightly more argument fetching in a
> > few places.
> 
> In addition, if we do that, we can remove the horrible
> asmlinkage_protect() thing altogether.

Agreed.

	Ingo

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 2/3] x86: Pass in pt_regs pointer for syscalls that need  it
  2009-02-11 14:48         ` Tejun Heo
  2009-02-11 14:58           ` Ingo Molnar
@ 2009-02-11 14:59           ` Brian Gerst
  2009-02-11 15:05             ` Tejun Heo
  1 sibling, 1 reply; 36+ messages in thread
From: Brian Gerst @ 2009-02-11 14:59 UTC (permalink / raw)
  To: Tejun Heo; +Cc: Ingo Molnar, linux-kernel

On Wed, Feb 11, 2009 at 9:48 AM, Tejun Heo <htejun@gmail.com> wrote:
> Tejun Heo wrote:
>>> I checked the disassembly of these functions and didn't see this
>>> happen on gcc 4.3.0.
>>
>> Well, tracking down why run_init_process() is returning 0 with
>> -fstack-protector wasn't much of fun.  These breakages are very subtle
>> and if we're gonna pass in pointer to pt_regs anyway and thus can
>> guarantee such breakage can't happen at no additional cost, I think we
>> should do that even if it means slightly more argument fetching in a
>> few places.
>
> In addition, if we do that, we can remove the horrible
> asmlinkage_protect() thing altogether.

Like I said before, the tail-call optimization problem isn't limited
to just this set of syscalls.  There are only two real ways to fix it.
 1) Set up a real stack frame for the syscalls instead of overalying
pt_regs, or 2) patch gcc to tell it not to touch the args area of the
stack.

--
Brian Gerst

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 2/3] x86: Pass in pt_regs pointer for syscalls that need it
  2009-02-11 14:31     ` Brian Gerst
  2009-02-11 14:41       ` Tejun Heo
@ 2009-02-11 15:01       ` Ingo Molnar
  1 sibling, 0 replies; 36+ messages in thread
From: Ingo Molnar @ 2009-02-11 15:01 UTC (permalink / raw)
  To: Brian Gerst; +Cc: Tejun Heo, linux-kernel


* Brian Gerst <brgerst@gmail.com> wrote:

> On Wed, Feb 11, 2009 at 2:41 AM, Tejun Heo <tj@kernel.org> wrote:
> > Hello, Brian.
> >
> > Brian Gerst wrote:
> >> Some syscalls need to access the pt_regs structure, either to copy
> >> user register state or to modifiy it.  This patch adds stubs to load
> >> the address of the pt_regs struct into the %eax register, and changes
> >> the syscalls to regparm(1) to receive the pt_regs pointer as the
> >> first argument.
> >
> > Heh... neat.  Just one question.
> >
> >> -asmlinkage long sys_iopl(unsigned long regsp)
> >> +ptregscall long sys_iopl(struct pt_regs *regs, unsigned int level)
> >>  {
> >> -     struct pt_regs *regs = (struct pt_regs *)&regsp;
> >> -     unsigned int level = regs->bx;
> >
> > Here and at other places where the function takes more than one
> > arguments, wouldn't it be better to just take *regs and use other
> > parameters from regs?  That way we won't have to worry about gcc
> > corrupting register frame at all and I think it's cleaner that way.
> 
> Expanding the parameters is good documentation.  [...]

Well, that way we shuffle the parameter expansion into assembly code,
instead of creating it as a local variable in the C function.

The latter sure looks better documented, and less error-prone as
well, right? The compiler might also be able to optimize it some.
(and we save one instruction in any case)

	Ingo

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 2/3] x86: Pass in pt_regs pointer for syscalls that need it
  2009-02-11 14:59           ` Brian Gerst
@ 2009-02-11 15:05             ` Tejun Heo
  2009-02-11 15:10               ` Brian Gerst
  0 siblings, 1 reply; 36+ messages in thread
From: Tejun Heo @ 2009-02-11 15:05 UTC (permalink / raw)
  To: Brian Gerst; +Cc: Ingo Molnar, linux-kernel

Brian Gerst wrote:
> On Wed, Feb 11, 2009 at 9:48 AM, Tejun Heo <htejun@gmail.com> wrote:
>> Tejun Heo wrote:
>>>> I checked the disassembly of these functions and didn't see this
>>>> happen on gcc 4.3.0.
>>> Well, tracking down why run_init_process() is returning 0 with
>>> -fstack-protector wasn't much of fun.  These breakages are very subtle
>>> and if we're gonna pass in pointer to pt_regs anyway and thus can
>>> guarantee such breakage can't happen at no additional cost, I think we
>>> should do that even if it means slightly more argument fetching in a
>>> few places.
>> In addition, if we do that, we can remove the horrible
>> asmlinkage_protect() thing altogether.
> 
> Like I said before, the tail-call optimization problem isn't limited
> to just this set of syscalls.  There are only two real ways to fix it.
> 1) Set up a real stack frame for the syscalls instead of overalying
> pt_regs, or 2) patch gcc to tell it not to touch the args area of the
> stack.

Right, I forgot about the generic ones.  We can pass pointer to
pt_regs to all of them like x86_64 does but yeah we're likely to lose
more than we gain by doing that.  :-(

-- 
tejun

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 2/3] x86: Pass in pt_regs pointer for syscalls that need  it
  2009-02-11 15:05             ` Tejun Heo
@ 2009-02-11 15:10               ` Brian Gerst
  2009-02-11 15:14                 ` Tejun Heo
  0 siblings, 1 reply; 36+ messages in thread
From: Brian Gerst @ 2009-02-11 15:10 UTC (permalink / raw)
  To: Tejun Heo; +Cc: Ingo Molnar, linux-kernel

On Wed, Feb 11, 2009 at 10:05 AM, Tejun Heo <htejun@gmail.com> wrote:
> Brian Gerst wrote:
>> On Wed, Feb 11, 2009 at 9:48 AM, Tejun Heo <htejun@gmail.com> wrote:
>>> Tejun Heo wrote:
>>>>> I checked the disassembly of these functions and didn't see this
>>>>> happen on gcc 4.3.0.
>>>> Well, tracking down why run_init_process() is returning 0 with
>>>> -fstack-protector wasn't much of fun.  These breakages are very subtle
>>>> and if we're gonna pass in pointer to pt_regs anyway and thus can
>>>> guarantee such breakage can't happen at no additional cost, I think we
>>>> should do that even if it means slightly more argument fetching in a
>>>> few places.
>>> In addition, if we do that, we can remove the horrible
>>> asmlinkage_protect() thing altogether.
>>
>> Like I said before, the tail-call optimization problem isn't limited
>> to just this set of syscalls.  There are only two real ways to fix it.
>> 1) Set up a real stack frame for the syscalls instead of overalying
>> pt_regs, or 2) patch gcc to tell it not to touch the args area of the
>> stack.
>
> Right, I forgot about the generic ones.  We can pass pointer to
> pt_regs to all of them like x86_64 does but yeah we're likely to lose
> more than we gain by doing that.  :-(

x86-64 doesn't have the tail-call problem because it doesn't use the
pt_regs on stack trick for syscall args.  All the args are passed in
registers.

--
Brian Gerst

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 2/3] x86: Pass in pt_regs pointer for syscalls that need it
  2009-02-11 15:10               ` Brian Gerst
@ 2009-02-11 15:14                 ` Tejun Heo
  2009-02-11 15:59                   ` Ingo Molnar
  0 siblings, 1 reply; 36+ messages in thread
From: Tejun Heo @ 2009-02-11 15:14 UTC (permalink / raw)
  To: Brian Gerst; +Cc: Ingo Molnar, linux-kernel

Brian Gerst wrote:
> x86-64 doesn't have the tail-call problem because it doesn't use the
> pt_regs on stack trick for syscall args.  All the args are passed in
> registers.

Yeah, I was saying that we can do about the same thing on x86_32 by
passing in pointer to pt_regs and defining proper syscall wrappers.
It will cost a bit of performance by increasing register pressure tho.

-- 
tejun

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 2/3] x86: Pass in pt_regs pointer for syscalls that need it
  2009-02-11 15:14                 ` Tejun Heo
@ 2009-02-11 15:59                   ` Ingo Molnar
  2009-02-12  1:12                     ` Tejun Heo
  0 siblings, 1 reply; 36+ messages in thread
From: Ingo Molnar @ 2009-02-11 15:59 UTC (permalink / raw)
  To: Tejun Heo; +Cc: Brian Gerst, linux-kernel


* Tejun Heo <htejun@gmail.com> wrote:

> Brian Gerst wrote:
> > x86-64 doesn't have the tail-call problem because it doesn't use the
> > pt_regs on stack trick for syscall args.  All the args are passed in
> > registers.
> 
> Yeah, I was saying that we can do about the same thing on x86_32 by
> passing in pointer to pt_regs and defining proper syscall wrappers.
> It will cost a bit of performance by increasing register pressure tho.

Do you mean converting:

ptregscall int sys_execve(struct pt_regs *regs, char __user *u_filename,
                          char __user * __user *argv,
                          char __user * __user *envp)

to:

ptregscall int sys_execve(struct pt_regs *regs)
{
	char __user *u_filename		= syscall_arg1(regs);
	char __user * __user *argv	= syscall_arg2(regs);
	char __user * __user *envp	= syscall_arg3(regs);

etc.?

	Ingo

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 2/3] x86: Pass in pt_regs pointer for syscalls that need it
  2009-02-10 14:51 ` [PATCH 2/3] x86: Pass in pt_regs pointer for syscalls that need it Brian Gerst
  2009-02-11  7:41   ` Tejun Heo
@ 2009-02-11 17:52   ` H. Peter Anvin
  2009-02-11 18:27     ` Brian Gerst
  2009-02-11 21:43   ` [PATCH] x86: pass in pt_regs pointer for syscalls that need it (take 2) Brian Gerst
  2 siblings, 1 reply; 36+ messages in thread
From: H. Peter Anvin @ 2009-02-11 17:52 UTC (permalink / raw)
  To: Brian Gerst; +Cc: Tejun Heo, Ingo Molnar, linux-kernel

Brian Gerst wrote:
>  /*
> + * For syscalls that need a pointer to the pt_regs struct (ie. fork).
> + * The regs pointer is passed in %eax as the first argument.  The
> + * remaining function arguments remain on the stack.
> + */
> +#define ptregscall __attribute__((regparm(1)))

I was looking a few weeks ago (still a work in progress, but I'm pretty
close to having something working) at getting rid of asmlinkage and try
to get everything onto regparm(3).  Adding yet another calling
convention seems to be a step in the wrong direction -- especially since
regparm(1) and (2) are unlikely to have been well exercised and
therefore are likely to attract gcc bugs.

That does *not* mean in any way that I disapprove of the concept of
accessing pt_regs via a pointer... quite on the contrary, I think it's
the only sane thing to do.  I would like to see it done without adding
calling conventions, and preferrably killing some off.

	-hpa


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 2/3] x86: Pass in pt_regs pointer for syscalls that need  it
  2009-02-11 17:52   ` H. Peter Anvin
@ 2009-02-11 18:27     ` Brian Gerst
  2009-02-11 19:50       ` H. Peter Anvin
  0 siblings, 1 reply; 36+ messages in thread
From: Brian Gerst @ 2009-02-11 18:27 UTC (permalink / raw)
  To: H. Peter Anvin; +Cc: Tejun Heo, Ingo Molnar, linux-kernel

On Wed, Feb 11, 2009 at 12:52 PM, H. Peter Anvin <hpa@kernel.org> wrote:
> Brian Gerst wrote:
>>  /*
>> + * For syscalls that need a pointer to the pt_regs struct (ie. fork).
>> + * The regs pointer is passed in %eax as the first argument.  The
>> + * remaining function arguments remain on the stack.
>> + */
>> +#define ptregscall __attribute__((regparm(1)))
>
> I was looking a few weeks ago (still a work in progress, but I'm pretty
> close to having something working) at getting rid of asmlinkage and try
> to get everything onto regparm(3).  Adding yet another calling
> convention seems to be a step in the wrong direction -- especially since
> regparm(1) and (2) are unlikely to have been well exercised and
> therefore are likely to attract gcc bugs.
>
> That does *not* mean in any way that I disapprove of the concept of
> accessing pt_regs via a pointer... quite on the contrary, I think it's
> the only sane thing to do.  I would like to see it done without adding
> calling conventions, and preferrably killing some off.
>
>        -hpa
>
>

I guess I could go back to extracting the args from the pt_regs struct
given just the pointer.  How do you intend to handle system calls in
your changes (normal ones, not needing pt_regs)?

--
Brian Gerst

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 2/3] x86: Pass in pt_regs pointer for syscalls that need it
  2009-02-11 18:27     ` Brian Gerst
@ 2009-02-11 19:50       ` H. Peter Anvin
  2009-02-11 19:57         ` Brian Gerst
  0 siblings, 1 reply; 36+ messages in thread
From: H. Peter Anvin @ 2009-02-11 19:50 UTC (permalink / raw)
  To: Brian Gerst; +Cc: H. Peter Anvin, Tejun Heo, Ingo Molnar, linux-kernel

Brian Gerst wrote:
> 
> I guess I could go back to extracting the args from the pt_regs struct
> given just the pointer.  How do you intend to handle system calls in
> your changes (normal ones, not needing pt_regs)?
> 

My plan was to by default load up the three first arguments in (%eax, 
%edx, %ecx) followed by the remaining arguments on the stack... I 
currently have it as a reorganized struct pt_regs, but I'm still trying 
to figure out if it would make more sense from a correctness and 
performance perspective to instead have duplicates of these entries.

For the pt_regs-using registers, they would need a tiny trampoline, 
looking like:

	leal 16(%esp),%eax
	jmp <real function>

	-hpa


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 2/3] x86: Pass in pt_regs pointer for syscalls that need  it
  2009-02-11 19:50       ` H. Peter Anvin
@ 2009-02-11 19:57         ` Brian Gerst
  2009-02-11 20:00           ` H. Peter Anvin
  0 siblings, 1 reply; 36+ messages in thread
From: Brian Gerst @ 2009-02-11 19:57 UTC (permalink / raw)
  To: H. Peter Anvin; +Cc: H. Peter Anvin, Tejun Heo, Ingo Molnar, linux-kernel

On Wed, Feb 11, 2009 at 2:50 PM, H. Peter Anvin <hpa@zytor.com> wrote:
> Brian Gerst wrote:
>>
>> I guess I could go back to extracting the args from the pt_regs struct
>> given just the pointer.  How do you intend to handle system calls in
>> your changes (normal ones, not needing pt_regs)?
>>
>
> My plan was to by default load up the three first arguments in (%eax, %edx,
> %ecx) followed by the remaining arguments on the stack... I currently have
> it as a reorganized struct pt_regs, but I'm still trying to figure out if it
> would make more sense from a correctness and performance perspective to
> instead have duplicates of these entries.
>
> For the pt_regs-using registers, they would need a tiny trampoline, looking
> like:
>
>        leal 16(%esp),%eax
>        jmp <real function>
>
>        -hpa
>
>

IMHO, copying the 4th-6th args to a new stack frame is the only way to
guarantee that gcc won't trash any part of pt_regs.  The question is
whether to do it unconditionally, or try to be clever and only copy
them for the syscalls that actually need them.

--
Brian Gerst

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 2/3] x86: Pass in pt_regs pointer for syscalls that need it
  2009-02-11 19:57         ` Brian Gerst
@ 2009-02-11 20:00           ` H. Peter Anvin
  0 siblings, 0 replies; 36+ messages in thread
From: H. Peter Anvin @ 2009-02-11 20:00 UTC (permalink / raw)
  To: Brian Gerst; +Cc: H. Peter Anvin, Tejun Heo, Ingo Molnar, linux-kernel

Brian Gerst wrote:
> 
> IMHO, copying the 4th-6th args to a new stack frame is the only way to
> guarantee that gcc won't trash any part of pt_regs.  The question is
> whether to do it unconditionally, or try to be clever and only copy
> them for the syscalls that actually need them.
> 

My guess is that the conditionalization would actually cost more than 
doing it unconditionally.  We're talking a small fraction of a cache 
line, and a set of stores to RAM, which can be buffered.

It's in many ways easier than reorganizing the struct pt_regs.  I'm just 
hypersensitive to adding system call overhead in any way.

	-hpa


^ permalink raw reply	[flat|nested] 36+ messages in thread

* [PATCH] x86: pass in pt_regs pointer for syscalls that need it (take 2)
  2009-02-10 14:51 ` [PATCH 2/3] x86: Pass in pt_regs pointer for syscalls that need it Brian Gerst
  2009-02-11  7:41   ` Tejun Heo
  2009-02-11 17:52   ` H. Peter Anvin
@ 2009-02-11 21:43   ` Brian Gerst
  2009-02-11 21:50     ` H. Peter Anvin
  2009-02-11 22:06     ` H. Peter Anvin
  2 siblings, 2 replies; 36+ messages in thread
From: Brian Gerst @ 2009-02-11 21:43 UTC (permalink / raw)
  To: Tejun Heo; +Cc: Ingo Molnar, linux-kernel, H. Peter Anvin

Some syscalls need to access the pt_regs structure, either to copy
user register state or to modifiy it.  This patch adds stubs to load
the address of the pt_regs struct into the %eax register, and changes
the syscalls to take the pointer as an argument instead of relying on
the assumption that the pt_regs structure overlaps the function
arguments.

Signed-off-by: Brian Gerst <brgerst@gmail.com>
---

v2: drop the regparm(1) usage due to concerns over gcc bugs.

 arch/x86/include/asm/syscalls.h    |   20 ++++++++++----------
 arch/x86/kernel/entry_32.S         |   20 ++++++++++++++++++++
 arch/x86/kernel/ioport.c           |    3 +--
 arch/x86/kernel/process_32.c       |   32 ++++++++++++++++----------------
 arch/x86/kernel/signal.c           |   22 ++++------------------
 arch/x86/kernel/syscall_table_32.S |   20 ++++++++++----------
 arch/x86/kernel/vm86_32.c          |   16 ++++++++--------
 7 files changed, 69 insertions(+), 64 deletions(-)

diff --git a/arch/x86/include/asm/syscalls.h b/arch/x86/include/asm/syscalls.h
index c0b0bda..77bb31a 100644
--- a/arch/x86/include/asm/syscalls.h
+++ b/arch/x86/include/asm/syscalls.h
@@ -29,21 +29,21 @@ asmlinkage int sys_get_thread_area(struct user_desc __user *);
 /* X86_32 only */
 #ifdef CONFIG_X86_32
 /* kernel/process_32.c */
-asmlinkage int sys_fork(struct pt_regs);
-asmlinkage int sys_clone(struct pt_regs);
-asmlinkage int sys_vfork(struct pt_regs);
-asmlinkage int sys_execve(struct pt_regs);
+int sys_fork(struct pt_regs *);
+int sys_clone(struct pt_regs *);
+int sys_vfork(struct pt_regs *);
+int sys_execve(struct pt_regs *);
 
 /* kernel/signal_32.c */
 asmlinkage int sys_sigsuspend(int, int, old_sigset_t);
 asmlinkage int sys_sigaction(int, const struct old_sigaction __user *,
 			     struct old_sigaction __user *);
-asmlinkage int sys_sigaltstack(unsigned long);
-asmlinkage unsigned long sys_sigreturn(unsigned long);
-asmlinkage int sys_rt_sigreturn(unsigned long);
+int sys_sigaltstack(struct pt_regs *);
+unsigned long sys_sigreturn(struct pt_regs *);
+int sys_rt_sigreturn(struct pt_regs *);
 
 /* kernel/ioport.c */
-asmlinkage long sys_iopl(unsigned long);
+long sys_iopl(struct pt_regs *);
 
 /* kernel/sys_i386_32.c */
 asmlinkage long sys_mmap2(unsigned long, unsigned long, unsigned long,
@@ -59,8 +59,8 @@ struct oldold_utsname;
 asmlinkage int sys_olduname(struct oldold_utsname __user *);
 
 /* kernel/vm86_32.c */
-asmlinkage int sys_vm86old(struct pt_regs);
-asmlinkage int sys_vm86(struct pt_regs);
+int sys_vm86old(struct pt_regs *);
+int sys_vm86(struct pt_regs *);
 
 #else /* CONFIG_X86_32 */
 
diff --git a/arch/x86/kernel/entry_32.S b/arch/x86/kernel/entry_32.S
index 5f5bd22..3de7b57 100644
--- a/arch/x86/kernel/entry_32.S
+++ b/arch/x86/kernel/entry_32.S
@@ -697,6 +697,26 @@ syscall_badsys:
 END(syscall_badsys)
 	CFI_ENDPROC
 
+/*
+ * System calls that need a pt_regs pointer.
+ */
+#define PTREGSCALL(name) \
+	ALIGN; \
+ptregs_##name: \
+	leal 4(%esp),%eax; \
+	jmp sys_##name;
+
+PTREGSCALL(iopl)
+PTREGSCALL(fork)
+PTREGSCALL(clone)
+PTREGSCALL(vfork)
+PTREGSCALL(execve)
+PTREGSCALL(sigaltstack)
+PTREGSCALL(sigreturn)
+PTREGSCALL(rt_sigreturn)
+PTREGSCALL(vm86)
+PTREGSCALL(vm86old)
+
 .macro FIXUP_ESPFIX_STACK
 	/* since we are on a wrong stack, we cant make it a C code :( */
 	PER_CPU(gdt_page, %ebx)
diff --git a/arch/x86/kernel/ioport.c b/arch/x86/kernel/ioport.c
index b12208f..e41980a 100644
--- a/arch/x86/kernel/ioport.c
+++ b/arch/x86/kernel/ioport.c
@@ -131,9 +131,8 @@ static int do_iopl(unsigned int level, struct pt_regs *regs)
 }
 
 #ifdef CONFIG_X86_32
-asmlinkage long sys_iopl(unsigned long regsp)
+long sys_iopl(struct pt_regs *regs)
 {
-	struct pt_regs *regs = (struct pt_regs *)&regsp;
 	unsigned int level = regs->bx;
 	struct thread_struct *t = &current->thread;
 	int rc;
diff --git a/arch/x86/kernel/process_32.c b/arch/x86/kernel/process_32.c
index b50604b..fec79ad 100644
--- a/arch/x86/kernel/process_32.c
+++ b/arch/x86/kernel/process_32.c
@@ -603,24 +603,24 @@ __switch_to(struct task_struct *prev_p, struct task_struct *next_p)
 	return prev_p;
 }
 
-asmlinkage int sys_fork(struct pt_regs regs)
+int sys_fork(struct pt_regs *regs)
 {
-	return do_fork(SIGCHLD, regs.sp, &regs, 0, NULL, NULL);
+	return do_fork(SIGCHLD, regs->sp, regs, 0, NULL, NULL);
 }
 
-asmlinkage int sys_clone(struct pt_regs regs)
+int sys_clone(struct pt_regs *regs)
 {
 	unsigned long clone_flags;
 	unsigned long newsp;
 	int __user *parent_tidptr, *child_tidptr;
 
-	clone_flags = regs.bx;
-	newsp = regs.cx;
-	parent_tidptr = (int __user *)regs.dx;
-	child_tidptr = (int __user *)regs.di;
+	clone_flags = regs->bx;
+	newsp = regs->cx;
+	parent_tidptr = (int __user *)regs->dx;
+	child_tidptr = (int __user *)regs->di;
 	if (!newsp)
-		newsp = regs.sp;
-	return do_fork(clone_flags, newsp, &regs, 0, parent_tidptr, child_tidptr);
+		newsp = regs->sp;
+	return do_fork(clone_flags, newsp, regs, 0, parent_tidptr, child_tidptr);
 }
 
 /*
@@ -633,27 +633,27 @@ asmlinkage int sys_clone(struct pt_regs regs)
  * do not have enough call-clobbered registers to hold all
  * the information you need.
  */
-asmlinkage int sys_vfork(struct pt_regs regs)
+int sys_vfork(struct pt_regs *regs)
 {
-	return do_fork(CLONE_VFORK | CLONE_VM | SIGCHLD, regs.sp, &regs, 0, NULL, NULL);
+	return do_fork(CLONE_VFORK | CLONE_VM | SIGCHLD, regs->sp, regs, 0, NULL, NULL);
 }
 
 /*
  * sys_execve() executes a new program.
  */
-asmlinkage int sys_execve(struct pt_regs regs)
+int sys_execve(struct pt_regs *regs)
 {
 	int error;
 	char *filename;
 
-	filename = getname((char __user *) regs.bx);
+	filename = getname((char __user *) regs->bx);
 	error = PTR_ERR(filename);
 	if (IS_ERR(filename))
 		goto out;
 	error = do_execve(filename,
-			(char __user * __user *) regs.cx,
-			(char __user * __user *) regs.dx,
-			&regs);
+			(char __user * __user *) regs->cx,
+			(char __user * __user *) regs->dx,
+			regs);
 	if (error == 0) {
 		/* Make sure we don't return using sysenter.. */
 		set_thread_flag(TIF_IRET);
diff --git a/arch/x86/kernel/signal.c b/arch/x86/kernel/signal.c
index 8562387..ccfb274 100644
--- a/arch/x86/kernel/signal.c
+++ b/arch/x86/kernel/signal.c
@@ -549,14 +549,9 @@ sys_sigaction(int sig, const struct old_sigaction __user *act,
 #endif /* CONFIG_X86_32 */
 
 #ifdef CONFIG_X86_32
-asmlinkage int sys_sigaltstack(unsigned long bx)
+int sys_sigaltstack(struct pt_regs *regs)
 {
-	/*
-	 * This is needed to make gcc realize it doesn't own the
-	 * "struct pt_regs"
-	 */
-	struct pt_regs *regs = (struct pt_regs *)&bx;
-	const stack_t __user *uss = (const stack_t __user *)bx;
+	const stack_t __user *uss = (const stack_t __user *)regs->bx;
 	stack_t __user *uoss = (stack_t __user *)regs->cx;
 
 	return do_sigaltstack(uss, uoss, regs->sp);
@@ -574,14 +569,12 @@ sys_sigaltstack(const stack_t __user *uss, stack_t __user *uoss,
  * Do a signal return; undo the signal stack.
  */
 #ifdef CONFIG_X86_32
-asmlinkage unsigned long sys_sigreturn(unsigned long __unused)
+unsigned long sys_sigreturn(struct pt_regs *regs)
 {
 	struct sigframe __user *frame;
-	struct pt_regs *regs;
 	unsigned long ax;
 	sigset_t set;
 
-	regs = (struct pt_regs *) &__unused;
 	frame = (struct sigframe __user *)(regs->sp - 8);
 
 	if (!access_ok(VERIFY_READ, frame, sizeof(*frame)))
@@ -640,15 +633,8 @@ badframe:
 }
 
 #ifdef CONFIG_X86_32
-/*
- * Note: do not pass in pt_regs directly as with tail-call optimization
- * GCC will incorrectly stomp on the caller's frame and corrupt user-space
- * register state:
- */
-asmlinkage int sys_rt_sigreturn(unsigned long __unused)
+int sys_rt_sigreturn(struct pt_regs *regs)
 {
-	struct pt_regs *regs = (struct pt_regs *)&__unused;
-
 	return do_rt_sigreturn(regs);
 }
 #else /* !CONFIG_X86_32 */
diff --git a/arch/x86/kernel/syscall_table_32.S b/arch/x86/kernel/syscall_table_32.S
index e2e86a0..3bdb648 100644
--- a/arch/x86/kernel/syscall_table_32.S
+++ b/arch/x86/kernel/syscall_table_32.S
@@ -1,7 +1,7 @@
 ENTRY(sys_call_table)
 	.long sys_restart_syscall	/* 0 - old "setup()" system call, used for restarting */
 	.long sys_exit
-	.long sys_fork
+	.long ptregs_fork
 	.long sys_read
 	.long sys_write
 	.long sys_open		/* 5 */
@@ -10,7 +10,7 @@ ENTRY(sys_call_table)
 	.long sys_creat
 	.long sys_link
 	.long sys_unlink	/* 10 */
-	.long sys_execve
+	.long ptregs_execve
 	.long sys_chdir
 	.long sys_time
 	.long sys_mknod
@@ -109,17 +109,17 @@ ENTRY(sys_call_table)
 	.long sys_newlstat
 	.long sys_newfstat
 	.long sys_uname
-	.long sys_iopl		/* 110 */
+	.long ptregs_iopl	/* 110 */
 	.long sys_vhangup
 	.long sys_ni_syscall	/* old "idle" system call */
-	.long sys_vm86old
+	.long ptregs_vm86old
 	.long sys_wait4
 	.long sys_swapoff	/* 115 */
 	.long sys_sysinfo
 	.long sys_ipc
 	.long sys_fsync
-	.long sys_sigreturn
-	.long sys_clone		/* 120 */
+	.long ptregs_sigreturn
+	.long ptregs_clone	/* 120 */
 	.long sys_setdomainname
 	.long sys_newuname
 	.long sys_modify_ldt
@@ -165,14 +165,14 @@ ENTRY(sys_call_table)
 	.long sys_mremap
 	.long sys_setresuid16
 	.long sys_getresuid16	/* 165 */
-	.long sys_vm86
+	.long ptregs_vm86
 	.long sys_ni_syscall	/* Old sys_query_module */
 	.long sys_poll
 	.long sys_nfsservctl
 	.long sys_setresgid16	/* 170 */
 	.long sys_getresgid16
 	.long sys_prctl
-	.long sys_rt_sigreturn
+	.long ptregs_rt_sigreturn
 	.long sys_rt_sigaction
 	.long sys_rt_sigprocmask	/* 175 */
 	.long sys_rt_sigpending
@@ -185,11 +185,11 @@ ENTRY(sys_call_table)
 	.long sys_getcwd
 	.long sys_capget
 	.long sys_capset	/* 185 */
-	.long sys_sigaltstack
+	.long ptregs_sigaltstack
 	.long sys_sendfile
 	.long sys_ni_syscall	/* reserved for streams1 */
 	.long sys_ni_syscall	/* reserved for streams2 */
-	.long sys_vfork		/* 190 */
+	.long ptregs_vfork	/* 190 */
 	.long sys_getrlimit
 	.long sys_mmap2
 	.long sys_truncate64
diff --git a/arch/x86/kernel/vm86_32.c b/arch/x86/kernel/vm86_32.c
index 55ea30d..d7ac84e 100644
--- a/arch/x86/kernel/vm86_32.c
+++ b/arch/x86/kernel/vm86_32.c
@@ -197,9 +197,9 @@ out:
 static int do_vm86_irq_handling(int subfunction, int irqnumber);
 static void do_sys_vm86(struct kernel_vm86_struct *info, struct task_struct *tsk);
 
-asmlinkage int sys_vm86old(struct pt_regs regs)
+int sys_vm86old(struct pt_regs *regs)
 {
-	struct vm86_struct __user *v86 = (struct vm86_struct __user *)regs.bx;
+	struct vm86_struct __user *v86 = (struct vm86_struct __user *)regs->bx;
 	struct kernel_vm86_struct info; /* declare this _on top_,
 					 * this avoids wasting of stack space.
 					 * This remains on the stack until we
@@ -218,7 +218,7 @@ asmlinkage int sys_vm86old(struct pt_regs regs)
 	if (tmp)
 		goto out;
 	memset(&info.vm86plus, 0, (int)&info.regs32 - (int)&info.vm86plus);
-	info.regs32 = &regs;
+	info.regs32 = regs;
 	tsk->thread.vm86_info = v86;
 	do_sys_vm86(&info, tsk);
 	ret = 0;	/* we never return here */
@@ -227,7 +227,7 @@ out:
 }
 
 
-asmlinkage int sys_vm86(struct pt_regs regs)
+int sys_vm86(struct pt_regs *regs)
 {
 	struct kernel_vm86_struct info; /* declare this _on top_,
 					 * this avoids wasting of stack space.
@@ -239,12 +239,12 @@ asmlinkage int sys_vm86(struct pt_regs regs)
 	struct vm86plus_struct __user *v86;
 
 	tsk = current;
-	switch (regs.bx) {
+	switch (regs->bx) {
 	case VM86_REQUEST_IRQ:
 	case VM86_FREE_IRQ:
 	case VM86_GET_IRQ_BITS:
 	case VM86_GET_AND_RESET_IRQ:
-		ret = do_vm86_irq_handling(regs.bx, (int)regs.cx);
+		ret = do_vm86_irq_handling(regs->bx, (int)regs->cx);
 		goto out;
 	case VM86_PLUS_INSTALL_CHECK:
 		/*
@@ -261,14 +261,14 @@ asmlinkage int sys_vm86(struct pt_regs regs)
 	ret = -EPERM;
 	if (tsk->thread.saved_sp0)
 		goto out;
-	v86 = (struct vm86plus_struct __user *)regs.cx;
+	v86 = (struct vm86plus_struct __user *)regs->cx;
 	tmp = copy_vm86_regs_from_user(&info.regs, &v86->regs,
 				       offsetof(struct kernel_vm86_struct, regs32) -
 				       sizeof(info.regs));
 	ret = -EFAULT;
 	if (tmp)
 		goto out;
-	info.regs32 = &regs;
+	info.regs32 = regs;
 	info.vm86plus.is_vm86pus = 1;
 	tsk->thread.vm86_info = (struct vm86_struct __user *)v86;
 	do_sys_vm86(&info, tsk);
-- 
1.6.1


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* Re: [PATCH] x86: pass in pt_regs pointer for syscalls that need it (take 2)
  2009-02-11 21:43   ` [PATCH] x86: pass in pt_regs pointer for syscalls that need it (take 2) Brian Gerst
@ 2009-02-11 21:50     ` H. Peter Anvin
  2009-02-11 22:06     ` H. Peter Anvin
  1 sibling, 0 replies; 36+ messages in thread
From: H. Peter Anvin @ 2009-02-11 21:50 UTC (permalink / raw)
  To: Brian Gerst; +Cc: Tejun Heo, Ingo Molnar, linux-kernel

Looks much better!

Brian Gerst wrote:
>  
>  #ifdef CONFIG_X86_32
> -/*
> - * Note: do not pass in pt_regs directly as with tail-call optimization
> - * GCC will incorrectly stomp on the caller's frame and corrupt user-space
> - * register state:
> - */
> -asmlinkage int sys_rt_sigreturn(unsigned long __unused)
> +int sys_rt_sigreturn(struct pt_regs *regs)
>  {
> -	struct pt_regs *regs = (struct pt_regs *)&__unused;
> -
>  	return do_rt_sigreturn(regs);
>  }
>  #else /* !CONFIG_X86_32 */

I observe with this sys_rt_sigreturn() is identical for both 32 and 64 
bits (since int == long for 32 bits.)  Furthermore, *both* are now no-op 
stubs around do_rt_sigreturn, so we can simply remove both stubs and 
change do_rt_sigreturn into sys_rt_sigreturn.

	-hpa

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH] x86: pass in pt_regs pointer for syscalls that need it (take 2)
  2009-02-11 21:43   ` [PATCH] x86: pass in pt_regs pointer for syscalls that need it (take 2) Brian Gerst
  2009-02-11 21:50     ` H. Peter Anvin
@ 2009-02-11 22:06     ` H. Peter Anvin
  2009-02-12 11:02       ` Ingo Molnar
  1 sibling, 1 reply; 36+ messages in thread
From: H. Peter Anvin @ 2009-02-11 22:06 UTC (permalink / raw)
  To: Brian Gerst; +Cc: Tejun Heo, Ingo Molnar, linux-kernel

Brian Gerst wrote:
> Some syscalls need to access the pt_regs structure, either to copy
> user register state or to modifiy it.  This patch adds stubs to load
> the address of the pt_regs struct into the %eax register, and changes
> the syscalls to take the pointer as an argument instead of relying on
> the assumption that the pt_regs structure overlaps the function
> arguments.
> 
> Signed-off-by: Brian Gerst <brgerst@gmail.com>

Testing this now, together with the do_rt_sigreturn -> sys_rt_sigreturn 
change.  If it works okay I'll commit it as an incremental diff to 
tip:core/percpu.

	-hpa

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 2/3] x86: Pass in pt_regs pointer for syscalls that need it
  2009-02-11 15:59                   ` Ingo Molnar
@ 2009-02-12  1:12                     ` Tejun Heo
  0 siblings, 0 replies; 36+ messages in thread
From: Tejun Heo @ 2009-02-12  1:12 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: Brian Gerst, linux-kernel

Ingo Molnar wrote:
> * Tejun Heo <htejun@gmail.com> wrote:
> 
>> Brian Gerst wrote:
>>> x86-64 doesn't have the tail-call problem because it doesn't use the
>>> pt_regs on stack trick for syscall args.  All the args are passed in
>>> registers.
>> Yeah, I was saying that we can do about the same thing on x86_32 by
>> passing in pointer to pt_regs and defining proper syscall wrappers.
>> It will cost a bit of performance by increasing register pressure tho.
> 
> Do you mean converting:
> 
> ptregscall int sys_execve(struct pt_regs *regs, char __user *u_filename,
>                           char __user * __user *argv,
>                           char __user * __user *envp)
> 
> to:
> 
> ptregscall int sys_execve(struct pt_regs *regs)
> {
> 	char __user *u_filename		= syscall_arg1(regs);
> 	char __user * __user *argv	= syscall_arg2(regs);
> 	char __user * __user *envp	= syscall_arg3(regs);
> 
> etc.?

Not exactly.  include/linux/syscalls.h already has syscall wrapping
macros defined, with slight modification to allow archs to define its
own __SC_DECL and __SC_LONG (probably should use different name tho),
the outer function can be easily defined to take pt_regs pointer and
pass in the correct argument to the actual implementation function.
The only added overhead would be pt_regs pointer having to be loaded
into %edi and it having to stay somewhere in the callee till the last
parameter access.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH] x86: pass in pt_regs pointer for syscalls that need it (take 2)
  2009-02-11 22:06     ` H. Peter Anvin
@ 2009-02-12 11:02       ` Ingo Molnar
  0 siblings, 0 replies; 36+ messages in thread
From: Ingo Molnar @ 2009-02-12 11:02 UTC (permalink / raw)
  To: H. Peter Anvin; +Cc: Brian Gerst, Tejun Heo, linux-kernel


* H. Peter Anvin <hpa@zytor.com> wrote:

> Brian Gerst wrote:
>> Some syscalls need to access the pt_regs structure, either to copy
>> user register state or to modifiy it.  This patch adds stubs to load
>> the address of the pt_regs struct into the %eax register, and changes
>> the syscalls to take the pointer as an argument instead of relying on
>> the assumption that the pt_regs structure overlaps the function
>> arguments.
>>
>> Signed-off-by: Brian Gerst <brgerst@gmail.com>
>
> Testing this now, together with the do_rt_sigreturn -> sys_rt_sigreturn  
> change.  If it works okay I'll commit it as an incremental diff to  
> tip:core/percpu.

FYI, i merged the delta commit you applied to tip:master as well - it's all
looking good in testing so far.

	Ingo

^ permalink raw reply	[flat|nested] 36+ messages in thread

end of thread, other threads:[~2009-02-12 11:03 UTC | newest]

Thread overview: 36+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-02-10 14:51 [PATCH 0/3] x86: Fix pt_regs passed by value Brian Gerst
2009-02-10 14:51 ` [PATCH 1/3] x86: Use pt_regs pointer in do_device_not_available() Brian Gerst
2009-02-11  7:43   ` Tejun Heo
2009-02-11 10:13     ` Ingo Molnar
2009-02-11 14:34     ` Brian Gerst
2009-02-11 14:42       ` Tejun Heo
2009-02-11 14:46         ` Brian Gerst
2009-02-11 14:53           ` Tejun Heo
2009-02-10 14:51 ` [PATCH 2/3] x86: Pass in pt_regs pointer for syscalls that need it Brian Gerst
2009-02-11  7:41   ` Tejun Heo
2009-02-11 10:18     ` Ingo Molnar
2009-02-11 14:14       ` Tejun Heo
2009-02-11 14:31     ` Brian Gerst
2009-02-11 14:41       ` Tejun Heo
2009-02-11 14:43         ` Tejun Heo
2009-02-11 14:48         ` Tejun Heo
2009-02-11 14:58           ` Ingo Molnar
2009-02-11 14:59           ` Brian Gerst
2009-02-11 15:05             ` Tejun Heo
2009-02-11 15:10               ` Brian Gerst
2009-02-11 15:14                 ` Tejun Heo
2009-02-11 15:59                   ` Ingo Molnar
2009-02-12  1:12                     ` Tejun Heo
2009-02-11 15:01       ` Ingo Molnar
2009-02-11 17:52   ` H. Peter Anvin
2009-02-11 18:27     ` Brian Gerst
2009-02-11 19:50       ` H. Peter Anvin
2009-02-11 19:57         ` Brian Gerst
2009-02-11 20:00           ` H. Peter Anvin
2009-02-11 21:43   ` [PATCH] x86: pass in pt_regs pointer for syscalls that need it (take 2) Brian Gerst
2009-02-11 21:50     ` H. Peter Anvin
2009-02-11 22:06     ` H. Peter Anvin
2009-02-12 11:02       ` Ingo Molnar
2009-02-10 14:51 ` [PATCH 3/3] x86: Drop -fno-stack-protector after pt_regs fixes Brian Gerst
2009-02-11 11:42 ` [PATCH 0/3] x86: Fix pt_regs passed by value Ingo Molnar
2009-02-11 14:15   ` Tejun Heo

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.