* [RFC] status of execve() work - per-architecture patches solicited @ 2012-09-07 18:20 Al Viro 2012-09-07 18:22 ` Al Viro ` (4 more replies) 0 siblings, 5 replies; 22+ messages in thread From: Al Viro @ 2012-09-07 18:20 UTC (permalink / raw) To: linux-arch; +Cc: linux-kernel To architecture maintainers: please, review the current situation in git.kernel.org/pub/scm/linux/kernel/git/viro/signal #execve2 and consider sending the corresponding patches for missing architectures. What's getting done is unification of sys_execve()/kernel_execve() into arch-independent code. x86, alpha, arm, s390, um and ppc are already converted in #execve2. The plan is: * provide a new primitive - ret_from_kernel_execve(); it takes two pointers to struct pt_regs, one being the normal location of pt_regs for a userland process, another - new pt_regs just filled by do_execve(). It should copy the latter to the former and bugger off to userland. Called from generic kernel_execve() implementation (see fs/exec.c in #execve2). It almost always has to be done in assembler - normally it does equivalent of something along the lines of memmove(normal, new, sizeof(struct pt_regs)) sp = normal, or whatever is needed to get a valid stack frame (e.g. on s390 there's ->back_chain that needs to be set to NULL) set other registers ret_from_sys_call expects to be set (e.g. i386 syscall entry has current_thread_info() value cached in %ebp and since it's a callee-saved register there, ret_from_sys_call expects to find that value still in %ebp, so we need to set it); basically, check what has to be set in ret_from_fork - it tends to jump to the same place. goto ret_from_sys_call, or whatever the equivalent is called on particular architecture. * define __ARCH_WANT_KERNEL_EXECVE in unistd.h, remove your old kernel_execve() * pull whatever work you'd been doing *after* do_execve() call in your sys_execve() (most of the architectures don't do anything after that anyway) into start_thread(); that's the point of no return for execve(2) and if we get there, we'll either succeed or get killed with SIGKILL. The same goes for compat variant of execve(), with s/start_thread/compat_start_thread/. * define __ARCH_WANT_SYS_EXECVE in unistd.h, kill your sys_execve() and compat counterpart (if any). * if there's a better way to calculate task_pt_regs(current), you can provide it in your ptrace.h - macro should be called current_pt_regs(); it's optional. Status: x86, arm, um, s390 - converted, tested, seem to work. alpha and ppc - need testing. The rest - hadn't touched yet. unicore32 and blackfin should be trivial to convert (they are doing kernel_execve() in that manner already). Other may be more or less tricky - depends on how gnarly their return from syscall path happens to be. I'll do what I can and test what I can (some on emulators, some on real hardware), but for quite a few architectures I've no way to test. Nor am I fond of sniffing dozens of variants of assembler glue, to put it mildly. Patches and/or help with testing setups would be very welcome. ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [RFC] status of execve() work - per-architecture patches solicited 2012-09-07 18:20 [RFC] status of execve() work - per-architecture patches solicited Al Viro @ 2012-09-07 18:22 ` Al Viro 2012-09-10 13:40 ` Greg Ungerer ` (3 subsequent siblings) 4 siblings, 0 replies; 22+ messages in thread From: Al Viro @ 2012-09-07 18:22 UTC (permalink / raw) To: linux-arch; +Cc: linux-kernel On Fri, Sep 07, 2012 at 07:20:04PM +0100, Al Viro wrote: > To architecture maintainers: please, review the current > situation in git.kernel.org/pub/scm/linux/kernel/git/viro/signal #execve2 It should be commit 50343ee2889f5a8cff1aa30110f07b0e01563500 right now. ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [RFC] status of execve() work - per-architecture patches solicited 2012-09-07 18:20 [RFC] status of execve() work - per-architecture patches solicited Al Viro 2012-09-07 18:22 ` Al Viro @ 2012-09-10 13:40 ` Greg Ungerer 2012-09-10 16:49 ` Al Viro 2012-09-10 22:20 ` Mark Salter ` (2 subsequent siblings) 4 siblings, 1 reply; 22+ messages in thread From: Greg Ungerer @ 2012-09-10 13:40 UTC (permalink / raw) To: Al Viro; +Cc: linux-arch, linux-kernel Hi Al, On 09/08/2012 04:20 AM, Al Viro wrote: > To architecture maintainers: please, review the current > situation in git.kernel.org/pub/scm/linux/kernel/git/viro/signal #execve2 > and consider sending the corresponding patches for missing architectures. I can see you have some m68k patches in there as well. They tested good on standard m68k (under emulator) and good on non-mmu ColdFire. But it is geting an exception when I run on ColdFire with MMU enabled: ... Creating 1 MTD partitions on "RAM": 0x000000000000-0x0000001b8000 : "ROMfs" TCP: cubic registered NET: Registered protocol family 17 VFS: Mounted root (romfs filesystem) readonly on device 31:0. *** FORMAT ERROR *** FORMAT=4 Current process id is 1 BAD KERNEL TRAP: 00000000 Modules linked in: PC: [<0002562a>] 0x02562a SR: 2704 SP: 0383dfc4 a2: 00000000 d0: 00000000 d1: 00000000 d2: 00000000 d3: 00000000 d4: 00000000 d5: 00000000 a0: 00000000 a1: 00000000 Process init (pid: 1, task=0383a000) Frame format=4 eff addr=00000000 pc=6000169a Stack from 0383e000: Call Trace: Code: 6610 4cd7 073e 4fef 0020 201f 588f dfdf <4e73> 2228 0004 46fc 2000 0801 0007 66ff ffff c2ea 598f 4fef ffe8 48d7 78c0 486f Disabling lock debugging due to kernel taint It is trapping at the return from exception (rte) in Lreturn. Looks like it doesn't like the "format" field of the new stack frame for some reason. If I get a few minutes tomorrow I'll dig into it. Regards Greg > What's getting done is unification of sys_execve()/kernel_execve() > into arch-independent code. x86, alpha, arm, s390, um and ppc are already > converted in #execve2. The plan is: > > * provide a new primitive - ret_from_kernel_execve(); it takes two pointers > to struct pt_regs, one being the normal location of pt_regs for a userland > process, another - new pt_regs just filled by do_execve(). It should copy > the latter to the former and bugger off to userland. Called from generic > kernel_execve() implementation (see fs/exec.c in #execve2). It almost always > has to be done in assembler - normally it does equivalent of something > along the lines of > memmove(normal, new, sizeof(struct pt_regs)) > sp = normal, or whatever is needed to get a valid stack > frame (e.g. on s390 there's ->back_chain that needs to be set to > NULL) > set other registers ret_from_sys_call expects to be set (e.g. > i386 syscall entry has current_thread_info() value cached in %ebp and > since it's a callee-saved register there, ret_from_sys_call expects to > find that value still in %ebp, so we need to set it); basically, check > what has to be set in ret_from_fork - it tends to jump to the same place. > goto ret_from_sys_call, or whatever the equivalent is called on > particular architecture. > * define __ARCH_WANT_KERNEL_EXECVE in unistd.h, remove your old kernel_execve() > * pull whatever work you'd been doing *after* do_execve() call in your > sys_execve() (most of the architectures don't do anything after that anyway) > into start_thread(); that's the point of no return for execve(2) and if we > get there, we'll either succeed or get killed with SIGKILL. The same goes > for compat variant of execve(), with s/start_thread/compat_start_thread/. > * define __ARCH_WANT_SYS_EXECVE in unistd.h, kill your sys_execve() and > compat counterpart (if any). > * if there's a better way to calculate task_pt_regs(current), you can provide > it in your ptrace.h - macro should be called current_pt_regs(); it's optional. > > Status: x86, arm, um, s390 - converted, tested, seem to work. alpha > and ppc - need testing. The rest - hadn't touched yet. unicore32 and > blackfin should be trivial to convert (they are doing kernel_execve() in > that manner already). Other may be more or less tricky - depends on how > gnarly their return from syscall path happens to be. I'll do what I can > and test what I can (some on emulators, some on real hardware), but for quite > a few architectures I've no way to test. Nor am I fond of sniffing dozens > of variants of assembler glue, to put it mildly. > > Patches and/or help with testing setups would be very welcome. > -- > To unsubscribe from this list: send the line "unsubscribe linux-arch" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [RFC] status of execve() work - per-architecture patches solicited 2012-09-10 13:40 ` Greg Ungerer @ 2012-09-10 16:49 ` Al Viro 2012-09-11 3:39 ` Greg Ungerer 2012-09-13 13:27 ` Greg Ungerer 0 siblings, 2 replies; 22+ messages in thread From: Al Viro @ 2012-09-10 16:49 UTC (permalink / raw) To: Greg Ungerer; +Cc: linux-arch, linux-kernel On Mon, Sep 10, 2012 at 11:40:11PM +1000, Greg Ungerer wrote: > Hi Al, > > On 09/08/2012 04:20 AM, Al Viro wrote: > > To architecture maintainers: please, review the current > >situation in git.kernel.org/pub/scm/linux/kernel/git/viro/signal #execve2 > >and consider sending the corresponding patches for missing architectures. > > I can see you have some m68k patches in there as well. > They tested good on standard m68k (under emulator) and good on non-mmu > ColdFire. But it is geting an exception when I run on ColdFire with MMU > enabled: > > ... > Creating 1 MTD partitions on "RAM": > 0x000000000000-0x0000001b8000 : "ROMfs" > TCP: cubic registered > NET: Registered protocol family 17 > VFS: Mounted root (romfs filesystem) readonly on device 31:0. > *** FORMAT ERROR *** FORMAT=4 > Current process id is 1 > BAD KERNEL TRAP: 00000000 > Modules linked in: > PC: [<0002562a>] 0x02562a > SR: 2704 SP: 0383dfc4 a2: 00000000 > d0: 00000000 d1: 00000000 d2: 00000000 d3: 00000000 > d4: 00000000 d5: 00000000 a0: 00000000 a1: 00000000 > Process init (pid: 1, task=0383a000) > Frame format=4 eff addr=00000000 pc=6000169a > Stack from 0383e000: > Call Trace: > Code: 6610 4cd7 073e 4fef 0020 201f 588f dfdf <4e73> 2228 0004 46fc > 2000 0801 0007 66ff ffff c2ea 598f 4fef ffe8 48d7 78c0 486f > Disabling lock debugging due to kernel taint > > It is trapping at the return from exception (rte) in Lreturn. > Looks like it doesn't like the "format" field of the new stack frame > for some reason. If I get a few minutes tomorrow I'll dig into it. Interesting... What it should get is format 0, same as before the change. BTW, is there any convenient way to get an emulated coldfire-MMU system? For m68k I'm using aranym with sid/m68k from debian-ports.org and it seems to work fine these days, but that obviously won't do for coldfire - neither the emulator itself, nor the userland (AFAICS, gcc will happily generate instructions that use weird addressing modes unless told not to, so I would be extremely surprised if normal debian m68k binaries would run on coldfire, MMU or no MMU). BTW, the same question goes for many other embedded targets - I'm using qemu for arm and mips and hercules for s390; alpha, parisc, ppc32 and sparc64 - on actual hardware, amd64 and i386 - on kvm guests (all with debian userland); ia64 kinda-sorta works with ski, but it's very much imperfect... I think sh (at least sh4) should be usable with qemu as well, but I hadn't set that up yet. sparc32 is usable on qemu, but only with very old userland. Everything else... In theory, quite a few ought to be usable if one bootstraps uclinux userland with qemu, but I've no idea how well does that work in practice. And seeing that e.g. FRV eval boards go for several hundred dollars even on ebay, let alone from manufacturer, I'd rather not add the actual hardware to the pile here ;-/ What do people actually use? ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [RFC] status of execve() work - per-architecture patches solicited 2012-09-10 16:49 ` Al Viro @ 2012-09-11 3:39 ` Greg Ungerer 2012-09-13 13:27 ` Greg Ungerer 1 sibling, 0 replies; 22+ messages in thread From: Greg Ungerer @ 2012-09-11 3:39 UTC (permalink / raw) To: Al Viro; +Cc: linux-arch, linux-kernel On 09/11/2012 02:49 AM, Al Viro wrote: > On Mon, Sep 10, 2012 at 11:40:11PM +1000, Greg Ungerer wrote: >> Hi Al, >> >> On 09/08/2012 04:20 AM, Al Viro wrote: >>> To architecture maintainers: please, review the current >>> situation in git.kernel.org/pub/scm/linux/kernel/git/viro/signal #execve2 >>> and consider sending the corresponding patches for missing architectures. >> >> I can see you have some m68k patches in there as well. >> They tested good on standard m68k (under emulator) and good on non-mmu >> ColdFire. But it is geting an exception when I run on ColdFire with MMU >> enabled: >> >> ... >> Creating 1 MTD partitions on "RAM": >> 0x000000000000-0x0000001b8000 : "ROMfs" >> TCP: cubic registered >> NET: Registered protocol family 17 >> VFS: Mounted root (romfs filesystem) readonly on device 31:0. >> *** FORMAT ERROR *** FORMAT=4 >> Current process id is 1 >> BAD KERNEL TRAP: 00000000 >> Modules linked in: >> PC: [<0002562a>] 0x02562a >> SR: 2704 SP: 0383dfc4 a2: 00000000 >> d0: 00000000 d1: 00000000 d2: 00000000 d3: 00000000 >> d4: 00000000 d5: 00000000 a0: 00000000 a1: 00000000 >> Process init (pid: 1, task=0383a000) >> Frame format=4 eff addr=00000000 pc=6000169a >> Stack from 0383e000: >> Call Trace: >> Code: 6610 4cd7 073e 4fef 0020 201f 588f dfdf <4e73> 2228 0004 46fc >> 2000 0801 0007 66ff ffff c2ea 598f 4fef ffe8 48d7 78c0 486f >> Disabling lock debugging due to kernel taint >> >> It is trapping at the return from exception (rte) in Lreturn. >> Looks like it doesn't like the "format" field of the new stack frame >> for some reason. If I get a few minutes tomorrow I'll dig into it. > > Interesting... What it should get is format 0, same as before the change. Thats a problem on ColdFire. The format field of the stack frame would normally be 0x4 for a long-word aligned user stack pointer. The current start_thread code doesn't set this on the ColdFire/MMU case, though we do for the non-mmu case. The old code inherited this from the stack frame of exec calling process. So I will rework the m68k start_thread() code so it sets it explicitly for all ColdFire cases. With this fixed up the new exec code works in all cases I have tested with ColdFire/MMU then. > BTW, is there any convenient way to get an emulated coldfire-MMU system? I only test it on real hardware. But it looks like qemu has coldfire emulation. I haven't tried it, but as of version 1.0 it listed supported ColdFire CPU's as: m5206 m5208 cfv4e The cfv4e core is capable of having an MMU, so maybe someone is working on it. I must go and check 1.2.0, it might be better. > For m68k I'm using aranym with sid/m68k from debian-ports.org and it seems > to work fine these days, but that obviously won't do for coldfire - neither > the emulator itself, nor the userland (AFAICS, gcc will happily generate > instructions that use weird addressing modes unless told not to, so I would > be extremely surprised if normal debian m68k binaries would run on coldfire, > MMU or no MMU). Yeah, no way it will work without the appropriate compiler switches on when generating even userland binaries. Regards Greg > BTW, the same question goes for many other embedded targets - I'm using > qemu for arm and mips and hercules for s390; alpha, parisc, ppc32 and sparc64 - > on actual hardware, amd64 and i386 - on kvm guests (all with debian userland); > ia64 kinda-sorta works with ski, but it's very much imperfect... I think > sh (at least sh4) should be usable with qemu as well, but I hadn't set that > up yet. sparc32 is usable on qemu, but only with very old userland. > Everything else... In theory, quite a few ought to be usable if one > bootstraps uclinux userland with qemu, but I've no idea how well does that > work in practice. And seeing that e.g. FRV eval boards go for several > hundred dollars even on ebay, let alone from manufacturer, I'd rather not > add the actual hardware to the pile here ;-/ > > What do people actually use? > -- > To unsubscribe from this list: send the line "unsubscribe linux-arch" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [RFC] status of execve() work - per-architecture patches solicited 2012-09-10 16:49 ` Al Viro 2012-09-11 3:39 ` Greg Ungerer @ 2012-09-13 13:27 ` Greg Ungerer 1 sibling, 0 replies; 22+ messages in thread From: Greg Ungerer @ 2012-09-13 13:27 UTC (permalink / raw) To: Al Viro; +Cc: linux-arch, linux-kernel On 09/11/2012 02:49 AM, Al Viro wrote: > BTW, the same question goes for many other embedded targets - I'm using > qemu for arm and mips and hercules for s390; alpha, parisc, ppc32 and sparc64 - > on actual hardware, amd64 and i386 - on kvm guests (all with debian userland); > ia64 kinda-sorta works with ski, but it's very much imperfect... I think > sh (at least sh4) should be usable with qemu as well, but I hadn't set that > up yet. sparc32 is usable on qemu, but only with very old userland. > Everything else... In theory, quite a few ought to be usable if one > bootstraps uclinux userland with qemu, but I've no idea how well does that > work in practice. And seeing that e.g. FRV eval boards go for several > hundred dollars even on ebay, let alone from manufacturer, I'd rather not > add the actual hardware to the pile here ;-/ I have managed to get qemu to run modern non-MMU ColdFire kernels. Stock qemu-1.2 itself didn't work, but after a couple of fixes it is now running again. It doesn't support ColdFire with MMU yet though. Regards Greg ------------------------------------------------------------------------ Greg Ungerer -- Principal Engineer EMAIL: gerg@snapgear.com SnapGear Group, McAfee PHONE: +61 7 3435 2888 8 Gardner Close, FAX: +61 7 3891 3630 Milton, QLD, 4064, Australia WEB: http://www.SnapGear.com ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [RFC] status of execve() work - per-architecture patches solicited 2012-09-07 18:20 [RFC] status of execve() work - per-architecture patches solicited Al Viro 2012-09-07 18:22 ` Al Viro 2012-09-10 13:40 ` Greg Ungerer @ 2012-09-10 22:20 ` Mark Salter 2012-09-10 22:20 ` [PATCH 1/2] c6x: implement ret_from_kernel_execve() and switch to generic kernel_execve() Mark Salter ` (2 more replies) 2012-09-17 9:29 ` Michal Simek 2012-09-19 12:20 ` Vineet Gupta 4 siblings, 3 replies; 22+ messages in thread From: Mark Salter @ 2012-09-10 22:20 UTC (permalink / raw) To: Al Viro; +Cc: linux-kernel, linux-arch, Mark Salter C6X works fine with these patches to switch over to generic code. Mark Salter (2): c6x: implement ret_from_kernel_execve() and switch to generic kernel_execve() c6x: switch to generic sys_execve() arch/c6x/include/asm/syscalls.h | 5 --- arch/c6x/include/asm/unistd.h | 3 ++ arch/c6x/kernel/entry.S | 54 +++++++++++++++++--------------------- arch/c6x/kernel/process.c | 22 ---------------- 4 files changed, 27 insertions(+), 57 deletions(-) -- 1.7.9.1 ^ permalink raw reply [flat|nested] 22+ messages in thread
* [PATCH 1/2] c6x: implement ret_from_kernel_execve() and switch to generic kernel_execve() 2012-09-10 22:20 ` Mark Salter @ 2012-09-10 22:20 ` Mark Salter 2012-09-10 22:20 ` [PATCH 2/2] c6x: switch to generic sys_execve() Mark Salter 2012-09-17 3:26 ` [RFC] status of execve() work - per-architecture patches solicited Al Viro 2 siblings, 0 replies; 22+ messages in thread From: Mark Salter @ 2012-09-10 22:20 UTC (permalink / raw) To: Al Viro; +Cc: linux-kernel, linux-arch, Mark Salter Signed-off-by: Mark Salter <msalter@redhat.com> --- arch/c6x/include/asm/unistd.h | 2 ++ arch/c6x/kernel/entry.S | 31 ++++++++++++++++++++++++------- 2 files changed, 26 insertions(+), 7 deletions(-) diff --git a/arch/c6x/include/asm/unistd.h b/arch/c6x/include/asm/unistd.h index 6d54ea4..1ce3a6f 100644 --- a/arch/c6x/include/asm/unistd.h +++ b/arch/c6x/include/asm/unistd.h @@ -16,6 +16,8 @@ #if !defined(_ASM_C6X_UNISTD_H) || defined(__SYSCALL) #define _ASM_C6X_UNISTD_H +#define __ARCH_WANT_KERNEL_EXECVE + /* Use the standard ABI for syscalls. */ #include <asm-generic/unistd.h> diff --git a/arch/c6x/kernel/entry.S b/arch/c6x/kernel/entry.S index 30b37e5..693002b 100644 --- a/arch/c6x/kernel/entry.S +++ b/arch/c6x/kernel/entry.S @@ -315,6 +315,30 @@ resume_userspace: [A0] BNOP .S1 work_pending,5 BNOP .S1 restore_all,5 + ;; extern void ret_from_kernel_execve(struct pt_regs *normal, + ;; struct pt_regs *new) + ;; + ;; Copy new regs to normal regs. + ;; Switch stack to normal regs and return to userspace. + ;; +ENTRY(ret_from_kernel_execve) +#ifdef CONFIG_C6X_BIG_KERNEL + MVKL .S2 memmove,B0 + MVKH .S2 memmove,B0 + B .S2 B0 +#else + B .S2 memmove +#endif + ADDKPC .S2 0f,B3,2 + MVK .S1 REGS__END,A6 ; sizeof(struct pt_regs) + SUB .L2X A4,8,B10 ; save new SP in callee-saved reg +0: + BNOP .S2 resume_userspace,2 + MV .S2 B10,SP ; switch stack + MVK .L2 0,B1 + STW .D2T2 B1,*+SP(REGS__END+8) ; clear syscall flag +ENDPROC(ret_from_kernel_execve) + ;; ;; System call handling ;; B0 = syscall number (in sys_call_table) @@ -593,13 +617,6 @@ ENTRY(sys_sigaltstack) NOP 4 ENDPROC(sys_sigaltstack) - ;; kernel_execve -ENTRY(kernel_execve) - MVK .S2 __NR_execve,B0 - SWE - BNOP .S2 B3,5 -ENDPROC(kernel_execve) - ;; ;; Special system calls ;; return address is in B3 -- 1.7.9.1 ^ permalink raw reply related [flat|nested] 22+ messages in thread
* [PATCH 2/2] c6x: switch to generic sys_execve() 2012-09-10 22:20 ` Mark Salter 2012-09-10 22:20 ` [PATCH 1/2] c6x: implement ret_from_kernel_execve() and switch to generic kernel_execve() Mark Salter @ 2012-09-10 22:20 ` Mark Salter 2012-09-17 3:26 ` [RFC] status of execve() work - per-architecture patches solicited Al Viro 2 siblings, 0 replies; 22+ messages in thread From: Mark Salter @ 2012-09-10 22:20 UTC (permalink / raw) To: Al Viro; +Cc: linux-kernel, linux-arch, Mark Salter Signed-off-by: Mark Salter <msalter@redhat.com> --- arch/c6x/include/asm/syscalls.h | 5 ----- arch/c6x/include/asm/unistd.h | 1 + arch/c6x/kernel/entry.S | 23 ----------------------- arch/c6x/kernel/process.c | 22 ---------------------- 4 files changed, 1 insertions(+), 50 deletions(-) diff --git a/arch/c6x/include/asm/syscalls.h b/arch/c6x/include/asm/syscalls.h index aed53da..e7b8991 100644 --- a/arch/c6x/include/asm/syscalls.h +++ b/arch/c6x/include/asm/syscalls.h @@ -44,11 +44,6 @@ extern int sys_cache_sync(unsigned long s, unsigned long e); struct pt_regs; extern asmlinkage long sys_c6x_clone(struct pt_regs *regs); -extern asmlinkage long sys_c6x_execve(const char __user *name, - const char __user *const __user *argv, - const char __user *const __user *envp, - struct pt_regs *regs); - #include <asm-generic/syscalls.h> diff --git a/arch/c6x/include/asm/unistd.h b/arch/c6x/include/asm/unistd.h index 1ce3a6f..3c131d5 100644 --- a/arch/c6x/include/asm/unistd.h +++ b/arch/c6x/include/asm/unistd.h @@ -17,6 +17,7 @@ #define _ASM_C6X_UNISTD_H #define __ARCH_WANT_KERNEL_EXECVE +#define __ARCH_WANT_SYS_EXECVE /* Use the standard ABI for syscalls. */ #include <asm-generic/unistd.h> diff --git a/arch/c6x/kernel/entry.S b/arch/c6x/kernel/entry.S index 693002b..1c0e867 100644 --- a/arch/c6x/kernel/entry.S +++ b/arch/c6x/kernel/entry.S @@ -645,29 +645,6 @@ ENTRY(sys_rt_sigreturn) #endif ENDPROC(sys_rt_sigreturn) -ENTRY(sys_execve) - ADDAW .D2 SP,2,B6 ; put regs addr in 4th parameter - ; & adjust regs stack addr - LDW .D2T2 *+SP(REGS_B4+8),B4 - - ;; c6x_execve(char *name, char **argv, - ;; char **envp, struct pt_regs *regs) -#ifdef CONFIG_C6X_BIG_KERNEL - || MVKL .S1 sys_c6x_execve,A0 - MVKH .S1 sys_c6x_execve,A0 - B .S2X A0 -#else - || B .S2 sys_c6x_execve -#endif - STW .D2T2 B3,*SP--[2] - ADDKPC .S2 ret_from_c6x_execve,B3,3 - -ret_from_c6x_execve: - LDW .D2T2 *++SP[2],B3 - NOP 4 - BNOP .S2 B3,5 -ENDPROC(sys_execve) - ENTRY(sys_pread_c6x) MV .D2X A8,B7 #ifdef CONFIG_C6X_BIG_KERNEL diff --git a/arch/c6x/kernel/process.c b/arch/c6x/kernel/process.c index 45e924a..e83d872 100644 --- a/arch/c6x/kernel/process.c +++ b/arch/c6x/kernel/process.c @@ -221,28 +221,6 @@ int copy_thread(unsigned long clone_flags, unsigned long usp, return 0; } -/* - * c6x_execve() executes a new program. - */ -SYSCALL_DEFINE4(c6x_execve, const char __user *, name, - const char __user *const __user *, argv, - const char __user *const __user *, envp, - struct pt_regs *, regs) -{ - int error; - char *filename; - - filename = getname(name); - error = PTR_ERR(filename); - if (IS_ERR(filename)) - goto out; - - error = do_execve(filename, argv, envp, regs); - putname(filename); -out: - return error; -} - unsigned long get_wchan(struct task_struct *p) { return p->thread.wchan; -- 1.7.9.1 ^ permalink raw reply related [flat|nested] 22+ messages in thread
* Re: [RFC] status of execve() work - per-architecture patches solicited 2012-09-10 22:20 ` Mark Salter 2012-09-10 22:20 ` [PATCH 1/2] c6x: implement ret_from_kernel_execve() and switch to generic kernel_execve() Mark Salter 2012-09-10 22:20 ` [PATCH 2/2] c6x: switch to generic sys_execve() Mark Salter @ 2012-09-17 3:26 ` Al Viro 2012-09-21 16:26 ` Mark Salter 2 siblings, 1 reply; 22+ messages in thread From: Al Viro @ 2012-09-17 3:26 UTC (permalink / raw) To: Mark Salter; +Cc: linux-kernel, linux-arch, Linus Torvalds On Mon, Sep 10, 2012 at 06:20:01PM -0400, Mark Salter wrote: > C6X works fine with these patches to switch over to generic code. > > > Mark Salter (2): > c6x: implement ret_from_kernel_execve() and switch to generic > kernel_execve() > c6x: switch to generic sys_execve() > > arch/c6x/include/asm/syscalls.h | 5 --- > arch/c6x/include/asm/unistd.h | 3 ++ > arch/c6x/kernel/entry.S | 54 +++++++++++++++++--------------------- > arch/c6x/kernel/process.c | 22 ---------------- > 4 files changed, 27 insertions(+), 57 deletions(-) Applied. There's an alternative variant of that branch; see #experimental-kernel_thread in the same tree. I have *not* attempted to port those patches over there - I don't have anything to test on and architecture is too unfamiliar for me to even attempt it blindly. The main differences between those branches are: * ret_from_fork is usually split in two - ret_from_fork is used for normal processes and ret_from_kernel_thread is its analog for kernel threads; copy_thread() chooses one to use based on user_mode(regs). * ret_from_kernel_thread does *not* go through the normal return-from-syscall codepath; instead of doing that it simply does an equivalent of kernel_thread_helper() itself - i.e. calls the function we'd passed to kernel_thread(), followed by sys_exit(). * ret_from_kernel_execve does *not* bother with memmove(); it's done by generic kernel_execve() itself. Note that the first two changes guarantee that kernel threads will have pt_regs at the bottom of their stack, so we won't have any overlaps - not between the source and destination of copying pt_regs and not between the stack frame and that destination. I.e. that copying can safely be done by generic C implementation of kernel_execve(). I've ported (and tested) execve2 stuff to that model; it's done for alpha, arm, m68k, s390, powerpc, x86 and um. I think it's a better approach: * ret_from_kernel_execve() is simpler that way - one argument, no memmove() call to implement in there. * we get to kill the last remnants of "syscall instruction from the kernel mode" crap (c6x kernel_thread() is free from that already, but for many architectures it's not so) * syscall return codepath is only taken for return to userland now; succeeding kernel_thread() is not sharing it. Seeing that a bunch of things on that path should be avoided when returning to kernel mode, that allows for nice optimizations and simpler logics in the asm glue. * it removes more code. BTW, right now the contents of experimental-kernel_thread + for-next sans execve2 counterparts is probably getting close to Linus' "it removes 1KLoC, piss on all merge window rules and pull it now" threshold ;-) The price is that kernel threads are in the same boat as userland processes now wrt kernel stack consumption - they get pt_regs in the bottom of kernel stack, same as for normal syscall path. That makes for _much_ simpler life, but if there's a kernel thread with really borderline stack footprint, that might push it over the edge. Note, however, that syscalls are where the worst stack footprints tend to happen and for those we can't get rid of pt_regs on stack, no matter what we do. Just as with #execve2 it's not a flagday conversion; however, switching from one to another probably would be messy, so we'd better decide which one we'll be doing before the merge window. Comments? ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [RFC] status of execve() work - per-architecture patches solicited 2012-09-17 3:26 ` [RFC] status of execve() work - per-architecture patches solicited Al Viro @ 2012-09-21 16:26 ` Mark Salter 2012-09-21 16:26 ` [PATCH 1/3] c6x: add ret_from_kernel_thread(), simplify kernel_thread() Mark Salter ` (3 more replies) 0 siblings, 4 replies; 22+ messages in thread From: Mark Salter @ 2012-09-21 16:26 UTC (permalink / raw) To: Al Viro; +Cc: linux-kernel, linux-arch, Mark Salter Here are a set of c6x patches to work with your experimental-kernel_thread branch. Mark Salter (3): c6x: add ret_from_kernel_thread(), simplify kernel_thread() c6x: switch to generic kernel_execve c6x: switch to generic sys_execve arch/c6x/include/asm/syscalls.h | 5 --- arch/c6x/include/asm/unistd.h | 3 ++ arch/c6x/kernel/entry.S | 56 +++++++++++++++++------------------- arch/c6x/kernel/process.c | 60 ++++++++------------------------------- 4 files changed, 41 insertions(+), 83 deletions(-) -- 1.7.9.1 ^ permalink raw reply [flat|nested] 22+ messages in thread
* [PATCH 1/3] c6x: add ret_from_kernel_thread(), simplify kernel_thread() 2012-09-21 16:26 ` Mark Salter @ 2012-09-21 16:26 ` Mark Salter 2012-09-21 16:26 ` [PATCH 2/3] c6x: switch to generic kernel_execve Mark Salter ` (2 subsequent siblings) 3 siblings, 0 replies; 22+ messages in thread From: Mark Salter @ 2012-09-21 16:26 UTC (permalink / raw) To: Al Viro; +Cc: linux-kernel, linux-arch, Mark Salter Signed-off-by: Mark Salter <msalter@redhat.com> --- arch/c6x/kernel/entry.S | 20 ++++++++++++++++++++ arch/c6x/kernel/process.c | 38 ++++++++++++-------------------------- 2 files changed, 32 insertions(+), 26 deletions(-) diff --git a/arch/c6x/kernel/entry.S b/arch/c6x/kernel/entry.S index 30b37e5..6e6bd9d 100644 --- a/arch/c6x/kernel/entry.S +++ b/arch/c6x/kernel/entry.S @@ -400,6 +400,26 @@ ret_from_fork_2: STW .D2T2 B0,*+SP(REGS_A4+8) ENDPROC(ret_from_fork) +ENTRY(ret_from_kernel_thread) +#ifdef CONFIG_C6X_BIG_KERNEL + MVKL .S1 schedule_tail,A0 + MVKH .S1 schedule_tail,A0 + B .S2X A0 +#else + B .S2 schedule_tail +#endif + LDW .D2T2 *+SP(REGS_A0+8),B10 /* get fn */ + ADDKPC .S2 0f,B3,3 +0: + B .S2 B10 /* call fn */ + LDW .D2T1 *+SP(REGS_A1+8),A4 /* get arg */ + MVKL .S2 sys_exit,B11 + MVKH .S2 sys_exit,B11 + ADDKPC .S2 0f,B3,1 +0: + BNOP .S2 B11,5 /* jump to sys_exit */ +ENDPROC(ret_from_kernel_thread) + ;; ;; These are the interrupt handlers, responsible for calling __do_IRQ() ;; int6 is used for syscalls (see _system_call entry) diff --git a/arch/c6x/kernel/process.c b/arch/c6x/kernel/process.c index 45e924a..d2ffc9b 100644 --- a/arch/c6x/kernel/process.c +++ b/arch/c6x/kernel/process.c @@ -25,6 +25,7 @@ void (*c6x_restart)(void); void (*c6x_halt)(void); extern asmlinkage void ret_from_fork(void); +extern asmlinkage void ret_from_kernel_thread(void); /* * power off function, if any @@ -103,36 +104,21 @@ void machine_power_off(void) halt_loop(); } -static void kernel_thread_helper(int dummy, void *arg, int (*fn)(void *)) -{ - do_exit(fn(arg)); -} - /* * Create a kernel thread */ int kernel_thread(int (*fn)(void *), void * arg, unsigned long flags) { - struct pt_regs regs; - - /* - * copy_thread sets a4 to zero (child return from fork) - * so we can't just set things up to directly return to - * fn. - */ - memset(®s, 0, sizeof(regs)); - regs.b4 = (unsigned long) arg; - regs.a6 = (unsigned long) fn; - regs.pc = (unsigned long) kernel_thread_helper; - local_save_flags(regs.csr); - regs.csr |= 1; - regs.tsr = 5; /* Set GEE and GIE in TSR */ + struct pt_regs regs = { + .a0 = (unsigned long)fn, + .a1 = (unsigned long)arg, + .tsr = 0, /* kernel mode */ + }; /* Ok, create the new process.. */ return do_fork(flags | CLONE_VM | CLONE_UNTRACED, -1, ®s, 0, NULL, NULL); } -EXPORT_SYMBOL(kernel_thread); void flush_thread(void) { @@ -192,21 +178,21 @@ int copy_thread(unsigned long clone_flags, unsigned long usp, childregs = task_pt_regs(p); *childregs = *regs; - childregs->a4 = 0; - if (usp == -1) + if (usp == -1) { /* case of __kernel_thread: we return to supervisor space */ childregs->sp = (unsigned long)(childregs + 1); - else + p->thread.pc = (unsigned long) ret_from_kernel_thread; + } else { /* Otherwise use the given stack */ childregs->sp = usp; + p->thread.pc = (unsigned long) ret_from_fork; + } /* Set usp/ksp */ p->thread.usp = childregs->sp; - /* switch_to uses stack to save/restore 14 callee-saved regs */ thread_saved_ksp(p) = (unsigned long)childregs - 8; - p->thread.pc = (unsigned int) ret_from_fork; - p->thread.wchan = (unsigned long) ret_from_fork; + p->thread.wchan = p->thread.pc; #ifdef __DSBT__ { unsigned long dp; -- 1.7.9.1 ^ permalink raw reply related [flat|nested] 22+ messages in thread
* [PATCH 2/3] c6x: switch to generic kernel_execve 2012-09-21 16:26 ` Mark Salter 2012-09-21 16:26 ` [PATCH 1/3] c6x: add ret_from_kernel_thread(), simplify kernel_thread() Mark Salter @ 2012-09-21 16:26 ` Mark Salter 2012-09-21 16:26 ` [PATCH 3/3] c6x: switch to generic sys_execve Mark Salter 2012-09-21 18:39 ` [RFC] status of execve() work - per-architecture patches solicited Al Viro 3 siblings, 0 replies; 22+ messages in thread From: Mark Salter @ 2012-09-21 16:26 UTC (permalink / raw) To: Al Viro; +Cc: linux-kernel, linux-arch, Mark Salter Signed-off-by: Mark Salter <msalter@redhat.com> --- arch/c6x/include/asm/unistd.h | 2 ++ arch/c6x/kernel/entry.S | 13 ++++++------- 2 files changed, 8 insertions(+), 7 deletions(-) diff --git a/arch/c6x/include/asm/unistd.h b/arch/c6x/include/asm/unistd.h index 6d54ea4..1ce3a6f 100644 --- a/arch/c6x/include/asm/unistd.h +++ b/arch/c6x/include/asm/unistd.h @@ -16,6 +16,8 @@ #if !defined(_ASM_C6X_UNISTD_H) || defined(__SYSCALL) #define _ASM_C6X_UNISTD_H +#define __ARCH_WANT_KERNEL_EXECVE + /* Use the standard ABI for syscalls. */ #include <asm-generic/unistd.h> diff --git a/arch/c6x/kernel/entry.S b/arch/c6x/kernel/entry.S index 6e6bd9d..32e3683 100644 --- a/arch/c6x/kernel/entry.S +++ b/arch/c6x/kernel/entry.S @@ -420,6 +420,12 @@ ENTRY(ret_from_kernel_thread) BNOP .S2 B11,5 /* jump to sys_exit */ ENDPROC(ret_from_kernel_thread) +ENTRY(ret_from_kernel_execve) + GET_THREAD_INFO A12 + BNOP .S2 syscall_exit,4 + ADD .D2X A4,-8,SP +ENDPROC(ret_from_kernel_execve) + ;; ;; These are the interrupt handlers, responsible for calling __do_IRQ() ;; int6 is used for syscalls (see _system_call entry) @@ -613,13 +619,6 @@ ENTRY(sys_sigaltstack) NOP 4 ENDPROC(sys_sigaltstack) - ;; kernel_execve -ENTRY(kernel_execve) - MVK .S2 __NR_execve,B0 - SWE - BNOP .S2 B3,5 -ENDPROC(kernel_execve) - ;; ;; Special system calls ;; return address is in B3 -- 1.7.9.1 ^ permalink raw reply related [flat|nested] 22+ messages in thread
* [PATCH 3/3] c6x: switch to generic sys_execve 2012-09-21 16:26 ` Mark Salter 2012-09-21 16:26 ` [PATCH 1/3] c6x: add ret_from_kernel_thread(), simplify kernel_thread() Mark Salter 2012-09-21 16:26 ` [PATCH 2/3] c6x: switch to generic kernel_execve Mark Salter @ 2012-09-21 16:26 ` Mark Salter 2012-09-21 18:39 ` [RFC] status of execve() work - per-architecture patches solicited Al Viro 3 siblings, 0 replies; 22+ messages in thread From: Mark Salter @ 2012-09-21 16:26 UTC (permalink / raw) To: Al Viro; +Cc: linux-kernel, linux-arch, Mark Salter Signed-off-by: Mark Salter <msalter@redhat.com> --- arch/c6x/include/asm/syscalls.h | 5 ----- arch/c6x/include/asm/unistd.h | 1 + arch/c6x/kernel/entry.S | 23 ----------------------- arch/c6x/kernel/process.c | 22 ---------------------- 4 files changed, 1 insertions(+), 50 deletions(-) diff --git a/arch/c6x/include/asm/syscalls.h b/arch/c6x/include/asm/syscalls.h index aed53da..e7b8991 100644 --- a/arch/c6x/include/asm/syscalls.h +++ b/arch/c6x/include/asm/syscalls.h @@ -44,11 +44,6 @@ extern int sys_cache_sync(unsigned long s, unsigned long e); struct pt_regs; extern asmlinkage long sys_c6x_clone(struct pt_regs *regs); -extern asmlinkage long sys_c6x_execve(const char __user *name, - const char __user *const __user *argv, - const char __user *const __user *envp, - struct pt_regs *regs); - #include <asm-generic/syscalls.h> diff --git a/arch/c6x/include/asm/unistd.h b/arch/c6x/include/asm/unistd.h index 1ce3a6f..3c131d5 100644 --- a/arch/c6x/include/asm/unistd.h +++ b/arch/c6x/include/asm/unistd.h @@ -17,6 +17,7 @@ #define _ASM_C6X_UNISTD_H #define __ARCH_WANT_KERNEL_EXECVE +#define __ARCH_WANT_SYS_EXECVE /* Use the standard ABI for syscalls. */ #include <asm-generic/unistd.h> diff --git a/arch/c6x/kernel/entry.S b/arch/c6x/kernel/entry.S index 32e3683..5449c36 100644 --- a/arch/c6x/kernel/entry.S +++ b/arch/c6x/kernel/entry.S @@ -647,29 +647,6 @@ ENTRY(sys_rt_sigreturn) #endif ENDPROC(sys_rt_sigreturn) -ENTRY(sys_execve) - ADDAW .D2 SP,2,B6 ; put regs addr in 4th parameter - ; & adjust regs stack addr - LDW .D2T2 *+SP(REGS_B4+8),B4 - - ;; c6x_execve(char *name, char **argv, - ;; char **envp, struct pt_regs *regs) -#ifdef CONFIG_C6X_BIG_KERNEL - || MVKL .S1 sys_c6x_execve,A0 - MVKH .S1 sys_c6x_execve,A0 - B .S2X A0 -#else - || B .S2 sys_c6x_execve -#endif - STW .D2T2 B3,*SP--[2] - ADDKPC .S2 ret_from_c6x_execve,B3,3 - -ret_from_c6x_execve: - LDW .D2T2 *++SP[2],B3 - NOP 4 - BNOP .S2 B3,5 -ENDPROC(sys_execve) - ENTRY(sys_pread_c6x) MV .D2X A8,B7 #ifdef CONFIG_C6X_BIG_KERNEL diff --git a/arch/c6x/kernel/process.c b/arch/c6x/kernel/process.c index d2ffc9b..f98616d 100644 --- a/arch/c6x/kernel/process.c +++ b/arch/c6x/kernel/process.c @@ -207,28 +207,6 @@ int copy_thread(unsigned long clone_flags, unsigned long usp, return 0; } -/* - * c6x_execve() executes a new program. - */ -SYSCALL_DEFINE4(c6x_execve, const char __user *, name, - const char __user *const __user *, argv, - const char __user *const __user *, envp, - struct pt_regs *, regs) -{ - int error; - char *filename; - - filename = getname(name); - error = PTR_ERR(filename); - if (IS_ERR(filename)) - goto out; - - error = do_execve(filename, argv, envp, regs); - putname(filename); -out: - return error; -} - unsigned long get_wchan(struct task_struct *p) { return p->thread.wchan; -- 1.7.9.1 ^ permalink raw reply related [flat|nested] 22+ messages in thread
* Re: [RFC] status of execve() work - per-architecture patches solicited 2012-09-21 16:26 ` Mark Salter ` (2 preceding siblings ...) 2012-09-21 16:26 ` [PATCH 3/3] c6x: switch to generic sys_execve Mark Salter @ 2012-09-21 18:39 ` Al Viro 2012-09-22 11:16 ` Greg Ungerer 3 siblings, 1 reply; 22+ messages in thread From: Al Viro @ 2012-09-21 18:39 UTC (permalink / raw) To: Mark Salter; +Cc: linux-kernel, linux-arch On Fri, Sep 21, 2012 at 12:26:36PM -0400, Mark Salter wrote: > Here are a set of c6x patches to work with your experimental-kernel_thread > branch. > > Mark Salter (3): > c6x: add ret_from_kernel_thread(), simplify kernel_thread() > c6x: switch to generic kernel_execve > c6x: switch to generic sys_execve Applied and pushed... FWIW, the current status: alpha - done, tested on hardware arm - done, tested on qemu c6x - done by maintainer frv - done, untested m68k - done, tested on aranym; there's a known issue in copy_thread() in case of coldfire-MMU, presumably to be handled in m68k tree (I can do it in this one instead, if m68k folks would prefer it that way) mn10300 - done, untested powerpc - done, tested on qemu (32bit and 64bit) s390 - done, tested on hercules (31bit and 64bit) x86 - done, tested on kvm guests (32bit and 64bit) um - done, tested on amd64 host (32bit and 64bit) avr32 - no blackfin - no, should be easy to write, NFI how to test cris - no h8300 - no hexagon - no ia64 - no m32r - no microblaze - no mips - no, and if I understood Ralf correctly, he prefers to deal with his asm glue surgery first. openrisc - no parisc - no, and there might be interesting issues writing that stuff. One good thing is that I can test it on actual hw (32bit only, though) score - no (and AFAICS that port is essentially abandonware) sh - no sparc - no, will get around to it. That I can test on actual hw... tile - no unicore32 - no, should be easy to copy arm solution xtensa - no The future plans for that series are * kill daemonize() - only one caller, it's in drivers/staging and it's trivial to eliminate. * convert powerpc eeh_event_handler() to kthread_run() * pull the calls of do_exit()/sys_exit() into the kernel_thread callbacks themselves; there are very few such callbacks (kernel_thread() is really a very low-level thing) and most of them never return - either loop forever, or call exit() themselves, or do kernel_execve() and panic() on failure of that (kernel_init()). It boils down to adding do_exit() on failure in ____call_usermodehelper(), adding do_exit() in the end of wait_for_helper() and adding do_exit() on failure in do_linuxrc(). That's it. What we get out of that is removal of asm glue calling exit() after the call of kernerl_thread() payload - on each architecture. * once that is done (and assuming we have all architectures converted), we can do the following trick: void __init kernel_init_guts(void) { /* current kernel_init() sans the call of init_post() */ } int __ref kernel_init(void *unused) { kernel_init_guts(); /* stuff currently in init_post() */ } and we can drastically simplify kernel_execve(). Note that there are only 3 callers, all of them in kernel_thread() payloads. Moreover, at that point we have the whole path to caller of the payload (i.e. ret_from_kernel_thread) alive and well (that's what the trick above is for). So let's just replace kernel_execve() with doing do_execve() *on* *default* *pt_regs*. And turn ret_from_kernel_thread into call schedule_tail() find the payload function and its argument call the payload go to normal return from syscall path (i.e. what ret_from_kernel_execve is doing, but without any need to do magic to stack pointer, etc.) Note that this is practically the same thing as ret_from_fork, except for calling the damn payload. Which either does exit(), or returns after successful do_execve(). At that point we can get rid of pt_regs argument of do_execve(). And search_binary_handler(). And all kinds of foo_load_binary(). When said foo_load_binary() wants pt_regs, it should simply call current_pt_regs() and be done with that... * I'm considering generic implementations of fork/vfork/clone - all it takes is current_user_stack_pointer() (defaulting to user_stack_pointer(current_pt_regs()); all architectures that don't have said userland stack pointer stored in pt_regs happen to have such function already, called rdusp() in all such cases). That helper is enough to make practically all instances of fork/vfork/clone identical. Again, it's up to the architecture whether it wants to use that or not, but it promises quite a bit of boilerplate removal *AND* we are getting rid of wonders like asmlinkage int sys_fork(long r10, long r11, long r12, long r13, long mof, long srp, struct pt_regs *regs) or asmlinkage int sys_fork(unsigned long r0, unsigned long r1, unsigned long r2, unsigned long r3, unsigned long r4, unsigned long r5, unsigned long r6, struct pt_regs regs) and similar bits of black magic. And black magic it is - in the second case (m32r) we are *badly* abusing C ABI. Took me a while to figure out WTF was going on there - in reality, (void *)®s will be equal to (void *)&r4. Compiler has every right to be unhappy. * first 4 arguments go in registers (and are unused) * arguments 5, 6 and 7 are expected to be on top of stack * argument 8 is expected to be passed as a pointer to copy, also on stack. So compiler expects r0 to r3 in registers, with r4, r5, r6, ®s, regs on top of stack. In reality, pt_regs ther starts with 3 longs and pointer to pt_regs. Initialized with the address of structure itself. So we get r4 aliased to regs.r4, r5 - to regs.r5, r6 - to regs.r6 and what would've been a hidden pointer to regs - to regs.pt_regs. The worst part is, all that trickery is absolutely pointless - the pointer we are looking for is (sp & ~(THREAD_SIZE - 1)) + constant, so it actually costs *more* to do it that way; we fetch the sucker from *(sp + constant_offset), which is going to be slower, even leaving aside the price of storing it there back when we'd been setting the pt_regs up on the way in. Kernel isn't IOCCC, damnit... And it's not the worst example, actually ;-/ All that crap is brittle and ugly, for no reason whatsoever. Sigh... ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [RFC] status of execve() work - per-architecture patches solicited 2012-09-21 18:39 ` [RFC] status of execve() work - per-architecture patches solicited Al Viro @ 2012-09-22 11:16 ` Greg Ungerer 2012-09-23 0:46 ` Al Viro 0 siblings, 1 reply; 22+ messages in thread From: Greg Ungerer @ 2012-09-22 11:16 UTC (permalink / raw) To: Al Viro; +Cc: Mark Salter, linux-kernel, linux-arch On 09/22/2012 04:39 AM, Al Viro wrote: > On Fri, Sep 21, 2012 at 12:26:36PM -0400, Mark Salter wrote: >> Here are a set of c6x patches to work with your experimental-kernel_thread >> branch. >> >> Mark Salter (3): >> c6x: add ret_from_kernel_thread(), simplify kernel_thread() >> c6x: switch to generic kernel_execve >> c6x: switch to generic sys_execve > > Applied and pushed... > > FWIW, the current status: > > alpha - done, tested on hardware > arm - done, tested on qemu > c6x - done by maintainer > frv - done, untested > m68k - done, tested on aranym; there's a known issue in copy_thread() in > case of coldfire-MMU, presumably to be handled in m68k tree (I can do it > in this one instead, if m68k folks would prefer it that way) I sent the patch to the m68k-linux list. Its been acked by Geert. http://marc.info/?l=linux-m68k&m=134742688015639&w=2 I was going to push it through the m68knommu git tree, but I don't mind if you would rather take it with your changes. Regards Greg ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [RFC] status of execve() work - per-architecture patches solicited 2012-09-22 11:16 ` Greg Ungerer @ 2012-09-23 0:46 ` Al Viro 2012-09-24 10:59 ` Vineet Gupta 0 siblings, 1 reply; 22+ messages in thread From: Al Viro @ 2012-09-23 0:46 UTC (permalink / raw) To: Greg Ungerer; +Cc: Mark Salter, linux-kernel, linux-arch On Sat, Sep 22, 2012 at 09:16:11PM +1000, Greg Ungerer wrote: > I sent the patch to the m68k-linux list. Its been acked by Geert. > > http://marc.info/?l=linux-m68k&m=134742688015639&w=2 > > I was going to push it through the m68knommu git tree, but I don't mind > if you would rather take it with your changes. Applied. Other changes since the last update: * ppc breakage debugged and fixed * kernel_thread() unified on all converted architectures. An architecture can add select GENERIC_KERNEL_THREAD to its Kconfig if it's ready to handle that in its copy_thread() - regs will be NULL, usp - (unsigned long)fn, stck_size - (unsigned long)arg. It should set things up for ret_from_kernel_thread, so that the sucker would call given function on given argument. See what e.g. m68k does in #experimental-kernel_thread() in its copy_thread() and ret_from_kernel_thread; it's a fairly typical situation if you have enough callee-saved registers to play with. If not, put these values somewhere in childregs and pick them in ret_from_kernel_thread - see i386 for example of that. Eventually I hope to merge all kernel_thread() instances; then CONFIG_GENERIC_KERNEL_THREAD will be gone. Note, BTW, that having killed all in-kernel syscalls-via-trap on given architecture we get a chance to optimize the syscall glue; for instance, on ppc64 we could just go ahead and set stack pointer from %r13->kstack unconditionally, rather than playing with "if we are coming from the kernel mode, push stack pointer down by INT_FRAME_SIZE, otherwise pick it from per-CPU data structure pointed to by r13" as we do now. And that's just the most obvious bit in the very beginning of their system_call_common; there's more. I haven't touched that stuff - this kind of work belongs in architecture trees, not in this series. FWIW, if we do that conversion for all kernel_thread(), we get another nice thing pretty much for free - do_fork() won't need pt_regs passed to it anymore. Note that after that we have two possible values passed there - NULL (for kernel_thread()) and current_pt_regs() (from sys_fork() and friends). I.e. it's 1 bit of information, *and* we already have that bit - it's current->flags & PF_KTHREAD (it's actually a bit more convenient to check its copy in p->flags). Only kernel threads call kernel_thread(); only userland processes call sys_fork/sys_clone/sys_vfork(). IOW, all architectures are converted to generic kernel_thread() implementation, we can * stop passing pt_regs to do_fork() * stop passing pt_regs to copy_process() * stop passing pt_regs to copy_thread() - it can bloody well be calculated there. And it's not used until that point. ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [RFC] status of execve() work - per-architecture patches solicited 2012-09-23 0:46 ` Al Viro @ 2012-09-24 10:59 ` Vineet Gupta 0 siblings, 0 replies; 22+ messages in thread From: Vineet Gupta @ 2012-09-24 10:59 UTC (permalink / raw) To: Al Viro; +Cc: Greg Ungerer, Mark Salter, linux-kernel, linux-arch On Sunday 23 September 2012 06:16 AM, Al Viro wrote: > On Sat, Sep 22, 2012 at 09:16:11PM +1000, Greg Ungerer wrote: >> I sent the patch to the m68k-linux list. Its been acked by Geert. >> >> http://marc.info/?l=linux-m68k&m=134742688015639&w=2 >> >> I was going to push it through the m68knommu git tree, but I don't mind >> if you would rather take it with your changes. > > Applied. Other changes since the last update: > * ppc breakage debugged and fixed > * kernel_thread() unified on all converted architectures. commit cc615abcde (mn10300: convert to generic kernel_thread) in the experimental-kernel_thread has a minor syntactical error: diff --git a/arch/mn10300/kernel/process.c b/arch/mn10300/kernel/process.c index 26120a9..8ee09d8 100644 --- a/arch/mn10300/kernel/process.c +++ b/arch/mn10300/kernel/process.c @@ -225,7 +225,7 @@ int copy_thread(unsigned long clone_flags, p->thread.usp = c_usp; if (unlikely(!kregs)) { - memset(c_regs, 0, sizeof(struct pt_regs); + memset(c_regs, 0, sizeof(struct pt_regs)); Thx, -Vineet ^ permalink raw reply related [flat|nested] 22+ messages in thread
* Re: [RFC] status of execve() work - per-architecture patches solicited 2012-09-07 18:20 [RFC] status of execve() work - per-architecture patches solicited Al Viro ` (2 preceding siblings ...) 2012-09-10 22:20 ` Mark Salter @ 2012-09-17 9:29 ` Michal Simek 2012-09-17 22:57 ` Al Viro 2012-09-19 12:20 ` Vineet Gupta 4 siblings, 1 reply; 22+ messages in thread From: Michal Simek @ 2012-09-17 9:29 UTC (permalink / raw) To: Al Viro; +Cc: linux-arch, linux-kernel Hi Al, On 09/07/2012 08:20 PM, Al Viro wrote: > To architecture maintainers: please, review the current > situation in git.kernel.org/pub/scm/linux/kernel/git/viro/signal #execve2 > and consider sending the corresponding patches for missing architectures. I have sent two patches. Please apply them to your tree. Michal Simek (2): microblaze: Move restart allowed out of block microblaze: Remove compilation failure arch/microblaze/kernel/entry.S | 6 +++--- 1 files changed, 3 insertions(+), 3 deletions(-) I have seen the compilation problem in June on linux-next but have never got any response from you. http://www.spinics.net/lists/linux-next/msg20848.html Can you please apply these two patches to your tree? Thanks, Michal -- Michal Simek, Ing. (M.Eng) w: www.monstr.eu p: +42-0-721842854 Maintainer of Linux kernel 2.6 Microblaze Linux - http://www.monstr.eu/fdt/ Microblaze U-BOOT custodian ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [RFC] status of execve() work - per-architecture patches solicited 2012-09-17 9:29 ` Michal Simek @ 2012-09-17 22:57 ` Al Viro 0 siblings, 0 replies; 22+ messages in thread From: Al Viro @ 2012-09-17 22:57 UTC (permalink / raw) To: Michal Simek; +Cc: linux-arch, linux-kernel On Mon, Sep 17, 2012 at 11:29:10AM +0200, Michal Simek wrote: > Hi Al, > > On 09/07/2012 08:20 PM, Al Viro wrote: > > To architecture maintainers: please, review the current > >situation in git.kernel.org/pub/scm/linux/kernel/git/viro/signal #execve2 > >and consider sending the corresponding patches for missing architectures. > > I have sent two patches. Please apply them to your tree. > > Michal Simek (2): > microblaze: Move restart allowed out of block applied. > microblaze: Remove compilation failure folded into the offending commit. Note that all this stuff is not in for-next or any of execve-related branches... ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [RFC] status of execve() work - per-architecture patches solicited 2012-09-07 18:20 [RFC] status of execve() work - per-architecture patches solicited Al Viro ` (3 preceding siblings ...) 2012-09-17 9:29 ` Michal Simek @ 2012-09-19 12:20 ` Vineet Gupta 2012-09-19 13:32 ` Al Viro 4 siblings, 1 reply; 22+ messages in thread From: Vineet Gupta @ 2012-09-19 12:20 UTC (permalink / raw) To: Al Viro; +Cc: linux-arch, linux-kernel On Friday 07 September 2012 11:50 PM, Al Viro wrote: > To architecture maintainers: please, review the current > situation in git.kernel.org/pub/scm/linux/kernel/git/viro/signal #execve2 > and consider sending the corresponding patches for missing architectures. > > What's getting done is unification of sys_execve()/kernel_execve() > into arch-independent code. x86, alpha, arm, s390, um and ppc are already > converted in #execve2. The plan is: > > * provide a new primitive - ret_from_kernel_execve(); it takes two pointers > to struct pt_regs, one being the normal location of pt_regs for a userland > process, another - new pt_regs just filled by do_execve(). It should copy > the latter to the former and bugger off to userland. Called from generic > kernel_execve() implementation (see fs/exec.c in #execve2). It almost always > has to be done in assembler - normally it does equivalent of something > along the lines of > memmove(normal, new, sizeof(struct pt_regs)) > sp = normal, or whatever is needed to get a valid stack > frame (e.g. on s390 there's ->back_chain that needs to be set to > NULL) > set other registers ret_from_sys_call expects to be set (e.g. > i386 syscall entry has current_thread_info() value cached in %ebp and > since it's a callee-saved register there, ret_from_sys_call expects to > find that value still in %ebp, so we need to set it); basically, check > what has to be set in ret_from_fork - it tends to jump to the same place. > goto ret_from_sys_call, or whatever the equivalent is called on > particular architecture. > * define __ARCH_WANT_KERNEL_EXECVE in unistd.h, remove your old kernel_execve() > * pull whatever work you'd been doing *after* do_execve() call in your > sys_execve() (most of the architectures don't do anything after that anyway) > into start_thread(); that's the point of no return for execve(2) and if we > get there, we'll either succeed or get killed with SIGKILL. The same goes > for compat variant of execve(), with s/start_thread/compat_start_thread/. > * define __ARCH_WANT_SYS_EXECVE in unistd.h, kill your sys_execve() and > compat counterpart (if any). > * if there's a better way to calculate task_pt_regs(current), you can provide > it in your ptrace.h - macro should be called current_pt_regs(); it's optional. > > Status: x86, arm, um, s390 - converted, tested, seem to work. alpha > and ppc - need testing. The rest - hadn't touched yet. unicore32 and > blackfin should be trivial to convert (they are doing kernel_execve() in > that manner already). Other may be more or less tricky - depends on how > gnarly their return from syscall path happens to be. I'll do what I can > and test what I can (some on emulators, some on real hardware), but for quite > a few architectures I've no way to test. Nor am I fond of sniffing dozens > of variants of assembler glue, to put it mildly. > > Patches and/or help with testing setups would be very welcome. > Hi Al, It must be noted that despite having seemingly independent __ARCH_WANT_(KERNEL|SYS)_EXECVE, arches which have a kernel syscall trap based kernel_execve(), e.g. MIPS, can't implement __ARCH_WANT_SYS_EXECVE alone - they need to first convert to __ARCH_WANT_KERNEL_EXECVE as well (although it probably doesn't make sense for anyone to just implement one - but in terms of staging - having only one, breaks stuff IMHO). The reason being, for non converted kernel_execve(), the call-stack leading to sys_execve (e.g. init_post -> run_init_process -> kernel_execve ->..) would cause the pt_regs layout to be slightly offsetted from bottom of stack - not exactly where current_pt_regs()/task_pt_regs(current) would point to in general. Thus on return path the update by start_thread() won't be visible to asm glue at expected location. I ran into this myself - when doing the execve switch for ARC Linux port (currently being "pre-reviewed" by tglx before submission to lkml). -Vineet ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [RFC] status of execve() work - per-architecture patches solicited 2012-09-19 12:20 ` Vineet Gupta @ 2012-09-19 13:32 ` Al Viro 0 siblings, 0 replies; 22+ messages in thread From: Al Viro @ 2012-09-19 13:32 UTC (permalink / raw) To: Vineet Gupta; +Cc: linux-arch, linux-kernel On Wed, Sep 19, 2012 at 05:50:34PM +0530, Vineet Gupta wrote: > Hi Al, > > It must be noted that despite having seemingly independent > __ARCH_WANT_(KERNEL|SYS)_EXECVE, arches which have a kernel syscall trap > based kernel_execve(), e.g. MIPS, can't implement __ARCH_WANT_SYS_EXECVE > alone - they need to first convert > to __ARCH_WANT_KERNEL_EXECVE as well (although it probably doesn't make > sense for anyone to just implement one - but in terms of staging - > having only one, breaks stuff IMHO). Of course - that's the reason for kernel_execve() being pulled into the mix at all. Unified sys_execve() relies on not using a trap to do kernel_execve(); it's not exactly the same thing as having it done by generic instance in fs/exec.c (e.g. some architectures were already doing it that way, with their own instances, some in asm glue, some in C) but it is a prerequisite. ^ permalink raw reply [flat|nested] 22+ messages in thread
end of thread, other threads:[~2012-09-24 10:59 UTC | newest] Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2012-09-07 18:20 [RFC] status of execve() work - per-architecture patches solicited Al Viro 2012-09-07 18:22 ` Al Viro 2012-09-10 13:40 ` Greg Ungerer 2012-09-10 16:49 ` Al Viro 2012-09-11 3:39 ` Greg Ungerer 2012-09-13 13:27 ` Greg Ungerer 2012-09-10 22:20 ` Mark Salter 2012-09-10 22:20 ` [PATCH 1/2] c6x: implement ret_from_kernel_execve() and switch to generic kernel_execve() Mark Salter 2012-09-10 22:20 ` [PATCH 2/2] c6x: switch to generic sys_execve() Mark Salter 2012-09-17 3:26 ` [RFC] status of execve() work - per-architecture patches solicited Al Viro 2012-09-21 16:26 ` Mark Salter 2012-09-21 16:26 ` [PATCH 1/3] c6x: add ret_from_kernel_thread(), simplify kernel_thread() Mark Salter 2012-09-21 16:26 ` [PATCH 2/3] c6x: switch to generic kernel_execve Mark Salter 2012-09-21 16:26 ` [PATCH 3/3] c6x: switch to generic sys_execve Mark Salter 2012-09-21 18:39 ` [RFC] status of execve() work - per-architecture patches solicited Al Viro 2012-09-22 11:16 ` Greg Ungerer 2012-09-23 0:46 ` Al Viro 2012-09-24 10:59 ` Vineet Gupta 2012-09-17 9:29 ` Michal Simek 2012-09-17 22:57 ` Al Viro 2012-09-19 12:20 ` Vineet Gupta 2012-09-19 13:32 ` Al Viro
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).