* [RFC] SIGKILL vs. SIGSEGV on late execve() failures @ 2013-02-14 5:36 Al Viro 2013-02-15 20:02 ` Linus Torvalds 0 siblings, 1 reply; 16+ messages in thread From: Al Viro @ 2013-02-14 5:36 UTC (permalink / raw) To: Linus Torvalds; +Cc: linux-kernel If execve() fails past flush_old_exec(), we are obviously going to kill the process. Right now it's implemented in $BIGNUM places in ->load_binary() and that's obviously brittle (and in at least one case buggy - binfmt_flat lacks send_sig_info() on late failures). Now, there's an obvious way to check that we had done successful flush_old_exec() - bfmt->mm becomes NULL just past the last failure exit there. So it would be tempting to have these send_sig_info() moved into search_binary_handler(), especially since we already have if (retval != -ENOEXEC || bprm->mm == NULL) break; in there and turning that into if (bprm->mm == NULL) { /* past the point of no return */ suicide break; } if (retval != -ENOEXEC) break; would be trivial. The only problem is that some suicides do SIGKILL, some SIGSEGV. AFAICS, it started as SIGSEGV and had been switched to SIGKILL for a.out (without any comments) in 1.1.62. By that time ELF had been there, with SIGSEGV in the same places. Not replaced with SIGKILL; as the matter of fact, they are still there. Additional failure exits in case of ELF had been added with SIGKILL; ELF-FDPIC has copied ELF and FLAT hadn't bothered with send_sig_info() at all. Since by that point we have an empty sighandler table, the only real difference is whether we attempt to produce a coredump on such late failures. Is there any real reason not to try that? After all, with that kind of late failure in execve(2) a coredump is obviously something the caller might want to take a look at... What was the reason for switch in 1.1.62? It's before my time and the only exec-related comments I see in 1.1.61->1.1.62 summary are Don't execute files that are being written to. If we can't get write access to a core dump file, don't core dump. Remove redundant test for non-null executables. No need to release the shared memory by hand, when loading different executable. neither of which covers that one... ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [RFC] SIGKILL vs. SIGSEGV on late execve() failures 2013-02-14 5:36 [RFC] SIGKILL vs. SIGSEGV on late execve() failures Al Viro @ 2013-02-15 20:02 ` Linus Torvalds 2013-02-15 21:59 ` Al Viro 0 siblings, 1 reply; 16+ messages in thread From: Linus Torvalds @ 2013-02-15 20:02 UTC (permalink / raw) To: Al Viro; +Cc: Linux Kernel Mailing List On Wed, Feb 13, 2013 at 9:36 PM, Al Viro <viro@zeniv.linux.org.uk> wrote: > > The only problem is that some suicides do SIGKILL, some SIGSEGV. > AFAICS, it started as SIGSEGV and had been switched to SIGKILL for a.out > (without any comments) in 1.1.62. Ok, I really don't think it matters which one we do - either SIGSEGV or SIGKILL is fine for a execve() that fails in those rare "impossible to recover from" circumstances. And no, I have no memory what the reason for the switch was, it's much too long ago.. Linus ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [RFC] SIGKILL vs. SIGSEGV on late execve() failures 2013-02-15 20:02 ` Linus Torvalds @ 2013-02-15 21:59 ` Al Viro 2013-02-15 23:12 ` Shentino 0 siblings, 1 reply; 16+ messages in thread From: Al Viro @ 2013-02-15 21:59 UTC (permalink / raw) To: Linus Torvalds; +Cc: Linux Kernel Mailing List On Fri, Feb 15, 2013 at 12:02:50PM -0800, Linus Torvalds wrote: > On Wed, Feb 13, 2013 at 9:36 PM, Al Viro <viro@zeniv.linux.org.uk> wrote: > > > > The only problem is that some suicides do SIGKILL, some SIGSEGV. > > AFAICS, it started as SIGSEGV and had been switched to SIGKILL for a.out > > (without any comments) in 1.1.62. > > Ok, I really don't think it matters which one we do - either SIGSEGV > or SIGKILL is fine for a execve() that fails in those rare "impossible > to recover from" circumstances. And no, I have no memory what the > reason for the switch was, it's much too long ago.. How about this, then: handle suicide on late failure exits in execve() in search_binary_handler() ... rather than doing that in the guts of ->load_binary(). And send SIGSEGV in all cases. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> --- diff --git a/fs/binfmt_aout.c b/fs/binfmt_aout.c index bbc8f88..a1c236d 100644 --- a/fs/binfmt_aout.c +++ b/fs/binfmt_aout.c @@ -260,11 +260,8 @@ static int load_aout_binary(struct linux_binprm * bprm) current->mm->cached_hole_size = 0; retval = setup_arg_pages(bprm, STACK_TOP, EXSTACK_DEFAULT); - if (retval < 0) { - /* Someone check-me: is this error path enough? */ - send_sig(SIGKILL, current, 0); + if (retval < 0) return retval; - } install_exec_creds(bprm); @@ -282,19 +279,15 @@ static int load_aout_binary(struct linux_binprm * bprm) map_size = ex.a_text+ex.a_data; #endif error = vm_brk(text_addr & PAGE_MASK, map_size); - if (error != (text_addr & PAGE_MASK)) { - send_sig(SIGKILL, current, 0); + if (error != (text_addr & PAGE_MASK)) return error; - } error = bprm->file->f_op->read(bprm->file, (char __user *)text_addr, ex.a_text+ex.a_data, &pos); - if ((signed long)error < 0) { - send_sig(SIGKILL, current, 0); + if ((signed long)error < 0) return error; - } - + flush_icache_range(text_addr, text_addr+ex.a_text+ex.a_data); } else { if ((ex.a_text & 0xfff || ex.a_data & 0xfff) && @@ -327,28 +320,22 @@ static int load_aout_binary(struct linux_binprm * bprm) MAP_FIXED | MAP_PRIVATE | MAP_DENYWRITE | MAP_EXECUTABLE, fd_offset); - if (error != N_TXTADDR(ex)) { - send_sig(SIGKILL, current, 0); + if (error != N_TXTADDR(ex)) return error; - } error = vm_mmap(bprm->file, N_DATADDR(ex), ex.a_data, PROT_READ | PROT_WRITE | PROT_EXEC, MAP_FIXED | MAP_PRIVATE | MAP_DENYWRITE | MAP_EXECUTABLE, fd_offset + ex.a_text); - if (error != N_DATADDR(ex)) { - send_sig(SIGKILL, current, 0); + if (error != N_DATADDR(ex)) return error; - } } beyond_if: set_binfmt(&aout_format); retval = set_brk(current->mm->start_brk, current->mm->brk); - if (retval < 0) { - send_sig(SIGKILL, current, 0); + if (retval < 0) return retval; - } current->mm->start_stack = (unsigned long) create_aout_tables((char __user *) bprm->p, bprm); diff --git a/fs/binfmt_elf.c b/fs/binfmt_elf.c index 11e078a..50e9194 100644 --- a/fs/binfmt_elf.c +++ b/fs/binfmt_elf.c @@ -734,10 +734,8 @@ static int load_elf_binary(struct linux_binprm *bprm) current->mm->cached_hole_size = 0; retval = setup_arg_pages(bprm, randomize_stack_top(STACK_TOP), executable_stack); - if (retval < 0) { - send_sig(SIGKILL, current, 0); + if (retval < 0) goto out_free_dentry; - } current->mm->start_stack = bprm->p; @@ -759,10 +757,8 @@ static int load_elf_binary(struct linux_binprm *bprm) and clear the area. */ retval = set_brk(elf_bss + load_bias, elf_brk + load_bias); - if (retval) { - send_sig(SIGKILL, current, 0); + if (retval) goto out_free_dentry; - } nbyte = ELF_PAGEOFFSET(elf_bss); if (nbyte) { nbyte = ELF_MIN_ALIGN - nbyte; @@ -815,7 +811,6 @@ static int load_elf_binary(struct linux_binprm *bprm) error = elf_map(bprm->file, load_bias + vaddr, elf_ppnt, elf_prot, elf_flags, 0); if (BAD_ADDR(error)) { - send_sig(SIGKILL, current, 0); retval = IS_ERR((void *)error) ? PTR_ERR((void*)error) : -EINVAL; goto out_free_dentry; @@ -846,7 +841,6 @@ static int load_elf_binary(struct linux_binprm *bprm) elf_ppnt->p_memsz > TASK_SIZE || TASK_SIZE - elf_ppnt->p_memsz < k) { /* set_brk can never work. Avoid overflows. */ - send_sig(SIGKILL, current, 0); retval = -EINVAL; goto out_free_dentry; } @@ -878,12 +872,10 @@ static int load_elf_binary(struct linux_binprm *bprm) * up getting placed where the bss needs to go. */ retval = set_brk(elf_bss, elf_brk); - if (retval) { - send_sig(SIGKILL, current, 0); + if (retval) goto out_free_dentry; - } + if (likely(elf_bss != elf_brk) && unlikely(padzero(elf_bss))) { - send_sig(SIGSEGV, current, 0); retval = -EFAULT; /* Nobody gets to see this, but.. */ goto out_free_dentry; } @@ -929,19 +921,15 @@ static int load_elf_binary(struct linux_binprm *bprm) #ifdef ARCH_HAS_SETUP_ADDITIONAL_PAGES retval = arch_setup_additional_pages(bprm, !!elf_interpreter); - if (retval < 0) { - send_sig(SIGKILL, current, 0); + if (retval < 0) goto out; - } #endif /* ARCH_HAS_SETUP_ADDITIONAL_PAGES */ install_exec_creds(bprm); retval = create_elf_tables(bprm, &loc->elf_ex, load_addr, interp_load_addr); - if (retval < 0) { - send_sig(SIGKILL, current, 0); + if (retval < 0) goto out; - } /* N.B. passed_fileno might not be initialized? */ current->mm->end_code = end_code; current->mm->start_code = start_code; diff --git a/fs/binfmt_elf_fdpic.c b/fs/binfmt_elf_fdpic.c index 30de01c..5c03732 100644 --- a/fs/binfmt_elf_fdpic.c +++ b/fs/binfmt_elf_fdpic.c @@ -317,8 +317,7 @@ static int load_elf_fdpic_binary(struct linux_binprm *bprm) goto error; /* there's now no turning back... the old userspace image is dead, - * defunct, deceased, etc. after this point we have to exit via - * error_kill */ + * defunct, deceased, etc. */ set_personality(PER_LINUX_FDPIC); if (elf_read_implies_exec(&exec_params.hdr, executable_stack)) current->personality |= READ_IMPLIES_EXEC; @@ -343,24 +342,22 @@ static int load_elf_fdpic_binary(struct linux_binprm *bprm) retval = setup_arg_pages(bprm, current->mm->start_stack, executable_stack); - if (retval < 0) { - send_sig(SIGKILL, current, 0); - goto error_kill; - } + if (retval < 0) + goto error; #endif /* load the executable and interpreter into memory */ retval = elf_fdpic_map_file(&exec_params, bprm->file, current->mm, "executable"); if (retval < 0) - goto error_kill; + goto error; if (interpreter_name) { retval = elf_fdpic_map_file(&interp_params, interpreter, current->mm, "interpreter"); if (retval < 0) { printk(KERN_ERR "Unable to load interpreter\n"); - goto error_kill; + goto error; } allow_write_access(interpreter); @@ -397,7 +394,7 @@ static int load_elf_fdpic_binary(struct linux_binprm *bprm) if (IS_ERR_VALUE(current->mm->start_brk)) { retval = current->mm->start_brk; current->mm->start_brk = 0; - goto error_kill; + goto error; } current->mm->brk = current->mm->start_brk; @@ -410,7 +407,7 @@ static int load_elf_fdpic_binary(struct linux_binprm *bprm) install_exec_creds(bprm); if (create_elf_fdpic_tables(bprm, current->mm, &exec_params, &interp_params) < 0) - goto error_kill; + goto error; kdebug("- start_code %lx", current->mm->start_code); kdebug("- end_code %lx", current->mm->end_code); @@ -449,12 +446,6 @@ error: kfree(interp_params.phdrs); kfree(interp_params.loadmap); return retval; - - /* unrecoverable error - kill the process */ -error_kill: - send_sig(SIGSEGV, current, 0); - goto error; - } /*****************************************************************************/ diff --git a/fs/exec.c b/fs/exec.c index 7b6f4d5..027b2de 100644 --- a/fs/exec.c +++ b/fs/exec.c @@ -1416,28 +1416,28 @@ int search_binary_handler(struct linux_binprm *bprm) } read_lock(&binfmt_lock); put_binfmt(fmt); - if (retval != -ENOEXEC || bprm->mm == NULL) - break; - if (!bprm->file) { + if (bprm->mm == NULL) { + /* we got to flush_old_exec() and failed after it */ + send_sig(SIGSEGV, current, 0); + read_unlock(&binfmt_lock); + return retval; + } + if (retval != -ENOEXEC || !bprm->file) { read_unlock(&binfmt_lock); return retval; } } read_unlock(&binfmt_lock); #ifdef CONFIG_MODULES - if (retval != -ENOEXEC || bprm->mm == NULL) { - break; - } else { #define printable(c) (((c)=='\t') || ((c)=='\n') || (0x20<=(c) && (c)<=0x7e)) - if (printable(bprm->buf[0]) && - printable(bprm->buf[1]) && - printable(bprm->buf[2]) && - printable(bprm->buf[3])) - break; /* -ENOEXEC */ - if (try) - break; /* -ENOEXEC */ - request_module("binfmt-%04x", *(unsigned short *)(&bprm->buf[2])); - } + if (printable(bprm->buf[0]) && + printable(bprm->buf[1]) && + printable(bprm->buf[2]) && + printable(bprm->buf[3])) + break; /* -ENOEXEC */ + if (try) + break; /* -ENOEXEC */ + request_module("binfmt-%04x", *(unsigned short *)(&bprm->buf[2])); #else break; #endif ^ permalink raw reply related [flat|nested] 16+ messages in thread
* Re: [RFC] SIGKILL vs. SIGSEGV on late execve() failures 2013-02-15 21:59 ` Al Viro @ 2013-02-15 23:12 ` Shentino 2013-02-16 0:04 ` Al Viro 0 siblings, 1 reply; 16+ messages in thread From: Shentino @ 2013-02-15 23:12 UTC (permalink / raw) To: Al Viro; +Cc: Linus Torvalds, Linux Kernel Mailing List On Fri, Feb 15, 2013 at 1:59 PM, Al Viro <viro@zeniv.linux.org.uk> wrote: > On Fri, Feb 15, 2013 at 12:02:50PM -0800, Linus Torvalds wrote: >> On Wed, Feb 13, 2013 at 9:36 PM, Al Viro <viro@zeniv.linux.org.uk> wrote: >> > >> > The only problem is that some suicides do SIGKILL, some SIGSEGV. >> > AFAICS, it started as SIGSEGV and had been switched to SIGKILL for a.out >> > (without any comments) in 1.1.62. >> >> Ok, I really don't think it matters which one we do - either SIGSEGV >> or SIGKILL is fine for a execve() that fails in those rare "impossible >> to recover from" circumstances. And no, I have no memory what the >> reason for the switch was, it's much too long ago.. > > How about this, then: > > handle suicide on late failure exits in execve() in search_binary_handler() > > ... rather than doing that in the guts of ->load_binary(). And send > SIGSEGV in all cases. > > Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> > --- > diff --git a/fs/binfmt_aout.c b/fs/binfmt_aout.c > index bbc8f88..a1c236d 100644 > --- a/fs/binfmt_aout.c > +++ b/fs/binfmt_aout.c > @@ -260,11 +260,8 @@ static int load_aout_binary(struct linux_binprm * bprm) > current->mm->cached_hole_size = 0; > > retval = setup_arg_pages(bprm, STACK_TOP, EXSTACK_DEFAULT); > - if (retval < 0) { > - /* Someone check-me: is this error path enough? */ > - send_sig(SIGKILL, current, 0); > + if (retval < 0) > return retval; > - } > > install_exec_creds(bprm); > > @@ -282,19 +279,15 @@ static int load_aout_binary(struct linux_binprm * bprm) > map_size = ex.a_text+ex.a_data; > #endif > error = vm_brk(text_addr & PAGE_MASK, map_size); > - if (error != (text_addr & PAGE_MASK)) { > - send_sig(SIGKILL, current, 0); > + if (error != (text_addr & PAGE_MASK)) > return error; > - } > > error = bprm->file->f_op->read(bprm->file, > (char __user *)text_addr, > ex.a_text+ex.a_data, &pos); > - if ((signed long)error < 0) { > - send_sig(SIGKILL, current, 0); > + if ((signed long)error < 0) > return error; > - } > - > + > flush_icache_range(text_addr, text_addr+ex.a_text+ex.a_data); > } else { > if ((ex.a_text & 0xfff || ex.a_data & 0xfff) && > @@ -327,28 +320,22 @@ static int load_aout_binary(struct linux_binprm * bprm) > MAP_FIXED | MAP_PRIVATE | MAP_DENYWRITE | MAP_EXECUTABLE, > fd_offset); > > - if (error != N_TXTADDR(ex)) { > - send_sig(SIGKILL, current, 0); > + if (error != N_TXTADDR(ex)) > return error; > - } > > error = vm_mmap(bprm->file, N_DATADDR(ex), ex.a_data, > PROT_READ | PROT_WRITE | PROT_EXEC, > MAP_FIXED | MAP_PRIVATE | MAP_DENYWRITE | MAP_EXECUTABLE, > fd_offset + ex.a_text); > - if (error != N_DATADDR(ex)) { > - send_sig(SIGKILL, current, 0); > + if (error != N_DATADDR(ex)) > return error; > - } > } > beyond_if: > set_binfmt(&aout_format); > > retval = set_brk(current->mm->start_brk, current->mm->brk); > - if (retval < 0) { > - send_sig(SIGKILL, current, 0); > + if (retval < 0) > return retval; > - } > > current->mm->start_stack = > (unsigned long) create_aout_tables((char __user *) bprm->p, bprm); > diff --git a/fs/binfmt_elf.c b/fs/binfmt_elf.c > index 11e078a..50e9194 100644 > --- a/fs/binfmt_elf.c > +++ b/fs/binfmt_elf.c > @@ -734,10 +734,8 @@ static int load_elf_binary(struct linux_binprm *bprm) > current->mm->cached_hole_size = 0; > retval = setup_arg_pages(bprm, randomize_stack_top(STACK_TOP), > executable_stack); > - if (retval < 0) { > - send_sig(SIGKILL, current, 0); > + if (retval < 0) > goto out_free_dentry; > - } > > current->mm->start_stack = bprm->p; > > @@ -759,10 +757,8 @@ static int load_elf_binary(struct linux_binprm *bprm) > and clear the area. */ > retval = set_brk(elf_bss + load_bias, > elf_brk + load_bias); > - if (retval) { > - send_sig(SIGKILL, current, 0); > + if (retval) > goto out_free_dentry; > - } > nbyte = ELF_PAGEOFFSET(elf_bss); > if (nbyte) { > nbyte = ELF_MIN_ALIGN - nbyte; > @@ -815,7 +811,6 @@ static int load_elf_binary(struct linux_binprm *bprm) > error = elf_map(bprm->file, load_bias + vaddr, elf_ppnt, > elf_prot, elf_flags, 0); > if (BAD_ADDR(error)) { > - send_sig(SIGKILL, current, 0); > retval = IS_ERR((void *)error) ? > PTR_ERR((void*)error) : -EINVAL; > goto out_free_dentry; > @@ -846,7 +841,6 @@ static int load_elf_binary(struct linux_binprm *bprm) > elf_ppnt->p_memsz > TASK_SIZE || > TASK_SIZE - elf_ppnt->p_memsz < k) { > /* set_brk can never work. Avoid overflows. */ > - send_sig(SIGKILL, current, 0); > retval = -EINVAL; > goto out_free_dentry; > } > @@ -878,12 +872,10 @@ static int load_elf_binary(struct linux_binprm *bprm) > * up getting placed where the bss needs to go. > */ > retval = set_brk(elf_bss, elf_brk); > - if (retval) { > - send_sig(SIGKILL, current, 0); > + if (retval) > goto out_free_dentry; > - } > + > if (likely(elf_bss != elf_brk) && unlikely(padzero(elf_bss))) { > - send_sig(SIGSEGV, current, 0); > retval = -EFAULT; /* Nobody gets to see this, but.. */ > goto out_free_dentry; > } > @@ -929,19 +921,15 @@ static int load_elf_binary(struct linux_binprm *bprm) > > #ifdef ARCH_HAS_SETUP_ADDITIONAL_PAGES > retval = arch_setup_additional_pages(bprm, !!elf_interpreter); > - if (retval < 0) { > - send_sig(SIGKILL, current, 0); > + if (retval < 0) > goto out; > - } > #endif /* ARCH_HAS_SETUP_ADDITIONAL_PAGES */ > > install_exec_creds(bprm); > retval = create_elf_tables(bprm, &loc->elf_ex, > load_addr, interp_load_addr); > - if (retval < 0) { > - send_sig(SIGKILL, current, 0); > + if (retval < 0) > goto out; > - } > /* N.B. passed_fileno might not be initialized? */ > current->mm->end_code = end_code; > current->mm->start_code = start_code; > diff --git a/fs/binfmt_elf_fdpic.c b/fs/binfmt_elf_fdpic.c > index 30de01c..5c03732 100644 > --- a/fs/binfmt_elf_fdpic.c > +++ b/fs/binfmt_elf_fdpic.c > @@ -317,8 +317,7 @@ static int load_elf_fdpic_binary(struct linux_binprm *bprm) > goto error; > > /* there's now no turning back... the old userspace image is dead, > - * defunct, deceased, etc. after this point we have to exit via > - * error_kill */ > + * defunct, deceased, etc. */ > set_personality(PER_LINUX_FDPIC); > if (elf_read_implies_exec(&exec_params.hdr, executable_stack)) > current->personality |= READ_IMPLIES_EXEC; > @@ -343,24 +342,22 @@ static int load_elf_fdpic_binary(struct linux_binprm *bprm) > > retval = setup_arg_pages(bprm, current->mm->start_stack, > executable_stack); > - if (retval < 0) { > - send_sig(SIGKILL, current, 0); > - goto error_kill; > - } > + if (retval < 0) > + goto error; > #endif > > /* load the executable and interpreter into memory */ > retval = elf_fdpic_map_file(&exec_params, bprm->file, current->mm, > "executable"); > if (retval < 0) > - goto error_kill; > + goto error; > > if (interpreter_name) { > retval = elf_fdpic_map_file(&interp_params, interpreter, > current->mm, "interpreter"); > if (retval < 0) { > printk(KERN_ERR "Unable to load interpreter\n"); > - goto error_kill; > + goto error; > } > > allow_write_access(interpreter); > @@ -397,7 +394,7 @@ static int load_elf_fdpic_binary(struct linux_binprm *bprm) > if (IS_ERR_VALUE(current->mm->start_brk)) { > retval = current->mm->start_brk; > current->mm->start_brk = 0; > - goto error_kill; > + goto error; > } > > current->mm->brk = current->mm->start_brk; > @@ -410,7 +407,7 @@ static int load_elf_fdpic_binary(struct linux_binprm *bprm) > install_exec_creds(bprm); > if (create_elf_fdpic_tables(bprm, current->mm, > &exec_params, &interp_params) < 0) > - goto error_kill; > + goto error; > > kdebug("- start_code %lx", current->mm->start_code); > kdebug("- end_code %lx", current->mm->end_code); > @@ -449,12 +446,6 @@ error: > kfree(interp_params.phdrs); > kfree(interp_params.loadmap); > return retval; > - > - /* unrecoverable error - kill the process */ > -error_kill: > - send_sig(SIGSEGV, current, 0); > - goto error; > - > } > > /*****************************************************************************/ > diff --git a/fs/exec.c b/fs/exec.c > index 7b6f4d5..027b2de 100644 > --- a/fs/exec.c > +++ b/fs/exec.c > @@ -1416,28 +1416,28 @@ int search_binary_handler(struct linux_binprm *bprm) > } > read_lock(&binfmt_lock); > put_binfmt(fmt); > - if (retval != -ENOEXEC || bprm->mm == NULL) > - break; > - if (!bprm->file) { > + if (bprm->mm == NULL) { > + /* we got to flush_old_exec() and failed after it */ > + send_sig(SIGSEGV, current, 0); This might be a stupid miscue on my part, but shouldn't it be force_sig instead of send_sig? I've got this crazy hunch that having SEGV masked might muck something up. > + read_unlock(&binfmt_lock); > + return retval; > + } > + if (retval != -ENOEXEC || !bprm->file) { > read_unlock(&binfmt_lock); > return retval; > } > } > read_unlock(&binfmt_lock); > #ifdef CONFIG_MODULES > - if (retval != -ENOEXEC || bprm->mm == NULL) { > - break; > - } else { > #define printable(c) (((c)=='\t') || ((c)=='\n') || (0x20<=(c) && (c)<=0x7e)) > - if (printable(bprm->buf[0]) && > - printable(bprm->buf[1]) && > - printable(bprm->buf[2]) && > - printable(bprm->buf[3])) > - break; /* -ENOEXEC */ > - if (try) > - break; /* -ENOEXEC */ > - request_module("binfmt-%04x", *(unsigned short *)(&bprm->buf[2])); > - } > + if (printable(bprm->buf[0]) && > + printable(bprm->buf[1]) && > + printable(bprm->buf[2]) && > + printable(bprm->buf[3])) > + break; /* -ENOEXEC */ > + if (try) > + break; /* -ENOEXEC */ > + request_module("binfmt-%04x", *(unsigned short *)(&bprm->buf[2])); > #else > break; > #endif > -- > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [RFC] SIGKILL vs. SIGSEGV on late execve() failures 2013-02-15 23:12 ` Shentino @ 2013-02-16 0:04 ` Al Viro 2013-02-16 0:38 ` Maciej W. Rozycki ` (2 more replies) 0 siblings, 3 replies; 16+ messages in thread From: Al Viro @ 2013-02-16 0:04 UTC (permalink / raw) To: Shentino; +Cc: Linus Torvalds, Linux Kernel Mailing List On Fri, Feb 15, 2013 at 03:12:30PM -0800, Shentino wrote: > > + send_sig(SIGSEGV, current, 0); > > This might be a stupid miscue on my part, but shouldn't it be > force_sig instead of send_sig? > > I've got this crazy hunch that having SEGV masked might muck something up. How would you manage to have it masked at that point? setup_new_exec() is inevitable after success of flush_old_exec() and it will do flush_signal_handlers() for us. And yes, flush_old_exec() and setup_new_exec() ought to be merged; the problem with that is the stuff done between those two - setting personality, plus playing with thread flags if needed. Unfortunately, there are non-obvious differences between architectures, so that would have to be hashed out on linux-arch. Doesn't affect the point above, though... ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [RFC] SIGKILL vs. SIGSEGV on late execve() failures 2013-02-16 0:04 ` Al Viro @ 2013-02-16 0:38 ` Maciej W. Rozycki 2013-02-16 0:38 ` Shentino 2013-02-16 0:40 ` Linus Torvalds 2 siblings, 0 replies; 16+ messages in thread From: Maciej W. Rozycki @ 2013-02-16 0:38 UTC (permalink / raw) To: Al Viro; +Cc: Shentino, Linus Torvalds, Linux Kernel Mailing List On Sat, 16 Feb 2013, Al Viro wrote: > > > + send_sig(SIGSEGV, current, 0); > > > > This might be a stupid miscue on my part, but shouldn't it be > > force_sig instead of send_sig? > > > > I've got this crazy hunch that having SEGV masked might muck something up. > > How would you manage to have it masked at that point? setup_new_exec() > is inevitable after success of flush_old_exec() and it will do > flush_signal_handlers() for us. So just to be completely safe here -- is your proposed change going to affect processes being traced anyhow? E.g. won't GDB see some sort of a limbo state when the child is terminated this way? According to ptrace(2) man page SIGKILL is the only exception to the usual child signal trapping policy. Maciej ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [RFC] SIGKILL vs. SIGSEGV on late execve() failures 2013-02-16 0:04 ` Al Viro 2013-02-16 0:38 ` Maciej W. Rozycki @ 2013-02-16 0:38 ` Shentino 2013-02-16 0:46 ` Shentino 2013-02-16 0:40 ` Linus Torvalds 2 siblings, 1 reply; 16+ messages in thread From: Shentino @ 2013-02-16 0:38 UTC (permalink / raw) To: Al Viro; +Cc: Linus Torvalds, Linux Kernel Mailing List On Fri, Feb 15, 2013 at 4:04 PM, Al Viro <viro@zeniv.linux.org.uk> wrote: > How would you manage to have it masked at that point? setup_new_exec() > is inevitable after success of flush_old_exec() and it will do > flush_signal_handlers() for us. I wouldn't know for sure but I read somewhere that even if execve resets handled signals, it didn't also say that ignored signals were also reset. (Source: execve man page.) Plus I've always seen "force_sig" instead of "send_sig" for errors that can't just be ignored. Then again I'm a complete n00b at linux hacking so I could very well be mistaken. It just looked funny considering what I'm used to seeing in the source. I'm hoping this is just me being an idiot and that there's nothing to worry about. > And yes, flush_old_exec() and setup_new_exec() ought to be merged; the > problem with that is the stuff done between those two - setting personality, > plus playing with thread flags if needed. Unfortunately, there are non-obvious > differences between architectures, so that would have to be hashed out on > linux-arch. Doesn't affect the point above, though... ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [RFC] SIGKILL vs. SIGSEGV on late execve() failures 2013-02-16 0:38 ` Shentino @ 2013-02-16 0:46 ` Shentino 2013-02-16 1:50 ` Al Viro 0 siblings, 1 reply; 16+ messages in thread From: Shentino @ 2013-02-16 0:46 UTC (permalink / raw) To: Al Viro; +Cc: Linus Torvalds, Linux Kernel Mailing List On Fri, Feb 15, 2013 at 4:38 PM, Shentino <shentino@gmail.com> wrote: > On Fri, Feb 15, 2013 at 4:04 PM, Al Viro <viro@zeniv.linux.org.uk> wrote: >> How would you manage to have it masked at that point? setup_new_exec() >> is inevitable after success of flush_old_exec() and it will do >> flush_signal_handlers() for us. > > I wouldn't know for sure but I read somewhere that even if execve > resets handled signals, it didn't also say that ignored signals were > also reset. (Source: execve man page.) Also, apologies for the terminology mix-up. By masked, I mean that the signal was ignored as directed by userspace a-la signal(SIGSEGV, SIG_IGN). Plus I *think* that signal ignore masks are preserved across an exec. Again, I might just be a clueless newbie here. ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [RFC] SIGKILL vs. SIGSEGV on late execve() failures 2013-02-16 0:46 ` Shentino @ 2013-02-16 1:50 ` Al Viro 2013-02-16 2:20 ` Al Viro 0 siblings, 1 reply; 16+ messages in thread From: Al Viro @ 2013-02-16 1:50 UTC (permalink / raw) To: Shentino; +Cc: Linus Torvalds, Linux Kernel Mailing List On Fri, Feb 15, 2013 at 04:46:43PM -0800, Shentino wrote: > On Fri, Feb 15, 2013 at 4:38 PM, Shentino <shentino@gmail.com> wrote: > > On Fri, Feb 15, 2013 at 4:04 PM, Al Viro <viro@zeniv.linux.org.uk> wrote: > >> How would you manage to have it masked at that point? setup_new_exec() > >> is inevitable after success of flush_old_exec() and it will do > >> flush_signal_handlers() for us. > > > > I wouldn't know for sure but I read somewhere that even if execve > > resets handled signals, it didn't also say that ignored signals were > > also reset. (Source: execve man page.) > > Also, apologies for the terminology mix-up. By masked, I mean that > the signal was ignored as directed by userspace a-la signal(SIGSEGV, > SIG_IGN). > > Plus I *think* that signal ignore masks are preserved across an exec. You are correct. OK, what it means is that we do need that force_sigsegv() - either there or in all places in ->load_binary() where we currently have send_sig_info(SIGSEGV). I don't think that it's an urgent hole, but yes, it is a bug. Nice catch. That said, I'd definitely prefer to fix it in one place, rather than in a bunch of locations in ->load_binary() instances. Updated diff follows: handle suicide on late failure exits in execve() in search_binary_handler() ... rather than doing that in the guts of ->load_binary(). [updated to fix the bug spotted by Shentino - for SIGSEGV we really need something stronger than send_sig_info(); again, better do that in one place] Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> --- diff --git a/arch/x86/ia32/ia32_aout.c b/arch/x86/ia32/ia32_aout.c index 03abf9b..a8066b7 100644 --- a/arch/x86/ia32/ia32_aout.c +++ b/arch/x86/ia32/ia32_aout.c @@ -313,11 +313,8 @@ static int load_aout_binary(struct linux_binprm *bprm) current->mm->cached_hole_size = 0; retval = setup_arg_pages(bprm, IA32_STACK_TOP, EXSTACK_DEFAULT); - if (retval < 0) { - /* Someone check-me: is this error path enough? */ - send_sig(SIGKILL, current, 0); + if (retval < 0) return retval; - } install_exec_creds(bprm); @@ -332,18 +329,14 @@ static int load_aout_binary(struct linux_binprm *bprm) error = vm_brk(text_addr & PAGE_MASK, map_size); - if (error != (text_addr & PAGE_MASK)) { - send_sig(SIGKILL, current, 0); + if (error != (text_addr & PAGE_MASK)) return error; - } error = bprm->file->f_op->read(bprm->file, (char __user *)text_addr, ex.a_text+ex.a_data, &pos); - if ((signed long)error < 0) { - send_sig(SIGKILL, current, 0); + if ((signed long)error < 0) return error; - } flush_icache_range(text_addr, text_addr+ex.a_text+ex.a_data); } else { @@ -385,20 +378,16 @@ static int load_aout_binary(struct linux_binprm *bprm) MAP_EXECUTABLE | MAP_32BIT, fd_offset); - if (error != N_TXTADDR(ex)) { - send_sig(SIGKILL, current, 0); + if (error != N_TXTADDR(ex)) return error; - } error = vm_mmap(bprm->file, N_DATADDR(ex), ex.a_data, PROT_READ | PROT_WRITE | PROT_EXEC, MAP_FIXED | MAP_PRIVATE | MAP_DENYWRITE | MAP_EXECUTABLE | MAP_32BIT, fd_offset + ex.a_text); - if (error != N_DATADDR(ex)) { - send_sig(SIGKILL, current, 0); + if (error != N_DATADDR(ex)) return error; - } } beyond_if: set_binfmt(&aout_format); diff --git a/fs/binfmt_aout.c b/fs/binfmt_aout.c index bbc8f88..a1c236d 100644 --- a/fs/binfmt_aout.c +++ b/fs/binfmt_aout.c @@ -260,11 +260,8 @@ static int load_aout_binary(struct linux_binprm * bprm) current->mm->cached_hole_size = 0; retval = setup_arg_pages(bprm, STACK_TOP, EXSTACK_DEFAULT); - if (retval < 0) { - /* Someone check-me: is this error path enough? */ - send_sig(SIGKILL, current, 0); + if (retval < 0) return retval; - } install_exec_creds(bprm); @@ -282,19 +279,15 @@ static int load_aout_binary(struct linux_binprm * bprm) map_size = ex.a_text+ex.a_data; #endif error = vm_brk(text_addr & PAGE_MASK, map_size); - if (error != (text_addr & PAGE_MASK)) { - send_sig(SIGKILL, current, 0); + if (error != (text_addr & PAGE_MASK)) return error; - } error = bprm->file->f_op->read(bprm->file, (char __user *)text_addr, ex.a_text+ex.a_data, &pos); - if ((signed long)error < 0) { - send_sig(SIGKILL, current, 0); + if ((signed long)error < 0) return error; - } - + flush_icache_range(text_addr, text_addr+ex.a_text+ex.a_data); } else { if ((ex.a_text & 0xfff || ex.a_data & 0xfff) && @@ -327,28 +320,22 @@ static int load_aout_binary(struct linux_binprm * bprm) MAP_FIXED | MAP_PRIVATE | MAP_DENYWRITE | MAP_EXECUTABLE, fd_offset); - if (error != N_TXTADDR(ex)) { - send_sig(SIGKILL, current, 0); + if (error != N_TXTADDR(ex)) return error; - } error = vm_mmap(bprm->file, N_DATADDR(ex), ex.a_data, PROT_READ | PROT_WRITE | PROT_EXEC, MAP_FIXED | MAP_PRIVATE | MAP_DENYWRITE | MAP_EXECUTABLE, fd_offset + ex.a_text); - if (error != N_DATADDR(ex)) { - send_sig(SIGKILL, current, 0); + if (error != N_DATADDR(ex)) return error; - } } beyond_if: set_binfmt(&aout_format); retval = set_brk(current->mm->start_brk, current->mm->brk); - if (retval < 0) { - send_sig(SIGKILL, current, 0); + if (retval < 0) return retval; - } current->mm->start_stack = (unsigned long) create_aout_tables((char __user *) bprm->p, bprm); diff --git a/fs/binfmt_elf.c b/fs/binfmt_elf.c index 11e078a..50e9194 100644 --- a/fs/binfmt_elf.c +++ b/fs/binfmt_elf.c @@ -734,10 +734,8 @@ static int load_elf_binary(struct linux_binprm *bprm) current->mm->cached_hole_size = 0; retval = setup_arg_pages(bprm, randomize_stack_top(STACK_TOP), executable_stack); - if (retval < 0) { - send_sig(SIGKILL, current, 0); + if (retval < 0) goto out_free_dentry; - } current->mm->start_stack = bprm->p; @@ -759,10 +757,8 @@ static int load_elf_binary(struct linux_binprm *bprm) and clear the area. */ retval = set_brk(elf_bss + load_bias, elf_brk + load_bias); - if (retval) { - send_sig(SIGKILL, current, 0); + if (retval) goto out_free_dentry; - } nbyte = ELF_PAGEOFFSET(elf_bss); if (nbyte) { nbyte = ELF_MIN_ALIGN - nbyte; @@ -815,7 +811,6 @@ static int load_elf_binary(struct linux_binprm *bprm) error = elf_map(bprm->file, load_bias + vaddr, elf_ppnt, elf_prot, elf_flags, 0); if (BAD_ADDR(error)) { - send_sig(SIGKILL, current, 0); retval = IS_ERR((void *)error) ? PTR_ERR((void*)error) : -EINVAL; goto out_free_dentry; @@ -846,7 +841,6 @@ static int load_elf_binary(struct linux_binprm *bprm) elf_ppnt->p_memsz > TASK_SIZE || TASK_SIZE - elf_ppnt->p_memsz < k) { /* set_brk can never work. Avoid overflows. */ - send_sig(SIGKILL, current, 0); retval = -EINVAL; goto out_free_dentry; } @@ -878,12 +872,10 @@ static int load_elf_binary(struct linux_binprm *bprm) * up getting placed where the bss needs to go. */ retval = set_brk(elf_bss, elf_brk); - if (retval) { - send_sig(SIGKILL, current, 0); + if (retval) goto out_free_dentry; - } + if (likely(elf_bss != elf_brk) && unlikely(padzero(elf_bss))) { - send_sig(SIGSEGV, current, 0); retval = -EFAULT; /* Nobody gets to see this, but.. */ goto out_free_dentry; } @@ -929,19 +921,15 @@ static int load_elf_binary(struct linux_binprm *bprm) #ifdef ARCH_HAS_SETUP_ADDITIONAL_PAGES retval = arch_setup_additional_pages(bprm, !!elf_interpreter); - if (retval < 0) { - send_sig(SIGKILL, current, 0); + if (retval < 0) goto out; - } #endif /* ARCH_HAS_SETUP_ADDITIONAL_PAGES */ install_exec_creds(bprm); retval = create_elf_tables(bprm, &loc->elf_ex, load_addr, interp_load_addr); - if (retval < 0) { - send_sig(SIGKILL, current, 0); + if (retval < 0) goto out; - } /* N.B. passed_fileno might not be initialized? */ current->mm->end_code = end_code; current->mm->start_code = start_code; diff --git a/fs/binfmt_elf_fdpic.c b/fs/binfmt_elf_fdpic.c index 30de01c..5c03732 100644 --- a/fs/binfmt_elf_fdpic.c +++ b/fs/binfmt_elf_fdpic.c @@ -317,8 +317,7 @@ static int load_elf_fdpic_binary(struct linux_binprm *bprm) goto error; /* there's now no turning back... the old userspace image is dead, - * defunct, deceased, etc. after this point we have to exit via - * error_kill */ + * defunct, deceased, etc. */ set_personality(PER_LINUX_FDPIC); if (elf_read_implies_exec(&exec_params.hdr, executable_stack)) current->personality |= READ_IMPLIES_EXEC; @@ -343,24 +342,22 @@ static int load_elf_fdpic_binary(struct linux_binprm *bprm) retval = setup_arg_pages(bprm, current->mm->start_stack, executable_stack); - if (retval < 0) { - send_sig(SIGKILL, current, 0); - goto error_kill; - } + if (retval < 0) + goto error; #endif /* load the executable and interpreter into memory */ retval = elf_fdpic_map_file(&exec_params, bprm->file, current->mm, "executable"); if (retval < 0) - goto error_kill; + goto error; if (interpreter_name) { retval = elf_fdpic_map_file(&interp_params, interpreter, current->mm, "interpreter"); if (retval < 0) { printk(KERN_ERR "Unable to load interpreter\n"); - goto error_kill; + goto error; } allow_write_access(interpreter); @@ -397,7 +394,7 @@ static int load_elf_fdpic_binary(struct linux_binprm *bprm) if (IS_ERR_VALUE(current->mm->start_brk)) { retval = current->mm->start_brk; current->mm->start_brk = 0; - goto error_kill; + goto error; } current->mm->brk = current->mm->start_brk; @@ -410,7 +407,7 @@ static int load_elf_fdpic_binary(struct linux_binprm *bprm) install_exec_creds(bprm); if (create_elf_fdpic_tables(bprm, current->mm, &exec_params, &interp_params) < 0) - goto error_kill; + goto error; kdebug("- start_code %lx", current->mm->start_code); kdebug("- end_code %lx", current->mm->end_code); @@ -449,12 +446,6 @@ error: kfree(interp_params.phdrs); kfree(interp_params.loadmap); return retval; - - /* unrecoverable error - kill the process */ -error_kill: - send_sig(SIGSEGV, current, 0); - goto error; - } /*****************************************************************************/ diff --git a/fs/exec.c b/fs/exec.c index 7b6f4d5..6a6ddd4 100644 --- a/fs/exec.c +++ b/fs/exec.c @@ -1416,28 +1416,28 @@ int search_binary_handler(struct linux_binprm *bprm) } read_lock(&binfmt_lock); put_binfmt(fmt); - if (retval != -ENOEXEC || bprm->mm == NULL) - break; - if (!bprm->file) { + if (bprm->mm == NULL) { + /* we got to flush_old_exec() and failed after it */ + force_sigsegv(SIGSEGV, current); + read_unlock(&binfmt_lock); + return retval; + } + if (retval != -ENOEXEC || !bprm->file) { read_unlock(&binfmt_lock); return retval; } } read_unlock(&binfmt_lock); #ifdef CONFIG_MODULES - if (retval != -ENOEXEC || bprm->mm == NULL) { - break; - } else { #define printable(c) (((c)=='\t') || ((c)=='\n') || (0x20<=(c) && (c)<=0x7e)) - if (printable(bprm->buf[0]) && - printable(bprm->buf[1]) && - printable(bprm->buf[2]) && - printable(bprm->buf[3])) - break; /* -ENOEXEC */ - if (try) - break; /* -ENOEXEC */ - request_module("binfmt-%04x", *(unsigned short *)(&bprm->buf[2])); - } + if (printable(bprm->buf[0]) && + printable(bprm->buf[1]) && + printable(bprm->buf[2]) && + printable(bprm->buf[3])) + break; /* -ENOEXEC */ + if (try) + break; /* -ENOEXEC */ + request_module("binfmt-%04x", *(unsigned short *)(&bprm->buf[2])); #else break; #endif ^ permalink raw reply related [flat|nested] 16+ messages in thread
* Re: [RFC] SIGKILL vs. SIGSEGV on late execve() failures 2013-02-16 1:50 ` Al Viro @ 2013-02-16 2:20 ` Al Viro 2013-02-16 7:20 ` Raymond Jennings 0 siblings, 1 reply; 16+ messages in thread From: Al Viro @ 2013-02-16 2:20 UTC (permalink / raw) To: Shentino; +Cc: Linus Torvalds, Linux Kernel Mailing List On Sat, Feb 16, 2013 at 01:50:24AM +0000, Al Viro wrote: > On Fri, Feb 15, 2013 at 04:46:43PM -0800, Shentino wrote: > > On Fri, Feb 15, 2013 at 4:38 PM, Shentino <shentino@gmail.com> wrote: > > > On Fri, Feb 15, 2013 at 4:04 PM, Al Viro <viro@zeniv.linux.org.uk> wrote: > > >> How would you manage to have it masked at that point? setup_new_exec() > > >> is inevitable after success of flush_old_exec() and it will do > > >> flush_signal_handlers() for us. > > > > > > I wouldn't know for sure but I read somewhere that even if execve > > > resets handled signals, it didn't also say that ignored signals were > > > also reset. (Source: execve man page.) > > > > Also, apologies for the terminology mix-up. By masked, I mean that > > the signal was ignored as directed by userspace a-la signal(SIGSEGV, > > SIG_IGN). > > > > Plus I *think* that signal ignore masks are preserved across an exec. > > You are correct. OK, what it means is that we do need that force_sigsegv() - > either there or in all places in ->load_binary() where we currently have > send_sig_info(SIGSEGV). I don't think that it's an urgent hole, but yes, > it is a bug. Nice catch. Arrgh... OK, I'm a blind idiot. These places in binfmt_elf.c currently use force_sig(), not send_sig_info(). Currently == since 2006 when somebody noticed the problem. Their counterparts in binfmt_elf_fdpic.c were *not* noticed. Anyway, that definitely means we want to do it in a single commit; the only remaining question is whether we have any problems with somebody ptracing such execve() and then poking the sucker with ptrace(); that _can_ happen with the current mainline for ELF binaries, so this is not something new. I'm low on coffee and about to crash, so I might be missing some horrible problem with it, but in this case I'm fairly sure that such a problem would be present in current mainline. ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [RFC] SIGKILL vs. SIGSEGV on late execve() failures 2013-02-16 2:20 ` Al Viro @ 2013-02-16 7:20 ` Raymond Jennings 2013-02-16 7:43 ` Al Viro 0 siblings, 1 reply; 16+ messages in thread From: Raymond Jennings @ 2013-02-16 7:20 UTC (permalink / raw) To: Al Viro; +Cc: Linus Torvalds, Linux Kernel Mailing List On Fri, Feb 15, 2013 at 6:20 PM, Al Viro <viro@zeniv.linux.org.uk> wrote: > Arrgh... OK, I'm a blind idiot. These places in binfmt_elf.c currently use > force_sig(), not send_sig_info(). Currently == since 2006 when somebody > noticed the problem. Their counterparts in binfmt_elf_fdpic.c were *not* > noticed. Anyway, that definitely means we want to do it in a single commit; > the only remaining question is whether we have any problems with somebody > ptracing such execve() and then poking the sucker with ptrace(); Personally if I was ptracing another process, I'd be flummoxed if I saw it get nailed with a fatal segfault that I somehow wasn't allowed to intercept. An even bigger question might be why an execve is allowed to get into an unrecoverable state to begin with. Assuming that one builds the new mm_struct and whatnot BEFORE discarding old state, why would execve be in a position for a fatal error in the first place? > that _can_ > happen with the current mainline for ELF binaries, so this is not something > new. I'm low on coffee and about to crash, so I might be missing some > horrible problem with it, but in this case I'm fairly sure that such a problem > would be present in current mainline. ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [RFC] SIGKILL vs. SIGSEGV on late execve() failures 2013-02-16 7:20 ` Raymond Jennings @ 2013-02-16 7:43 ` Al Viro 2013-02-16 8:13 ` Raymond Jennings 0 siblings, 1 reply; 16+ messages in thread From: Al Viro @ 2013-02-16 7:43 UTC (permalink / raw) To: Raymond Jennings; +Cc: Linus Torvalds, Linux Kernel Mailing List On Fri, Feb 15, 2013 at 11:20:18PM -0800, Raymond Jennings wrote: > An even bigger question might be why an execve is allowed to get into > an unrecoverable state to begin with. Assuming that one builds the > new mm_struct and whatnot BEFORE discarding old state, why would > execve be in a position for a fatal error in the first place? When would you kill the rest of thread group? Take a look at de_thread() - we are not just replacing ->mm during execve(). Signal delivery logics, etc. ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [RFC] SIGKILL vs. SIGSEGV on late execve() failures 2013-02-16 7:43 ` Al Viro @ 2013-02-16 8:13 ` Raymond Jennings 0 siblings, 0 replies; 16+ messages in thread From: Raymond Jennings @ 2013-02-16 8:13 UTC (permalink / raw) To: Al Viro; +Cc: Linus Torvalds, Linux Kernel Mailing List My mistake then. I figured there was always a way to back out of something like this. Carry on. ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [RFC] SIGKILL vs. SIGSEGV on late execve() failures 2013-02-16 0:04 ` Al Viro 2013-02-16 0:38 ` Maciej W. Rozycki 2013-02-16 0:38 ` Shentino @ 2013-02-16 0:40 ` Linus Torvalds 2013-02-16 1:22 ` Al Viro 2 siblings, 1 reply; 16+ messages in thread From: Linus Torvalds @ 2013-02-16 0:40 UTC (permalink / raw) To: Al Viro; +Cc: Shentino, Linux Kernel Mailing List On Fri, Feb 15, 2013 at 4:04 PM, Al Viro <viro@zeniv.linux.org.uk> wrote: > On Fri, Feb 15, 2013 at 03:12:30PM -0800, Shentino wrote: >> > + send_sig(SIGSEGV, current, 0); >> >> This might be a stupid miscue on my part, but shouldn't it be >> force_sig instead of send_sig? >> >> I've got this crazy hunch that having SEGV masked might muck something up. > > How would you manage to have it masked at that point? setup_new_exec() > is inevitable after success of flush_old_exec() and it will do > flush_signal_handlers() for us. I have to agree with Shentino on this one: it's entirely possible that send_sig() is always equivalent to force_sig() in this circumstance, but rather than depend on that kind of non-local subtlety, we should just make it obvious. This is what "force_sig()" exists for - making it clear that we punch through any signal handlers. Whether such a signal handler can exist or not is kind of immaterial. Linus ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [RFC] SIGKILL vs. SIGSEGV on late execve() failures 2013-02-16 0:40 ` Linus Torvalds @ 2013-02-16 1:22 ` Al Viro 2013-02-16 1:44 ` Linus Torvalds 0 siblings, 1 reply; 16+ messages in thread From: Al Viro @ 2013-02-16 1:22 UTC (permalink / raw) To: Linus Torvalds; +Cc: Shentino, Linux Kernel Mailing List On Fri, Feb 15, 2013 at 04:40:18PM -0800, Linus Torvalds wrote: > On Fri, Feb 15, 2013 at 4:04 PM, Al Viro <viro@zeniv.linux.org.uk> wrote: > > On Fri, Feb 15, 2013 at 03:12:30PM -0800, Shentino wrote: > >> > + send_sig(SIGSEGV, current, 0); > >> > >> This might be a stupid miscue on my part, but shouldn't it be > >> force_sig instead of send_sig? > >> > >> I've got this crazy hunch that having SEGV masked might muck something up. > > > > How would you manage to have it masked at that point? setup_new_exec() > > is inevitable after success of flush_old_exec() and it will do > > flush_signal_handlers() for us. > > I have to agree with Shentino on this one: it's entirely possible that > send_sig() is always equivalent to force_sig() in this circumstance, > but rather than depend on that kind of non-local subtlety, we should > just make it obvious. This is what "force_sig()" exists for - making > it clear that we punch through any signal handlers. Whether such a > signal handler can exist or not is kind of immaterial. *shrug* Fine by me - the variant I'd posted simply moved these calls in one place; I've no problem with replacing them with force_sig() (or force_sigsegv(SIGSEGV, current), for paranoia sake). OTOH, I'd probably prefer to make it a separate commit. FWIW, now that I've looked into what's involved in merging flush_old_exec() and setup_new_exec()... Here's something that looks like a bug: #include <sys/personality.h> #include <unistd.h> main() { char *argv[] = {"uname", "-m", "-r", NULL}; char *envp[] = {NULL}; personality(0x0020000); /* UNAME26 */ execve("/bin/uname", argv, envp); } On amd64 testbox (3.0.60-based kernel): 2.6.40+ x86_64 On alpha: 3.3.6+ alpha Cause: SET_PERSONALITY() on alpha doesn't care to preserve the upper bits of current->personality and just does either set_personality(PER_LINUX) or set_personality(PER_LINUX_32BIT). ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [RFC] SIGKILL vs. SIGSEGV on late execve() failures 2013-02-16 1:22 ` Al Viro @ 2013-02-16 1:44 ` Linus Torvalds 0 siblings, 0 replies; 16+ messages in thread From: Linus Torvalds @ 2013-02-16 1:44 UTC (permalink / raw) To: Al Viro; +Cc: Shentino, Linux Kernel Mailing List On Fri, Feb 15, 2013 at 5:22 PM, Al Viro <viro@zeniv.linux.org.uk> wrote: > > Fine by me - the variant I'd posted simply moved these calls in one > place; I've no problem with replacing them with force_sig() (or > force_sigsegv(SIGSEGV, current), for paranoia sake). OTOH, I'd probably > prefer to make it a separate commit. I'd argue that it actually makes sense in the same commit - since you're changing SIGKILL to SIGSEGV, and what makes SIGKILL special really is that it can never be blocked. So even conceptually, it's actually *that* change that it should couple with. But I don't care enough, so if you already just have the one commit and want to add on top of it with a second one, that's fine. > Cause: SET_PERSONALITY() on alpha doesn't care to preserve the upper bits > of current->personality and just does either set_personality(PER_LINUX) or > set_personality(PER_LINUX_32BIT). Christ, we should just try to get rid of the personality bits entirely, they are completely insane. We don't even use most of the numbers, yet we have these odd personality numbers for emulating legacy Unixes that we don't even care about. Some of it *may* have made sense back when we actually thought things like iBCS2 would matter. I cannot believe it possibly matters any more. But nobody seems to care enough, and the numbers have slipped into user space. I do wonder if we could make things clearer, get rid of the insane internal "personality number", and instead have explicit bits for the few things that actually do matter (like the whole zero-page mapping, the address randomization etc). And then leave the insane "set_personality()" system call as a legacy interface that just looks at the number and moves the bits around appropriately. The fact that we actually carry this insane meaningless number along, and then have crazy things like that "upper bits of the crazy number doesn't get set by alpha SET_PERSONALITY" is just crazy. But don't worry, I sympathize 100% with all sane people who go "I don't want to touch that mess". Linus ^ permalink raw reply [flat|nested] 16+ messages in thread
end of thread, other threads:[~2013-02-16 8:14 UTC | newest] Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2013-02-14 5:36 [RFC] SIGKILL vs. SIGSEGV on late execve() failures Al Viro 2013-02-15 20:02 ` Linus Torvalds 2013-02-15 21:59 ` Al Viro 2013-02-15 23:12 ` Shentino 2013-02-16 0:04 ` Al Viro 2013-02-16 0:38 ` Maciej W. Rozycki 2013-02-16 0:38 ` Shentino 2013-02-16 0:46 ` Shentino 2013-02-16 1:50 ` Al Viro 2013-02-16 2:20 ` Al Viro 2013-02-16 7:20 ` Raymond Jennings 2013-02-16 7:43 ` Al Viro 2013-02-16 8:13 ` Raymond Jennings 2013-02-16 0:40 ` Linus Torvalds 2013-02-16 1:22 ` Al Viro 2013-02-16 1:44 ` Linus Torvalds
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).