* Core dumps for threads @ 2001-02-24 21:45 Don Dugger 2001-02-24 21:57 ` Alan Cox 0 siblings, 1 reply; 11+ messages in thread From: Don Dugger @ 2001-02-24 21:45 UTC (permalink / raw) To: linux-kernel Can anyone explain why this test is in routine `do_coredump' in file `fs/exec.c' in the 2.4.0 kernel? if (!current->dumpable || atomic_read(¤t->mm->mm_users) != 1) goto fail; The only thing the test on `mm_users' seems to be doing is stopping a thread process from dumping core. What's the rationale for this? -- Don Dugger "Censeo Toto nos in Kansa esse decisse." - D. Gale n0ano@valinux.com Ph: 303/938-9838 ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Core dumps for threads 2001-02-24 21:45 Core dumps for threads Don Dugger @ 2001-02-24 21:57 ` Alan Cox 2001-02-25 9:15 ` Chris Wedgwood 2001-02-27 17:29 ` [PATCH] " Don Dugger 0 siblings, 2 replies; 11+ messages in thread From: Alan Cox @ 2001-02-24 21:57 UTC (permalink / raw) To: Don Dugger; +Cc: linux-kernel > Can anyone explain why this test is in routine `do_coredump' > in file `fs/exec.c' in the 2.4.0 kernel? > > if (!current->dumpable || atomic_read(¤t->mm->mm_users) != 1) > goto fail; > > The only thing the test on `mm_users' seems to be doing is stopping > a thread process from dumping core. What's the rationale for this? The I/O to dump the core would race other changes on the mm. The right fix is probably to copy the mm (as fork does) then dump the copy. ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Core dumps for threads 2001-02-24 21:57 ` Alan Cox @ 2001-02-25 9:15 ` Chris Wedgwood 2001-02-25 9:40 ` Adam Fritzler ` (2 more replies) 2001-02-27 17:29 ` [PATCH] " Don Dugger 1 sibling, 3 replies; 11+ messages in thread From: Chris Wedgwood @ 2001-02-25 9:15 UTC (permalink / raw) To: Alan Cox; +Cc: Don Dugger, linux-kernel On Sat, Feb 24, 2001 at 09:57:44PM +0000, Alan Cox wrote: The I/O to dump the core would race other changes on the mm. The right fix is probably to copy the mm (as fork does) then dump the copy. Stupid question... but since all threads see the same memory space as each other; can we not lock the entire vma for the process whilst it's being written out? --cw ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Core dumps for threads 2001-02-25 9:15 ` Chris Wedgwood @ 2001-02-25 9:40 ` Adam Fritzler 2001-02-26 0:36 ` davids 2001-02-25 10:28 ` Andi Kleen 2001-02-25 13:57 ` Alan Cox 2 siblings, 1 reply; 11+ messages in thread From: Adam Fritzler @ 2001-02-25 9:40 UTC (permalink / raw) To: linux-kernel Theres a patch floating around that does just that. Its an obvious hack. I would like to see something clean get into the mainstream kernels. Its a real pain not to have cores for threaded code. It does work, however. It effectively dumps the thread that caused the fault. (I have a complimentary hack that will dump the stacks of all the rest of the threads as well (though its a good trick to get gdb to interpret this). Available upon request.) af On Sun, 25 Feb 2001, Chris Wedgwood wrote: > On Sat, Feb 24, 2001 at 09:57:44PM +0000, Alan Cox wrote: > > The I/O to dump the core would race other changes on the mm. The > right fix is probably to copy the mm (as fork does) then dump the > copy. > > Stupid question... but since all threads see the same memory space as > each other; can we not lock the entire vma for the process whilst > it's being written out? > > > > --cw > - > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ > ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Core dumps for threads 2001-02-25 9:40 ` Adam Fritzler @ 2001-02-26 0:36 ` davids 0 siblings, 0 replies; 11+ messages in thread From: davids @ 2001-02-26 0:36 UTC (permalink / raw) To: Adam Fritzler, linux-kernel > It does work, however. It effectively dumps the thread that caused the > fault. If you want that behavior, catch SIGSEGV, fork, and have the child process (in which only the faulting thread exists) call abort. DS ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Core dumps for threads 2001-02-25 9:15 ` Chris Wedgwood 2001-02-25 9:40 ` Adam Fritzler @ 2001-02-25 10:28 ` Andi Kleen 2001-02-25 11:03 ` Chris Wedgwood 2001-02-25 13:57 ` Alan Cox 2 siblings, 1 reply; 11+ messages in thread From: Andi Kleen @ 2001-02-25 10:28 UTC (permalink / raw) To: Chris Wedgwood; +Cc: linux-kernel Chris Wedgwood <cw@f00f.org> writes: > On Sat, Feb 24, 2001 at 09:57:44PM +0000, Alan Cox wrote: > > The I/O to dump the core would race other changes on the mm. The > right fix is probably to copy the mm (as fork does) then dump the > copy. > > Stupid question... but since all threads see the same memory space as > each other; can we not lock the entire vma for the process whilst > it's being written out? It would need a recursive mm semaphore -- core dumps can page fault and page faults take the semaphore again. Other alternative is to copy the MM like fork before dumping, but then core dumping could fail much quicker when you ran out of memory. -Andi ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Core dumps for threads 2001-02-25 10:28 ` Andi Kleen @ 2001-02-25 11:03 ` Chris Wedgwood 0 siblings, 0 replies; 11+ messages in thread From: Chris Wedgwood @ 2001-02-25 11:03 UTC (permalink / raw) To: Andi Kleen; +Cc: linux-kernel On Sun, Feb 25, 2001 at 11:28:14AM +0100, Andi Kleen wrote: It would need a recursive mm semaphore -- core dumps can page fault and page faults take the semaphore again. Other alternative is to copy the MM like fork before dumping, but then core dumping could fail much quicker when you ran out of memory. Ouch... would creating another semaphore or flag 'dumping_core' work or is that jut a bad idea? I ask this because like it or not; people use threaded applications and having threaded core dumps would be cool. Requiring a copy to dump isn't really elegant or an option if you run a process who's RSS is 400M on a 512MB machine as I do. --cw ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Core dumps for threads 2001-02-25 9:15 ` Chris Wedgwood 2001-02-25 9:40 ` Adam Fritzler 2001-02-25 10:28 ` Andi Kleen @ 2001-02-25 13:57 ` Alan Cox 2 siblings, 0 replies; 11+ messages in thread From: Alan Cox @ 2001-02-25 13:57 UTC (permalink / raw) To: Chris Wedgwood; +Cc: Alan Cox, Don Dugger, linux-kernel > On Sat, Feb 24, 2001 at 09:57:44PM +0000, Alan Cox wrote: > > The I/O to dump the core would race other changes on the mm. The > right fix is probably to copy the mm (as fork does) then dump the > copy. > > Stupid question... but since all threads see the same memory space as > each other; can we not lock the entire vma for the process whilst > it's being written out? It isnt the vma, its the entire mm you would need to lock. And I dont think you can do a deadlock free lock of that sanely, hence its better to copy the mm (thats pretty efficienty anyway as it wont copy the data) ^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH] Core dumps for threads 2001-02-24 21:57 ` Alan Cox 2001-02-25 9:15 ` Chris Wedgwood @ 2001-02-27 17:29 ` Don Dugger 2001-02-27 17:55 ` Alan Cox 2001-02-27 17:57 ` Alan Cox 1 sibling, 2 replies; 11+ messages in thread From: Don Dugger @ 2001-02-27 17:29 UTC (permalink / raw) To: Linus Torvalds; +Cc: linux-kernel, Alan Cox [-- Attachment #1: Type: text/plain, Size: 1348 bytes --] OK, I followed Alan's suggestion and here is a patch, relative to the 2.4.1 kernel, that copies the mm and dumps the corefile from that copy. Also, whenever there are multiple users of the original mm it creates the core dump in the file `core.n' where `n' is the PID of the offending process. I would contend that this fixes a bug and should be put into the 2.4 but if the concensus is that it's a new 2.5 feature that's fine. On Sat, Feb 24, 2001 at 09:57:44PM +0000, Alan Cox wrote: > > Can anyone explain why this test is in routine `do_coredump' > > in file `fs/exec.c' in the 2.4.0 kernel? > > > > if (!current->dumpable || atomic_read(¤t->mm->mm_users) != 1) > > goto fail; > > > > The only thing the test on `mm_users' seems to be doing is stopping > > a thread process from dumping core. What's the rationale for this? > > The I/O to dump the core would race other changes on the mm. The right fix > is probably to copy the mm (as fork does) then dump the copy. > > - > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ -- Don Dugger "Censeo Toto nos in Kansa esse decisse." - D. Gale n0ano@valinux.com Ph: 303/938-9838 [-- Attachment #2: p_core-0227.l --] [-- Type: text/plain, Size: 6178 bytes --] Index: linux-2.4/fs/binfmt_aout.c =================================================================== RCS file: /trillian/src/cvs_root/linux-2.4/fs/binfmt_aout.c,v retrieving revision 1.1 diff -u -r1.1 binfmt_aout.c --- linux-2.4/fs/binfmt_aout.c 2001/02/06 23:40:30 1.1 +++ linux-2.4/fs/binfmt_aout.c 2001/02/27 16:50:43 @@ -31,7 +31,8 @@ static int load_aout_binary(struct linux_binprm *, struct pt_regs * regs); static int load_aout_library(struct file*); -static int aout_core_dump(long signr, struct pt_regs * regs, struct file *file); +static int aout_core_dump(long signr, struct pt_regs * regs, + struct file *file, struct mm_struct * mm); extern void dump_thread(struct pt_regs *, struct user *); @@ -78,7 +79,8 @@ * dumping of the process results in another error.. */ -static int aout_core_dump(long signr, struct pt_regs * regs, struct file *file) +static int aout_core_dump(long signr, struct pt_regs * regs, + struct file * file, struct mm_struct * mm) { mm_segment_t fs; int has_dumped = 0; Index: linux-2.4/fs/binfmt_elf.c =================================================================== RCS file: /trillian/src/cvs_root/linux-2.4/fs/binfmt_elf.c,v retrieving revision 1.1 diff -u -r1.1 binfmt_elf.c --- linux-2.4/fs/binfmt_elf.c 2001/02/06 23:40:30 1.1 +++ linux-2.4/fs/binfmt_elf.c 2001/02/27 16:50:43 @@ -56,7 +56,8 @@ * don't even try. */ #ifdef USE_ELF_CORE_DUMP -static int elf_core_dump(long signr, struct pt_regs * regs, struct file * file); +static int elf_core_dump(long signr, struct pt_regs * regs, + struct file * file, struct mm_struct * mm); #else #define elf_core_dump NULL #endif @@ -981,7 +982,8 @@ * and then they are actually written out. If we run out of core limit * we just truncate. */ -static int elf_core_dump(long signr, struct pt_regs * regs, struct file * file) +static int elf_core_dump(long signr, struct pt_regs * regs, + struct file * file, struct mm_struct * mm) { int has_dumped = 0; mm_segment_t fs; @@ -998,7 +1000,7 @@ elf_fpregset_t fpu; /* NT_PRFPREG */ struct elf_prpsinfo psinfo; /* NT_PRPSINFO */ - segs = current->mm->map_count; + segs = mm->map_count; #ifdef DEBUG printk("elf_core_dump: %d segs %lu limit\n", segs, limit); @@ -1158,7 +1160,7 @@ dataoff = offset = roundup(offset, ELF_EXEC_PAGESIZE); /* Write program headers for segments dump */ - for(vma = current->mm->mmap; vma != NULL; vma = vma->vm_next) { + for(vma = mm->mmap; vma != NULL; vma = vma->vm_next) { struct elf_phdr phdr; size_t sz; @@ -1187,7 +1189,7 @@ DUMP_SEEK(dataoff); - for(vma = current->mm->mmap; vma != NULL; vma = vma->vm_next) { + for(vma = mm->mmap; vma != NULL; vma = vma->vm_next) { unsigned long addr; if (!maydump(vma)) Index: linux-2.4/fs/exec.c =================================================================== RCS file: /trillian/src/cvs_root/linux-2.4/fs/exec.c,v retrieving revision 1.2 diff -u -r1.2 exec.c --- linux-2.4/fs/exec.c 2001/02/07 01:17:26 1.2 +++ linux-2.4/fs/exec.c 2001/02/27 16:50:43 @@ -916,16 +916,18 @@ int do_coredump(long signr, struct pt_regs * regs) { + struct mm_struct *mm; struct linux_binfmt * binfmt; - char corename[6+sizeof(current->comm)]; + char corename[6+sizeof(current->comm)+10]; struct file * file; struct inode * inode; + int r; lock_kernel(); binfmt = current->binfmt; if (!binfmt || !binfmt->core_dump) goto fail; - if (!current->dumpable || atomic_read(¤t->mm->mm_users) != 1) + if (!current->dumpable) goto fail; current->dumpable = 0; if (current->rlim[RLIMIT_CORE].rlim_cur < binfmt->min_coredump) @@ -937,6 +939,8 @@ #else corename[4] = '\0'; #endif + if (atomic_read(¤t->mm->mm_users) != 1) + sprintf(&corename[4], ".%d", current->pid); file = filp_open(corename, O_CREAT | 2 | O_NOFOLLOW, 0600); if (IS_ERR(file)) goto fail; @@ -954,10 +958,29 @@ goto close_fail; if (do_truncate(file->f_dentry, 0) != 0) goto close_fail; - if (!binfmt->core_dump(signr, regs, file)) - goto close_fail; + /* + * Copy the mm structure to avoid potential races with + * other threads + */ + if ((mm = kmem_cache_alloc(mm_cachep, SLAB_KERNEL)) == NULL) + goto close_fail; + memcpy(mm, current->mm, sizeof(*mm)); + if (!mm_init(mm)) { + kmem_cache_free(mm_cachep, mm); + goto close_fail; + } + down(¤t->mm->mmap_sem); + r = dup_mmap(mm); + up(¤t->mm->mmap_sem); + if (r) { + mmput(mm); + goto close_fail; + } + r = binfmt->core_dump(signr, regs, file, mm); + mmput(mm); unlock_kernel(); - filp_close(file, NULL); + if (r) + filp_close(file, NULL); return 1; close_fail: Index: linux-2.4/include/linux/binfmts.h =================================================================== RCS file: /trillian/src/cvs_root/linux-2.4/include/linux/binfmts.h,v retrieving revision 1.1 diff -u -r1.1 binfmts.h --- linux-2.4/include/linux/binfmts.h 2001/02/06 23:41:21 1.1 +++ linux-2.4/include/linux/binfmts.h 2001/02/27 16:50:43 @@ -41,7 +41,8 @@ struct module *module; int (*load_binary)(struct linux_binprm *, struct pt_regs * regs); int (*load_shlib)(struct file *); - int (*core_dump)(long signr, struct pt_regs * regs, struct file * file); + int (*core_dump)(long signr, struct pt_regs * regs, + struct file * file, struct mm_struct *mm); unsigned long min_coredump; /* minimal dump size */ }; Index: linux-2.4/kernel/fork.c =================================================================== RCS file: /trillian/src/cvs_root/linux-2.4/kernel/fork.c,v retrieving revision 1.2 diff -u -r1.2 fork.c --- linux-2.4/kernel/fork.c 2001/02/07 01:17:29 1.2 +++ linux-2.4/kernel/fork.c 2001/02/27 16:50:43 @@ -122,7 +122,7 @@ return last_pid; } -static inline int dup_mmap(struct mm_struct * mm) +int dup_mmap(struct mm_struct * mm) { struct vm_area_struct * mpnt, *tmp, **pprev; int retval; @@ -197,7 +197,7 @@ #define allocate_mm() (kmem_cache_alloc(mm_cachep, SLAB_KERNEL)) #define free_mm(mm) (kmem_cache_free(mm_cachep, (mm))) -static struct mm_struct * mm_init(struct mm_struct * mm) +struct mm_struct * mm_init(struct mm_struct * mm) { atomic_set(&mm->mm_users, 1); atomic_set(&mm->mm_count, 1); ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] Core dumps for threads 2001-02-27 17:29 ` [PATCH] " Don Dugger @ 2001-02-27 17:55 ` Alan Cox 2001-02-27 17:57 ` Alan Cox 1 sibling, 0 replies; 11+ messages in thread From: Alan Cox @ 2001-02-27 17:55 UTC (permalink / raw) To: Don Dugger; +Cc: Linus Torvalds, linux-kernel, Alan Cox > the 2.4.1 kernel, that copies the mm and dumps the corefile from > that copy. Also, whenever there are multiple users of the original > mm it creates the core dump in the file `core.n' where `n' is the > PID of the offending process. > + if ((mm = kmem_cache_alloc(mm_cachep, SLAB_KERNEL)) == NULL) > + goto close_fail; > + memcpy(mm, current->mm, sizeof(*mm)); > + if (!mm_init(mm)) { > + kmem_cache_free(mm_cachep, mm); > + goto close_fail; > + } > + down(¤t->mm->mmap_sem); Umm not quite what I meant. Take a look at copy_mm in fork.c The code starting m = allocate_mm() down to just before /* * child gets a Does the real thing you need to do to be sure the mm is properly set up. You can also drop the original mm and make that current->mm to avoid the passing of mm around. In fact you probably should do that. ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] Core dumps for threads 2001-02-27 17:29 ` [PATCH] " Don Dugger 2001-02-27 17:55 ` Alan Cox @ 2001-02-27 17:57 ` Alan Cox 1 sibling, 0 replies; 11+ messages in thread From: Alan Cox @ 2001-02-27 17:57 UTC (permalink / raw) To: Don Dugger; +Cc: Linus Torvalds, linux-kernel, Alan Cox Umm I misread that patch somewhat. I take that comment back. It does indeed do what I intended. Would the mm folks care to verify the patch.. ^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2001-02-27 17:56 UTC | newest] Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2001-02-24 21:45 Core dumps for threads Don Dugger 2001-02-24 21:57 ` Alan Cox 2001-02-25 9:15 ` Chris Wedgwood 2001-02-25 9:40 ` Adam Fritzler 2001-02-26 0:36 ` davids 2001-02-25 10:28 ` Andi Kleen 2001-02-25 11:03 ` Chris Wedgwood 2001-02-25 13:57 ` Alan Cox 2001-02-27 17:29 ` [PATCH] " Don Dugger 2001-02-27 17:55 ` Alan Cox 2001-02-27 17:57 ` Alan Cox
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).