On 7/14/17 5:55 PM, Jerome Glisse wrote: > ... > > Cheers, > JA(C)rA'me Hi Jerome, I think I just found a couple of new issues, now related to fork/execve. 1) With a fork() followed by execve(), the child process makes a copy of the parent mm_struct object, including the "hmm" pointer. Later on, an execve() syscall in the child process frees the old mm_struct, and destroys the "hmm" object - which apparently it shouldn't do, because the "hmm" object is shared between the parent and child processes: (gdb) bt #0 hmm_mm_destroy (mm=0xffff88080757aa40) at mm/hmm.c:134 #1 0xffffffff81058567 in __mmdrop (mm=0xffff88080757aa40) at kernel/fork.c:889 #2 0xffffffff8105904f in mmdrop (mm=) at ./include/linux/sched/mm.h:42 #3 __mmput (mm=) at kernel/fork.c:916 #4 mmput (mm=0xffff88080757aa40) at kernel/fork.c:927 #5 0xffffffff811c5a68 in exec_mmap (mm=) at fs/exec.c:1057 #6 flush_old_exec (bprm=) at fs/exec.c:1284 #7 0xffffffff81214460 in load_elf_binary (bprm=0xffff8808133b1978) at fs/binfmt_elf.c:855 #8 0xffffffff811c4fce in search_binary_handler (bprm=0xffff88081b40cb78) at fs/exec.c:1625 #9 0xffffffff811c6bbf in exec_binprm (bprm=) at fs/exec.c:1667 #10 do_execveat_common (fd=, filename=0xffff88080a101200, flags=0x0, argv=..., envp=...) at fs/exec.c:1789 #11 0xffffffff811c6fda in do_execve (__envp=, __argv=, filename=) at fs/exec.c:1833 #12 SYSC_execve (envp=, argv=, filename=) at fs/exec.c:1914 #13 SyS_execve (filename=, argv=0x7f4e5c2aced0, envp=0x7f4e5c2aceb0) at fs/exec.c:1909 #14 0xffffffff810018dd in do_syscall_64 (regs=0xffff88081b40cb78) at arch/x86/entry/common.c:284 #15 0xffffffff819e2c06 in entry_SYSCALL_64 () at arch/x86/entry/entry_64.S:245 This leads to a sporadic memory corruption in the parent process: Thread 200 received signal SIGSEGV, Segmentation fault. [Switching to Thread 3685] 0xffffffff811a3efe in __mmu_notifier_invalidate_range_start (mm=0xffff880807579000, start=0x7f4e5c62f000, end=0x7f4e5c66f000) at mm/mmu_notifier.c:199 199 if (mn->ops->invalidate_range_start) (gdb) bt #0 0xffffffff811a3efe in __mmu_notifier_invalidate_range_start (mm=0xffff880807579000, start=0x7f4e5c62f000, end=0x7f4e5c66f000) at mm/mmu_notifier.c:199 #1 0xffffffff811ae471 in mmu_notifier_invalidate_range_start (end=, start=, mm=) at ./include/linux/mmu_notifier.h:282 #2 migrate_vma_collect (migrate=0xffffc90003ca3940) at mm/migrate.c:2280 #3 0xffffffff811b04a7 in migrate_vma (ops=, vma=0x7f4e5c62f000, start=0x7f4e5c62f000, end=0x7f4e5c66f000, src=0xffffc90003ca39d0, dst=0xffffc90003ca39d0, private=0xffffc90003ca39c0) at mm/migrate.c:2819 (gdb) p mn->ops $2 = (const struct mmu_notifier_ops *) 0x6b6b6b6b6b6b6b6b Please see attached a reproducer (sanity_rmem004_fork.tgz). Use "./build.sh; sudo ./kload.sh; ./run.sh" to recreate the issue on your end. 2) A slight modification of the affected application does not use fork(). Instead, an execve() call from a parallel thread replaces the original process. This is a particularly interesting case, because at that point the process is busy migrating pages to/from device. Here's what happens: 0xffffffff811b9879 in commit_charge (page=, lrucare=, memcg=) at mm/memcontrol.c:2060 2060 VM_BUG_ON_PAGE(page->mem_cgroup, page); (gdb) bt #0 0xffffffff811b9879 in commit_charge (page=, lrucare=, memcg=) at mm/memcontrol.c:2060 #1 0xffffffff811b93d6 in commit_charge (lrucare=, memcg=, page=) at ./include/linux/page-flags.h:149 #2 mem_cgroup_commit_charge (page=0xffff88081b68cb70, memcg=0xffff88081b051548, lrucare=, compound=) at mm/memcontrol.c:5468 #3 0xffffffff811b10d4 in migrate_vma_insert_page (migrate=, dst=, src=, page=, addr=) at mm/migrate.c:2605 #4 migrate_vma_pages (migrate=) at mm/migrate.c:2647 #5 migrate_vma (ops=, vma=, start=, end=, src=, dst=, private=0xffffc900037439c0) at mm/migrate.c:2844 Please find another reproducer attached (sanity_rmem004_execve.tgz) for this issue. Thanks! -- Evgeny Baskakov NVIDIA