linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Jerome Glisse <jglisse@redhat.com>
To: Evgeny Baskakov <ebaskakov@nvidia.com>
Cc: "akpm@linux-foundation.org" <akpm@linux-foundation.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>,
	John Hubbard <jhubbard@nvidia.com>,
	David Nellans <dnellans@nvidia.com>,
	Mark Hairgrove <mhairgrove@nvidia.com>,
	Sherry Cheung <SCheung@nvidia.com>,
	Subhash Gutti <sgutti@nvidia.com>
Subject: Re: [HMM 12/15] mm/migrate: new memory migration helper for use with device memory v4
Date: Thu, 20 Jul 2017 21:33:04 -0400	[thread overview]
Message-ID: <20170721013303.GA25991@redhat.com> (raw)
In-Reply-To: <cfba9bfb-5178-bcae-0fa9-ef66e2a871d5@nvidia.com>

On Thu, Jul 20, 2017 at 06:00:08PM -0700, Evgeny Baskakov wrote:
> On 7/14/17 5:55 PM, Jerome Glisse wrote:
> Hi Jerome,
> 
> I think I just found a couple of new issues, now related to fork/execve.
> 
> 1) With a fork() followed by execve(), the child process makes a copy of the
> parent mm_struct object, including the "hmm" pointer. Later on, an execve()
> syscall in the child process frees the old mm_struct, and destroys the "hmm"
> object - which apparently it shouldn't do, because the "hmm" object is
> shared between the parent and child processes:
> 
> (gdb) bt
> #0  hmm_mm_destroy (mm=0xffff88080757aa40) at mm/hmm.c:134
> #1  0xffffffff81058567 in __mmdrop (mm=0xffff88080757aa40) at
> kernel/fork.c:889
> #2  0xffffffff8105904f in mmdrop (mm=<optimized out>) at
> ./include/linux/sched/mm.h:42
> #3  __mmput (mm=<optimized out>) at kernel/fork.c:916
> #4  mmput (mm=0xffff88080757aa40) at kernel/fork.c:927
> #5  0xffffffff811c5a68 in exec_mmap (mm=<optimized out>) at fs/exec.c:1057
> #6  flush_old_exec (bprm=<optimized out>) at fs/exec.c:1284
> #7  0xffffffff81214460 in load_elf_binary (bprm=0xffff8808133b1978) at
> fs/binfmt_elf.c:855
> #8  0xffffffff811c4fce in search_binary_handler (bprm=0xffff88081b40cb78) at
> fs/exec.c:1625
> #9  0xffffffff811c6bbf in exec_binprm (bprm=<optimized out>) at
> fs/exec.c:1667
> #10 do_execveat_common (fd=<optimized out>, filename=0xffff88080a101200,
> flags=0x0, argv=..., envp=...) at fs/exec.c:1789
> #11 0xffffffff811c6fda in do_execve (__envp=<optimized out>,
> __argv=<optimized out>, filename=<optimized out>) at fs/exec.c:1833
> #12 SYSC_execve (envp=<optimized out>, argv=<optimized out>,
> filename=<optimized out>) at fs/exec.c:1914
> #13 SyS_execve (filename=<optimized out>, argv=0x7f4e5c2aced0,
> envp=0x7f4e5c2aceb0) at fs/exec.c:1909
> #14 0xffffffff810018dd in do_syscall_64 (regs=0xffff88081b40cb78) at
> arch/x86/entry/common.c:284
> #15 0xffffffff819e2c06 in entry_SYSCALL_64 () at
> arch/x86/entry/entry_64.S:245
> 
> This leads to a sporadic memory corruption in the parent process:
> 
> Thread 200 received signal SIGSEGV, Segmentation fault.
> [Switching to Thread 3685]
> 0xffffffff811a3efe in __mmu_notifier_invalidate_range_start
> (mm=0xffff880807579000, start=0x7f4e5c62f000, end=0x7f4e5c66f000) at
> mm/mmu_notifier.c:199
> 199            if (mn->ops->invalidate_range_start)
> (gdb) bt
> #0  0xffffffff811a3efe in __mmu_notifier_invalidate_range_start
> (mm=0xffff880807579000, start=0x7f4e5c62f000, end=0x7f4e5c66f000) at
> mm/mmu_notifier.c:199
> #1  0xffffffff811ae471 in mmu_notifier_invalidate_range_start
> (end=<optimized out>, start=<optimized out>, mm=<optimized out>) at
> ./include/linux/mmu_notifier.h:282
> #2  migrate_vma_collect (migrate=0xffffc90003ca3940) at mm/migrate.c:2280
> #3  0xffffffff811b04a7 in migrate_vma (ops=<optimized out>,
> vma=0x7f4e5c62f000, start=0x7f4e5c62f000, end=0x7f4e5c66f000,
> src=0xffffc90003ca39d0, dst=0xffffc90003ca39d0, private=0xffffc90003ca39c0)
> at mm/migrate.c:2819
> (gdb) p mn->ops
> $2 = (const struct mmu_notifier_ops *) 0x6b6b6b6b6b6b6b6b
> 
> Please see attached a reproducer (sanity_rmem004_fork.tgz). Use "./build.sh;
> sudo ./kload.sh; ./run.sh" to recreate the issue on your end.
> 
> 
> 2) A slight modification of the affected application does not use fork().
> Instead, an execve() call from a parallel thread replaces the original
> process. This is a particularly interesting case, because at that point the
> process is busy migrating pages to/from device.
> 
> Here's what happens:
> 
> 0xffffffff811b9879 in commit_charge (page=<optimized out>,
> lrucare=<optimized out>, memcg=<optimized out>) at mm/memcontrol.c:2060
> 2060        VM_BUG_ON_PAGE(page->mem_cgroup, page);
> (gdb) bt
> #0  0xffffffff811b9879 in commit_charge (page=<optimized out>,
> lrucare=<optimized out>, memcg=<optimized out>) at mm/memcontrol.c:2060
> #1  0xffffffff811b93d6 in commit_charge (lrucare=<optimized out>,
> memcg=<optimized out>, page=<optimized out>) at
> ./include/linux/page-flags.h:149
> #2  mem_cgroup_commit_charge (page=0xffff88081b68cb70,
> memcg=0xffff88081b051548, lrucare=<optimized out>, compound=<optimized out>)
> at mm/memcontrol.c:5468
> #3  0xffffffff811b10d4 in migrate_vma_insert_page (migrate=<optimized out>,
> dst=<optimized out>, src=<optimized out>, page=<optimized out>,
> addr=<optimized out>) at mm/migrate.c:2605
> #4  migrate_vma_pages (migrate=<optimized out>) at mm/migrate.c:2647
> #5  migrate_vma (ops=<optimized out>, vma=<optimized out>, start=<optimized
> out>, end=<optimized out>, src=<optimized out>, dst=<optimized out>,
> private=0xffffc900037439c0) at mm/migrate.c:2844
> 
> 
> Please find another reproducer attached (sanity_rmem004_execve.tgz) for this
> issue.
> 

So i pushed an updated hmm-next branch it should have all fixes so far, including
something that should fix this issue. I still want to go over all emails again
to make sure i am not forgetting anything.

Cheers,
Jerome

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2017-07-21  1:33 UTC|newest]

Thread overview: 66+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-05-22 16:51 [HMM 00/15] HMM (Heterogeneous Memory Management) v22 Jérôme Glisse
2017-05-22 16:51 ` [HMM 01/15] hmm: heterogeneous memory management documentation Jérôme Glisse
2017-05-22 16:51 ` [HMM 02/15] mm/hmm: heterogeneous memory management (HMM for short) v3 Jérôme Glisse
2017-05-22 16:51 ` [HMM 03/15] mm/hmm/mirror: mirror process address space on device with HMM helpers v3 Jérôme Glisse
2017-05-22 16:51 ` [HMM 04/15] mm/hmm/mirror: helper to snapshot CPU page table v3 Jérôme Glisse
2017-05-22 16:51 ` [HMM 05/15] mm/hmm/mirror: device page fault handler Jérôme Glisse
2017-05-22 16:51 ` [HMM 06/15] mm/memory_hotplug: introduce add_pages Jérôme Glisse
2017-05-22 16:51 ` [HMM 07/15] mm/ZONE_DEVICE: new type of ZONE_DEVICE for unaddressable memory v2 Jérôme Glisse
2017-05-22 21:17   ` Dan Williams
2017-05-23 21:36     ` [HMM 07/18] mm/ZONE_DEVICE: new type of ZONE_DEVICE for unaddressable memory v3 Jérôme Glisse
2017-05-23  8:36   ` [HMM 07/15] mm/ZONE_DEVICE: new type of ZONE_DEVICE for unaddressable memory v2 kbuild test robot
2017-05-22 16:51 ` [HMM 08/15] mm/ZONE_DEVICE: special case put_page() for device private pages Jérôme Glisse
2017-05-22 19:29   ` Dan Williams
2017-05-22 20:14     ` Jerome Glisse
2017-05-22 20:19       ` Dan Williams
2017-05-22 21:14         ` Jerome Glisse
2017-05-22 20:22       ` Hugh Dickins
2017-05-22 21:17         ` Jerome Glisse
2017-05-23  9:34   ` kbuild test robot
2017-05-23 13:23   ` Kirill A. Shutemov
2017-05-23 21:37     ` [HMM 08/18] mm/ZONE_DEVICE: special case put_page() for device private pages v2 Jérôme Glisse
2017-05-22 16:52 ` [HMM 09/15] mm/hmm/devmem: device memory hotplug using ZONE_DEVICE v4 Jérôme Glisse
2017-05-23 21:37   ` [HMM 09/18] mm/hmm/devmem: device memory hotplug using ZONE_DEVICE v5 Jérôme Glisse
2017-05-22 16:52 ` [HMM 10/15] mm/hmm/devmem: dummy HMM device for ZONE_DEVICE memory v3 Jérôme Glisse
2017-05-22 16:52 ` [HMM 11/15] mm/migrate: new migrate mode MIGRATE_SYNC_NO_COPY Jérôme Glisse
2017-05-22 16:52 ` [HMM 12/15] mm/migrate: new memory migration helper for use with device memory v4 Jérôme Glisse
2017-05-23 18:07   ` Reza Arbab
2017-06-27  0:07   ` Evgeny Baskakov
2017-06-30 23:19     ` Evgeny Baskakov
2017-07-01  0:57       ` Jerome Glisse
2017-07-01  2:06         ` Evgeny Baskakov
2017-07-10 22:59         ` Evgeny Baskakov
2017-07-10 23:43           ` Jerome Glisse
2017-07-11  0:17             ` Evgeny Baskakov
2017-07-11  0:54               ` Jerome Glisse
2017-07-20 21:05                 ` Evgeny Baskakov
2017-07-10 23:44         ` Evgeny Baskakov
2017-07-11 18:29           ` Jerome Glisse
2017-07-11 18:42             ` Evgeny Baskakov
2017-07-11 18:49               ` Jerome Glisse
2017-07-11 19:35                 ` Evgeny Baskakov
2017-07-13 20:16                   ` Jerome Glisse
2017-07-14  5:32                     ` Evgeny Baskakov
2017-07-14 19:43                     ` Evgeny Baskakov
2017-07-15  0:55                       ` Jerome Glisse
2017-07-15  5:04                         ` Evgeny Baskakov
2017-07-21  1:00                         ` Evgeny Baskakov
2017-07-21  1:33                           ` Jerome Glisse [this message]
2017-07-21 22:01                             ` Evgeny Baskakov
2017-07-25 22:45                             ` Evgeny Baskakov
2017-07-26 19:14                               ` Jerome Glisse
2017-05-22 16:52 ` [HMM 13/15] mm/migrate: migrate_vma() unmap page from vma while collecting pages Jérôme Glisse
2017-05-22 16:52 ` [HMM 14/15] mm/migrate: support un-addressable ZONE_DEVICE page in migration v2 Jérôme Glisse
2017-05-22 16:52 ` [HMM 15/15] mm/migrate: allow migrate_vma() to alloc new page on empty entry v2 Jérôme Glisse
2017-05-23 22:02 ` [HMM 00/15] HMM (Heterogeneous Memory Management) v22 Jerome Glisse
2017-05-23 22:05   ` Andrew Morton
2017-05-24  1:55 ` Balbir Singh
2017-05-24 17:53   ` Jerome Glisse
2017-06-01  2:04     ` Balbir Singh
2017-06-01 22:38       ` Jerome Glisse
2017-06-03  9:18         ` Balbir Singh
2017-05-24 17:20 [HMM 00/15] HMM (Heterogeneous Memory Management) v23 Jérôme Glisse
2017-05-24 17:20 ` [HMM 12/15] mm/migrate: new memory migration helper for use with device memory v4 Jérôme Glisse
2017-05-31  3:59   ` Balbir Singh
2017-06-01 22:35     ` Jerome Glisse
2017-06-07  9:02       ` Balbir Singh
2017-06-07 14:06         ` Jerome Glisse

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170721013303.GA25991@redhat.com \
    --to=jglisse@redhat.com \
    --cc=SCheung@nvidia.com \
    --cc=akpm@linux-foundation.org \
    --cc=dnellans@nvidia.com \
    --cc=ebaskakov@nvidia.com \
    --cc=jhubbard@nvidia.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhairgrove@nvidia.com \
    --cc=sgutti@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).