linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Pavel Emelyanov <xemul@virtuozzo.com>
To: Mike Rapoport <rppt@linux.vnet.ibm.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	linux-mm <linux-mm@kvack.org>,
	lkml <linux-kernel@vger.kernel.org>,
	Andrea Arcangeli <aarcange@redhat.com>,
	Mike Kravetz <mike.kravetz@oracle.com>,
	Andrei Vagin <avagin@virtuozzo.com>
Subject: Re: [PATCH] userfaultfd: prevent non-cooperative events vs mcopy_atomic races
Date: Thu, 24 May 2018 19:40:07 +0300	[thread overview]
Message-ID: <e42a383f-491a-b42a-347c-effe4b86b982@virtuozzo.com> (raw)
In-Reply-To: <20180524115613.GA16908@rapoport-lnx>

On 05/24/2018 02:56 PM, Mike Rapoport wrote:
> On Thu, May 24, 2018 at 02:24:37PM +0300, Pavel Emelyanov wrote:
>> On 05/23/2018 10:42 AM, Mike Rapoport wrote:
>>> If a process monitored with userfaultfd changes it's memory mappings or
>>> forks() at the same time as uffd monitor fills the process memory with
>>> UFFDIO_COPY, the actual creation of page table entries and copying of the
>>> data in mcopy_atomic may happen either before of after the memory mapping
>>> modifications and there is no way for the uffd monitor to maintain
>>> consistent view of the process memory layout.
>>>
>>> For instance, let's consider fork() running in parallel with
>>> userfaultfd_copy():
>>>
>>> process        		         |	uffd monitor
>>> ---------------------------------+------------------------------
>>> fork()        		         | userfaultfd_copy()
>>> ...        		         | ...
>>>     dup_mmap()        	         |     down_read(mmap_sem)
>>>     down_write(mmap_sem)         |     /* create PTEs, copy data */
>>>         dup_uffd()               |     up_read(mmap_sem)
>>>         copy_page_range()        |
>>>         up_write(mmap_sem)       |
>>>         dup_uffd_complete()      |
>>>             /* notify monitor */ |
>>>
>>> If the userfaultfd_copy() takes the mmap_sem first, the new page(s) will be
>>> present by the time copy_page_range() is called and they will appear in the
>>> child's memory mappings. However, if the fork() is the first to take the
>>> mmap_sem, the new pages won't be mapped in the child's address space.
>>
>> But in this case child should get an entry, that emits a message to uffd when step upon!
>> And uffd will just userfaultfd_copy() it again. No?
>  
> There will be a message, indeed. But there is no way for monitor to tell
> whether the pages it copied are present or not in the child.

If there's a message, then they are not present, that's for sure :)

> Since the monitor cannot assume that the process will access all its memory
> it has to copy some pages "in the background". A simple monitor may look
> like:
> 
> 	for (;;) {
> 		wait_for_uffd_events(timeout);
> 		handle_uffd_events();
> 		uffd_copy(some not faulted pages);
> 	}
> 
> Then, if the "background" uffd_copy() races with fork, the pages we've
> copied may be already present in parent's mappings before the call to
> copy_page_range() and may be not.
> 
> If the pages were not present, uffd_copy'ing them again to the child's
> memory would be ok.

Yes.

> But if uffd_copy() was first to catch mmap_sem, and we would uffd_copy them
> again, child process will get memory corruption.

You mean the background uffd_copy()? But doesn't it race even with regular PF handling,
not only the fork? How do we handle this race?

-- Pavel

  reply	other threads:[~2018-05-24 16:40 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-05-23  7:42 [PATCH] userfaultfd: prevent non-cooperative events vs mcopy_atomic races Mike Rapoport
2018-05-24 11:24 ` Pavel Emelyanov
2018-05-24 11:56   ` Mike Rapoport
2018-05-24 16:40     ` Pavel Emelyanov [this message]
2018-05-24 19:06       ` Mike Rapoport
2018-05-25 14:05         ` Pavel Emelyanov
2020-12-03 19:57 ` Nadav Amit
2020-12-06  9:37   ` Mike Rapoport
2020-12-07  4:31     ` Nadav Amit
2020-12-08  8:34       ` Mike Rapoport
2020-12-08  8:57         ` Nadav Amit

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=e42a383f-491a-b42a-347c-effe4b86b982@virtuozzo.com \
    --to=xemul@virtuozzo.com \
    --cc=aarcange@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=avagin@virtuozzo.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mike.kravetz@oracle.com \
    --cc=rppt@linux.vnet.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).