linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Mike Rapoport <rppt@linux.vnet.ibm.com>
To: Pavel Emelyanov <xemul@virtuozzo.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	linux-mm <linux-mm@kvack.org>,
	lkml <linux-kernel@vger.kernel.org>,
	Andrea Arcangeli <aarcange@redhat.com>,
	Mike Kravetz <mike.kravetz@oracle.com>,
	Andrei Vagin <avagin@virtuozzo.com>
Subject: Re: [PATCH] userfaultfd: prevent non-cooperative events vs mcopy_atomic races
Date: Thu, 24 May 2018 22:06:40 +0300	[thread overview]
Message-ID: <20180524190639.GD16908@rapoport-lnx> (raw)
In-Reply-To: <e42a383f-491a-b42a-347c-effe4b86b982@virtuozzo.com>

On Thu, May 24, 2018 at 07:40:07PM +0300, Pavel Emelyanov wrote:
> On 05/24/2018 02:56 PM, Mike Rapoport wrote:
> > On Thu, May 24, 2018 at 02:24:37PM +0300, Pavel Emelyanov wrote:
> >> On 05/23/2018 10:42 AM, Mike Rapoport wrote:
> >>> If a process monitored with userfaultfd changes it's memory mappings or
> >>> forks() at the same time as uffd monitor fills the process memory with
> >>> UFFDIO_COPY, the actual creation of page table entries and copying of the
> >>> data in mcopy_atomic may happen either before of after the memory mapping
> >>> modifications and there is no way for the uffd monitor to maintain
> >>> consistent view of the process memory layout.
> >>>
> >>> For instance, let's consider fork() running in parallel with
> >>> userfaultfd_copy():
> >>>
> >>> process        		         |	uffd monitor
> >>> ---------------------------------+------------------------------
> >>> fork()        		         | userfaultfd_copy()
> >>> ...        		         | ...
> >>>     dup_mmap()        	         |     down_read(mmap_sem)
> >>>     down_write(mmap_sem)         |     /* create PTEs, copy data */
> >>>         dup_uffd()               |     up_read(mmap_sem)
> >>>         copy_page_range()        |
> >>>         up_write(mmap_sem)       |
> >>>         dup_uffd_complete()      |
> >>>             /* notify monitor */ |
> >>>
> >>> If the userfaultfd_copy() takes the mmap_sem first, the new page(s) will be
> >>> present by the time copy_page_range() is called and they will appear in the
> >>> child's memory mappings. However, if the fork() is the first to take the
> >>> mmap_sem, the new pages won't be mapped in the child's address space.
> >>
> >> But in this case child should get an entry, that emits a message to uffd when step upon!
> >> And uffd will just userfaultfd_copy() it again. No?
> >  
> > There will be a message, indeed. But there is no way for monitor to tell
> > whether the pages it copied are present or not in the child.
> 
> If there's a message, then they are not present, that's for sure :)

If the pages are not present and child tries to access them, the monitor
will get page fault notification and everything is fine.
However, if the pages *are present*, the child can access them without uffd
noticing. And if we copy them into child it'll see the wrong data.
Since we are talking about background copy, we'd need to decide whether the
pages should be copied or not regardless #PF notifications.
 
> > Since the monitor cannot assume that the process will access all its memory
> > it has to copy some pages "in the background". A simple monitor may look
> > like:
> > 
> > 	for (;;) {
> > 		wait_for_uffd_events(timeout);
> > 		handle_uffd_events();
> > 		uffd_copy(some not faulted pages);
> > 	}
> > 
> > Then, if the "background" uffd_copy() races with fork, the pages we've
> > copied may be already present in parent's mappings before the call to
> > copy_page_range() and may be not.
> > 
> > If the pages were not present, uffd_copy'ing them again to the child's
> > memory would be ok.
> 
> Yes.
> 
> > But if uffd_copy() was first to catch mmap_sem, and we would uffd_copy them
> > again, child process will get memory corruption.
> 
> You mean the background uffd_copy()?

Yes.

> But doesn't it race even with regular PF handling, not only the fork? How
> do we handle this race?

With the regular #PF handing, the faulting thread patiently waits until
page fault is resolved. With fork(), mremap() etc the thread that caused
the event resumes once the uffd message is read by the monitor. That's
surely way before monitor had chance to somehow process that message.

> -- Pavel
> 

-- 
Sincerely yours,
Mike.

  reply	other threads:[~2018-05-24 19:06 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-05-23  7:42 [PATCH] userfaultfd: prevent non-cooperative events vs mcopy_atomic races Mike Rapoport
2018-05-24 11:24 ` Pavel Emelyanov
2018-05-24 11:56   ` Mike Rapoport
2018-05-24 16:40     ` Pavel Emelyanov
2018-05-24 19:06       ` Mike Rapoport [this message]
2018-05-25 14:05         ` Pavel Emelyanov
2020-12-03 19:57 ` Nadav Amit
2020-12-06  9:37   ` Mike Rapoport
2020-12-07  4:31     ` Nadav Amit
2020-12-08  8:34       ` Mike Rapoport
2020-12-08  8:57         ` Nadav Amit

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180524190639.GD16908@rapoport-lnx \
    --to=rppt@linux.vnet.ibm.com \
    --cc=aarcange@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=avagin@virtuozzo.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mike.kravetz@oracle.com \
    --cc=xemul@virtuozzo.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).