linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Mike Kravetz <mike.kravetz@oracle.com>
To: Andrea Arcangeli <aarcange@redhat.com>
Cc: Michal Hocko <mhocko@kernel.org>,
	linux-mm@kvack.org, linux-api@vger.kernel.org,
	linux-kernel@vger.kernel.org,
	Andrew Morton <akpm@linux-foundation.org>,
	Aaron Lu <aaron.lu@intel.com>,
	"Kirill A . Shutemov" <kirill.shutemov@linux.intel.com>,
	Vlastimil Babka <vbabka@suse.cz>
Subject: Re: [RFC PATCH 1/1] mm/mremap: add MREMAP_MIRROR flag for existing mirroring functionality
Date: Thu, 13 Jul 2017 11:11:37 -0700	[thread overview]
Message-ID: <28a8da13-bdc2-3f23-dee9-607377ac1cc3@oracle.com> (raw)
In-Reply-To: <20170713163054.GK22628@redhat.com>

On 07/13/2017 09:30 AM, Andrea Arcangeli wrote:
> On Thu, Jul 13, 2017 at 09:01:54AM -0700, Mike Kravetz wrote:
>> Sent a patch (in separate e-mail thread) to return EINVAL for private
>> mappings.
> 
> The way old_len == 0 behaves for MAP_PRIVATE seems more sane to me
> than the alternative of copying pagetables for anon pages (as behaving
> the way that way avoids to break anon pages invariants), despite it's
> not creating an exact mirror of what was in the original vma as it
> excludes any modification done to cowed anon pages.
> 
> By nullifying move_page_tables old_len == 0 is simply duping the vma
> which is equivalent to a new mmap on the file for the MAP_PRIVATE
> case, it has a deterministic result. The real question is if it
> anybody is using it.

As previously discussed, copying pagetables (via move_page_tables) does
not happen if old_len == 0.  This is true for both for private and shared
mappings.

Here is my understanding of how things work for old_len == 0 of anon
mappings:
- shared mappings
	- New vma is created at new virtual address
	- vma refers to the same underlying object/pages as old vma
	- after mremap, no page tables exist for new vma, they are
	  created as pages are accessed/faulted
	- page at new_address is same as page at old_address
- private mappings
	- New vma is created at new virtual address
	- vma does not refer to same pages as old vma.  It is a 'new'
	  private anon mapping.
	- after mremap, no page tables exist for new vma.  access to
	  the range of the new vma will result in faults that allocate
	  a new page.
	- page at new_address is different than  page at old_address
	  the new vma will result in new 

So, the result of mremap(old_len == 0) on a private mapping is that it
simply creates a new private mapping.  IMO, this is contrary to the purpose
of mremap.  mremap should return a mapping that is somehow related to
the original mapping.

Perhaps you are thinking about mremap of a private file mapping?  I was
not considering that case.  I believe you are right.  In this case a
private COW mapping based on the original mapping would be created.  So,
this seems more in line with the intent of mremap.  The new mapping is
still related to the old mapping.

With this in mind, what about returning EINVAL only for the anon private
mapping case?

However, if you have a fd (for a file mapping) then I can not see why
someone would be using the old_len == 0 trick.  It would be more straight
forward to simply use mmap to create the additional mapping.

> So an alternative would be to start by adding a WARN_ON_ONCE deprecation
> warning instead of -EINVAL right away.
> 
> The vma->vm_flags VM_ACCOUNT being wiped on the original vma as side
> effect of using the old_len == 0 trick looks like a bug, I guess it
> should get fixed if we intend to keep old_len and document it for the
> long term.

Others seem to think we should keep old_len == 0 and document.

> Overall I'm more concerned about the fact an allocation failure in
> do_munmap is unreported to userland and it will leave the old vma
> intact like old_len == 0 would do (unless I'm misreading something
> there). The VM_ACCOUNT wipe as side effect of old_len == 0 is not
> major short term concern.

I assume you are concerned about the do_munmap call in move_vma?  That
does indeed look to be of concern.  This happens AFTER setting up the
new mapping.  So, I'm thinking we should tear down the new mapping in
the case do_munmap of the old mapping fails?  That 'should' simply
be a matter of:
- moving page tables back to original mapping
- remove/delete new vma
- I don't think we need to 'unmap' the new vma as there should be no
  associated pages.

I'll look into doing this as well.

Just curious, do those userfaultfd callouts still work as desired in the
case of map duplication (old_len == 0)?
-- 
Mike Kravetz

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2017-07-13 18:12 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-07-06 16:17 [RFC PATCH 0/1] mm/mremap: add MREMAP_MIRROR flag Mike Kravetz
2017-07-06 16:17 ` [RFC PATCH 1/1] mm/mremap: add MREMAP_MIRROR flag for existing mirroring functionality Mike Kravetz
2017-07-07  8:45   ` Anshuman Khandual
2017-07-07 17:14     ` Mike Kravetz
2017-07-09  7:23       ` Anshuman Khandual
2017-07-07 10:23   ` Kirill A. Shutemov
2017-07-07 17:29     ` Mike Kravetz
2017-07-07 17:45       ` Kirill A. Shutemov
2017-07-07 18:09         ` Mike Kravetz
2017-07-09  7:32           ` Anshuman Khandual
2017-07-10 16:22             ` Vlastimil Babka
2017-07-10 17:22               ` Mike Kravetz
2017-07-11 12:36   ` Michal Hocko
2017-07-11 18:23     ` Mike Kravetz
2017-07-11 21:02       ` Andrea Arcangeli
2017-07-11 21:57         ` Mike Kravetz
2017-07-11 23:31           ` Andrea Arcangeli
2017-07-12 11:46       ` Michal Hocko
2017-07-12 16:55         ` Mike Kravetz
2017-07-13  6:16           ` Michal Hocko
2017-07-13 16:01             ` Mike Kravetz
2017-07-13 16:30               ` Andrea Arcangeli
2017-07-13 18:11                 ` Mike Kravetz [this message]
2017-07-13 20:33                   ` Andrea Arcangeli
2017-07-07  8:19 ` [RFC PATCH 0/1] mm/mremap: add MREMAP_MIRROR flag Anshuman Khandual
2017-07-07 17:04   ` Mike Kravetz
2017-07-07 11:03 ` Anshuman Khandual
2017-07-07 17:12   ` Mike Kravetz

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=28a8da13-bdc2-3f23-dee9-607377ac1cc3@oracle.com \
    --to=mike.kravetz@oracle.com \
    --cc=aarcange@redhat.com \
    --cc=aaron.lu@intel.com \
    --cc=akpm@linux-foundation.org \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=linux-api@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@kernel.org \
    --cc=vbabka@suse.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).