All of lore.kernel.org
 help / color / mirror / Atom feed
From: Mike Rapoport <rppt@kernel.org>
To: Michal Hocko <mhocko@suse.com>
Cc: Bui Quang Minh <minhquangbui99@gmail.com>,
	akpm@linux-foundation.org, linux-mm@kvack.org,
	linux-kernel@vger.kernel.org,
	Andrea Arcangeli <aarcange@redhat.com>
Subject: Re: [PATCH] userfaultfd: Write protect when virtual memory range has no page table entry
Date: Mon, 22 Mar 2021 15:00:37 +0200	[thread overview]
Message-ID: <YFiU9YWbYpLnlnde@kernel.org> (raw)
In-Reply-To: <YFhuDf6L7nkUoT7q@dhcp22.suse.cz>

On Mon, Mar 22, 2021 at 11:14:37AM +0100, Michal Hocko wrote:
> Le'ts Andrea and Mike
> 
> On Fri 19-03-21 22:24:28, Bui Quang Minh wrote:
> > userfaultfd_writeprotect() use change_protection() to clear write bit in
> > page table entries (pte/pmd). So, later write to this virtual address
> > range causes a page fault, which is then handled by userspace program.
> > However, change_protection() has no effect when there is no page table
> > entries associated with that virtual memory range (a newly mapped memory
> > range). As a result, later access to that memory range causes allocating a
> > page table entry with write bit still set (due to VM_WRITE flag in
> > vma->vm_flags).
> > 
> > Add checks for VM_UFFD_WP in vma->vm_flags when allocating new page table
> > entry in missing page table entry page fault path.
> 
> From the above it is not really clear whether this is a usability
> problem or a bug of the interface.

I'd say it's usability/documentation clarity issue. 
Userspace can register an area with

	UFFDIO_REGISTER_MODE_MISSING | UFFDIO_REGISTER_MODE_WP

and then it will be notified either when page table has no entry for a
virtual address or when there is a write to a write protected address.
 
> > Signed-off-by: Bui Quang Minh <minhquangbui99@gmail.com>
> > ---
> >  mm/huge_memory.c | 12 ++++++++++++
> >  mm/memory.c      | 10 ++++++++++
> >  2 files changed, 22 insertions(+)
> > 
> > diff --git a/mm/huge_memory.c b/mm/huge_memory.c
> > index ae907a9c2050..9bb16a55a48c 100644
> > --- a/mm/huge_memory.c
> > +++ b/mm/huge_memory.c
> > @@ -636,6 +636,11 @@ static vm_fault_t __do_huge_pmd_anonymous_page(struct vm_fault *vmf,
> >  
> >  		entry = mk_huge_pmd(page, vma->vm_page_prot);
> >  		entry = maybe_pmd_mkwrite(pmd_mkdirty(entry), vma);
> > +		if (userfaultfd_wp(vma)) {
> > +			entry = pmd_wrprotect(entry);
> > +			entry = pmd_mkuffd_wp(entry);
> > +		}
> > +
> >  		page_add_new_anon_rmap(page, vma, haddr, true);
> >  		lru_cache_add_inactive_or_unevictable(page, vma);
> >  		pgtable_trans_huge_deposit(vma->vm_mm, vmf->pmd, pgtable);
> > @@ -643,6 +648,13 @@ static vm_fault_t __do_huge_pmd_anonymous_page(struct vm_fault *vmf,
> >  		update_mmu_cache_pmd(vma, vmf->address, vmf->pmd);
> >  		add_mm_counter(vma->vm_mm, MM_ANONPAGES, HPAGE_PMD_NR);
> >  		mm_inc_nr_ptes(vma->vm_mm);
> > +
> > +		if (userfaultfd_huge_pmd_wp(vma, *vmf->pmd)) {
> > +			spin_unlock(vmf->ptl);
> > +			count_vm_event(THP_FAULT_ALLOC);
> > +			count_memcg_event_mm(vma->vm_mm, THP_FAULT_ALLOC);
> > +			return handle_userfault(vmf, VM_UFFD_WP);
> > +		}
> >  		spin_unlock(vmf->ptl);
> >  		count_vm_event(THP_FAULT_ALLOC);
> >  		count_memcg_event_mm(vma->vm_mm, THP_FAULT_ALLOC);
> > diff --git a/mm/memory.c b/mm/memory.c
> > index 5efa07fb6cdc..b835746545bf 100644
> > --- a/mm/memory.c
> > +++ b/mm/memory.c
> > @@ -3564,6 +3564,11 @@ static vm_fault_t do_anonymous_page(struct vm_fault *vmf)
> >  	if (vma->vm_flags & VM_WRITE)
> >  		entry = pte_mkwrite(pte_mkdirty(entry));
> >  
> > +	if (userfaultfd_wp(vma)) {
> > +		entry = pte_wrprotect(entry);
> > +		entry = pte_mkuffd_wp(entry);
> > +	}
> > +
> >  	vmf->pte = pte_offset_map_lock(vma->vm_mm, vmf->pmd, vmf->address,
> >  			&vmf->ptl);
> >  	if (!pte_none(*vmf->pte)) {
> > @@ -3590,6 +3595,11 @@ static vm_fault_t do_anonymous_page(struct vm_fault *vmf)
> >  
> >  	/* No need to invalidate - it was non-present before */
> >  	update_mmu_cache(vma, vmf->address, vmf->pte);
> > +
> > +	if (userfaultfd_pte_wp(vma, *vmf->pte)) {
> > +		pte_unmap_unlock(vmf->pte, vmf->ptl);
> > +		return handle_userfault(vmf, VM_UFFD_WP);
> > +	}
> >  unlock:
> >  	pte_unmap_unlock(vmf->pte, vmf->ptl);
> >  	return ret;
> > -- 
> > 2.25.1
> 
> -- 
> Michal Hocko
> SUSE Labs

-- 
Sincerely yours,
Mike.

  parent reply	other threads:[~2021-03-22 13:15 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-03-19 15:24 [PATCH] userfaultfd: Write protect when virtual memory range has no page table entry Bui Quang Minh
2021-03-20 16:32 ` Andrew Morton
2021-03-22 10:14 ` Michal Hocko
2021-03-22 10:23   ` Mike Rapoport
2021-03-22 13:00   ` Mike Rapoport [this message]
2021-03-22 13:27     ` Peter Xu
2021-03-22 13:49     ` Michal Hocko
2021-03-31 14:49       ` Michal Hocko
2021-04-01  0:24         ` Andrew Morton
2021-03-23  2:48     ` Bui Quang Minh
2021-03-23 15:12       ` Peter Xu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YFiU9YWbYpLnlnde@kernel.org \
    --to=rppt@kernel.org \
    --cc=aarcange@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@suse.com \
    --cc=minhquangbui99@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.