linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Mike Kravetz <mike.kravetz@oracle.com>
To: Peter Xu <peterx@redhat.com>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org
Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>,
	Nadav Amit <nadav.amit@gmail.com>,
	Jerome Glisse <jglisse@redhat.com>,
	Hugh Dickins <hughd@google.com>,
	Andrea Arcangeli <aarcange@redhat.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	"Kirill A . Shutemov" <kirill@shutemov.name>,
	Axel Rasmussen <axelrasmussen@google.com>,
	Matthew Wilcox <willy@infradead.org>
Subject: Re: [PATCH 21/23] hugetlb/userfaultfd: Only drop uffd-wp special pte if required
Date: Fri, 23 Apr 2021 13:33:08 -0700	[thread overview]
Message-ID: <02712955-0552-f82a-0ab8-460d63af3519@oracle.com> (raw)
In-Reply-To: <20210323005054.35973-1-peterx@redhat.com>

On 3/22/21 5:50 PM, Peter Xu wrote:
> Just like what we've done with shmem uffd-wp special ptes, we shouldn't drop
> uffd-wp special swap pte for hugetlb too, only if we're going to unmap the
> whole vma, or we're punching a hole with safe locks held.
> 
> For example, remove_inode_hugepages() is safe to drop uffd-wp ptes, because it
> has taken hugetlb fault mutex so that no concurrent page fault would trigger.
> While the call to hugetlb_vmdelete_list() in hugetlbfs_punch_hole() is not
> safe.  That's why the previous call will be with ZAP_FLAG_DROP_FILE_UFFD_WP,
> while the latter one won't be able to.

This commit message is a bit confusing, but the hugetlb hole punch code
path is a bit confusing. :)   How about something like this?

As with  shmem uffd-wp special ptes, only drop the uffd-wp special swap
pte if unmapping an entire vma or synchronized such that faults can not
race with the unmap operation.  This requires passing zap_flags all the
way to the lowest level hugetlb unmap routine: __unmap_hugepage_range.

In general, unmap calls originated in hugetlbfs code will pass the
ZAP_FLAG_DROP_FILE_UFFD_WP flag as synchronization is in place to prevent
faults.  The exception is hole punch which will first unmap without any
synchronization.  Later when hole punch actually removes the page from
the file, it will check to see if there was a subsequent fault and if
so take the hugetlb fault mutex while unmapping again.  This second
unmap will pass in ZAP_FLAG_DROP_FILE_UFFD_WP.

> 
> Signed-off-by: Peter Xu <peterx@redhat.com>
> ---
>  fs/hugetlbfs/inode.c    | 15 +++++++++------
>  include/linux/hugetlb.h | 13 ++++++++-----
>  mm/hugetlb.c            | 27 +++++++++++++++++++++------
>  mm/memory.c             |  5 ++++-
>  4 files changed, 42 insertions(+), 18 deletions(-)
> 
> diff --git a/fs/hugetlbfs/inode.c b/fs/hugetlbfs/inode.c
> index d81f52b87bd7..5fe19e801a2b 100644
> --- a/fs/hugetlbfs/inode.c
> +++ b/fs/hugetlbfs/inode.c
> @@ -399,7 +399,8 @@ static void remove_huge_page(struct page *page)
>  }
>  
>  static void
> -hugetlb_vmdelete_list(struct rb_root_cached *root, pgoff_t start, pgoff_t end)
> +hugetlb_vmdelete_list(struct rb_root_cached *root, pgoff_t start, pgoff_t end,
> +		      unsigned long zap_flags)
>  {
>  	struct vm_area_struct *vma;
>  
> @@ -432,7 +433,7 @@ hugetlb_vmdelete_list(struct rb_root_cached *root, pgoff_t start, pgoff_t end)
>  		}
>  
>  		unmap_hugepage_range(vma, vma->vm_start + v_offset, v_end,
> -									NULL);
> +				     NULL, zap_flags);
>  	}
>  }
>  
> @@ -513,7 +514,8 @@ static void remove_inode_hugepages(struct inode *inode, loff_t lstart,
>  				mutex_lock(&hugetlb_fault_mutex_table[hash]);
>  				hugetlb_vmdelete_list(&mapping->i_mmap,
>  					index * pages_per_huge_page(h),
> -					(index + 1) * pages_per_huge_page(h));
> +					(index + 1) * pages_per_huge_page(h),
> +					ZAP_FLAG_DROP_FILE_UFFD_WP);
>  				i_mmap_unlock_write(mapping);
>  			}
>  
> @@ -579,7 +581,8 @@ static void hugetlb_vmtruncate(struct inode *inode, loff_t offset)
>  	i_mmap_lock_write(mapping);
>  	i_size_write(inode, offset);
>  	if (!RB_EMPTY_ROOT(&mapping->i_mmap.rb_root))
> -		hugetlb_vmdelete_list(&mapping->i_mmap, pgoff, 0);
> +		hugetlb_vmdelete_list(&mapping->i_mmap, pgoff, 0,
> +				      ZAP_FLAG_DROP_FILE_UFFD_WP);
>  	i_mmap_unlock_write(mapping);
>  	remove_inode_hugepages(inode, offset, LLONG_MAX);
>  }
> @@ -612,8 +615,8 @@ static long hugetlbfs_punch_hole(struct inode *inode, loff_t offset, loff_t len)
>  		i_mmap_lock_write(mapping);
>  		if (!RB_EMPTY_ROOT(&mapping->i_mmap.rb_root))
>  			hugetlb_vmdelete_list(&mapping->i_mmap,
> -						hole_start >> PAGE_SHIFT,
> -						hole_end  >> PAGE_SHIFT);
> +					      hole_start >> PAGE_SHIFT,
> +					      hole_end >> PAGE_SHIFT, 0);
>  		i_mmap_unlock_write(mapping);
>  		remove_inode_hugepages(inode, hole_start, hole_end);
>  		inode_unlock(inode);
> diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h
> index 92710600596e..4047fa042782 100644
> --- a/include/linux/hugetlb.h
> +++ b/include/linux/hugetlb.h
> @@ -121,14 +121,15 @@ long follow_hugetlb_page(struct mm_struct *, struct vm_area_struct *,
>  			 unsigned long *, unsigned long *, long, unsigned int,
>  			 int *);
>  void unmap_hugepage_range(struct vm_area_struct *,
> -			  unsigned long, unsigned long, struct page *);
> +			  unsigned long, unsigned long, struct page *,
> +			  unsigned long);
>  void __unmap_hugepage_range_final(struct mmu_gather *tlb,
>  			  struct vm_area_struct *vma,
>  			  unsigned long start, unsigned long end,
> -			  struct page *ref_page);
> +			  struct page *ref_page, unsigned long zap_flags);
>  void __unmap_hugepage_range(struct mmu_gather *tlb, struct vm_area_struct *vma,
>  				unsigned long start, unsigned long end,
> -				struct page *ref_page);
> +				struct page *ref_page, unsigned long zap_flags);

Nothing introduced with your patch, but it seems __unmap_hugepage_range_final
does not need to be in the header and can be static in hugetlb.c.

Everything else looks good,

Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com>
-- 
Mike Kravetz


  reply	other threads:[~2021-04-23 20:33 UTC|newest]

Thread overview: 45+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-03-23  0:48 [PATCH 00/23] userfaultfd-wp: Support shmem and hugetlbfs Peter Xu
2021-03-23  0:48 ` [PATCH 01/23] shmem/userfaultfd: Take care of UFFDIO_COPY_MODE_WP Peter Xu
2021-03-23  0:48 ` [PATCH 02/23] mm: Clear vmf->pte after pte_unmap_same() returns Peter Xu
2021-03-23  2:34   ` Miaohe Lin
2021-03-23 15:40     ` Peter Xu
2021-03-23  0:48 ` [PATCH 03/23] mm/userfaultfd: Introduce special pte for unmapped file-backed mem Peter Xu
2021-03-23  0:48 ` [PATCH 04/23] mm/swap: Introduce the idea of special swap ptes Peter Xu
2021-03-23  0:48 ` [PATCH 05/23] shmem/userfaultfd: Handle uffd-wp special pte in page fault handler Peter Xu
2021-03-23  0:48 ` [PATCH 06/23] mm: Drop first_index/last_index in zap_details Peter Xu
2021-03-23  0:48 ` [PATCH 07/23] mm: Introduce zap_details.zap_flags Peter Xu
2021-03-23  2:11   ` Matthew Wilcox
2021-03-23 15:43     ` Peter Xu
2021-03-23  0:48 ` [PATCH 08/23] mm: Introduce ZAP_FLAG_SKIP_SWAP Peter Xu
2021-03-23  0:48 ` [PATCH 09/23] mm: Pass zap_flags into unmap_mapping_pages() Peter Xu
2021-03-23  0:48 ` [PATCH 10/23] shmem/userfaultfd: Persist uffd-wp bit across zapping for file-backed Peter Xu
2021-03-23  0:49 ` [PATCH 11/23] shmem/userfaultfd: Allow wr-protect none pte for file-backed mem Peter Xu
2021-03-23  0:49 ` [PATCH 12/23] shmem/userfaultfd: Allows file-back mem to be uffd wr-protected on thps Peter Xu
2021-03-23  0:49 ` [PATCH 13/23] shmem/userfaultfd: Handle the left-overed special swap ptes Peter Xu
2021-03-23  0:49 ` [PATCH 14/23] shmem/userfaultfd: Pass over uffd-wp special swap pte when fork() Peter Xu
2021-03-23  0:49 ` [PATCH 15/23] hugetlb/userfaultfd: Hook page faults for uffd write protection Peter Xu
2021-04-21 22:02   ` Mike Kravetz
2021-03-23  0:49 ` [PATCH 16/23] hugetlb/userfaultfd: Take care of UFFDIO_COPY_MODE_WP Peter Xu
2021-04-21 23:06   ` Mike Kravetz
2021-04-22  1:14     ` Peter Xu
2021-03-23  0:49 ` [PATCH 17/23] hugetlb/userfaultfd: Handle UFFDIO_WRITEPROTECT Peter Xu
2021-04-22 18:22   ` Mike Kravetz
2021-03-23  0:49 ` [PATCH 18/23] mm/hugetlb: Introduce huge version of special swap pte helpers Peter Xu
2021-04-22 19:00   ` Mike Kravetz
2021-03-23  0:50 ` [PATCH 19/23] hugetlb/userfaultfd: Handle uffd-wp special pte in hugetlb pf handler Peter Xu
2021-04-22 22:45   ` Mike Kravetz
2021-04-26  2:08     ` Peter Xu
2021-03-23  0:50 ` [PATCH 20/23] hugetlb/userfaultfd: Allow wr-protect none ptes Peter Xu
2021-04-23  0:08   ` Mike Kravetz
2021-03-23  0:50 ` [PATCH 21/23] hugetlb/userfaultfd: Only drop uffd-wp special pte if required Peter Xu
2021-04-23 20:33   ` Mike Kravetz [this message]
2021-04-26 21:16     ` Peter Xu
2021-04-26 21:36       ` Mike Kravetz
2021-04-26 22:05         ` Peter Xu
2021-04-26 23:09           ` Mike Kravetz
2021-03-23  0:50 ` [PATCH 22/23] mm/userfaultfd: Enable write protection for shmem & hugetlbfs Peter Xu
2021-03-23  0:50 ` [PATCH 23/23] userfaultfd/selftests: Enable uffd-wp for shmem/hugetlbfs Peter Xu
2021-03-23  0:54 ` [PATCH 00/23] userfaultfd-wp: Support shmem and hugetlbfs Peter Xu
2021-04-21 16:03 ` Peter Xu
2021-04-21 21:39   ` Mike Kravetz
2021-04-22  1:16     ` Peter Xu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=02712955-0552-f82a-0ab8-460d63af3519@oracle.com \
    --to=mike.kravetz@oracle.com \
    --cc=aarcange@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=axelrasmussen@google.com \
    --cc=hughd@google.com \
    --cc=jglisse@redhat.com \
    --cc=kirill@shutemov.name \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=nadav.amit@gmail.com \
    --cc=peterx@redhat.com \
    --cc=rppt@linux.vnet.ibm.com \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).