All of lore.kernel.org
 help / color / mirror / Atom feed
From: Alistair Popple <apopple@nvidia.com>
To: Peter Xu <peterx@redhat.com>
Cc: <linux-kernel@vger.kernel.org>, <linux-mm@kvack.org>,
	Mike Kravetz <mike.kravetz@oracle.com>,
	"Kirill A . Shutemov" <kirill@shutemov.name>,
	Jason Gunthorpe <jgg@ziepe.ca>, Hugh Dickins <hughd@google.com>,
	Matthew Wilcox <willy@infradead.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Miaohe Lin <linmiaohe@huawei.com>,
	Jerome Glisse <jglisse@redhat.com>,
	Nadav Amit <nadav.amit@gmail.com>,
	Axel Rasmussen <axelrasmussen@google.com>,
	Andrea Arcangeli <aarcange@redhat.com>,
	Mike Rapoport <rppt@linux.vnet.ibm.com>
Subject: Re: [PATCH v3 11/27] shmem/userfaultfd: Persist uffd-wp bit across zapping for file-backed
Date: Thu, 8 Jul 2021 12:49:23 +1000	[thread overview]
Message-ID: <2500158.4Z4izgLEvx@nvdebian> (raw)
In-Reply-To: <YOR4NmRmk54ULkkp@t490s>

On Wednesday, 7 July 2021 1:35:18 AM AEST Peter Xu wrote:
> On Tue, Jul 06, 2021 at 03:40:42PM +1000, Alistair Popple wrote:
> > > > > > > > >  struct page *vm_normal_page(struct vm_area_struct *vma, unsigned long addr,
> > > > > > > > >  			     pte_t pte);
> > > > > > > > >  struct page *vm_normal_page_pmd(struct vm_area_struct *vma, unsigned long addr,
> > > > > > > > > diff --git a/include/linux/mm_inline.h b/include/linux/mm_inline.h
> > > > > > > > > index 355ea1ee32bd..c29a6ef3a642 100644
> > > > > > > > > --- a/include/linux/mm_inline.h
> > > > > > > > > +++ b/include/linux/mm_inline.h
> > > > > > > > > @@ -4,6 +4,8 @@
> > > > > > > > >  
> > > > > > > > >  #include <linux/huge_mm.h>
> > > > > > > > >  #include <linux/swap.h>
> > > > > > > > > +#include <linux/userfaultfd_k.h>
> > > > > > > > > +#include <linux/swapops.h>
> > > > > > > > >  
> > > > > > > > >  /**
> > > > > > > > >   * page_is_file_lru - should the page be on a file LRU or anon LRU?
> > > > > > > > > @@ -104,4 +106,45 @@ static __always_inline void del_page_from_lru_list(struct page *page,
> > > > > > > > >  	update_lru_size(lruvec, page_lru(page), page_zonenum(page),
> > > > > > > > >  			-thp_nr_pages(page));
> > > > > > > > >  }
> > > > > > > > > +
> > > > > > > > > +/*
> > > > > > > > > + * If this pte is wr-protected by uffd-wp in any form, arm the special pte to
> > > > > > > > > + * replace a none pte.  NOTE!  This should only be called when *pte is already
> > > > > > > > > + * cleared so we will never accidentally replace something valuable.  Meanwhile
> > > > > > > > > + * none pte also means we are not demoting the pte so if tlb flushed then we
> > > > > > > > > + * don't need to do it again; otherwise if tlb flush is postponed then it's
> > > > > > > > > + * even better.
> > > > > > > > > + *
> > > > > > > > > + * Must be called with pgtable lock held.
> > > > > > > > > + */
> > > > > > > > > +static inline void
> > > > > > > > > +pte_install_uffd_wp_if_needed(struct vm_area_struct *vma, unsigned long addr,
> > > > > > > > > +			      pte_t *pte, pte_t pteval)
> > > > > > > > > +{
> > > > > > > > > +#ifdef CONFIG_USERFAULTFD
> > > > > > > > > +	bool arm_uffd_pte = false;
> > > > > > > > > +
> > > > > > > > > +	/* The current status of the pte should be "cleared" before calling */
> > > > > > > > > +	WARN_ON_ONCE(!pte_none(*pte));
> > > > > > > > > +
> > > > > > > > > +	if (vma_is_anonymous(vma))
> > > > > > > > > +		return;
> > > > > > > > > +
> > > > > > > > > +	/* A uffd-wp wr-protected normal pte */
> > > > > > > > > +	if (unlikely(pte_present(pteval) && pte_uffd_wp(pteval)))
> > > > > > > > > +		arm_uffd_pte = true;
> > > > > > > > > +
> > > > > > > > > +	/*
> > > > > > > > > +	 * A uffd-wp wr-protected swap pte.  Note: this should even work for
> > > > > > > > > +	 * pte_swp_uffd_wp_special() too.
> > > > > > > > > +	 */
> > > > > > > > 
> > > > > > > > I'm probably missing something but when can we actually have this case and why
> > > > > > > > would we want to leave a special pte behind? From what I can tell this is
> > > > > > > > called from try_to_unmap_one() where this won't be true or from zap_pte_range()
> > > > > > > > when not skipping swap pages.
> > > > > > > 
> > > > > > > Yes this is a good question..
> > > > > > > 
> > > > > > > Initially I made this function make sure I cover all forms of uffd-wp bit, that
> > > > > > > contains both swap and present ptes; imho that's pretty safe.  However for
> > > > > > > !anonymous cases we don't keep swap entry inside pte even if swapped out, as
> > > > > > > they should reside in shmem page cache indeed.  The only missing piece seems to
> > > > > > > be the device private entries as you also spotted below.
> > > > > > 
> > > > > > Yes, I think it's *probably* safe although I don't yet have a strong opinion
> > > > > > here ...
> > > > > > 
> > > > > > > > > +	if (unlikely(is_swap_pte(pteval) && pte_swp_uffd_wp(pteval)))
> > > > > > 
> > > > > > ... however if this can never happen would a WARN_ON() be better? It would also
> > > > > > mean you could remove arm_uffd_pte.
> > > > > 
> > > > > Hmm, after a second thought I think we can't make it a WARN_ON_ONCE().. this
> > > > > can still be useful for private mapping of shmem files: in that case we'll have
> > > > > swap entry stored in pte not page cache, so after page reclaim it will contain
> > > > > a valid swap entry, while it's still "!anonymous".
> 
> [1]
> 
> > > > 
> > > > There's something (probably obvious) I must still be missing here. During
> > > > reclaim won't a private shmem mapping still have a present pteval here?
> > > > Therefore it won't trigger this case - the uffd wp bit is set when the swap
> > > > entry is established further down in try_to_unmap_one() right?
> > > 
> > > I agree if it's at the point when it get reclaimed, however what if we zap a
> > > pte of a page already got reclaimed?  It should have the swap pte installed,
> > > imho, which will have "is_swap_pte(pteval) && pte_swp_uffd_wp(pteval)"==true.
> > 
> > Apologies for the delay getting back to this, I hope to find some more time
> > to look at this again this week.
> 
> No problem, please take your time on reviewing the series.
> 
> > 
> > I guess what I am missing is why we care about a swap pte for a reclaimed page
> > getting zapped. I thought that would imply the mapping was getting torn down,
> > although I suppose in that case you still want the uffd-wp to apply in case a
> > new mapping appears there?
> 
> For the torn down case it'll always have ZAP_FLAG_DROP_FILE_UFFD_WP set, so
> pte_install_uffd_wp_if_needed() won't be called, as zap_drop_file_uffd_wp()
> will return true:

Argh, thanks. I had forgotten that bit.

> static inline void
> zap_install_uffd_wp_if_needed(struct vm_area_struct *vma,
> 			      unsigned long addr, pte_t *pte,
> 			      struct zap_details *details, pte_t pteval)
> {
> 	if (zap_drop_file_uffd_wp(details))
> 		return;
> 
> 	pte_install_uffd_wp_if_needed(vma, addr, pte, pteval);
> }
> 
> If you see it's non-trivial to fully digest all the caller stacks of it. What I
> wanted to do with pte_install_uffd_wp_if_needed is simply to provide a helper
> that can convert any form of uffd-wp ptes into a pte marker before being set as
> none pte.  Since uffd-wp can exist in two forms (either present, or swap), then
> cover all these two forms (and for swap form also cover the uffd-wp special pte
> itself) is very clear idea and easy to understand to me.  I don't even need to
> worry about who is calling it, and which case can be swap pte, which case must
> not - we just call it when we want to persist the uffd-wp bit (after a pte got
> cleared).  That's why in all cases I still prefer to keep it as is, as it just
> makes things straightforward to me.

Ok, that makes sense. I don't think there is an actual problem here it was
just a little surprising to me so I was trying to get a better understanding
of the caller stacks and when this might actually be required. As you say
though that is non-trivial and in any case it's still ok to install these
bits and a single function is simpler.

 - Alistair
 
> Thanks,
> 
> 





  reply	other threads:[~2021-07-08  2:49 UTC|newest]

Thread overview: 61+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-05-27 20:19 [PATCH v3 00/27] userfaultfd-wp: Support shmem and hugetlbfs Peter Xu
2021-05-27 20:19 ` [PATCH v3 01/27] mm/shmem: Unconditionally set pte dirty in mfill_atomic_install_pte Peter Xu
2021-05-27 20:19 ` [PATCH v3 02/27] shmem/userfaultfd: Take care of UFFDIO_COPY_MODE_WP Peter Xu
2021-05-27 20:19 ` [PATCH v3 03/27] mm: Clear vmf->pte after pte_unmap_same() returns Peter Xu
2021-05-27 20:19 ` [PATCH v3 04/27] mm/userfaultfd: Introduce special pte for unmapped file-backed mem Peter Xu
2021-05-28  8:32   ` Alistair Popple
2021-05-28 12:56     ` Peter Xu
2021-06-03 11:53       ` Alistair Popple
2021-06-03 14:51         ` Peter Xu
2021-06-04  0:55           ` Alistair Popple
2021-06-04  3:14             ` Hugh Dickins
2021-06-04  3:14               ` Hugh Dickins
2021-06-04  6:16               ` Alistair Popple
2021-06-04 16:01                 ` Peter Xu
2021-06-08 13:18                   ` Alistair Popple
2021-06-09 13:06   ` Alistair Popple
2021-06-09 14:43     ` Peter Xu
2021-05-27 20:21 ` [PATCH v3 05/27] mm/swap: Introduce the idea of special swap ptes Peter Xu
2021-05-27 20:21 ` [PATCH v3 06/27] shmem/userfaultfd: Handle uffd-wp special pte in page fault handler Peter Xu
2021-06-17  8:59   ` Alistair Popple
2021-06-17 15:10     ` Peter Xu
2021-05-27 20:21 ` [PATCH v3 07/27] mm: Drop first_index/last_index in zap_details Peter Xu
2021-06-21 12:20   ` Alistair Popple
2021-05-27 20:21 ` [PATCH v3 08/27] mm: Introduce zap_details.zap_flags Peter Xu
2021-06-21 12:09   ` Alistair Popple
2021-06-21 16:16     ` Peter Xu
2021-06-22  2:07       ` Alistair Popple
2021-05-27 20:21 ` [PATCH v3 09/27] mm: Introduce ZAP_FLAG_SKIP_SWAP Peter Xu
2021-06-21 12:36   ` Alistair Popple
2021-06-21 16:26     ` Peter Xu
2021-06-22  2:11       ` Alistair Popple
2021-05-27 20:21 ` [PATCH v3 10/27] mm: Pass zap_flags into unmap_mapping_pages() Peter Xu
2021-05-27 20:22 ` [PATCH v3 11/27] shmem/userfaultfd: Persist uffd-wp bit across zapping for file-backed Peter Xu
2021-06-21  8:41   ` Alistair Popple
2021-06-22  0:40     ` Peter Xu
2021-06-22 12:47       ` Alistair Popple
2021-06-22 15:44         ` Peter Xu
2021-06-23  6:04           ` Alistair Popple
2021-06-23 15:31             ` Peter Xu
2021-07-06  5:40               ` Alistair Popple
2021-07-06 15:35                 ` Peter Xu
2021-07-08  2:49                   ` Alistair Popple [this message]
2021-05-27 20:22 ` [PATCH v3 12/27] shmem/userfaultfd: Allow wr-protect none pte for file-backed mem Peter Xu
2021-05-27 20:22 ` [PATCH v3 13/27] shmem/userfaultfd: Allows file-back mem to be uffd wr-protected on thps Peter Xu
2021-05-27 20:22 ` [PATCH v3 14/27] shmem/userfaultfd: Handle the left-overed special swap ptes Peter Xu
2021-05-27 20:22 ` [PATCH v3 15/27] shmem/userfaultfd: Pass over uffd-wp special swap pte when fork() Peter Xu
2021-05-27 20:23 ` [PATCH v3 16/27] mm/hugetlb: Drop __unmap_hugepage_range definition from hugetlb.h Peter Xu
2021-05-27 20:23 ` [PATCH v3 17/27] mm/hugetlb: Introduce huge pte version of uffd-wp helpers Peter Xu
2021-05-27 20:23 ` [PATCH v3 18/27] hugetlb/userfaultfd: Hook page faults for uffd write protection Peter Xu
2021-05-27 20:23 ` [PATCH v3 19/27] hugetlb/userfaultfd: Take care of UFFDIO_COPY_MODE_WP Peter Xu
2021-05-27 20:23 ` [PATCH v3 20/27] hugetlb/userfaultfd: Handle UFFDIO_WRITEPROTECT Peter Xu
2021-05-27 20:23 ` [PATCH v3 21/27] mm/hugetlb: Introduce huge version of special swap pte helpers Peter Xu
2021-05-27 20:23 ` [PATCH v3 22/27] hugetlb/userfaultfd: Handle uffd-wp special pte in hugetlb pf handler Peter Xu
2021-05-27 20:23 ` [PATCH v3 23/27] hugetlb/userfaultfd: Allow wr-protect none ptes Peter Xu
2021-05-27 20:23 ` [PATCH v3 24/27] hugetlb/userfaultfd: Only drop uffd-wp special pte if required Peter Xu
2021-05-27 20:23 ` [PATCH v3 25/27] mm/pagemap: Recognize uffd-wp bit for shmem/hugetlbfs Peter Xu
2021-05-27 20:23 ` [PATCH v3 26/27] mm/userfaultfd: Enable write protection for shmem & hugetlbfs Peter Xu
2021-05-27 20:23 ` [PATCH v3 27/27] userfaultfd/selftests: Enable uffd-wp for shmem/hugetlbfs Peter Xu
2021-06-02 14:40 ` [PATCH v3 00/27] userfaultfd-wp: Support shmem and hugetlbfs Peter Xu
2021-06-02 22:36   ` Andrew Morton
2021-06-03  0:09     ` Peter Xu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=2500158.4Z4izgLEvx@nvdebian \
    --to=apopple@nvidia.com \
    --cc=aarcange@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=axelrasmussen@google.com \
    --cc=hughd@google.com \
    --cc=jgg@ziepe.ca \
    --cc=jglisse@redhat.com \
    --cc=kirill@shutemov.name \
    --cc=linmiaohe@huawei.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mike.kravetz@oracle.com \
    --cc=nadav.amit@gmail.com \
    --cc=peterx@redhat.com \
    --cc=rppt@linux.vnet.ibm.com \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.