All of lore.kernel.org
 help / color / mirror / Atom feed
From: Peter Xu <peterx@redhat.com>
To: linux-kernel@vger.kernel.org, linux-mm@kvack.org
Cc: "Kirill A . Shutemov" <kirill@shutemov.name>,
	Jerome Glisse <jglisse@redhat.com>,
	Mike Kravetz <mike.kravetz@oracle.com>,
	Matthew Wilcox <willy@infradead.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Axel Rasmussen <axelrasmussen@google.com>,
	Hugh Dickins <hughd@google.com>,
	Nadav Amit <nadav.amit@gmail.com>,
	Andrea Arcangeli <aarcange@redhat.com>,
	Mike Rapoport <rppt@linux.vnet.ibm.com>
Subject: Re: [PATCH 00/23] userfaultfd-wp: Support shmem and hugetlbfs
Date: Wed, 21 Apr 2021 12:03:52 -0400	[thread overview]
Message-ID: <20210421160352.GJ4440@xz-x1> (raw)
In-Reply-To: <20210323004912.35132-1-peterx@redhat.com>

On Mon, Mar 22, 2021 at 08:48:49PM -0400, Peter Xu wrote:
> This patchset is based on tag v5.12-rc3-mmots-2021-03-17-22-26.  To run the
> selftest, need to apply the two patches to fix minor mode page leak:
> 
> https://lore.kernel.org/lkml/20210322175132.36659-1-peterx@redhat.com/
> https://lore.kernel.org/lkml/20210322204836.1650221-1-axelrasmussen@google.com/
> 
> Since I didn't get any NACK in the previous RFC series for months, I decided to
> remove the RFC tag starting from this version, so this is v1 of uffd-wp support
> on shmem & hugetlb.
> 
> The whole series can also be found online [1].
> 
> The major comment I'd like to get is on the new idea of swap special pte.  That
> comes from suggestions from both Hugh and Andrea and I appreciated a lot for
> those discussions.
> 
> In short, the so-called "swap special pte" in this patchset is a new type of
> pte that doesn't exist in the past, but it got used initially in this series in
> file-backed memories.  It is used to persist information even if the ptes got
> dropped meanwhile when the page cache still existed.  For example, when
> splitting a file-backed huge pmd, we could be simply dropping the pmd entry
> then wait until another fault coming.  It's okay in the past since all
> information in the pte can be retained from the page cache when the next page
> fault triggers.  However in this case, uffd-wp is per-pte information which
> cannot be kept in page cache, so that information needs to be maintained
> somehow still in the pgtable entry, even if the pgtable entry is going to be
> dropped.  Here instead of replacing with a none entry, we used the "swap
> special pte".  Then when the next page fault triggers, we can observe orig_pte
> to retain this information.
> 
> I'm copy-pasting some commit message from the patch "mm/swap: Introduce the
> idea of special swap ptes", where it tried to explain this pte in another angle:
> 
>     We used to have special swap entries, like migration entries, hw-poison
>     entries, device private entries, etc.
> 
>     Those "special swap entries" reside in the range that they need to be at least
>     swap entries first, and their types are decided by swp_type(entry).
> 
>     This patch introduces another idea called "special swap ptes".
> 
>     It's very easy to get confused against "special swap entries", but a speical
>     swap pte should never contain a swap entry at all.  It means, it's illegal to
>     call pte_to_swp_entry() upon a special swap pte.
> 
>     Make the uffd-wp special pte to be the first special swap pte.
> 
>     Before this patch, is_swap_pte()==true means one of the below:
> 
>        (a.1) The pte has a normal swap entry (non_swap_entry()==false).  For
>              example, when an anonymous page got swapped out.
> 
>        (a.2) The pte has a special swap entry (non_swap_entry()==true).  For
>              example, a migration entry, a hw-poison entry, etc.
> 
>     After this patch, is_swap_pte()==true means one of the below, where case (b) is
>     added:
> 
>      (a) The pte contains a swap entry.
> 
>        (a.1) The pte has a normal swap entry (non_swap_entry()==false).  For
>              example, when an anonymous page got swapped out.
> 
>        (a.2) The pte has a special swap entry (non_swap_entry()==true).  For
>              example, a migration entry, a hw-poison entry, etc.
> 
>      (b) The pte does not contain a swap entry at all (so it cannot be passed
>          into pte_to_swp_entry()).  For example, uffd-wp special swap pte.
> 
> Hugetlbfs needs similar thing because it's also file-backed.  I directly reused
> the same special pte there, though the shmem/hugetlb change on supporting this
> new pte is different since they don't share code path a lot.
> 
> Patch layout
> ============
> 
> Part (1): Shmem support, this is where the special swap pte is introduced.
> Some zap rework is needed within the process:
> 
>   shmem/userfaultfd: Take care of UFFDIO_COPY_MODE_WP
>   mm: Clear vmf->pte after pte_unmap_same() returns
>   mm/userfaultfd: Introduce special pte for unmapped file-backed mem
>   mm/swap: Introduce the idea of special swap ptes
>   shmem/userfaultfd: Handle uffd-wp special pte in page fault handler
>   mm: Drop first_index/last_index in zap_details
>   mm: Introduce zap_details.zap_flags
>   mm: Introduce ZAP_FLAG_SKIP_SWAP
>   mm: Pass zap_flags into unmap_mapping_pages()
>   shmem/userfaultfd: Persist uffd-wp bit across zapping for file-backed
>   shmem/userfaultfd: Allow wr-protect none pte for file-backed mem
>   shmem/userfaultfd: Allows file-back mem to be uffd wr-protected on thps
>   shmem/userfaultfd: Handle the left-overed special swap ptes
>   shmem/userfaultfd: Pass over uffd-wp special swap pte when fork()
> 
> Part (2): Hugetlb support, we need to disable huge pmd sharing for uffd-wp
> because not compatible just like uffd minor mode.  The rest is the changes
> required to teach hugetlbfs understand the special swap pte too that introduced
> with the uffd-wp change:
> 
>   hugetlb/userfaultfd: Hook page faults for uffd write protection
>   hugetlb/userfaultfd: Take care of UFFDIO_COPY_MODE_WP
>   hugetlb/userfaultfd: Handle UFFDIO_WRITEPROTECT
>   hugetlb: Pass vma into huge_pte_alloc()
>   hugetlb/userfaultfd: Forbid huge pmd sharing when uffd enabled
>   mm/hugetlb: Introduce huge version of special swap pte helpers
>   mm/hugetlb: Move flush_hugetlb_tlb_range() into hugetlb.h
>   hugetlb/userfaultfd: Unshare all pmds for hugetlbfs when register wp
>   hugetlb/userfaultfd: Handle uffd-wp special pte in hugetlb pf handler
>   hugetlb/userfaultfd: Allow wr-protect none ptes
>   hugetlb/userfaultfd: Only drop uffd-wp special pte if required
> 
> Part (3): Enable both features in code and test
> 
>   userfaultfd: Enable write protection for shmem & hugetlbfs
>   userfaultfd/selftests: Enable uffd-wp for shmem/hugetlbfs
> 
> Tests
> =========
> 
> I've tested it using either userfaultfd kselftest program, but also with
> umapsort [2] which should be even stricter.  Tested page swapping in/out during
> umapsort.
> 
> If anyone would like to try umapsort, need to use an extremely hacked version
> of umap library [3], because by default umap only supports anonymous.  So to
> test it we need to build [3] then [2].
> 
> Any comment would be greatly welcomed.  Thanks,
> 
> [1] https://github.com/xzpeter/linux/tree/uffd-wp-shmem-hugetlbfs
> [2] https://github.com/LLNL/umap-apps
> [3] https://github.com/xzpeter/umap/tree/peter-shmem-hugetlbfs

Hugh, Mike, Andrew,

Any comment for this series?

Thanks,

-- 
Peter Xu


  parent reply	other threads:[~2021-04-21 16:04 UTC|newest]

Thread overview: 45+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-03-23  0:48 [PATCH 00/23] userfaultfd-wp: Support shmem and hugetlbfs Peter Xu
2021-03-23  0:48 ` [PATCH 01/23] shmem/userfaultfd: Take care of UFFDIO_COPY_MODE_WP Peter Xu
2021-03-23  0:48 ` [PATCH 02/23] mm: Clear vmf->pte after pte_unmap_same() returns Peter Xu
2021-03-23  2:34   ` Miaohe Lin
2021-03-23 15:40     ` Peter Xu
2021-03-23  0:48 ` [PATCH 03/23] mm/userfaultfd: Introduce special pte for unmapped file-backed mem Peter Xu
2021-03-23  0:48 ` [PATCH 04/23] mm/swap: Introduce the idea of special swap ptes Peter Xu
2021-03-23  0:48 ` [PATCH 05/23] shmem/userfaultfd: Handle uffd-wp special pte in page fault handler Peter Xu
2021-03-23  0:48 ` [PATCH 06/23] mm: Drop first_index/last_index in zap_details Peter Xu
2021-03-23  0:48 ` [PATCH 07/23] mm: Introduce zap_details.zap_flags Peter Xu
2021-03-23  2:11   ` Matthew Wilcox
2021-03-23 15:43     ` Peter Xu
2021-03-23  0:48 ` [PATCH 08/23] mm: Introduce ZAP_FLAG_SKIP_SWAP Peter Xu
2021-03-23  0:48 ` [PATCH 09/23] mm: Pass zap_flags into unmap_mapping_pages() Peter Xu
2021-03-23  0:48 ` [PATCH 10/23] shmem/userfaultfd: Persist uffd-wp bit across zapping for file-backed Peter Xu
2021-03-23  0:49 ` [PATCH 11/23] shmem/userfaultfd: Allow wr-protect none pte for file-backed mem Peter Xu
2021-03-23  0:49 ` [PATCH 12/23] shmem/userfaultfd: Allows file-back mem to be uffd wr-protected on thps Peter Xu
2021-03-23  0:49 ` [PATCH 13/23] shmem/userfaultfd: Handle the left-overed special swap ptes Peter Xu
2021-03-23  0:49 ` [PATCH 14/23] shmem/userfaultfd: Pass over uffd-wp special swap pte when fork() Peter Xu
2021-03-23  0:49 ` [PATCH 15/23] hugetlb/userfaultfd: Hook page faults for uffd write protection Peter Xu
2021-04-21 22:02   ` Mike Kravetz
2021-03-23  0:49 ` [PATCH 16/23] hugetlb/userfaultfd: Take care of UFFDIO_COPY_MODE_WP Peter Xu
2021-04-21 23:06   ` Mike Kravetz
2021-04-22  1:14     ` Peter Xu
2021-03-23  0:49 ` [PATCH 17/23] hugetlb/userfaultfd: Handle UFFDIO_WRITEPROTECT Peter Xu
2021-04-22 18:22   ` Mike Kravetz
2021-03-23  0:49 ` [PATCH 18/23] mm/hugetlb: Introduce huge version of special swap pte helpers Peter Xu
2021-04-22 19:00   ` Mike Kravetz
2021-03-23  0:50 ` [PATCH 19/23] hugetlb/userfaultfd: Handle uffd-wp special pte in hugetlb pf handler Peter Xu
2021-04-22 22:45   ` Mike Kravetz
2021-04-26  2:08     ` Peter Xu
2021-03-23  0:50 ` [PATCH 20/23] hugetlb/userfaultfd: Allow wr-protect none ptes Peter Xu
2021-04-23  0:08   ` Mike Kravetz
2021-03-23  0:50 ` [PATCH 21/23] hugetlb/userfaultfd: Only drop uffd-wp special pte if required Peter Xu
2021-04-23 20:33   ` Mike Kravetz
2021-04-26 21:16     ` Peter Xu
2021-04-26 21:36       ` Mike Kravetz
2021-04-26 22:05         ` Peter Xu
2021-04-26 23:09           ` Mike Kravetz
2021-03-23  0:50 ` [PATCH 22/23] mm/userfaultfd: Enable write protection for shmem & hugetlbfs Peter Xu
2021-03-23  0:50 ` [PATCH 23/23] userfaultfd/selftests: Enable uffd-wp for shmem/hugetlbfs Peter Xu
2021-03-23  0:54 ` [PATCH 00/23] userfaultfd-wp: Support shmem and hugetlbfs Peter Xu
2021-04-21 16:03 ` Peter Xu [this message]
2021-04-21 21:39   ` Mike Kravetz
2021-04-22  1:16     ` Peter Xu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210421160352.GJ4440@xz-x1 \
    --to=peterx@redhat.com \
    --cc=aarcange@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=axelrasmussen@google.com \
    --cc=hughd@google.com \
    --cc=jglisse@redhat.com \
    --cc=kirill@shutemov.name \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mike.kravetz@oracle.com \
    --cc=nadav.amit@gmail.com \
    --cc=rppt@linux.vnet.ibm.com \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.