All of lore.kernel.org
 help / color / mirror / Atom feed
From: David Hildenbrand <david@redhat.com>
To: Peter Xu <peterx@redhat.com>
Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	Axel Rasmussen <axelrasmussen@google.com>,
	Mike Rapoport <rppt@linux.vnet.ibm.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Andrea Arcangeli <aarcange@redhat.com>,
	Nadav Amit <nadav.amit@gmail.com>,
	Muhammad Usama Anjum <usama.anjum@collabora.com>
Subject: Re: [PATCH] mm/uffd: UFFD_FEATURE_WP_ZEROPAGE
Date: Thu, 2 Mar 2023 15:12:55 +0100	[thread overview]
Message-ID: <c7fc5bc3-7f86-c699-2968-a861ab44989a@redhat.com> (raw)
In-Reply-To: <Y/5ZNTESKfntBSF3@x1n>

>>
>> uffd-wp protecting a range:
>> * !pte_none() -> set uffd-wp bit and wrprotect
>> * pte_none() -> nothing to do
>> * PTE_UFFD_WP -> nothing to do
>> * PTE_UFFD_NO_WP -> set PTE_UFFD_WP
>>
>> unmapping a page (old way: !pte_none() -> pte_none()):
>> * uffd-wp bit set: set PTE_UFFD_WP
>> * uffd-wp bit not set: set PTE_UFFD_NO_WP
>>
>> (re)mapping a page (old: pte_none() -> !pte_none()):
>> * PTE_UFFD_WP -> set pte bit for new PTE
>> * PTE_UFFD_NO_WP -> don't set pte bit for new PTE
>> * pte_none() -> set pte bit for new PTE
>>
>> Zapping an anon page using MADV_DONTNEED is a bit confusing. It's actually
>> similar to a memory write (-> write zeroes), but we don't notify uffd-wp for
>> that (I think that's something you comment on below). Theoretically, we'd
>> want to set PTE_UFFD_NO_WP ("dirty") in the async mode. But that might need
>> more thought of what the expected semantics actually are.
>>
>> When we walk over the page tables we would get the following information
>> after protecting the range:
>>
>> * PTE_UFFD_WP -> clean, not modified since last protection round
>> * PTE_UFFD_NO_WP -> dirty, modified since last protection round
>> * pte_none() -> not mapped and therefore not modified since beginning of
>>                  protection.
>> * !pte_none() -> uffd-wp bit decides
> 
> I can't say I thought a lot but I feel like it may work. I'd probably avoid
> calling it PTE_UFFD_NO_WP or it'll be confusing.. maybe WP_WRITTEN or
> WP_RESOLVED instead.

I haven't thought about this further, but I maybe we might be able to 
just use a single PTE marker , because pte_none() would translate to 
PTE_UFFD_WP in such VMAs. So we could make the meaning of the 
PTE_UFFD_WP marker simply depend on the type of VMA.

If I am not wrong, we could stop setting PTE_UFFD_WP completely then, 
for any memory type (shmem/anon/hugetlb).

> 
> But that interface looks weird in that the protection happens right after
> VM_UFFD_WP applied to VMA and that keeps true until unregister.  One needs
> to reprotect using ioctl(UFFDIO_WRITEPROTECT) OTOH after the 1st round of
> tracking.  It just looks a little bit over-complicated, not to mention we
> will need two markers only for userfault-wp.  I had a feeling this
> complexity can cause us some trouble elsewhere.

When to apply this logic is indeed the interesting part. Doing it right 
from the beginning when the feature is enabled sounds like the simplest 
approach to me.  For background snapshots and dirty tracking that might 
just be good enough.


> 
> IIUC this can be something done on top even if it'll work (I think the

I think the API would have to change eventually, to enable it via a new 
feature ("unpopulated implies uffd-wp is active").

> userspace API doesn't need to change at all), so I'd suggest giving it some
> more thoughts and we start with simple and working.

Yes.

-- 
Thanks,

David / dhildenb


  reply	other threads:[~2023-03-02 14:13 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-02-15 21:02 [PATCH] mm/uffd: UFFD_FEATURE_WP_ZEROPAGE Peter Xu
2023-02-16 10:47 ` David Hildenbrand
2023-02-16 16:29   ` Peter Xu
2023-02-16 17:00     ` David Hildenbrand
2023-02-16 17:55       ` Peter Xu
2023-02-16 18:23         ` David Hildenbrand
2023-02-16 20:08           ` Peter Xu
2023-02-17 11:41             ` David Hildenbrand
2023-02-17 23:04               ` Peter Xu
2023-02-21 12:43                 ` David Hildenbrand
2023-02-21 23:13                   ` Peter Xu
2023-02-22 17:02                     ` David Hildenbrand
2023-02-22 20:37                       ` Peter Xu
2023-02-23 14:35                         ` David Hildenbrand
2023-02-28 19:42                           ` Peter Xu
2023-03-02 14:12                             ` David Hildenbrand [this message]
2023-02-17 12:31             ` Muhammad Usama Anjum
2023-02-17 23:10               ` Peter Xu
2023-02-20  7:15                 ` Muhammad Usama Anjum

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=c7fc5bc3-7f86-c699-2968-a861ab44989a@redhat.com \
    --to=david@redhat.com \
    --cc=aarcange@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=axelrasmussen@google.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=nadav.amit@gmail.com \
    --cc=peterx@redhat.com \
    --cc=rppt@linux.vnet.ibm.com \
    --cc=usama.anjum@collabora.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.