All of lore.kernel.org
 help / color / mirror / Atom feed
From: Peter Xu <peterx@redhat.com>
To: Muhammad Usama Anjum <usama.anjum@collabora.com>
Cc: "David Hildenbrand" <david@redhat.com>,
	"Andrew Morton" <akpm@linux-foundation.org>,
	"Michał Mirosław" <emmir@google.com>,
	"Andrei Vagin" <avagin@gmail.com>,
	"Danylo Mocherniuk" <mdanylo@google.com>,
	"Paul Gofman" <pgofman@codeweavers.com>,
	"Cyrill Gorcunov" <gorcunov@gmail.com>,
	"Alexander Viro" <viro@zeniv.linux.org.uk>,
	"Shuah Khan" <shuah@kernel.org>,
	"Christian Brauner" <brauner@kernel.org>,
	"Yang Shi" <shy828301@gmail.com>,
	"Vlastimil Babka" <vbabka@suse.cz>,
	"Liam R . Howlett" <Liam.Howlett@oracle.com>,
	"Yun Zhou" <yun.zhou@windriver.com>,
	"Suren Baghdasaryan" <surenb@google.com>,
	"Alex Sierra" <alex.sierra@amd.com>,
	"Matthew Wilcox" <willy@infradead.org>,
	"Pasha Tatashin" <pasha.tatashin@soleen.com>,
	"Mike Rapoport" <rppt@kernel.org>,
	"Nadav Amit" <namit@vmware.com>,
	"Axel Rasmussen" <axelrasmussen@google.com>,
	"Gustavo A . R . Silva" <gustavoars@kernel.org>,
	"Dan Williams" <dan.j.williams@intel.com>,
	linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org,
	linux-mm@kvack.org, linux-kselftest@vger.kernel.org,
	"Greg KH" <gregkh@linuxfoundation.org>,
	kernel@collabora.com
Subject: Re: [PATCH v7 0/4] Implement IOCTL to get and/or the clear info about PTEs
Date: Wed, 18 Jan 2023 17:12:04 -0500	[thread overview]
Message-ID: <Y8hutCGec6je5toG@x1n> (raw)
In-Reply-To: <20230109064519.3555250-1-usama.anjum@collabora.com>

On Mon, Jan 09, 2023 at 11:45:15AM +0500, Muhammad Usama Anjum wrote:
> *Changes in v7:*
> - Add uffd wp async
> - Update the IOCTL to use uffd under the hood instead of soft-dirty
>   flags
> 
> Stop using the soft-dirty flags for finding which pages have been
> written to. It is too delicate and wrong as it shows more soft-dirty
> pages than the actual soft-dirty pages. There is no interest in
> correcting it [A][B] as this is how the feature was written years ago.
> It shouldn't be updated to changed behaviour. Peter Xu has suggested
> using the async version of the UFFD WP [C] as it is based inherently
> on the PTEs.
> 
> So in this patch series, I've added a new mode to the UFFD which is
> asynchronous version of the write protect. When this variant of the
> UFFD WP is used, the page faults are resolved automatically by the
> kernel. The pages which have been written-to can be found by reading
> pagemap file (!PM_UFFD_WP). This feature can be used successfully to
> find which pages have been written to from the time the pages were
> write protected. This works just like the soft-dirty flag without
> showing any extra pages which aren't soft-dirty in reality.
> 
> [A] https://lore.kernel.org/all/20221220162606.1595355-1-usama.anjum@collabora.com
> [B] https://lore.kernel.org/all/20221122115007.2787017-1-usama.anjum@collabora.com
> [C] https://lore.kernel.org/all/Y6Hc2d+7eTKs7AiH@x1n
> 
> *Changes in v6:*
> - Updated the interface and made cosmetic changes
> 
> *Cover Letter in v5:*
> Hello,

Please consider either drop the cover letter below this point or rephrase,
otherwise many of them are not true anymore and it can confuse the
reviewers.

I have a few high level comments/questions here, please bare with me if any
of them are already discussed by others in the old versions; I'd be happy
to read them when there's a pointer to the relevant answers.

Firstly, doc update is more than welcomed to explain the new interface
first (before throwing the code..).  That can be done in pagemap.rst on
pagemap changes, or userfaultfd.rst on userfaultfd.

Besides, can you provide more justification on the new pagemap-side
interface design?

It seems it came from the Windows API GetWriteWatch(), but it's definitely
not exactly that.  Let me spell some points out..

There're four kinds of masks (required/anyof/excluded/return).  Are they
all needed?  Why this is a good interface design?

I saw you used page_region structure to keep the information.  I think you
wanted to have a densed output, especially if counting in the "return mask"
above it starts to make more sense. If with a very limited return mask it
means many of the (continuous) page information can be merged into a single
page_region struct when the kernel is scanning.

However, at the meantime the other three masks (required/anyof/excluded)
made me quite confused - it means you wanted to somehow filter the pages
and only some of them will get collected.  The thing is for a continuous
page range if any of the page got skipped due to the masks (e.g. not in
"required" or in "excluded") it also means it can never be merged into
previous page_region either.  That seems to be against the principle of
having densed output.

I hope you can help clarify what's the major use case here.

There's also the new interface to do atomic "fetch + update" on wrprotected
pages.  Is that just for efficiency or is the accuracy required in some of
the applications?

Thanks,

-- 
Peter Xu


  parent reply	other threads:[~2023-01-18 22:13 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-01-09  6:45 [PATCH v7 0/4] Implement IOCTL to get and/or the clear info about PTEs Muhammad Usama Anjum
2023-01-09  6:45 ` [PATCH v7 1/4] userfaultfd: Add UFFD WP Async support Muhammad Usama Anjum
2023-01-18 16:54   ` Peter Xu
2023-01-19 15:09     ` Muhammad Usama Anjum
2023-01-19 16:35       ` Peter Xu
2023-01-20 14:53         ` Peter Xu
2023-01-23 10:11           ` Muhammad Usama Anjum
2023-01-24 17:26             ` Peter Xu
2023-01-25 12:18               ` Muhammad Usama Anjum
2023-01-09  6:45 ` [PATCH v7 2/4] userfaultfd: split mwriteprotect_range() Muhammad Usama Anjum
2023-01-09  6:45 ` [PATCH v7 3/4] fs/proc/task_mmu: Implement IOCTL to get and/or the clear info about PTEs Muhammad Usama Anjum
2023-01-18 22:28   ` Peter Xu
2023-01-23 12:18     ` Muhammad Usama Anjum
2023-01-24 17:30       ` Peter Xu
2023-01-26 14:32         ` Muhammad Usama Anjum
2023-01-09  6:45 ` [PATCH v7 4/4] selftests: vm: add pagemap ioctl tests Muhammad Usama Anjum
2023-01-18  6:55 ` [PATCH v7 0/4] Implement IOCTL to get and/or the clear info about PTEs Muhammad Usama Anjum
2023-01-18 22:12 ` Peter Xu [this message]
2023-01-23 13:15   ` Muhammad Usama Anjum
2023-01-24 19:49     ` Peter Xu
2023-01-25 14:45       ` Danylo Mocherniuk

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Y8hutCGec6je5toG@x1n \
    --to=peterx@redhat.com \
    --cc=Liam.Howlett@oracle.com \
    --cc=akpm@linux-foundation.org \
    --cc=alex.sierra@amd.com \
    --cc=avagin@gmail.com \
    --cc=axelrasmussen@google.com \
    --cc=brauner@kernel.org \
    --cc=dan.j.williams@intel.com \
    --cc=david@redhat.com \
    --cc=emmir@google.com \
    --cc=gorcunov@gmail.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=gustavoars@kernel.org \
    --cc=kernel@collabora.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-kselftest@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mdanylo@google.com \
    --cc=namit@vmware.com \
    --cc=pasha.tatashin@soleen.com \
    --cc=pgofman@codeweavers.com \
    --cc=rppt@kernel.org \
    --cc=shuah@kernel.org \
    --cc=shy828301@gmail.com \
    --cc=surenb@google.com \
    --cc=usama.anjum@collabora.com \
    --cc=vbabka@suse.cz \
    --cc=viro@zeniv.linux.org.uk \
    --cc=willy@infradead.org \
    --cc=yun.zhou@windriver.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.