linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Mike Rapoport <rppt@linux.vnet.ibm.com>
To: "Michael Kerrisk (man-pages)" <mtk.manpages@gmail.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	linux-man@vger.kernel.org,
	Mike Rapoport <rppt@linux.vnet.ibm.com>
Subject: [PATCH man-pages 1/2] userfaultfd.2: start documenting non-cooperative events
Date: Thu, 27 Apr 2017 17:14:33 +0300	[thread overview]
Message-ID: <1493302474-4701-2-git-send-email-rppt@linux.vnet.ibm.com> (raw)
In-Reply-To: <1493302474-4701-1-git-send-email-rppt@linux.vnet.ibm.com>

Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
---
 man2/userfaultfd.2 | 135 ++++++++++++++++++++++++++++++++++++++++++++++++++---
 1 file changed, 128 insertions(+), 7 deletions(-)

diff --git a/man2/userfaultfd.2 b/man2/userfaultfd.2
index cfea5cb..44af3e4 100644
--- a/man2/userfaultfd.2
+++ b/man2/userfaultfd.2
@@ -75,7 +75,7 @@ flag in
 .PP
 When the last file descriptor referring to a userfaultfd object is closed,
 all memory ranges that were registered with the object are unregistered
-and unread page-fault events are flushed.
+and unread events are flushed.
 .\"
 .SS Usage
 The userfaultfd mechanism is designed to allow a thread in a multithreaded
@@ -99,6 +99,20 @@ In such non-cooperative mode,
 the process that monitors userfaultfd and handles page faults
 needs to be aware of the changes in the virtual memory layout
 of the faulting process to avoid memory corruption.
+
+Starting from Linux 4.11,
+userfaultfd may notify the fault-handling threads about changes
+in the virtual memory layout of the faulting process.
+In addition, if the faulting process invokes
+.BR fork (2)
+system call,
+the userfaultfd objects associated with the parent may be duplicated
+into the child process and the userfaultfd monitor will be notified
+about the file descriptor associated with the userfault objects
+created for the child process,
+which allows userfaultfd monitor to perform user-space paging
+for the child process.
+
 .\" FIXME elaborate about non-cooperating mode, describe its limitations
 .\" for kernels before 4.11, features added in 4.11
 .\" and limitations remaining in 4.11
@@ -144,6 +158,10 @@ Details of the various
 operations can be found in
 .BR ioctl_userfaultfd (2).
 
+Since Linux 4.11, events other than page-fault may enabled during
+.B UFFDIO_API
+operation.
+
 Up to Linux 4.11,
 userfaultfd can be used only with anonymous private memory mappings.
 
@@ -156,7 +174,8 @@ Each
 .BR read (2)
 from the userfaultfd file descriptor returns one or more
 .I uffd_msg
-structures, each of which describes a page-fault event:
+structures, each of which describes a page-fault event
+or an event required for the non-cooperative userfaultfd usage:
 
 .nf
 .in +4n
@@ -168,6 +187,23 @@ struct uffd_msg {
             __u64 flags;        /* Flags describing fault */
             __u64 address;      /* Faulting address */
         } pagefault;
+        struct {
+            __u32 ufd;          /* userfault file descriptor
+                                   of the child process */
+        } fork;                 /* since Linux 4.11 */
+        struct {
+            __u64 from;         /* old address of the
+                                   remapped area */
+            __u64 to;           /* new address of the
+                                   remapped area */
+            __u64 len;          /* original mapping length */
+        } remap;                /* since Linux 4.11 */
+        struct {
+            __u64 start;        /* start address of the
+                                   removed area */
+            __u64 end;          /* end address of the
+                                   removed area */
+        } remove;               /* since Linux 4.11 */
         ...
     } arg;
 
@@ -194,14 +230,73 @@ structure are as follows:
 .TP
 .I event
 The type of event.
-Currently, only one value can appear in this field:
-.BR UFFD_EVENT_PAGEFAULT ,
-which indicates a page-fault event.
+Depending of the event type,
+different fields of the
+.I arg
+union represent details required for the event processing.
+The non-page-fault events are generated only when appropriate feature
+is enabled during API handshake with
+.B UFFDIO_API
+.BR ioctl (2).
+
+The following values can appear in the
+.I event
+field:
+.RS
+.TP
+.B UFFD_EVENT_PAGEFAULT
+A page-fault event.
+The page-fault details are available in the
+.I pagefault
+field.
 .TP
-.I address
+.B UFFD_EVENT_FORK
+Generated when the faulting process invokes
+.BR fork (2)
+system call.
+The event details are available in the
+.I fork
+field.
+.\" FIXME descirbe duplication of userfault file descriptor during fork
+.TP
+.B UFFD_EVENT_REMAP
+Generated when the faulting process invokes
+.BR mremap (2)
+system call.
+The event details are available in the
+.I remap
+field.
+.TP
+.B UFFD_EVENT_REMOVE
+Generated when the faulting process invokes
+.BR madvise (2)
+system call with
+.BR MADV_DONTNEED
+or
+.BR MADV_REMOVE
+advice.
+The event details are available in the
+.I remove
+field.
+.TP
+.B UFFD_EVENT_UNMAP
+Generated when the faulting process unmaps a memory range,
+either explicitly using
+.BR munmap (2)
+system call or implicitly during
+.BR mmap (2)
+or
+.BR mremap (2)
+system calls.
+The event details are available in the
+.I remove
+field.
+.RE
+.TP
+.I pagefault.address
 The address that triggered the page fault.
 .TP
-.I flags
+.I pagefault.flags
 A bit mask of flags that describe the event.
 For
 .BR UFFD_EVENT_PAGEFAULT ,
@@ -218,6 +313,32 @@ otherwise it is a read fault.
 .\"
 .\" UFFD_PAGEFAULT_FLAG_WP is not yet supported.
 .RE
+.TP
+.I fork.ufd
+The file descriptor associated with the userfault object
+created for the child process
+.TP
+.I remap.from
+The original address of the memory range that was remapped using
+.BR mremap (2).
+.TP
+.I remap.to
+The new address of the memory range that was remapped using
+.BR mremap (2).
+.TP
+.I remap.len
+The original length of the the memory range that was remapped using
+.BR mremap (2).
+.TP
+.I remove.start
+The start address of the memory range that was freed using
+.BR madvise (2)
+or unmapped
+.TP
+.I remove.end
+The end address of the memory range that was freed using
+.BR madvise (2)
+or unmapped
 .PP
 A
 .BR read (2)
-- 
1.9.1

  reply	other threads:[~2017-04-27 14:14 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-04-27 14:14 [PATCH man-pages 0/2] some more userfault pages updates Mike Rapoport
2017-04-27 14:14 ` Mike Rapoport [this message]
2017-04-27 17:26   ` [PATCH man-pages 1/2] userfaultfd.2: start documenting non-cooperative events Michael Kerrisk (man-pages)
2017-04-28  9:45     ` Mike Rapoprt
2017-05-01 18:34       ` Michael Kerrisk (man-pages)
2017-05-02  9:22         ` Mike Rapoport
2017-05-02 20:19           ` Michael Kerrisk (man-pages)
2017-04-27 14:14 ` [PATCH man-pages 2/2] ioctl_userfaultfd.2: start adding details about userfaultfd features Mike Rapoport
2017-04-27 17:26   ` Michael Kerrisk (man-pages)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1493302474-4701-2-git-send-email-rppt@linux.vnet.ibm.com \
    --to=rppt@linux.vnet.ibm.com \
    --cc=aarcange@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-man@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mtk.manpages@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).