linux-man.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2] mremap.2: Add information for MREMAP_DONTUNMAP.
@ 2020-04-15 16:49 Brian Geffon
  2020-04-16  7:07 ` Michael Kerrisk (man-pages)
  0 siblings, 1 reply; 6+ messages in thread
From: Brian Geffon @ 2020-04-15 16:49 UTC (permalink / raw)
  To: mtk.manpages
  Cc: linux-man, Sonny Rao, Jesse Barnes, Vlastimil Babka, Brian Geffon

Signed-off-by: Brian Geffon <bgeffon@google.com>
---
 man2/mremap.2 | 49 +++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 49 insertions(+)

diff --git a/man2/mremap.2 b/man2/mremap.2
index d73fb64fa..ff5939ff1 100644
--- a/man2/mremap.2
+++ b/man2/mremap.2
@@ -129,6 +129,22 @@ If
 is specified, then
 .B MREMAP_MAYMOVE
 must also be specified.
+.TP
+.BR MREMAP_DONTUNMAP " (since Linux 5.7)"
+.\" commit e346b3813067d4b17383f975f197a9aa28a3b077
+This flag which must be used in conjunction with
+.B MREMAP_MAYMOVE
+remaps a mapping to a new address and it does not unmap the mapping at
+.BR old_address .
+This flag can only be used with private anonymous mappings.
+Any access to the range specified at
+.BR old_address
+after completion will result in an anonymous page fault.
+The anonymous page fault will be handled by a
+.BR userfaultfd (2)
+if the range was previously registered on the mapping specified by
+.BR old_address .
+Otherwise, it will be zero filled by the kernel.
 .PP
 If the memory segment specified by
 .I old_address
@@ -176,6 +192,8 @@ a value other than
 .B MREMAP_MAYMOVE
 or
 .B MREMAP_FIXED
+or
+.B MREMAP_DONTUNMAP
 was specified in
 .IR flags ;
 .IP *
@@ -197,9 +215,22 @@ and
 .IR old_size ;
 .IP *
 .B MREMAP_FIXED
+or
+.B MREMAP_DONTUNMAP
 was specified without also specifying
 .BR MREMAP_MAYMOVE ;
 .IP *
+.B MREMAP_DONTUNMAP
+was specified with and
+.BR old_address
+that was not private anonymous;
+.IP *
+.B MREMAP_DONTUNMAP
+was specified and
+.BR old_size
+was not equal to
+.BR new_size ;
+.IP *
 \fIold_size\fP was zero and \fIold_address\fP does not refer to a
 shareable mapping (but see BUGS);
 .IP *
@@ -209,10 +240,20 @@ flag was not specified.
 .RE
 .TP
 .B ENOMEM
+Not enough memory was available to complete the operation.
+Possible causes are:
+.RS
+.IP * 3
 The memory area cannot be expanded at the current virtual address, and the
 .B MREMAP_MAYMOVE
 flag is not set in \fIflags\fP.
 Or, there is not enough (virtual) memory available.
+.IP *
+.B MREMAP_DONTUNMAP
+was used causing a new mapping to be created that would exceed the
+(virtual) memory available.
+Or, it would exceed the maximum number of allowed mappings.
+.RE
 .SH CONFORMING TO
 This call is Linux-specific, and should not be used in programs
 intended to be portable.
@@ -238,6 +279,14 @@ call will make a best effort to populate the new area but will not fail
 with
 .B ENOMEM
 if the area cannot be populated.
+.PP
+The
+.BR MREMAP_DONTUNMAP
+flag may be used to atomically move a mapping while leaving the source
+mapped.
+Possible applications for this behavior might be garbage collection or
+non-cooperative
+.BR userfaultfd (2) .
 .SH BUGS
 Before Linux 4.14,
 if
-- 
2.26.0.110.g2183baf09c-goog


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH v2] mremap.2: Add information for MREMAP_DONTUNMAP.
  2020-04-15 16:49 [PATCH v2] mremap.2: Add information for MREMAP_DONTUNMAP Brian Geffon
@ 2020-04-16  7:07 ` Michael Kerrisk (man-pages)
  2020-04-17  3:01   ` Brian Geffon
  0 siblings, 1 reply; 6+ messages in thread
From: Michael Kerrisk (man-pages) @ 2020-04-16  7:07 UTC (permalink / raw)
  To: Brian Geffon
  Cc: mtk.manpages, linux-man, Sonny Rao, Jesse Barnes,
	Vlastimil Babka, Minchan Kim, Kirill A. Shutemov, Lokesh Gidra

[CC expanded to include a few people who tested/acked/reviewed the
original kernel patch.]

Hello Brian,

Thanks for this patch. I've applied it, and done quite a
bit of editing. Could you please take a look at the
version in Git, and let me know if I made any bad changes
to your text.

In addition, I have one or two questions below.

On 4/15/20 6:49 PM, Brian Geffon wrote:
> Signed-off-by: Brian Geffon <bgeffon@google.com>
> ---
>  man2/mremap.2 | 49 +++++++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 49 insertions(+)
> 
> diff --git a/man2/mremap.2 b/man2/mremap.2
> index d73fb64fa..ff5939ff1 100644
> --- a/man2/mremap.2
> +++ b/man2/mremap.2
> @@ -129,6 +129,22 @@ If
>  is specified, then
>  .B MREMAP_MAYMOVE
>  must also be specified.
> +.TP
> +.BR MREMAP_DONTUNMAP " (since Linux 5.7)"
> +.\" commit e346b3813067d4b17383f975f197a9aa28a3b077
> +This flag which must be used in conjunction with
> +.B MREMAP_MAYMOVE
> +remaps a mapping to a new address and it does not unmap the mapping at
> +.BR old_address .
> +This flag can only be used with private anonymous mappings.
> +Any access to the range specified at
> +.BR old_address
> +after completion will result in an anonymous page fault.
> +The anonymous page fault will be handled by a
> +.BR userfaultfd (2)
> +if the range was previously registered on the mapping specified by
> +.BR old_address .
> +Otherwise, it will be zero filled by the kernel.
>  .PP
>  If the memory segment specified by
>  .I old_address
> @@ -176,6 +192,8 @@ a value other than
>  .B MREMAP_MAYMOVE
>  or
>  .B MREMAP_FIXED
> +or
> +.B MREMAP_DONTUNMAP
>  was specified in
>  .IR flags ;
>  .IP *
> @@ -197,9 +215,22 @@ and
>  .IR old_size ;
>  .IP *
>  .B MREMAP_FIXED
> +or
> +.B MREMAP_DONTUNMAP
>  was specified without also specifying
>  .BR MREMAP_MAYMOVE ;
>  .IP *
> +.B MREMAP_DONTUNMAP
> +was specified with and
> +.BR old_address
> +that was not private anonymous;
> +.IP *
> +.B MREMAP_DONTUNMAP
> +was specified and
> +.BR old_size
> +was not equal to
> +.BR new_size ;
> +.IP *
>  \fIold_size\fP was zero and \fIold_address\fP does not refer to a
>  shareable mapping (but see BUGS);
>  .IP *
> @@ -209,10 +240,20 @@ flag was not specified.
>  .RE
>  .TP
>  .B ENOMEM
> +Not enough memory was available to complete the operation.
> +Possible causes are:
> +.RS
> +.IP * 3
>  The memory area cannot be expanded at the current virtual address, and the
>  .B MREMAP_MAYMOVE
>  flag is not set in \fIflags\fP.
>  Or, there is not enough (virtual) memory available.
> +.IP *
> +.B MREMAP_DONTUNMAP
> +was used causing a new mapping to be created that would exceed the
> +(virtual) memory available.
> +Or, it would exceed the maximum number of allowed mappings.
> +.RE
>  .SH CONFORMING TO
>  This call is Linux-specific, and should not be used in programs
>  intended to be portable.
> @@ -238,6 +279,14 @@ call will make a best effort to populate the new area but will not fail
>  with
>  .B ENOMEM
>  if the area cannot be populated.
> +.PP
> +The
> +.BR MREMAP_DONTUNMAP
> +flag may be used to atomically move a mapping while leaving the source
> +mapped.

You write "move", but would it not be more correcrt to say something
like "duplicate"?

> +Possible applications for this behavior might be garbage collection or

Can you elaborate the garbage collection use case a little, please?

> +non-cooperative
> +.BR userfaultfd (2) .

What is noncooperative userfaultfd(2)?

Thanks,

Michael


-- 
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Linux/UNIX System Programming Training: http://man7.org/training/

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v2] mremap.2: Add information for MREMAP_DONTUNMAP.
  2020-04-16  7:07 ` Michael Kerrisk (man-pages)
@ 2020-04-17  3:01   ` Brian Geffon
  2020-04-22  0:15     ` Lokesh Gidra
  2020-04-22 12:05     ` Michael Kerrisk (man-pages)
  0 siblings, 2 replies; 6+ messages in thread
From: Brian Geffon @ 2020-04-17  3:01 UTC (permalink / raw)
  To: Michael Kerrisk (man-pages)
  Cc: linux-man, Sonny Rao, Jesse Barnes, Vlastimil Babka, Minchan Kim,
	Kirill A. Shutemov, Lokesh Gidra

Hi Michael,

> Thanks for this patch. I've applied it, and done quite a
> bit of editing. Could you please take a look at the
> version in Git, and let me know if I made any bad changes
> to your text.

Your changes look good.

> You write "move", but would it not be more correcrt to say something
> like "duplicate"?

It's a little of both, it duplicates the VMA but moves the page table
entries. So the behavior feels more like a move followed by a new
mapping created that had the same properties as the previous. Does
that make sense?

> > +Possible applications for this behavior might be garbage collection or
>
> Can you elaborate the garbage collection use case a little, please?

Lokesh, who is CCed, can probably expand better than I can, Lokesh
would you mind elaborating on how the JVM plans to use this.

> > +non-cooperative
> > +.BR userfaultfd (2) .
>
> What is noncooperative userfaultfd(2)?

No cooperative userfaultfd is the term that people tend to use when
the threads accessing the memory are not cooperating with the fault
handling, MREMAP_DONTUNMAP is interesting for this as you can yank out
the page tables from a running process and immediately start handling
faults for the registered range without having to stop the process.

I hope that answers your questions, feel free to ask if you need more
clarification.

Brian

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v2] mremap.2: Add information for MREMAP_DONTUNMAP.
  2020-04-17  3:01   ` Brian Geffon
@ 2020-04-22  0:15     ` Lokesh Gidra
  2020-04-22 12:08       ` Michael Kerrisk (man-pages)
  2020-04-22 12:05     ` Michael Kerrisk (man-pages)
  1 sibling, 1 reply; 6+ messages in thread
From: Lokesh Gidra @ 2020-04-22  0:15 UTC (permalink / raw)
  To: Brian Geffon
  Cc: Michael Kerrisk (man-pages),
	linux-man, Sonny Rao, Jesse Barnes, Vlastimil Babka, Minchan Kim,
	Kirill A. Shutemov

On Thu, Apr 16, 2020 at 8:02 PM Brian Geffon <bgeffon@google.com> wrote:
>
> Hi Michael,
>
> > Thanks for this patch. I've applied it, and done quite a
> > bit of editing. Could you please take a look at the
> > version in Git, and let me know if I made any bad changes
> > to your text.
>
> Your changes look good.
>
> > You write "move", but would it not be more correcrt to say something
> > like "duplicate"?
>
> It's a little of both, it duplicates the VMA but moves the page table
> entries. So the behavior feels more like a move followed by a new
> mapping created that had the same properties as the previous. Does
> that make sense?
>
> > > +Possible applications for this behavior might be garbage collection or
> >
> > Can you elaborate the garbage collection use case a little, please?
>
> Lokesh, who is CCed, can probably expand better than I can, Lokesh
> would you mind elaborating on how the JVM plans to use this.
>
There are many GC algorithms in literature which use PROT_NONE+SIGSEGV
trick to implement concurrent compaction of java heap. In Android
Runtime we plan to use userfaultfd instead. But this requires a
stop-the-world, wherein Java threads are paused, right before starting
the compaction phase. Within this pause, the physical pages in the
Java heap will be moved to another area, so that the Java heap, which
is already registered with userfaultfd, can start 'userfaulting' (as
Java heap pages are missing) once application threads are resumed.

In the absence of MREMAP_DONTUNMAP, I'd have to do it by first doing
mremap, and then mmaping Java heap, as its virtual mapping would be
removed by the preceding mremap. This not only causes performance
issues as two system calls need to be made instead of one, but it also
leaves a window open for a native thread, which is not paused, to
create a virtual mapping for its own usage right where Java heap is
supposed to be.

> > > +non-cooperative
> > > +.BR userfaultfd (2) .
> >
> > What is noncooperative userfaultfd(2)?
>
> No cooperative userfaultfd is the term that people tend to use when
> the threads accessing the memory are not cooperating with the fault
> handling, MREMAP_DONTUNMAP is interesting for this as you can yank out
> the page tables from a running process and immediately start handling
> faults for the registered range without having to stop the process.
>
> I hope that answers your questions, feel free to ask if you need more
> clarification.
>
> Brian

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v2] mremap.2: Add information for MREMAP_DONTUNMAP.
  2020-04-17  3:01   ` Brian Geffon
  2020-04-22  0:15     ` Lokesh Gidra
@ 2020-04-22 12:05     ` Michael Kerrisk (man-pages)
  1 sibling, 0 replies; 6+ messages in thread
From: Michael Kerrisk (man-pages) @ 2020-04-22 12:05 UTC (permalink / raw)
  To: Brian Geffon
  Cc: mtk.manpages, linux-man, Sonny Rao, Jesse Barnes,
	Vlastimil Babka, Minchan Kim, Kirill A. Shutemov, Lokesh Gidra

On 4/17/20 5:01 AM, Brian Geffon wrote:
> Hi Michael,
> 
>> Thanks for this patch. I've applied it, and done quite a
>> bit of editing. Could you please take a look at the
>> version in Git, and let me know if I made any bad changes
>> to your text.
> 
> Your changes look good.
> 
>> You write "move", but would it not be more correcrt to say something
>> like "duplicate"?
> 
> It's a little of both, it duplicates the VMA but moves the page table
> entries. So the behavior feels more like a move followed by a new
> mapping created that had the same properties as the previous. Does
> that make sense?
> 
>>> +Possible applications for this behavior might be garbage collection or
>>
>> Can you elaborate the garbage collection use case a little, please?
> 
> Lokesh, who is CCed, can probably expand better than I can, Lokesh
> would you mind elaborating on how the JVM plans to use this.
> 
>>> +non-cooperative
>>> +.BR userfaultfd (2) .
>>
>> What is noncooperative userfaultfd(2)?
> 
> No cooperative userfaultfd is the term that people tend to use when
> the threads accessing the memory are not cooperating with the fault
> handling, MREMAP_DONTUNMAP is interesting for this as you can yank out
> the page tables from a running process and immediately start handling
> faults for the registered range without having to stop the process.
> 
> I hope that answers your questions, feel free to ask if you need more
> clarification.

Thanks, Brian. See my reply to Loresh in just a moment

Cheers,

Mcihael

-- 
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Linux/UNIX System Programming Training: http://man7.org/training/

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v2] mremap.2: Add information for MREMAP_DONTUNMAP.
  2020-04-22  0:15     ` Lokesh Gidra
@ 2020-04-22 12:08       ` Michael Kerrisk (man-pages)
  0 siblings, 0 replies; 6+ messages in thread
From: Michael Kerrisk (man-pages) @ 2020-04-22 12:08 UTC (permalink / raw)
  To: Lokesh Gidra, Brian Geffon
  Cc: mtk.manpages, linux-man, Sonny Rao, Jesse Barnes,
	Vlastimil Babka, Minchan Kim, Kirill A. Shutemov

Hello Brian and Loresh,

>>>> +Possible applications for this behavior might be garbage collection or
>>>
>>> Can you elaborate the garbage collection use case a little, please?
>>
>> Lokesh, who is CCed, can probably expand better than I can, Lokesh
>> would you mind elaborating on how the JVM plans to use this.
>>
> There are many GC algorithms in literature which use PROT_NONE+SIGSEGV
> trick to implement concurrent compaction of java heap. In Android
> Runtime we plan to use userfaultfd instead. But this requires a
> stop-the-world, wherein Java threads are paused, right before starting
> the compaction phase. Within this pause, the physical pages in the
> Java heap will be moved to another area, so that the Java heap, which
> is already registered with userfaultfd, can start 'userfaulting' (as
> Java heap pages are missing) once application threads are resumed.
> 
> In the absence of MREMAP_DONTUNMAP, I'd have to do it by first doing
> mremap, and then mmaping Java heap, as its virtual mapping would be
> removed by the preceding mremap. This not only causes performance
> issues as two system calls need to be made instead of one, but it also
> leaves a window open for a native thread, which is not paused, to
> create a virtual mapping for its own usage right where Java heap is
> supposed to be.

Thank you both for your explanations.

I added some text to the page. Does the following look okay?

   MREMAP_DONTUNMAP use cases
       Possible applications for MREMAP_DONTUNMAP include:

       *  Non-cooperative userfaultfd(2): an application can yank  out  a
          virtual  address range using MREMAP_DONTUNMAP and then employ a
          userfaultfd(2) handler to handle the page  faults  that  subse‐
          quently  occur  as  other threads in the process touch pages in
          the yanked range.

       *  Garbage collection: MREMAP_DONTUNMAP can be used in conjunction
          with  userfaultfd(2) to implement garbage collection algorithms
          (e.g., in a Java virtual machine).  Such an implementation  can
          be  cheaper  (and simpler) than conventional garbage collection
          techniques that involve marking pages with protection PROT_NONE
          in  conjunction with the of a SIGSEGV handler to catch accesses
          to those pages.

Cheers,

Michael

-- 
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Linux/UNIX System Programming Training: http://man7.org/training/

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2020-04-22 12:08 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-04-15 16:49 [PATCH v2] mremap.2: Add information for MREMAP_DONTUNMAP Brian Geffon
2020-04-16  7:07 ` Michael Kerrisk (man-pages)
2020-04-17  3:01   ` Brian Geffon
2020-04-22  0:15     ` Lokesh Gidra
2020-04-22 12:08       ` Michael Kerrisk (man-pages)
2020-04-22 12:05     ` Michael Kerrisk (man-pages)

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).