linux-api.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2] mmap.2: MAP_FIXED updated documentation
@ 2017-12-04  2:14 john.hubbard
       [not found] ` <20171204021411.4786-1-jhubbard-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
  0 siblings, 1 reply; 16+ messages in thread
From: john.hubbard @ 2017-12-04  2:14 UTC (permalink / raw)
  To: Michael Kerrisk
  Cc: linux-man, linux-api, Michael Ellerman, linux-mm, LKML,
	linux-arch, Jann Horn, Matthew Wilcox, Michal Hocko,
	John Hubbard

From: John Hubbard <jhubbard@nvidia.com>

Previously, MAP_FIXED was "discouraged", due to portability
issues with the fixed address. In fact, there are other, more
serious issues. Also, in some limited cases, this option can
be used safely.

Expand the documentation to discuss both the hazards, and how
to use it safely.

The "Portability issues" wording is lifted directly from
Matthew Wilcox's review. The notes about other libraries
creating mappings is also from Matthew (lightly edited).

The suggestion to explain how to use MAP_FIXED safely is
from Jann Horn.

Suggested-by: Matthew Wilcox <willy@infradead.org>
Suggested-by: Jann Horn <jannh@google.com>
Signed-off-by: John Hubbard <jhubbard@nvidia.com>
---

Changed from v1:

    -- Covered topics recommended by Matthew Wilcox
       and Jann Horn, in their recent review: the hazards
       of overwriting pre-exising mappings, and some notes
       about how to use MAP_FIXED safely.

    -- Rewrote the commit description accordingly.

 man2/mmap.2 | 38 ++++++++++++++++++++++++++++++++++++--
 1 file changed, 36 insertions(+), 2 deletions(-)

diff --git a/man2/mmap.2 b/man2/mmap.2
index 385f3bfd5..9038256d4 100644
--- a/man2/mmap.2
+++ b/man2/mmap.2
@@ -222,8 +222,42 @@ part of the existing mapping(s) will be discarded.
 If the specified address cannot be used,
 .BR mmap ()
 will fail.
-Because requiring a fixed address for a mapping is less portable,
-the use of this option is discouraged.
+.IP
+This option is extremely hazardous (when used on its own) and moderately
+non-portable.
+.IP
+Portability issues: a process's memory map may change significantly from one
+run to the next, depending on library versions, kernel versions and random
+numbers.
+.IP
+Hazards: this option forcibly removes pre-existing mappings, making it easy
+for a multi-threaded process to corrupt its own address space.
+.IP
+For example, thread A looks through /proc/<pid>/maps and locates an available
+address range, while thread B simultaneously acquires part or all of that same
+address range. Thread A then calls mmap(MAP_FIXED), effectively overwriting
+thread B's mapping.
+.IP
+Thread B need not create a mapping directly; simply making a library call
+that, internally, uses dlopen(3) to load some other shared library, will
+suffice. The dlopen(3) call will map the library into the process's address
+space. Furthermore, almost any library call may be implemented using this
+technique.
+Examples include brk(2), malloc(3), pthread_create(3), and the PAM libraries
+(http://www.linux-pam.org).
+.IP
+Given the above limitations, one of the very few ways to use this option
+safely is: mmap() a region, without specifying MAP_FIXED. Then, within that
+region, call mmap(MAP_FIXED) to suballocate regions. This avoids both the
+portability problem (because the first mmap call lets the kernel pick the
+address), and the address space corruption problem (because the region being
+overwritten is already owned by the calling thread).
+.IP
+Newer kernels
+(Linux 4.16 and later) have a
+.B MAP_FIXED_SAFE
+option that avoids the corruption problem; if available, MAP_FIXED_SAFE
+should be preferred over MAP_FIXED.
 .TP
 .B MAP_GROWSDOWN
 This flag is used for stacks.
-- 
2.15.1

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH v2] mmap.2: MAP_FIXED updated documentation
       [not found] ` <20171204021411.4786-1-jhubbard-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
@ 2017-12-04 10:55   ` Cyril Hrubis
  2017-12-05  2:14     ` John Hubbard
  2017-12-04 11:31   ` Mike Rapoport
  1 sibling, 1 reply; 16+ messages in thread
From: Cyril Hrubis @ 2017-12-04 10:55 UTC (permalink / raw)
  To: john.hubbard-Re5JQEeQqe8AvxtiuMwx3w
  Cc: Michael Kerrisk, linux-man, linux-api-u79uwXL29TY76Z2rM5mHXA,
	Michael Ellerman, linux-mm-Bw31MaZKKs3YtjvyW6yDsg, LKML,
	linux-arch-u79uwXL29TY76Z2rM5mHXA, Jann Horn, Matthew Wilcox,
	Michal Hocko, John Hubbard

Hi!
I know that we are not touching the rest of the existing description for
MAP_FIXED however the second sentence in the manual page says that "addr
must be a multiple of the page size." Which however is misleading as
this is not enough on some architectures. Code in the wild seems to
(mis)use SHMLBA for aligment purposes but I'm not sure that we should
advise something like that in the manpages.

So what about something as:

"addr must be suitably aligned, for most architectures multiple of page
size is sufficient, however some may impose additional restrictions for
page mapping addresses."

Which should at least hint the reader that this is architecture specific.

-- 
Cyril Hrubis
chrubis-AlSwsSmVLrQ@public.gmane.org
--
To unsubscribe from this list: send the line "unsubscribe linux-man" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v2] mmap.2: MAP_FIXED updated documentation
       [not found] ` <20171204021411.4786-1-jhubbard-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
  2017-12-04 10:55   ` Cyril Hrubis
@ 2017-12-04 11:31   ` Mike Rapoport
  2017-12-05  2:52     ` John Hubbard
  1 sibling, 1 reply; 16+ messages in thread
From: Mike Rapoport @ 2017-12-04 11:31 UTC (permalink / raw)
  To: john.hubbard-Re5JQEeQqe8AvxtiuMwx3w
  Cc: Michael Kerrisk, linux-man, linux-api-u79uwXL29TY76Z2rM5mHXA,
	Michael Ellerman, linux-mm-Bw31MaZKKs3YtjvyW6yDsg, LKML,
	linux-arch-u79uwXL29TY76Z2rM5mHXA, Jann Horn, Matthew Wilcox,
	Michal Hocko, John Hubbard

On Sun, Dec 03, 2017 at 06:14:11PM -0800, john.hubbard-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org wrote:
> From: John Hubbard <jhubbard-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
> 
> Previously, MAP_FIXED was "discouraged", due to portability
> issues with the fixed address. In fact, there are other, more
> serious issues. Also, in some limited cases, this option can
> be used safely.
> 
> Expand the documentation to discuss both the hazards, and how
> to use it safely.
> 
> The "Portability issues" wording is lifted directly from
> Matthew Wilcox's review. The notes about other libraries
> creating mappings is also from Matthew (lightly edited).
> 
> The suggestion to explain how to use MAP_FIXED safely is
> from Jann Horn.
> 
> Suggested-by: Matthew Wilcox <willy-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org>
> Suggested-by: Jann Horn <jannh-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
> Signed-off-by: John Hubbard <jhubbard-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
> ---
> 
> Changed from v1:
> 
>     -- Covered topics recommended by Matthew Wilcox
>        and Jann Horn, in their recent review: the hazards
>        of overwriting pre-exising mappings, and some notes
>        about how to use MAP_FIXED safely.
> 
>     -- Rewrote the commit description accordingly.
> 
>  man2/mmap.2 | 38 ++++++++++++++++++++++++++++++++++++--
>  1 file changed, 36 insertions(+), 2 deletions(-)
> 
> diff --git a/man2/mmap.2 b/man2/mmap.2
> index 385f3bfd5..9038256d4 100644
> --- a/man2/mmap.2
> +++ b/man2/mmap.2
> @@ -222,8 +222,42 @@ part of the existing mapping(s) will be discarded.
>  If the specified address cannot be used,
>  .BR mmap ()
>  will fail.
> -Because requiring a fixed address for a mapping is less portable,
> -the use of this option is discouraged.
> +.IP
> +This option is extremely hazardous (when used on its own) and moderately
> +non-portable.
> +.IP
> +Portability issues: a process's memory map may change significantly from one
> +run to the next, depending on library versions, kernel versions and random
> +numbers.
> +.IP
> +Hazards: this option forcibly removes pre-existing mappings, making it easy
> +for a multi-threaded process to corrupt its own address space.
> +.IP
> +For example, thread A looks through /proc/<pid>/maps and locates an available
> +address range, while thread B simultaneously acquires part or all of that same
> +address range. Thread A then calls mmap(MAP_FIXED), effectively overwriting
> +thread B's mapping.
> +.IP
> +Thread B need not create a mapping directly; simply making a library call
> +that, internally, uses dlopen(3) to load some other shared library, will
> +suffice. The dlopen(3) call will map the library into the process's address
> +space. Furthermore, almost any library call may be implemented using this
> +technique.
> +Examples include brk(2), malloc(3), pthread_create(3), and the PAM libraries
> +(http://www.linux-pam.org).
> +.IP
> +Given the above limitations, one of the very few ways to use this option
> +safely is: mmap() a region, without specifying MAP_FIXED. Then, within that
> +region, call mmap(MAP_FIXED) to suballocate regions. This avoids both the
> +portability problem (because the first mmap call lets the kernel pick the
> +address), and the address space corruption problem (because the region being
> +overwritten is already owned by the calling thread).

Maybe "address space corruption problem caused by implicit calls to mmap"?
The region allocated with the first mmap is not exactly owned by the
thread and a multi-thread application can still corrupt its memory if
different threads use mmap(MAP_FIXED) for overlapping regions.

My 2 cents.

> +.IP
> +Newer kernels
> +(Linux 4.16 and later) have a
> +.B MAP_FIXED_SAFE
> +option that avoids the corruption problem; if available, MAP_FIXED_SAFE
> +should be preferred over MAP_FIXED.
>  .TP
>  .B MAP_GROWSDOWN
>  This flag is used for stacks.
> -- 
> 2.15.1
> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo-Bw31MaZKKs0EbZ0PF+XxCw@public.gmane.org  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont-Bw31MaZKKs3YtjvyW6yDsg@public.gmane.org"> email-Bw31MaZKKs3YtjvyW6yDsg@public.gmane.org </a>
> 

-- 
Sincerely yours,
Mike.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v2] mmap.2: MAP_FIXED updated documentation
  2017-12-04 10:55   ` Cyril Hrubis
@ 2017-12-05  2:14     ` John Hubbard
       [not found]       ` <efb6eae4-7f30-42c3-0efe-0ab5fbf0fdb4-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
  0 siblings, 1 reply; 16+ messages in thread
From: John Hubbard @ 2017-12-05  2:14 UTC (permalink / raw)
  To: Cyril Hrubis
  Cc: Michael Kerrisk, linux-man, linux-api, Michael Ellerman,
	linux-mm, LKML, linux-arch, Jann Horn, Matthew Wilcox,
	Michal Hocko

On 12/04/2017 02:55 AM, Cyril Hrubis wrote:
> Hi!
> I know that we are not touching the rest of the existing description for
> MAP_FIXED however the second sentence in the manual page says that "addr
> must be a multiple of the page size." Which however is misleading as
> this is not enough on some architectures. Code in the wild seems to
> (mis)use SHMLBA for aligment purposes but I'm not sure that we should
> advise something like that in the manpages.
> 
> So what about something as:
> 
> "addr must be suitably aligned, for most architectures multiple of page
> size is sufficient, however some may impose additional restrictions for
> page mapping addresses."
> 

Hi Cyril,

Right, so I've been looking into this today, and I think we can go a bit
further than that, even. The kernel, as far back as the *original* git
commit in 2005, implements mmap on ARM by requiring that the address is
aligned to SHMLBA:

arch/arm/mm/mmap.c:50:

	if (flags & MAP_FIXED) {
		if (aliasing && flags & MAP_SHARED &&
		    (addr - (pgoff << PAGE_SHIFT)) & (SHMLBA - 1))
			return -EINVAL;
		return addr;
	}

So, given that this has been the implementation for the last 12+ years (and
probably the whole time, in fact), I think we can be bold enough to use this
wording for the second sentence of MAP_FIXED:

"addr must be a multiple of SHMLBA (<sys/shm.h>), which in turn is either
the system page size (on many architectures) or a multiple of the system
page size (on some architectures)."

What do you think?

thanks,
John Hubbard
NVIDIA

> Which should at least hint the reader that this is architecture specific.
> 

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v2] mmap.2: MAP_FIXED updated documentation
  2017-12-04 11:31   ` Mike Rapoport
@ 2017-12-05  2:52     ` John Hubbard
       [not found]       ` <6777116d-ad9e-48c9-0009-01d10274135e-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
  0 siblings, 1 reply; 16+ messages in thread
From: John Hubbard @ 2017-12-05  2:52 UTC (permalink / raw)
  To: Mike Rapoport
  Cc: Michael Kerrisk, linux-man, linux-api, Michael Ellerman,
	linux-mm, LKML, linux-arch, Jann Horn, Matthew Wilcox,
	Michal Hocko

On 12/04/2017 03:31 AM, Mike Rapoport wrote:
> On Sun, Dec 03, 2017 at 06:14:11PM -0800, john.hubbard@gmail.com wrote:
>> From: John Hubbard <jhubbard@nvidia.com>
>>
[...]
>> +.IP
>> +Given the above limitations, one of the very few ways to use this option
>> +safely is: mmap() a region, without specifying MAP_FIXED. Then, within that
>> +region, call mmap(MAP_FIXED) to suballocate regions. This avoids both the
>> +portability problem (because the first mmap call lets the kernel pick the
>> +address), and the address space corruption problem (because the region being
>> +overwritten is already owned by the calling thread).
> 
> Maybe "address space corruption problem caused by implicit calls to mmap"?
> The region allocated with the first mmap is not exactly owned by the
> thread and a multi-thread application can still corrupt its memory if
> different threads use mmap(MAP_FIXED) for overlapping regions.
> 
> My 2 cents.
> 

Hi Mike,

Yes, thanks for picking through this, and I agree that the above is misleading.
It should definitely not use the word "owned" at all. Re-doing the whole 
paragraph in order to make it all fit together nicely, I get this:

"Given the above limitations, one of the very few ways to use this option
safely is: mmap() an enclosing region, without specifying MAP_FIXED.
Then, within that region, call mmap(MAP_FIXED) to suballocate regions
within the enclosing region. This avoids both the portability problem 
(because the first mmap call lets the kernel pick the address), and the 
address space corruption problem (because implicit calls to mmap will 
not affect the already-mapped enclosing region)."

...how's that sound to you? I'll post a v3 soon with this.


thanks,
John Hubbard
NVIDIA

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v2] mmap.2: MAP_FIXED updated documentation
       [not found]       ` <efb6eae4-7f30-42c3-0efe-0ab5fbf0fdb4-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
@ 2017-12-05  7:05         ` Michal Hocko
  2017-12-05  7:42           ` John Hubbard
  0 siblings, 1 reply; 16+ messages in thread
From: Michal Hocko @ 2017-12-05  7:05 UTC (permalink / raw)
  To: John Hubbard
  Cc: Cyril Hrubis, Michael Kerrisk, linux-man,
	linux-api-u79uwXL29TY76Z2rM5mHXA, Michael Ellerman,
	linux-mm-Bw31MaZKKs3YtjvyW6yDsg, LKML,
	linux-arch-u79uwXL29TY76Z2rM5mHXA, Jann Horn, Matthew Wilcox

On Mon 04-12-17 18:14:18, John Hubbard wrote:
> On 12/04/2017 02:55 AM, Cyril Hrubis wrote:
> > Hi!
> > I know that we are not touching the rest of the existing description for
> > MAP_FIXED however the second sentence in the manual page says that "addr
> > must be a multiple of the page size." Which however is misleading as
> > this is not enough on some architectures. Code in the wild seems to
> > (mis)use SHMLBA for aligment purposes but I'm not sure that we should
> > advise something like that in the manpages.
> > 
> > So what about something as:
> > 
> > "addr must be suitably aligned, for most architectures multiple of page
> > size is sufficient, however some may impose additional restrictions for
> > page mapping addresses."
> > 
> 
> Hi Cyril,
> 
> Right, so I've been looking into this today, and I think we can go a bit
> further than that, even. The kernel, as far back as the *original* git
> commit in 2005, implements mmap on ARM by requiring that the address is
> aligned to SHMLBA:
> 
> arch/arm/mm/mmap.c:50:
> 
> 	if (flags & MAP_FIXED) {
> 		if (aliasing && flags & MAP_SHARED &&
> 		    (addr - (pgoff << PAGE_SHIFT)) & (SHMLBA - 1))
> 			return -EINVAL;
> 		return addr;
> 	}
> 
> So, given that this has been the implementation for the last 12+ years (and
> probably the whole time, in fact), I think we can be bold enough to use this
> wording for the second sentence of MAP_FIXED:
> 
> "addr must be a multiple of SHMLBA (<sys/shm.h>), which in turn is either
> the system page size (on many architectures) or a multiple of the system
> page size (on some architectures)."
> 
> What do you think?

I am not sure this is a good idea. This is pulling way too many
implementation details into the man page IMHO. Note that your wording is
even incorrect because this applies only to shared mappings and on some
architectures it even requires special memory regions. We do not want
all that in the man page...

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v2] mmap.2: MAP_FIXED updated documentation
       [not found]       ` <6777116d-ad9e-48c9-0009-01d10274135e-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
@ 2017-12-05  7:08         ` Michal Hocko
  2017-12-05  7:43           ` John Hubbard
  0 siblings, 1 reply; 16+ messages in thread
From: Michal Hocko @ 2017-12-05  7:08 UTC (permalink / raw)
  To: John Hubbard
  Cc: Mike Rapoport, Michael Kerrisk, linux-man,
	linux-api-u79uwXL29TY76Z2rM5mHXA, Michael Ellerman,
	linux-mm-Bw31MaZKKs3YtjvyW6yDsg, LKML,
	linux-arch-u79uwXL29TY76Z2rM5mHXA, Jann Horn, Matthew Wilcox

On Mon 04-12-17 18:52:27, John Hubbard wrote:
> On 12/04/2017 03:31 AM, Mike Rapoport wrote:
> > On Sun, Dec 03, 2017 at 06:14:11PM -0800, john.hubbard-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org wrote:
> >> From: John Hubbard <jhubbard-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
> >>
> [...]
> >> +.IP
> >> +Given the above limitations, one of the very few ways to use this option
> >> +safely is: mmap() a region, without specifying MAP_FIXED. Then, within that
> >> +region, call mmap(MAP_FIXED) to suballocate regions. This avoids both the
> >> +portability problem (because the first mmap call lets the kernel pick the
> >> +address), and the address space corruption problem (because the region being
> >> +overwritten is already owned by the calling thread).
> > 
> > Maybe "address space corruption problem caused by implicit calls to mmap"?
> > The region allocated with the first mmap is not exactly owned by the
> > thread and a multi-thread application can still corrupt its memory if
> > different threads use mmap(MAP_FIXED) for overlapping regions.
> > 
> > My 2 cents.
> > 
> 
> Hi Mike,
> 
> Yes, thanks for picking through this, and I agree that the above is misleading.
> It should definitely not use the word "owned" at all. Re-doing the whole 
> paragraph in order to make it all fit together nicely, I get this:
> 
> "Given the above limitations, one of the very few ways to use this option
> safely is: mmap() an enclosing region, without specifying MAP_FIXED.
> Then, within that region, call mmap(MAP_FIXED) to suballocate regions
> within the enclosing region. This avoids both the portability problem 
> (because the first mmap call lets the kernel pick the address), and the 
> address space corruption problem (because implicit calls to mmap will 
> not affect the already-mapped enclosing region)."
> 
> ...how's that sound to you? I'll post a v3 soon with this.

It sounds to me you are trying to tell way to much while actually being
a bit misleading. Even sub-range MAP_FIXED is not multi-thread safe.

Really the more corner cases you will try to cover the worse the end
result will end up. I would just try to be simple here and mention the
address space corruption issues you've had earlier and be done with it.
Maybe add a note that some architectures might need a special alignement
and fail if it is not the case but nothing really specific.
-- 
Michal Hocko
SUSE Labs
--
To unsubscribe from this list: send the line "unsubscribe linux-man" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v2] mmap.2: MAP_FIXED updated documentation
  2017-12-05  7:05         ` Michal Hocko
@ 2017-12-05  7:42           ` John Hubbard
  2017-12-05  8:52             ` Michal Hocko
       [not found]             ` <2cff594a-b481-269d-dd91-ff2cc2f4100a-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
  0 siblings, 2 replies; 16+ messages in thread
From: John Hubbard @ 2017-12-05  7:42 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Cyril Hrubis, Michael Kerrisk, linux-man, linux-api,
	Michael Ellerman, linux-mm, LKML, linux-arch, Jann Horn,
	Matthew Wilcox

On 12/04/2017 11:05 PM, Michal Hocko wrote:
> On Mon 04-12-17 18:14:18, John Hubbard wrote:
>> On 12/04/2017 02:55 AM, Cyril Hrubis wrote:
>>> Hi!
>>> I know that we are not touching the rest of the existing description for
>>> MAP_FIXED however the second sentence in the manual page says that "addr
>>> must be a multiple of the page size." Which however is misleading as
>>> this is not enough on some architectures. Code in the wild seems to
>>> (mis)use SHMLBA for aligment purposes but I'm not sure that we should
>>> advise something like that in the manpages.
>>>
>>> So what about something as:
>>>
>>> "addr must be suitably aligned, for most architectures multiple of page
>>> size is sufficient, however some may impose additional restrictions for
>>> page mapping addresses."
>>>
>>
>> Hi Cyril,
>>
>> Right, so I've been looking into this today, and I think we can go a bit
>> further than that, even. The kernel, as far back as the *original* git
>> commit in 2005, implements mmap on ARM by requiring that the address is
>> aligned to SHMLBA:
>>
>> arch/arm/mm/mmap.c:50:
>>
>> 	if (flags & MAP_FIXED) {
>> 		if (aliasing && flags & MAP_SHARED &&
>> 		    (addr - (pgoff << PAGE_SHIFT)) & (SHMLBA - 1))
>> 			return -EINVAL;
>> 		return addr;
>> 	}
>>
>> So, given that this has been the implementation for the last 12+ years (and
>> probably the whole time, in fact), I think we can be bold enough to use this
>> wording for the second sentence of MAP_FIXED:
>>
>> "addr must be a multiple of SHMLBA (<sys/shm.h>), which in turn is either
>> the system page size (on many architectures) or a multiple of the system
>> page size (on some architectures)."
>>
>> What do you think?
> 
> I am not sure this is a good idea. This is pulling way too many
> implementation details into the man page IMHO. Note that your wording is
> even incorrect because this applies only to shared mappings and on some
> architectures it even requires special memory regions. We do not want
> all that in the man page...
> 

Hi Michal,

OK, so it sounds like Cyril's original wording would be just about right,
after all, like this?

"addr must be suitably aligned. For most architectures multiple of page
size is sufficient; however, some may impose additional restrictions."

(It does seem unfortunate that the man page cannot help the programmer
actually write correct code here. He or she is forced to read the kernel
implementation, in order to figure out the true alignment rules. I was
hoping we could avoid that.)

thanks,
John Hubbard
NVIDIA

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v2] mmap.2: MAP_FIXED updated documentation
  2017-12-05  7:08         ` Michal Hocko
@ 2017-12-05  7:43           ` John Hubbard
  0 siblings, 0 replies; 16+ messages in thread
From: John Hubbard @ 2017-12-05  7:43 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Mike Rapoport, Michael Kerrisk, linux-man, linux-api,
	Michael Ellerman, linux-mm, LKML, linux-arch, Jann Horn,
	Matthew Wilcox

On 12/04/2017 11:08 PM, Michal Hocko wrote:
> On Mon 04-12-17 18:52:27, John Hubbard wrote:
>> On 12/04/2017 03:31 AM, Mike Rapoport wrote:
>>> On Sun, Dec 03, 2017 at 06:14:11PM -0800, john.hubbard@gmail.com wrote:
>>>> From: John Hubbard <jhubbard@nvidia.com>
>>>>
>> [...]
>>>> +.IP
>>>> +Given the above limitations, one of the very few ways to use this option
>>>> +safely is: mmap() a region, without specifying MAP_FIXED. Then, within that
>>>> +region, call mmap(MAP_FIXED) to suballocate regions. This avoids both the
>>>> +portability problem (because the first mmap call lets the kernel pick the
>>>> +address), and the address space corruption problem (because the region being
>>>> +overwritten is already owned by the calling thread).
>>>
>>> Maybe "address space corruption problem caused by implicit calls to mmap"?
>>> The region allocated with the first mmap is not exactly owned by the
>>> thread and a multi-thread application can still corrupt its memory if
>>> different threads use mmap(MAP_FIXED) for overlapping regions.
>>>
>>> My 2 cents.
>>>
>>
>> Hi Mike,
>>
>> Yes, thanks for picking through this, and I agree that the above is misleading.
>> It should definitely not use the word "owned" at all. Re-doing the whole 
>> paragraph in order to make it all fit together nicely, I get this:
>>
>> "Given the above limitations, one of the very few ways to use this option
>> safely is: mmap() an enclosing region, without specifying MAP_FIXED.
>> Then, within that region, call mmap(MAP_FIXED) to suballocate regions
>> within the enclosing region. This avoids both the portability problem 
>> (because the first mmap call lets the kernel pick the address), and the 
>> address space corruption problem (because implicit calls to mmap will 
>> not affect the already-mapped enclosing region)."
>>
>> ...how's that sound to you? I'll post a v3 soon with this.
> 
> It sounds to me you are trying to tell way to much while actually being
> a bit misleading. Even sub-range MAP_FIXED is not multi-thread safe.
> 
> Really the more corner cases you will try to cover the worse the end
> result will end up. I would just try to be simple here and mention the
> address space corruption issues you've had earlier and be done with it.
> Maybe add a note that some architectures might need a special alignement
> and fail if it is not the case but nothing really specific.
> 

Sure, I can drop the "how to use this safely" section.  It seemed like a good
idea at the time... :)

thanks,
John Hubbard
NVIDIA

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v2] mmap.2: MAP_FIXED updated documentation
  2017-12-05  7:42           ` John Hubbard
@ 2017-12-05  8:52             ` Michal Hocko
       [not found]             ` <2cff594a-b481-269d-dd91-ff2cc2f4100a-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
  1 sibling, 0 replies; 16+ messages in thread
From: Michal Hocko @ 2017-12-05  8:52 UTC (permalink / raw)
  To: John Hubbard
  Cc: Cyril Hrubis, Michael Kerrisk, linux-man, linux-api,
	Michael Ellerman, linux-mm, LKML, linux-arch, Jann Horn,
	Matthew Wilcox

On Mon 04-12-17 23:42:00, John Hubbard wrote:
> On 12/04/2017 11:05 PM, Michal Hocko wrote:
> > On Mon 04-12-17 18:14:18, John Hubbard wrote:
> >> On 12/04/2017 02:55 AM, Cyril Hrubis wrote:
> >>> Hi!
> >>> I know that we are not touching the rest of the existing description for
> >>> MAP_FIXED however the second sentence in the manual page says that "addr
> >>> must be a multiple of the page size." Which however is misleading as
> >>> this is not enough on some architectures. Code in the wild seems to
> >>> (mis)use SHMLBA for aligment purposes but I'm not sure that we should
> >>> advise something like that in the manpages.
> >>>
> >>> So what about something as:
> >>>
> >>> "addr must be suitably aligned, for most architectures multiple of page
> >>> size is sufficient, however some may impose additional restrictions for
> >>> page mapping addresses."
> >>>
> >>
> >> Hi Cyril,
> >>
> >> Right, so I've been looking into this today, and I think we can go a bit
> >> further than that, even. The kernel, as far back as the *original* git
> >> commit in 2005, implements mmap on ARM by requiring that the address is
> >> aligned to SHMLBA:
> >>
> >> arch/arm/mm/mmap.c:50:
> >>
> >> 	if (flags & MAP_FIXED) {
> >> 		if (aliasing && flags & MAP_SHARED &&
> >> 		    (addr - (pgoff << PAGE_SHIFT)) & (SHMLBA - 1))
> >> 			return -EINVAL;
> >> 		return addr;
> >> 	}
> >>
> >> So, given that this has been the implementation for the last 12+ years (and
> >> probably the whole time, in fact), I think we can be bold enough to use this
> >> wording for the second sentence of MAP_FIXED:
> >>
> >> "addr must be a multiple of SHMLBA (<sys/shm.h>), which in turn is either
> >> the system page size (on many architectures) or a multiple of the system
> >> page size (on some architectures)."
> >>
> >> What do you think?
> > 
> > I am not sure this is a good idea. This is pulling way too many
> > implementation details into the man page IMHO. Note that your wording is
> > even incorrect because this applies only to shared mappings and on some
> > architectures it even requires special memory regions. We do not want
> > all that in the man page...
> > 
> 
> Hi Michal,
> 
> OK, so it sounds like Cyril's original wording would be just about right,
> after all, like this?
> 
> "addr must be suitably aligned. For most architectures multiple of page
> size is sufficient; however, some may impose additional restrictions."
> 
> (It does seem unfortunate that the man page cannot help the programmer
> actually write correct code here. He or she is forced to read the kernel
> implementation, in order to figure out the true alignment rules. I was
> hoping we could avoid that.)

I strongly suspect that this is more the architecture than the kernel
implementation imposed restriction.
-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v2] mmap.2: MAP_FIXED updated documentation
       [not found]             ` <2cff594a-b481-269d-dd91-ff2cc2f4100a-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
@ 2017-12-06 10:01               ` Cyril Hrubis
  2017-12-06 21:21                 ` John Hubbard
  0 siblings, 1 reply; 16+ messages in thread
From: Cyril Hrubis @ 2017-12-06 10:01 UTC (permalink / raw)
  To: John Hubbard
  Cc: Michal Hocko, Michael Kerrisk, linux-man,
	linux-api-u79uwXL29TY76Z2rM5mHXA, Michael Ellerman,
	linux-mm-Bw31MaZKKs3YtjvyW6yDsg, LKML,
	linux-arch-u79uwXL29TY76Z2rM5mHXA, Jann Horn, Matthew Wilcox

Hi!
> (It does seem unfortunate that the man page cannot help the programmer
> actually write correct code here. He or she is forced to read the kernel
> implementation, in order to figure out the true alignment rules. I was
> hoping we could avoid that.)

It would be nice if we had this information exported somehere so that we
do not have to rely on per-architecture ifdefs.

What about adding MapAligment or something similar to the /proc/meminfo?

-- 
Cyril Hrubis
chrubis-AlSwsSmVLrQ@public.gmane.org

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v2] mmap.2: MAP_FIXED updated documentation
  2017-12-06 10:01               ` Cyril Hrubis
@ 2017-12-06 21:21                 ` John Hubbard
  2017-12-07 12:58                   ` Cyril Hrubis
  0 siblings, 1 reply; 16+ messages in thread
From: John Hubbard @ 2017-12-06 21:21 UTC (permalink / raw)
  To: Cyril Hrubis
  Cc: Michal Hocko, Michael Kerrisk, linux-man, linux-api,
	Michael Ellerman, linux-mm, LKML, linux-arch, Jann Horn,
	Matthew Wilcox

On 12/06/2017 02:01 AM, Cyril Hrubis wrote:
> Hi!
>> (It does seem unfortunate that the man page cannot help the programmer
>> actually write correct code here. He or she is forced to read the kernel
>> implementation, in order to figure out the true alignment rules. I was
>> hoping we could avoid that.)
> 
> It would be nice if we had this information exported somehere so that we
> do not have to rely on per-architecture ifdefs.
> 
> What about adding MapAligment or something similar to the /proc/meminfo?
> 

What's the use case you envision for that? I don't see how that would be
better than using SHMLBA, which is available at compiler time. Because 
unless someone expects to be able to run an app that was compiled for 
Arch X, on Arch Y (surely that's not requirement here?), I don't see how
the run-time check is any better.

Or maybe you're thinking that since the SHMLBA cannot be put in the man
pages, we could instead provide MapAlignment as sort of a different
way to document the requirement?

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v2] mmap.2: MAP_FIXED updated documentation
  2017-12-06 21:21                 ` John Hubbard
@ 2017-12-07 12:58                   ` Cyril Hrubis
  2017-12-07 14:02                     ` Michal Hocko
  0 siblings, 1 reply; 16+ messages in thread
From: Cyril Hrubis @ 2017-12-07 12:58 UTC (permalink / raw)
  To: John Hubbard
  Cc: Michal Hocko, Michael Kerrisk, linux-man, linux-api,
	Michael Ellerman, linux-mm, LKML, linux-arch, Jann Horn,
	Matthew Wilcox

Hi!
> >> (It does seem unfortunate that the man page cannot help the programmer
> >> actually write correct code here. He or she is forced to read the kernel
> >> implementation, in order to figure out the true alignment rules. I was
> >> hoping we could avoid that.)
> > 
> > It would be nice if we had this information exported somehere so that we
> > do not have to rely on per-architecture ifdefs.
> > 
> > What about adding MapAligment or something similar to the /proc/meminfo?
> > 
> 
> What's the use case you envision for that? I don't see how that would be
> better than using SHMLBA, which is available at compiler time. Because 
> unless someone expects to be able to run an app that was compiled for 
> Arch X, on Arch Y (surely that's not requirement here?), I don't see how
> the run-time check is any better.

I guess that some kind of compile time constant in uapi headers will do
as well, I'm really open to any solution that would expose this constant
as some kind of official API.

There is one problem with using SHMLBA, there are more than one libc
implementations and at least two of them (bionic and klibc) does not
implement the SysV IPC at all. I know that the chances that you are
writing a program that requires MAP_FIXED, is compiled against
alternative libc, and expected to run on less common architectures are
pretty slim. On the other hand I do not see a reason why this cannot be
done properly, this is just about exporting one simple constant to
userspace after all.

> Or maybe you're thinking that since the SHMLBA cannot be put in the man
> pages, we could instead provide MapAlignment as sort of a different
> way to document the requirement?

This is my intention as well.

-- 
Cyril Hrubis
chrubis@suse.cz

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v2] mmap.2: MAP_FIXED updated documentation
  2017-12-07 12:58                   ` Cyril Hrubis
@ 2017-12-07 14:02                     ` Michal Hocko
  2017-12-09 17:19                       ` Pavel Machek
  0 siblings, 1 reply; 16+ messages in thread
From: Michal Hocko @ 2017-12-07 14:02 UTC (permalink / raw)
  To: Cyril Hrubis
  Cc: John Hubbard, Michael Kerrisk, linux-man, linux-api,
	Michael Ellerman, linux-mm, LKML, linux-arch, Jann Horn,
	Matthew Wilcox

On Thu 07-12-17 13:58:05, Cyril Hrubis wrote:
> Hi!
> > >> (It does seem unfortunate that the man page cannot help the programmer
> > >> actually write correct code here. He or she is forced to read the kernel
> > >> implementation, in order to figure out the true alignment rules. I was
> > >> hoping we could avoid that.)
> > > 
> > > It would be nice if we had this information exported somehere so that we
> > > do not have to rely on per-architecture ifdefs.
> > > 
> > > What about adding MapAligment or something similar to the /proc/meminfo?
> > > 
> > 
> > What's the use case you envision for that? I don't see how that would be
> > better than using SHMLBA, which is available at compiler time. Because 
> > unless someone expects to be able to run an app that was compiled for 
> > Arch X, on Arch Y (surely that's not requirement here?), I don't see how
> > the run-time check is any better.
> 
> I guess that some kind of compile time constant in uapi headers will do
> as well, I'm really open to any solution that would expose this constant
> as some kind of official API.

I am not sure this is really feasible. It is not only a simple alignment
thing. Look at ppc for example (slice_get_unmapped_area). Other
architectures might have even more complicated rules e.g. arm and its
cache_is_vipt_aliasing. Also this applies only on MAP_SHARED || file
backed mappings.

I would really leave dogs sleeping... Trying to document all this in the
man page has chances to confuse more people than it has chances to help
those who already know all these nasty details.
-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v2] mmap.2: MAP_FIXED updated documentation
  2017-12-07 14:02                     ` Michal Hocko
@ 2017-12-09 17:19                       ` Pavel Machek
  2017-12-10  7:44                         ` John Hubbard
  0 siblings, 1 reply; 16+ messages in thread
From: Pavel Machek @ 2017-12-09 17:19 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Cyril Hrubis, John Hubbard, Michael Kerrisk, linux-man,
	linux-api, Michael Ellerman, linux-mm, LKML, linux-arch,
	Jann Horn, Matthew Wilcox

On Thu 2017-12-07 15:02:21, Michal Hocko wrote:
> On Thu 07-12-17 13:58:05, Cyril Hrubis wrote:
> > Hi!
> > > >> (It does seem unfortunate that the man page cannot help the programmer
> > > >> actually write correct code here. He or she is forced to read the kernel
> > > >> implementation, in order to figure out the true alignment rules. I was
> > > >> hoping we could avoid that.)
> > > > 
> > > > It would be nice if we had this information exported somehere so that we
> > > > do not have to rely on per-architecture ifdefs.
> > > > 
> > > > What about adding MapAligment or something similar to the /proc/meminfo?
> > > > 
> > > 
> > > What's the use case you envision for that? I don't see how that would be
> > > better than using SHMLBA, which is available at compiler time. Because 
> > > unless someone expects to be able to run an app that was compiled for 
> > > Arch X, on Arch Y (surely that's not requirement here?), I don't see how
> > > the run-time check is any better.
> > 
> > I guess that some kind of compile time constant in uapi headers will do
> > as well, I'm really open to any solution that would expose this constant
> > as some kind of official API.
> 
> I am not sure this is really feasible. It is not only a simple alignment
> thing. Look at ppc for example (slice_get_unmapped_area). Other
> architectures might have even more complicated rules e.g. arm and its
> cache_is_vipt_aliasing. Also this applies only on MAP_SHARED || file
> backed mappings.
> 
> I would really leave dogs sleeping... Trying to document all this in the
> man page has chances to confuse more people than it has chances to help
> those who already know all these nasty details.

You don't have to provide all the details, but warning that there's arch-
specific magic would be nice...
								Pavel

(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v2] mmap.2: MAP_FIXED updated documentation
  2017-12-09 17:19                       ` Pavel Machek
@ 2017-12-10  7:44                         ` John Hubbard
  0 siblings, 0 replies; 16+ messages in thread
From: John Hubbard @ 2017-12-10  7:44 UTC (permalink / raw)
  To: Pavel Machek, Michal Hocko
  Cc: Cyril Hrubis, Michael Kerrisk, linux-man, linux-api,
	Michael Ellerman, linux-mm, LKML, linux-arch, Jann Horn,
	Matthew Wilcox

On 12/09/2017 09:19 AM, Pavel Machek wrote:
> On Thu 2017-12-07 15:02:21, Michal Hocko wrote:
>> On Thu 07-12-17 13:58:05, Cyril Hrubis wrote:
>>> Hi!
>>>>>> (It does seem unfortunate that the man page cannot help the programmer
>>>>>> actually write correct code here. He or she is forced to read the kernel
>>>>>> implementation, in order to figure out the true alignment rules. I was
>>>>>> hoping we could avoid that.)
>>>>>
>>>>> It would be nice if we had this information exported somehere so that we
>>>>> do not have to rely on per-architecture ifdefs.
>>>>>
>>>>> What about adding MapAligment or something similar to the /proc/meminfo?
>>>>>
>>>>
>>>> What's the use case you envision for that? I don't see how that would be
>>>> better than using SHMLBA, which is available at compiler time. Because 
>>>> unless someone expects to be able to run an app that was compiled for 
>>>> Arch X, on Arch Y (surely that's not requirement here?), I don't see how
>>>> the run-time check is any better.
>>>
>>> I guess that some kind of compile time constant in uapi headers will do
>>> as well, I'm really open to any solution that would expose this constant
>>> as some kind of official API.
>>
>> I am not sure this is really feasible. It is not only a simple alignment
>> thing. Look at ppc for example (slice_get_unmapped_area). Other
>> architectures might have even more complicated rules e.g. arm and its
>> cache_is_vipt_aliasing. Also this applies only on MAP_SHARED || file
>> backed mappings.
>>
>> I would really leave dogs sleeping... Trying to document all this in the
>> man page has chances to confuse more people than it has chances to help
>> those who already know all these nasty details.
> 
> You don't have to provide all the details, but warning that there's arch-
> specific magic would be nice...

Hi Pavel,

In version 4 of this patch (which oddly enough, I have trouble finding via
google, it only seems to show up in patchwork.kernel.org [1]), I phrased it 
like this:

    Don't interpret addr as a hint: place the mapping at  exactly  that
    address.   addr  must be suitably aligned: for most architectures a
    multiple of page size is sufficient;  however,  some  architectures
    may  impose additional restrictions. 

...which is basically what Cyril was asking for, in his early feedback.
Does that work for you?

(Maybe I need to repost that patch. In any case the CC's need updating,
at least.)

[1] https://patchwork.kernel.org/patch/10094905/

thanks,
-- 
John Hubbard
NVIDIA

> 								Pavel
> 
> (english) http://www.livejournal.com/~pavelmachek
> (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
> 

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2017-12-10  7:44 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-12-04  2:14 [PATCH v2] mmap.2: MAP_FIXED updated documentation john.hubbard
     [not found] ` <20171204021411.4786-1-jhubbard-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
2017-12-04 10:55   ` Cyril Hrubis
2017-12-05  2:14     ` John Hubbard
     [not found]       ` <efb6eae4-7f30-42c3-0efe-0ab5fbf0fdb4-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
2017-12-05  7:05         ` Michal Hocko
2017-12-05  7:42           ` John Hubbard
2017-12-05  8:52             ` Michal Hocko
     [not found]             ` <2cff594a-b481-269d-dd91-ff2cc2f4100a-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
2017-12-06 10:01               ` Cyril Hrubis
2017-12-06 21:21                 ` John Hubbard
2017-12-07 12:58                   ` Cyril Hrubis
2017-12-07 14:02                     ` Michal Hocko
2017-12-09 17:19                       ` Pavel Machek
2017-12-10  7:44                         ` John Hubbard
2017-12-04 11:31   ` Mike Rapoport
2017-12-05  2:52     ` John Hubbard
     [not found]       ` <6777116d-ad9e-48c9-0009-01d10274135e-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
2017-12-05  7:08         ` Michal Hocko
2017-12-05  7:43           ` John Hubbard

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).