* [PATCH v2] mmap.2: MAP_FIXED updated documentation @ 2017-12-04 2:14 john.hubbard [not found] ` <20171204021411.4786-1-jhubbard-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org> 0 siblings, 1 reply; 16+ messages in thread From: john.hubbard @ 2017-12-04 2:14 UTC (permalink / raw) To: Michael Kerrisk Cc: linux-man, linux-api, Michael Ellerman, linux-mm, LKML, linux-arch, Jann Horn, Matthew Wilcox, Michal Hocko, John Hubbard From: John Hubbard <jhubbard@nvidia.com> Previously, MAP_FIXED was "discouraged", due to portability issues with the fixed address. In fact, there are other, more serious issues. Also, in some limited cases, this option can be used safely. Expand the documentation to discuss both the hazards, and how to use it safely. The "Portability issues" wording is lifted directly from Matthew Wilcox's review. The notes about other libraries creating mappings is also from Matthew (lightly edited). The suggestion to explain how to use MAP_FIXED safely is from Jann Horn. Suggested-by: Matthew Wilcox <willy@infradead.org> Suggested-by: Jann Horn <jannh@google.com> Signed-off-by: John Hubbard <jhubbard@nvidia.com> --- Changed from v1: -- Covered topics recommended by Matthew Wilcox and Jann Horn, in their recent review: the hazards of overwriting pre-exising mappings, and some notes about how to use MAP_FIXED safely. -- Rewrote the commit description accordingly. man2/mmap.2 | 38 ++++++++++++++++++++++++++++++++++++-- 1 file changed, 36 insertions(+), 2 deletions(-) diff --git a/man2/mmap.2 b/man2/mmap.2 index 385f3bfd5..9038256d4 100644 --- a/man2/mmap.2 +++ b/man2/mmap.2 @@ -222,8 +222,42 @@ part of the existing mapping(s) will be discarded. If the specified address cannot be used, .BR mmap () will fail. -Because requiring a fixed address for a mapping is less portable, -the use of this option is discouraged. +.IP +This option is extremely hazardous (when used on its own) and moderately +non-portable. +.IP +Portability issues: a process's memory map may change significantly from one +run to the next, depending on library versions, kernel versions and random +numbers. +.IP +Hazards: this option forcibly removes pre-existing mappings, making it easy +for a multi-threaded process to corrupt its own address space. +.IP +For example, thread A looks through /proc/<pid>/maps and locates an available +address range, while thread B simultaneously acquires part or all of that same +address range. Thread A then calls mmap(MAP_FIXED), effectively overwriting +thread B's mapping. +.IP +Thread B need not create a mapping directly; simply making a library call +that, internally, uses dlopen(3) to load some other shared library, will +suffice. The dlopen(3) call will map the library into the process's address +space. Furthermore, almost any library call may be implemented using this +technique. +Examples include brk(2), malloc(3), pthread_create(3), and the PAM libraries +(http://www.linux-pam.org). +.IP +Given the above limitations, one of the very few ways to use this option +safely is: mmap() a region, without specifying MAP_FIXED. Then, within that +region, call mmap(MAP_FIXED) to suballocate regions. This avoids both the +portability problem (because the first mmap call lets the kernel pick the +address), and the address space corruption problem (because the region being +overwritten is already owned by the calling thread). +.IP +Newer kernels +(Linux 4.16 and later) have a +.B MAP_FIXED_SAFE +option that avoids the corruption problem; if available, MAP_FIXED_SAFE +should be preferred over MAP_FIXED. .TP .B MAP_GROWSDOWN This flag is used for stacks. -- 2.15.1 -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply related [flat|nested] 16+ messages in thread
[parent not found: <20171204021411.4786-1-jhubbard-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>]
* Re: [PATCH v2] mmap.2: MAP_FIXED updated documentation [not found] ` <20171204021411.4786-1-jhubbard-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org> @ 2017-12-04 10:55 ` Cyril Hrubis 2017-12-05 2:14 ` John Hubbard 2017-12-04 11:31 ` Mike Rapoport 1 sibling, 1 reply; 16+ messages in thread From: Cyril Hrubis @ 2017-12-04 10:55 UTC (permalink / raw) To: john.hubbard-Re5JQEeQqe8AvxtiuMwx3w Cc: Michael Kerrisk, linux-man, linux-api-u79uwXL29TY76Z2rM5mHXA, Michael Ellerman, linux-mm-Bw31MaZKKs3YtjvyW6yDsg, LKML, linux-arch-u79uwXL29TY76Z2rM5mHXA, Jann Horn, Matthew Wilcox, Michal Hocko, John Hubbard Hi! I know that we are not touching the rest of the existing description for MAP_FIXED however the second sentence in the manual page says that "addr must be a multiple of the page size." Which however is misleading as this is not enough on some architectures. Code in the wild seems to (mis)use SHMLBA for aligment purposes but I'm not sure that we should advise something like that in the manpages. So what about something as: "addr must be suitably aligned, for most architectures multiple of page size is sufficient, however some may impose additional restrictions for page mapping addresses." Which should at least hint the reader that this is architecture specific. -- Cyril Hrubis chrubis-AlSwsSmVLrQ@public.gmane.org -- To unsubscribe from this list: send the line "unsubscribe linux-man" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH v2] mmap.2: MAP_FIXED updated documentation 2017-12-04 10:55 ` Cyril Hrubis @ 2017-12-05 2:14 ` John Hubbard [not found] ` <efb6eae4-7f30-42c3-0efe-0ab5fbf0fdb4-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org> 0 siblings, 1 reply; 16+ messages in thread From: John Hubbard @ 2017-12-05 2:14 UTC (permalink / raw) To: Cyril Hrubis Cc: Michael Kerrisk, linux-man, linux-api, Michael Ellerman, linux-mm, LKML, linux-arch, Jann Horn, Matthew Wilcox, Michal Hocko On 12/04/2017 02:55 AM, Cyril Hrubis wrote: > Hi! > I know that we are not touching the rest of the existing description for > MAP_FIXED however the second sentence in the manual page says that "addr > must be a multiple of the page size." Which however is misleading as > this is not enough on some architectures. Code in the wild seems to > (mis)use SHMLBA for aligment purposes but I'm not sure that we should > advise something like that in the manpages. > > So what about something as: > > "addr must be suitably aligned, for most architectures multiple of page > size is sufficient, however some may impose additional restrictions for > page mapping addresses." > Hi Cyril, Right, so I've been looking into this today, and I think we can go a bit further than that, even. The kernel, as far back as the *original* git commit in 2005, implements mmap on ARM by requiring that the address is aligned to SHMLBA: arch/arm/mm/mmap.c:50: if (flags & MAP_FIXED) { if (aliasing && flags & MAP_SHARED && (addr - (pgoff << PAGE_SHIFT)) & (SHMLBA - 1)) return -EINVAL; return addr; } So, given that this has been the implementation for the last 12+ years (and probably the whole time, in fact), I think we can be bold enough to use this wording for the second sentence of MAP_FIXED: "addr must be a multiple of SHMLBA (<sys/shm.h>), which in turn is either the system page size (on many architectures) or a multiple of the system page size (on some architectures)." What do you think? thanks, John Hubbard NVIDIA > Which should at least hint the reader that this is architecture specific. > ^ permalink raw reply [flat|nested] 16+ messages in thread
[parent not found: <efb6eae4-7f30-42c3-0efe-0ab5fbf0fdb4-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>]
* Re: [PATCH v2] mmap.2: MAP_FIXED updated documentation [not found] ` <efb6eae4-7f30-42c3-0efe-0ab5fbf0fdb4-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org> @ 2017-12-05 7:05 ` Michal Hocko 2017-12-05 7:42 ` John Hubbard 0 siblings, 1 reply; 16+ messages in thread From: Michal Hocko @ 2017-12-05 7:05 UTC (permalink / raw) To: John Hubbard Cc: Cyril Hrubis, Michael Kerrisk, linux-man, linux-api-u79uwXL29TY76Z2rM5mHXA, Michael Ellerman, linux-mm-Bw31MaZKKs3YtjvyW6yDsg, LKML, linux-arch-u79uwXL29TY76Z2rM5mHXA, Jann Horn, Matthew Wilcox On Mon 04-12-17 18:14:18, John Hubbard wrote: > On 12/04/2017 02:55 AM, Cyril Hrubis wrote: > > Hi! > > I know that we are not touching the rest of the existing description for > > MAP_FIXED however the second sentence in the manual page says that "addr > > must be a multiple of the page size." Which however is misleading as > > this is not enough on some architectures. Code in the wild seems to > > (mis)use SHMLBA for aligment purposes but I'm not sure that we should > > advise something like that in the manpages. > > > > So what about something as: > > > > "addr must be suitably aligned, for most architectures multiple of page > > size is sufficient, however some may impose additional restrictions for > > page mapping addresses." > > > > Hi Cyril, > > Right, so I've been looking into this today, and I think we can go a bit > further than that, even. The kernel, as far back as the *original* git > commit in 2005, implements mmap on ARM by requiring that the address is > aligned to SHMLBA: > > arch/arm/mm/mmap.c:50: > > if (flags & MAP_FIXED) { > if (aliasing && flags & MAP_SHARED && > (addr - (pgoff << PAGE_SHIFT)) & (SHMLBA - 1)) > return -EINVAL; > return addr; > } > > So, given that this has been the implementation for the last 12+ years (and > probably the whole time, in fact), I think we can be bold enough to use this > wording for the second sentence of MAP_FIXED: > > "addr must be a multiple of SHMLBA (<sys/shm.h>), which in turn is either > the system page size (on many architectures) or a multiple of the system > page size (on some architectures)." > > What do you think? I am not sure this is a good idea. This is pulling way too many implementation details into the man page IMHO. Note that your wording is even incorrect because this applies only to shared mappings and on some architectures it even requires special memory regions. We do not want all that in the man page... -- Michal Hocko SUSE Labs ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH v2] mmap.2: MAP_FIXED updated documentation 2017-12-05 7:05 ` Michal Hocko @ 2017-12-05 7:42 ` John Hubbard 2017-12-05 8:52 ` Michal Hocko [not found] ` <2cff594a-b481-269d-dd91-ff2cc2f4100a-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org> 0 siblings, 2 replies; 16+ messages in thread From: John Hubbard @ 2017-12-05 7:42 UTC (permalink / raw) To: Michal Hocko Cc: Cyril Hrubis, Michael Kerrisk, linux-man, linux-api, Michael Ellerman, linux-mm, LKML, linux-arch, Jann Horn, Matthew Wilcox On 12/04/2017 11:05 PM, Michal Hocko wrote: > On Mon 04-12-17 18:14:18, John Hubbard wrote: >> On 12/04/2017 02:55 AM, Cyril Hrubis wrote: >>> Hi! >>> I know that we are not touching the rest of the existing description for >>> MAP_FIXED however the second sentence in the manual page says that "addr >>> must be a multiple of the page size." Which however is misleading as >>> this is not enough on some architectures. Code in the wild seems to >>> (mis)use SHMLBA for aligment purposes but I'm not sure that we should >>> advise something like that in the manpages. >>> >>> So what about something as: >>> >>> "addr must be suitably aligned, for most architectures multiple of page >>> size is sufficient, however some may impose additional restrictions for >>> page mapping addresses." >>> >> >> Hi Cyril, >> >> Right, so I've been looking into this today, and I think we can go a bit >> further than that, even. The kernel, as far back as the *original* git >> commit in 2005, implements mmap on ARM by requiring that the address is >> aligned to SHMLBA: >> >> arch/arm/mm/mmap.c:50: >> >> if (flags & MAP_FIXED) { >> if (aliasing && flags & MAP_SHARED && >> (addr - (pgoff << PAGE_SHIFT)) & (SHMLBA - 1)) >> return -EINVAL; >> return addr; >> } >> >> So, given that this has been the implementation for the last 12+ years (and >> probably the whole time, in fact), I think we can be bold enough to use this >> wording for the second sentence of MAP_FIXED: >> >> "addr must be a multiple of SHMLBA (<sys/shm.h>), which in turn is either >> the system page size (on many architectures) or a multiple of the system >> page size (on some architectures)." >> >> What do you think? > > I am not sure this is a good idea. This is pulling way too many > implementation details into the man page IMHO. Note that your wording is > even incorrect because this applies only to shared mappings and on some > architectures it even requires special memory regions. We do not want > all that in the man page... > Hi Michal, OK, so it sounds like Cyril's original wording would be just about right, after all, like this? "addr must be suitably aligned. For most architectures multiple of page size is sufficient; however, some may impose additional restrictions." (It does seem unfortunate that the man page cannot help the programmer actually write correct code here. He or she is forced to read the kernel implementation, in order to figure out the true alignment rules. I was hoping we could avoid that.) thanks, John Hubbard NVIDIA ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH v2] mmap.2: MAP_FIXED updated documentation 2017-12-05 7:42 ` John Hubbard @ 2017-12-05 8:52 ` Michal Hocko [not found] ` <2cff594a-b481-269d-dd91-ff2cc2f4100a-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org> 1 sibling, 0 replies; 16+ messages in thread From: Michal Hocko @ 2017-12-05 8:52 UTC (permalink / raw) To: John Hubbard Cc: Cyril Hrubis, Michael Kerrisk, linux-man, linux-api, Michael Ellerman, linux-mm, LKML, linux-arch, Jann Horn, Matthew Wilcox On Mon 04-12-17 23:42:00, John Hubbard wrote: > On 12/04/2017 11:05 PM, Michal Hocko wrote: > > On Mon 04-12-17 18:14:18, John Hubbard wrote: > >> On 12/04/2017 02:55 AM, Cyril Hrubis wrote: > >>> Hi! > >>> I know that we are not touching the rest of the existing description for > >>> MAP_FIXED however the second sentence in the manual page says that "addr > >>> must be a multiple of the page size." Which however is misleading as > >>> this is not enough on some architectures. Code in the wild seems to > >>> (mis)use SHMLBA for aligment purposes but I'm not sure that we should > >>> advise something like that in the manpages. > >>> > >>> So what about something as: > >>> > >>> "addr must be suitably aligned, for most architectures multiple of page > >>> size is sufficient, however some may impose additional restrictions for > >>> page mapping addresses." > >>> > >> > >> Hi Cyril, > >> > >> Right, so I've been looking into this today, and I think we can go a bit > >> further than that, even. The kernel, as far back as the *original* git > >> commit in 2005, implements mmap on ARM by requiring that the address is > >> aligned to SHMLBA: > >> > >> arch/arm/mm/mmap.c:50: > >> > >> if (flags & MAP_FIXED) { > >> if (aliasing && flags & MAP_SHARED && > >> (addr - (pgoff << PAGE_SHIFT)) & (SHMLBA - 1)) > >> return -EINVAL; > >> return addr; > >> } > >> > >> So, given that this has been the implementation for the last 12+ years (and > >> probably the whole time, in fact), I think we can be bold enough to use this > >> wording for the second sentence of MAP_FIXED: > >> > >> "addr must be a multiple of SHMLBA (<sys/shm.h>), which in turn is either > >> the system page size (on many architectures) or a multiple of the system > >> page size (on some architectures)." > >> > >> What do you think? > > > > I am not sure this is a good idea. This is pulling way too many > > implementation details into the man page IMHO. Note that your wording is > > even incorrect because this applies only to shared mappings and on some > > architectures it even requires special memory regions. We do not want > > all that in the man page... > > > > Hi Michal, > > OK, so it sounds like Cyril's original wording would be just about right, > after all, like this? > > "addr must be suitably aligned. For most architectures multiple of page > size is sufficient; however, some may impose additional restrictions." > > (It does seem unfortunate that the man page cannot help the programmer > actually write correct code here. He or she is forced to read the kernel > implementation, in order to figure out the true alignment rules. I was > hoping we could avoid that.) I strongly suspect that this is more the architecture than the kernel implementation imposed restriction. -- Michal Hocko SUSE Labs -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 16+ messages in thread
[parent not found: <2cff594a-b481-269d-dd91-ff2cc2f4100a-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>]
* Re: [PATCH v2] mmap.2: MAP_FIXED updated documentation [not found] ` <2cff594a-b481-269d-dd91-ff2cc2f4100a-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org> @ 2017-12-06 10:01 ` Cyril Hrubis 2017-12-06 21:21 ` John Hubbard 0 siblings, 1 reply; 16+ messages in thread From: Cyril Hrubis @ 2017-12-06 10:01 UTC (permalink / raw) To: John Hubbard Cc: Michal Hocko, Michael Kerrisk, linux-man, linux-api-u79uwXL29TY76Z2rM5mHXA, Michael Ellerman, linux-mm-Bw31MaZKKs3YtjvyW6yDsg, LKML, linux-arch-u79uwXL29TY76Z2rM5mHXA, Jann Horn, Matthew Wilcox Hi! > (It does seem unfortunate that the man page cannot help the programmer > actually write correct code here. He or she is forced to read the kernel > implementation, in order to figure out the true alignment rules. I was > hoping we could avoid that.) It would be nice if we had this information exported somehere so that we do not have to rely on per-architecture ifdefs. What about adding MapAligment or something similar to the /proc/meminfo? -- Cyril Hrubis chrubis-AlSwsSmVLrQ@public.gmane.org ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH v2] mmap.2: MAP_FIXED updated documentation 2017-12-06 10:01 ` Cyril Hrubis @ 2017-12-06 21:21 ` John Hubbard 2017-12-07 12:58 ` Cyril Hrubis 0 siblings, 1 reply; 16+ messages in thread From: John Hubbard @ 2017-12-06 21:21 UTC (permalink / raw) To: Cyril Hrubis Cc: Michal Hocko, Michael Kerrisk, linux-man, linux-api, Michael Ellerman, linux-mm, LKML, linux-arch, Jann Horn, Matthew Wilcox On 12/06/2017 02:01 AM, Cyril Hrubis wrote: > Hi! >> (It does seem unfortunate that the man page cannot help the programmer >> actually write correct code here. He or she is forced to read the kernel >> implementation, in order to figure out the true alignment rules. I was >> hoping we could avoid that.) > > It would be nice if we had this information exported somehere so that we > do not have to rely on per-architecture ifdefs. > > What about adding MapAligment or something similar to the /proc/meminfo? > What's the use case you envision for that? I don't see how that would be better than using SHMLBA, which is available at compiler time. Because unless someone expects to be able to run an app that was compiled for Arch X, on Arch Y (surely that's not requirement here?), I don't see how the run-time check is any better. Or maybe you're thinking that since the SHMLBA cannot be put in the man pages, we could instead provide MapAlignment as sort of a different way to document the requirement? ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH v2] mmap.2: MAP_FIXED updated documentation 2017-12-06 21:21 ` John Hubbard @ 2017-12-07 12:58 ` Cyril Hrubis 2017-12-07 14:02 ` Michal Hocko 0 siblings, 1 reply; 16+ messages in thread From: Cyril Hrubis @ 2017-12-07 12:58 UTC (permalink / raw) To: John Hubbard Cc: Michal Hocko, Michael Kerrisk, linux-man, linux-api, Michael Ellerman, linux-mm, LKML, linux-arch, Jann Horn, Matthew Wilcox Hi! > >> (It does seem unfortunate that the man page cannot help the programmer > >> actually write correct code here. He or she is forced to read the kernel > >> implementation, in order to figure out the true alignment rules. I was > >> hoping we could avoid that.) > > > > It would be nice if we had this information exported somehere so that we > > do not have to rely on per-architecture ifdefs. > > > > What about adding MapAligment or something similar to the /proc/meminfo? > > > > What's the use case you envision for that? I don't see how that would be > better than using SHMLBA, which is available at compiler time. Because > unless someone expects to be able to run an app that was compiled for > Arch X, on Arch Y (surely that's not requirement here?), I don't see how > the run-time check is any better. I guess that some kind of compile time constant in uapi headers will do as well, I'm really open to any solution that would expose this constant as some kind of official API. There is one problem with using SHMLBA, there are more than one libc implementations and at least two of them (bionic and klibc) does not implement the SysV IPC at all. I know that the chances that you are writing a program that requires MAP_FIXED, is compiled against alternative libc, and expected to run on less common architectures are pretty slim. On the other hand I do not see a reason why this cannot be done properly, this is just about exporting one simple constant to userspace after all. > Or maybe you're thinking that since the SHMLBA cannot be put in the man > pages, we could instead provide MapAlignment as sort of a different > way to document the requirement? This is my intention as well. -- Cyril Hrubis chrubis@suse.cz -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH v2] mmap.2: MAP_FIXED updated documentation 2017-12-07 12:58 ` Cyril Hrubis @ 2017-12-07 14:02 ` Michal Hocko 2017-12-09 17:19 ` Pavel Machek 0 siblings, 1 reply; 16+ messages in thread From: Michal Hocko @ 2017-12-07 14:02 UTC (permalink / raw) To: Cyril Hrubis Cc: John Hubbard, Michael Kerrisk, linux-man, linux-api, Michael Ellerman, linux-mm, LKML, linux-arch, Jann Horn, Matthew Wilcox On Thu 07-12-17 13:58:05, Cyril Hrubis wrote: > Hi! > > >> (It does seem unfortunate that the man page cannot help the programmer > > >> actually write correct code here. He or she is forced to read the kernel > > >> implementation, in order to figure out the true alignment rules. I was > > >> hoping we could avoid that.) > > > > > > It would be nice if we had this information exported somehere so that we > > > do not have to rely on per-architecture ifdefs. > > > > > > What about adding MapAligment or something similar to the /proc/meminfo? > > > > > > > What's the use case you envision for that? I don't see how that would be > > better than using SHMLBA, which is available at compiler time. Because > > unless someone expects to be able to run an app that was compiled for > > Arch X, on Arch Y (surely that's not requirement here?), I don't see how > > the run-time check is any better. > > I guess that some kind of compile time constant in uapi headers will do > as well, I'm really open to any solution that would expose this constant > as some kind of official API. I am not sure this is really feasible. It is not only a simple alignment thing. Look at ppc for example (slice_get_unmapped_area). Other architectures might have even more complicated rules e.g. arm and its cache_is_vipt_aliasing. Also this applies only on MAP_SHARED || file backed mappings. I would really leave dogs sleeping... Trying to document all this in the man page has chances to confuse more people than it has chances to help those who already know all these nasty details. -- Michal Hocko SUSE Labs -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH v2] mmap.2: MAP_FIXED updated documentation 2017-12-07 14:02 ` Michal Hocko @ 2017-12-09 17:19 ` Pavel Machek 2017-12-10 7:44 ` John Hubbard 0 siblings, 1 reply; 16+ messages in thread From: Pavel Machek @ 2017-12-09 17:19 UTC (permalink / raw) To: Michal Hocko Cc: Cyril Hrubis, John Hubbard, Michael Kerrisk, linux-man, linux-api, Michael Ellerman, linux-mm, LKML, linux-arch, Jann Horn, Matthew Wilcox On Thu 2017-12-07 15:02:21, Michal Hocko wrote: > On Thu 07-12-17 13:58:05, Cyril Hrubis wrote: > > Hi! > > > >> (It does seem unfortunate that the man page cannot help the programmer > > > >> actually write correct code here. He or she is forced to read the kernel > > > >> implementation, in order to figure out the true alignment rules. I was > > > >> hoping we could avoid that.) > > > > > > > > It would be nice if we had this information exported somehere so that we > > > > do not have to rely on per-architecture ifdefs. > > > > > > > > What about adding MapAligment or something similar to the /proc/meminfo? > > > > > > > > > > What's the use case you envision for that? I don't see how that would be > > > better than using SHMLBA, which is available at compiler time. Because > > > unless someone expects to be able to run an app that was compiled for > > > Arch X, on Arch Y (surely that's not requirement here?), I don't see how > > > the run-time check is any better. > > > > I guess that some kind of compile time constant in uapi headers will do > > as well, I'm really open to any solution that would expose this constant > > as some kind of official API. > > I am not sure this is really feasible. It is not only a simple alignment > thing. Look at ppc for example (slice_get_unmapped_area). Other > architectures might have even more complicated rules e.g. arm and its > cache_is_vipt_aliasing. Also this applies only on MAP_SHARED || file > backed mappings. > > I would really leave dogs sleeping... Trying to document all this in the > man page has chances to confuse more people than it has chances to help > those who already know all these nasty details. You don't have to provide all the details, but warning that there's arch- specific magic would be nice... Pavel (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH v2] mmap.2: MAP_FIXED updated documentation 2017-12-09 17:19 ` Pavel Machek @ 2017-12-10 7:44 ` John Hubbard 0 siblings, 0 replies; 16+ messages in thread From: John Hubbard @ 2017-12-10 7:44 UTC (permalink / raw) To: Pavel Machek, Michal Hocko Cc: Cyril Hrubis, Michael Kerrisk, linux-man, linux-api, Michael Ellerman, linux-mm, LKML, linux-arch, Jann Horn, Matthew Wilcox On 12/09/2017 09:19 AM, Pavel Machek wrote: > On Thu 2017-12-07 15:02:21, Michal Hocko wrote: >> On Thu 07-12-17 13:58:05, Cyril Hrubis wrote: >>> Hi! >>>>>> (It does seem unfortunate that the man page cannot help the programmer >>>>>> actually write correct code here. He or she is forced to read the kernel >>>>>> implementation, in order to figure out the true alignment rules. I was >>>>>> hoping we could avoid that.) >>>>> >>>>> It would be nice if we had this information exported somehere so that we >>>>> do not have to rely on per-architecture ifdefs. >>>>> >>>>> What about adding MapAligment or something similar to the /proc/meminfo? >>>>> >>>> >>>> What's the use case you envision for that? I don't see how that would be >>>> better than using SHMLBA, which is available at compiler time. Because >>>> unless someone expects to be able to run an app that was compiled for >>>> Arch X, on Arch Y (surely that's not requirement here?), I don't see how >>>> the run-time check is any better. >>> >>> I guess that some kind of compile time constant in uapi headers will do >>> as well, I'm really open to any solution that would expose this constant >>> as some kind of official API. >> >> I am not sure this is really feasible. It is not only a simple alignment >> thing. Look at ppc for example (slice_get_unmapped_area). Other >> architectures might have even more complicated rules e.g. arm and its >> cache_is_vipt_aliasing. Also this applies only on MAP_SHARED || file >> backed mappings. >> >> I would really leave dogs sleeping... Trying to document all this in the >> man page has chances to confuse more people than it has chances to help >> those who already know all these nasty details. > > You don't have to provide all the details, but warning that there's arch- > specific magic would be nice... Hi Pavel, In version 4 of this patch (which oddly enough, I have trouble finding via google, it only seems to show up in patchwork.kernel.org [1]), I phrased it like this: Don't interpret addr as a hint: place the mapping at exactly that address. addr must be suitably aligned: for most architectures a multiple of page size is sufficient; however, some architectures may impose additional restrictions. ...which is basically what Cyril was asking for, in his early feedback. Does that work for you? (Maybe I need to repost that patch. In any case the CC's need updating, at least.) [1] https://patchwork.kernel.org/patch/10094905/ thanks, -- John Hubbard NVIDIA > Pavel > > (english) http://www.livejournal.com/~pavelmachek > (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html > ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH v2] mmap.2: MAP_FIXED updated documentation [not found] ` <20171204021411.4786-1-jhubbard-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org> 2017-12-04 10:55 ` Cyril Hrubis @ 2017-12-04 11:31 ` Mike Rapoport 2017-12-05 2:52 ` John Hubbard 1 sibling, 1 reply; 16+ messages in thread From: Mike Rapoport @ 2017-12-04 11:31 UTC (permalink / raw) To: john.hubbard-Re5JQEeQqe8AvxtiuMwx3w Cc: Michael Kerrisk, linux-man, linux-api-u79uwXL29TY76Z2rM5mHXA, Michael Ellerman, linux-mm-Bw31MaZKKs3YtjvyW6yDsg, LKML, linux-arch-u79uwXL29TY76Z2rM5mHXA, Jann Horn, Matthew Wilcox, Michal Hocko, John Hubbard On Sun, Dec 03, 2017 at 06:14:11PM -0800, john.hubbard-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org wrote: > From: John Hubbard <jhubbard-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org> > > Previously, MAP_FIXED was "discouraged", due to portability > issues with the fixed address. In fact, there are other, more > serious issues. Also, in some limited cases, this option can > be used safely. > > Expand the documentation to discuss both the hazards, and how > to use it safely. > > The "Portability issues" wording is lifted directly from > Matthew Wilcox's review. The notes about other libraries > creating mappings is also from Matthew (lightly edited). > > The suggestion to explain how to use MAP_FIXED safely is > from Jann Horn. > > Suggested-by: Matthew Wilcox <willy-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org> > Suggested-by: Jann Horn <jannh-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org> > Signed-off-by: John Hubbard <jhubbard-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org> > --- > > Changed from v1: > > -- Covered topics recommended by Matthew Wilcox > and Jann Horn, in their recent review: the hazards > of overwriting pre-exising mappings, and some notes > about how to use MAP_FIXED safely. > > -- Rewrote the commit description accordingly. > > man2/mmap.2 | 38 ++++++++++++++++++++++++++++++++++++-- > 1 file changed, 36 insertions(+), 2 deletions(-) > > diff --git a/man2/mmap.2 b/man2/mmap.2 > index 385f3bfd5..9038256d4 100644 > --- a/man2/mmap.2 > +++ b/man2/mmap.2 > @@ -222,8 +222,42 @@ part of the existing mapping(s) will be discarded. > If the specified address cannot be used, > .BR mmap () > will fail. > -Because requiring a fixed address for a mapping is less portable, > -the use of this option is discouraged. > +.IP > +This option is extremely hazardous (when used on its own) and moderately > +non-portable. > +.IP > +Portability issues: a process's memory map may change significantly from one > +run to the next, depending on library versions, kernel versions and random > +numbers. > +.IP > +Hazards: this option forcibly removes pre-existing mappings, making it easy > +for a multi-threaded process to corrupt its own address space. > +.IP > +For example, thread A looks through /proc/<pid>/maps and locates an available > +address range, while thread B simultaneously acquires part or all of that same > +address range. Thread A then calls mmap(MAP_FIXED), effectively overwriting > +thread B's mapping. > +.IP > +Thread B need not create a mapping directly; simply making a library call > +that, internally, uses dlopen(3) to load some other shared library, will > +suffice. The dlopen(3) call will map the library into the process's address > +space. Furthermore, almost any library call may be implemented using this > +technique. > +Examples include brk(2), malloc(3), pthread_create(3), and the PAM libraries > +(http://www.linux-pam.org). > +.IP > +Given the above limitations, one of the very few ways to use this option > +safely is: mmap() a region, without specifying MAP_FIXED. Then, within that > +region, call mmap(MAP_FIXED) to suballocate regions. This avoids both the > +portability problem (because the first mmap call lets the kernel pick the > +address), and the address space corruption problem (because the region being > +overwritten is already owned by the calling thread). Maybe "address space corruption problem caused by implicit calls to mmap"? The region allocated with the first mmap is not exactly owned by the thread and a multi-thread application can still corrupt its memory if different threads use mmap(MAP_FIXED) for overlapping regions. My 2 cents. > +.IP > +Newer kernels > +(Linux 4.16 and later) have a > +.B MAP_FIXED_SAFE > +option that avoids the corruption problem; if available, MAP_FIXED_SAFE > +should be preferred over MAP_FIXED. > .TP > .B MAP_GROWSDOWN > This flag is used for stacks. > -- > 2.15.1 > > -- > To unsubscribe, send a message with 'unsubscribe linux-mm' in > the body to majordomo-Bw31MaZKKs0EbZ0PF+XxCw@public.gmane.org For more info on Linux MM, > see: http://www.linux-mm.org/ . > Don't email: <a href=mailto:"dont-Bw31MaZKKs3YtjvyW6yDsg@public.gmane.org"> email-Bw31MaZKKs3YtjvyW6yDsg@public.gmane.org </a> > -- Sincerely yours, Mike. ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH v2] mmap.2: MAP_FIXED updated documentation 2017-12-04 11:31 ` Mike Rapoport @ 2017-12-05 2:52 ` John Hubbard [not found] ` <6777116d-ad9e-48c9-0009-01d10274135e-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org> 0 siblings, 1 reply; 16+ messages in thread From: John Hubbard @ 2017-12-05 2:52 UTC (permalink / raw) To: Mike Rapoport Cc: Michael Kerrisk, linux-man, linux-api, Michael Ellerman, linux-mm, LKML, linux-arch, Jann Horn, Matthew Wilcox, Michal Hocko On 12/04/2017 03:31 AM, Mike Rapoport wrote: > On Sun, Dec 03, 2017 at 06:14:11PM -0800, john.hubbard@gmail.com wrote: >> From: John Hubbard <jhubbard@nvidia.com> >> [...] >> +.IP >> +Given the above limitations, one of the very few ways to use this option >> +safely is: mmap() a region, without specifying MAP_FIXED. Then, within that >> +region, call mmap(MAP_FIXED) to suballocate regions. This avoids both the >> +portability problem (because the first mmap call lets the kernel pick the >> +address), and the address space corruption problem (because the region being >> +overwritten is already owned by the calling thread). > > Maybe "address space corruption problem caused by implicit calls to mmap"? > The region allocated with the first mmap is not exactly owned by the > thread and a multi-thread application can still corrupt its memory if > different threads use mmap(MAP_FIXED) for overlapping regions. > > My 2 cents. > Hi Mike, Yes, thanks for picking through this, and I agree that the above is misleading. It should definitely not use the word "owned" at all. Re-doing the whole paragraph in order to make it all fit together nicely, I get this: "Given the above limitations, one of the very few ways to use this option safely is: mmap() an enclosing region, without specifying MAP_FIXED. Then, within that region, call mmap(MAP_FIXED) to suballocate regions within the enclosing region. This avoids both the portability problem (because the first mmap call lets the kernel pick the address), and the address space corruption problem (because implicit calls to mmap will not affect the already-mapped enclosing region)." ...how's that sound to you? I'll post a v3 soon with this. thanks, John Hubbard NVIDIA ^ permalink raw reply [flat|nested] 16+ messages in thread
[parent not found: <6777116d-ad9e-48c9-0009-01d10274135e-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>]
* Re: [PATCH v2] mmap.2: MAP_FIXED updated documentation [not found] ` <6777116d-ad9e-48c9-0009-01d10274135e-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org> @ 2017-12-05 7:08 ` Michal Hocko 2017-12-05 7:43 ` John Hubbard 0 siblings, 1 reply; 16+ messages in thread From: Michal Hocko @ 2017-12-05 7:08 UTC (permalink / raw) To: John Hubbard Cc: Mike Rapoport, Michael Kerrisk, linux-man, linux-api-u79uwXL29TY76Z2rM5mHXA, Michael Ellerman, linux-mm-Bw31MaZKKs3YtjvyW6yDsg, LKML, linux-arch-u79uwXL29TY76Z2rM5mHXA, Jann Horn, Matthew Wilcox On Mon 04-12-17 18:52:27, John Hubbard wrote: > On 12/04/2017 03:31 AM, Mike Rapoport wrote: > > On Sun, Dec 03, 2017 at 06:14:11PM -0800, john.hubbard-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org wrote: > >> From: John Hubbard <jhubbard-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org> > >> > [...] > >> +.IP > >> +Given the above limitations, one of the very few ways to use this option > >> +safely is: mmap() a region, without specifying MAP_FIXED. Then, within that > >> +region, call mmap(MAP_FIXED) to suballocate regions. This avoids both the > >> +portability problem (because the first mmap call lets the kernel pick the > >> +address), and the address space corruption problem (because the region being > >> +overwritten is already owned by the calling thread). > > > > Maybe "address space corruption problem caused by implicit calls to mmap"? > > The region allocated with the first mmap is not exactly owned by the > > thread and a multi-thread application can still corrupt its memory if > > different threads use mmap(MAP_FIXED) for overlapping regions. > > > > My 2 cents. > > > > Hi Mike, > > Yes, thanks for picking through this, and I agree that the above is misleading. > It should definitely not use the word "owned" at all. Re-doing the whole > paragraph in order to make it all fit together nicely, I get this: > > "Given the above limitations, one of the very few ways to use this option > safely is: mmap() an enclosing region, without specifying MAP_FIXED. > Then, within that region, call mmap(MAP_FIXED) to suballocate regions > within the enclosing region. This avoids both the portability problem > (because the first mmap call lets the kernel pick the address), and the > address space corruption problem (because implicit calls to mmap will > not affect the already-mapped enclosing region)." > > ...how's that sound to you? I'll post a v3 soon with this. It sounds to me you are trying to tell way to much while actually being a bit misleading. Even sub-range MAP_FIXED is not multi-thread safe. Really the more corner cases you will try to cover the worse the end result will end up. I would just try to be simple here and mention the address space corruption issues you've had earlier and be done with it. Maybe add a note that some architectures might need a special alignement and fail if it is not the case but nothing really specific. -- Michal Hocko SUSE Labs -- To unsubscribe from this list: send the line "unsubscribe linux-man" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH v2] mmap.2: MAP_FIXED updated documentation 2017-12-05 7:08 ` Michal Hocko @ 2017-12-05 7:43 ` John Hubbard 0 siblings, 0 replies; 16+ messages in thread From: John Hubbard @ 2017-12-05 7:43 UTC (permalink / raw) To: Michal Hocko Cc: Mike Rapoport, Michael Kerrisk, linux-man, linux-api, Michael Ellerman, linux-mm, LKML, linux-arch, Jann Horn, Matthew Wilcox On 12/04/2017 11:08 PM, Michal Hocko wrote: > On Mon 04-12-17 18:52:27, John Hubbard wrote: >> On 12/04/2017 03:31 AM, Mike Rapoport wrote: >>> On Sun, Dec 03, 2017 at 06:14:11PM -0800, john.hubbard@gmail.com wrote: >>>> From: John Hubbard <jhubbard@nvidia.com> >>>> >> [...] >>>> +.IP >>>> +Given the above limitations, one of the very few ways to use this option >>>> +safely is: mmap() a region, without specifying MAP_FIXED. Then, within that >>>> +region, call mmap(MAP_FIXED) to suballocate regions. This avoids both the >>>> +portability problem (because the first mmap call lets the kernel pick the >>>> +address), and the address space corruption problem (because the region being >>>> +overwritten is already owned by the calling thread). >>> >>> Maybe "address space corruption problem caused by implicit calls to mmap"? >>> The region allocated with the first mmap is not exactly owned by the >>> thread and a multi-thread application can still corrupt its memory if >>> different threads use mmap(MAP_FIXED) for overlapping regions. >>> >>> My 2 cents. >>> >> >> Hi Mike, >> >> Yes, thanks for picking through this, and I agree that the above is misleading. >> It should definitely not use the word "owned" at all. Re-doing the whole >> paragraph in order to make it all fit together nicely, I get this: >> >> "Given the above limitations, one of the very few ways to use this option >> safely is: mmap() an enclosing region, without specifying MAP_FIXED. >> Then, within that region, call mmap(MAP_FIXED) to suballocate regions >> within the enclosing region. This avoids both the portability problem >> (because the first mmap call lets the kernel pick the address), and the >> address space corruption problem (because implicit calls to mmap will >> not affect the already-mapped enclosing region)." >> >> ...how's that sound to you? I'll post a v3 soon with this. > > It sounds to me you are trying to tell way to much while actually being > a bit misleading. Even sub-range MAP_FIXED is not multi-thread safe. > > Really the more corner cases you will try to cover the worse the end > result will end up. I would just try to be simple here and mention the > address space corruption issues you've had earlier and be done with it. > Maybe add a note that some architectures might need a special alignement > and fail if it is not the case but nothing really specific. > Sure, I can drop the "how to use this safely" section. It seemed like a good idea at the time... :) thanks, John Hubbard NVIDIA -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 16+ messages in thread
end of thread, other threads:[~2017-12-10 7:44 UTC | newest] Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2017-12-04 2:14 [PATCH v2] mmap.2: MAP_FIXED updated documentation john.hubbard [not found] ` <20171204021411.4786-1-jhubbard-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org> 2017-12-04 10:55 ` Cyril Hrubis 2017-12-05 2:14 ` John Hubbard [not found] ` <efb6eae4-7f30-42c3-0efe-0ab5fbf0fdb4-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org> 2017-12-05 7:05 ` Michal Hocko 2017-12-05 7:42 ` John Hubbard 2017-12-05 8:52 ` Michal Hocko [not found] ` <2cff594a-b481-269d-dd91-ff2cc2f4100a-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org> 2017-12-06 10:01 ` Cyril Hrubis 2017-12-06 21:21 ` John Hubbard 2017-12-07 12:58 ` Cyril Hrubis 2017-12-07 14:02 ` Michal Hocko 2017-12-09 17:19 ` Pavel Machek 2017-12-10 7:44 ` John Hubbard 2017-12-04 11:31 ` Mike Rapoport 2017-12-05 2:52 ` John Hubbard [not found] ` <6777116d-ad9e-48c9-0009-01d10274135e-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org> 2017-12-05 7:08 ` Michal Hocko 2017-12-05 7:43 ` John Hubbard
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).