All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/2] mm: introduce MAP_FIXED_SAFE
@ 2017-11-29 14:42 ` Michal Hocko
  0 siblings, 0 replies; 130+ messages in thread
From: Michal Hocko @ 2017-11-29 14:42 UTC (permalink / raw)
  To: linux-api
  Cc: Khalid Aziz, Michael Ellerman, Andrew Morton,
	Russell King - ARM Linux, Andrea Arcangeli, linux-mm, LKML,
	linux-arch, Florian Weimer, John Hubbard, Abdul Haleem,
	Joel Stanley, Kees Cook, Michal Hocko

Hi,
I am resending with RFC dropped and ask for inclusion. There haven't
been any fundamental objections for the RFC [1]. I have also prepared
a man page patch which is 0/3 of this series.

This has started as a follow up discussion [2][3] resulting in the
runtime failure caused by hardening patch [4] which removes MAP_FIXED
from the elf loader because MAP_FIXED is inherently dangerous as it
might silently clobber an existing underlying mapping (e.g. stack). The
reason for the failure is that some architectures enforce an alignment
for the given address hint without MAP_FIXED used (e.g. for shared or
file backed mappings).

One way around this would be excluding those archs which do alignment
tricks from the hardening [5]. The patch is really trivial but it has
been objected, rightfully so, that this screams for a more generic
solution. We basically want a non-destructive MAP_FIXED.

The first patch introduced MAP_FIXED_SAFE which enforces the given
address but unlike MAP_FIXED it fails with ENOMEM if the given range
conflicts with an existing one. The flag is introduced as a completely
new one rather than a MAP_FIXED extension because of the backward
compatibility. We really want a never-clobber semantic even on older
kernels which do not recognize the flag. Unfortunately mmap sucks wrt.
flags evaluation because we do not EINVAL on unknown flags. On those
kernels we would simply use the traditional hint based semantic so the
caller can still get a different address (which sucks) but at least not
silently corrupt an existing mapping. I do not see a good way around
that. Except we won't export expose the new semantic to the userspace at
all. 

It seems there are users who would like to have something like that.
Jemalloc has been mentioned by Michael Ellerman [6]

Florian Weimer has mentioned the following:
: glibc ld.so currently maps DSOs without hints.  This means that the kernel
: will map right next to each other, and the offsets between them a completely
: predictable.  We would like to change that and supply a random address in a
: window of the address space.  If there is a conflict, we do not want the
: kernel to pick a non-random address. Instead, we would try again with a
: random address.

John Hubbard has mentioned CUDA example
: a) Searches /proc/<pid>/maps for a "suitable" region of available
: VA space.  "Suitable" generally means it has to have a base address
: within a certain limited range (a particular device model might
: have odd limitations, for example), it has to be large enough, and
: alignment has to be large enough (again, various devices may have
: constraints that lead us to do this).
: 
: This is of course subject to races with other threads in the process.
: 
: Let's say it finds a region starting at va.
: 
: b) Next it does: 
:     p = mmap(va, ...) 
: 
: *without* setting MAP_FIXED, of course (so va is just a hint), to
: attempt to safely reserve that region. If p != va, then in most cases,
: this is a failure (almost certainly due to another thread getting a
: mapping from that region before we did), and so this layer now has to
: call munmap(), before returning a "failure: retry" to upper layers.
: 
:     IMPROVEMENT: --> if instead, we could call this:
: 
:             p = mmap(va, ... MAP_FIXED_SAFE ...)
: 
:         , then we could skip the munmap() call upon failure. This
:         is a small thing, but it is useful here. (Thanks to Piotr
:         Jaroszynski and Mark Hairgrove for helping me get that detail
:         exactly right, btw.)
: 
: c) After that, CUDA suballocates from p, via: 
:  
:      q = mmap(sub_region_start, ... MAP_FIXED ...)
: 
: Interestingly enough, "freeing" is also done via MAP_FIXED, and
: setting PROT_NONE to the subregion. Anyway, I just included (c) for
: general interest.

Atomic address range probing in the multithreaded programs in general
sounds like an interesting thing to me.

The second patch simply replaces MAP_FIXED use in elf loader by
MAP_FIXED_SAFE. I believe other places which rely on MAP_FIXED should
follow. Actually real MAP_FIXED usages should be docummented properly
and they should be more of an exception.

Does anybody see any fundamental reasons why this is a wrong approach?

Diffstat says
 arch/alpha/include/uapi/asm/mman.h   |  2 ++
 arch/metag/kernel/process.c          |  6 +++++-
 arch/mips/include/uapi/asm/mman.h    |  2 ++
 arch/parisc/include/uapi/asm/mman.h  |  2 ++
 arch/powerpc/include/uapi/asm/mman.h |  1 +
 arch/sparc/include/uapi/asm/mman.h   |  1 +
 arch/tile/include/uapi/asm/mman.h    |  1 +
 arch/xtensa/include/uapi/asm/mman.h  |  2 ++
 fs/binfmt_elf.c                      | 12 ++++++++----
 include/uapi/asm-generic/mman.h      |  1 +
 mm/mmap.c                            | 11 +++++++++++
 11 files changed, 36 insertions(+), 5 deletions(-)

[1] http://lkml.kernel.org/r/20171116101900.13621-1-mhocko@kernel.org
[2] http://lkml.kernel.org/r/20171107162217.382cd754@canb.auug.org.au
[3] http://lkml.kernel.org/r/1510048229.12079.7.camel@abdul.in.ibm.com
[4] http://lkml.kernel.org/r/20171023082608.6167-1-mhocko@kernel.org
[5] http://lkml.kernel.org/r/20171113094203.aofz2e7kueitk55y@dhcp22.suse.cz
[6] http://lkml.kernel.org/r/87efp1w7vy.fsf@concordia.ellerman.id.au

^ permalink raw reply	[flat|nested] 130+ messages in thread

* [PATCH 0/2] mm: introduce MAP_FIXED_SAFE
@ 2017-11-29 14:42 ` Michal Hocko
  0 siblings, 0 replies; 130+ messages in thread
From: Michal Hocko @ 2017-11-29 14:42 UTC (permalink / raw)
  To: linux-api
  Cc: Khalid Aziz, Michael Ellerman, Andrew Morton,
	Russell King - ARM Linux, Andrea Arcangeli, linux-mm, LKML,
	linux-arch, Florian Weimer, John Hubbard, Abdul Haleem,
	Joel Stanley, Kees Cook, Michal Hocko

Hi,
I am resending with RFC dropped and ask for inclusion. There haven't
been any fundamental objections for the RFC [1]. I have also prepared
a man page patch which is 0/3 of this series.

This has started as a follow up discussion [2][3] resulting in the
runtime failure caused by hardening patch [4] which removes MAP_FIXED
from the elf loader because MAP_FIXED is inherently dangerous as it
might silently clobber an existing underlying mapping (e.g. stack). The
reason for the failure is that some architectures enforce an alignment
for the given address hint without MAP_FIXED used (e.g. for shared or
file backed mappings).

One way around this would be excluding those archs which do alignment
tricks from the hardening [5]. The patch is really trivial but it has
been objected, rightfully so, that this screams for a more generic
solution. We basically want a non-destructive MAP_FIXED.

The first patch introduced MAP_FIXED_SAFE which enforces the given
address but unlike MAP_FIXED it fails with ENOMEM if the given range
conflicts with an existing one. The flag is introduced as a completely
new one rather than a MAP_FIXED extension because of the backward
compatibility. We really want a never-clobber semantic even on older
kernels which do not recognize the flag. Unfortunately mmap sucks wrt.
flags evaluation because we do not EINVAL on unknown flags. On those
kernels we would simply use the traditional hint based semantic so the
caller can still get a different address (which sucks) but at least not
silently corrupt an existing mapping. I do not see a good way around
that. Except we won't export expose the new semantic to the userspace at
all. 

It seems there are users who would like to have something like that.
Jemalloc has been mentioned by Michael Ellerman [6]

Florian Weimer has mentioned the following:
: glibc ld.so currently maps DSOs without hints.  This means that the kernel
: will map right next to each other, and the offsets between them a completely
: predictable.  We would like to change that and supply a random address in a
: window of the address space.  If there is a conflict, we do not want the
: kernel to pick a non-random address. Instead, we would try again with a
: random address.

John Hubbard has mentioned CUDA example
: a) Searches /proc/<pid>/maps for a "suitable" region of available
: VA space.  "Suitable" generally means it has to have a base address
: within a certain limited range (a particular device model might
: have odd limitations, for example), it has to be large enough, and
: alignment has to be large enough (again, various devices may have
: constraints that lead us to do this).
: 
: This is of course subject to races with other threads in the process.
: 
: Let's say it finds a region starting at va.
: 
: b) Next it does: 
:     p = mmap(va, ...) 
: 
: *without* setting MAP_FIXED, of course (so va is just a hint), to
: attempt to safely reserve that region. If p != va, then in most cases,
: this is a failure (almost certainly due to another thread getting a
: mapping from that region before we did), and so this layer now has to
: call munmap(), before returning a "failure: retry" to upper layers.
: 
:     IMPROVEMENT: --> if instead, we could call this:
: 
:             p = mmap(va, ... MAP_FIXED_SAFE ...)
: 
:         , then we could skip the munmap() call upon failure. This
:         is a small thing, but it is useful here. (Thanks to Piotr
:         Jaroszynski and Mark Hairgrove for helping me get that detail
:         exactly right, btw.)
: 
: c) After that, CUDA suballocates from p, via: 
:  
:      q = mmap(sub_region_start, ... MAP_FIXED ...)
: 
: Interestingly enough, "freeing" is also done via MAP_FIXED, and
: setting PROT_NONE to the subregion. Anyway, I just included (c) for
: general interest.

Atomic address range probing in the multithreaded programs in general
sounds like an interesting thing to me.

The second patch simply replaces MAP_FIXED use in elf loader by
MAP_FIXED_SAFE. I believe other places which rely on MAP_FIXED should
follow. Actually real MAP_FIXED usages should be docummented properly
and they should be more of an exception.

Does anybody see any fundamental reasons why this is a wrong approach?

Diffstat says
 arch/alpha/include/uapi/asm/mman.h   |  2 ++
 arch/metag/kernel/process.c          |  6 +++++-
 arch/mips/include/uapi/asm/mman.h    |  2 ++
 arch/parisc/include/uapi/asm/mman.h  |  2 ++
 arch/powerpc/include/uapi/asm/mman.h |  1 +
 arch/sparc/include/uapi/asm/mman.h   |  1 +
 arch/tile/include/uapi/asm/mman.h    |  1 +
 arch/xtensa/include/uapi/asm/mman.h  |  2 ++
 fs/binfmt_elf.c                      | 12 ++++++++----
 include/uapi/asm-generic/mman.h      |  1 +
 mm/mmap.c                            | 11 +++++++++++
 11 files changed, 36 insertions(+), 5 deletions(-)

[1] http://lkml.kernel.org/r/20171116101900.13621-1-mhocko@kernel.org
[2] http://lkml.kernel.org/r/20171107162217.382cd754@canb.auug.org.au
[3] http://lkml.kernel.org/r/1510048229.12079.7.camel@abdul.in.ibm.com
[4] http://lkml.kernel.org/r/20171023082608.6167-1-mhocko@kernel.org
[5] http://lkml.kernel.org/r/20171113094203.aofz2e7kueitk55y@dhcp22.suse.cz
[6] http://lkml.kernel.org/r/87efp1w7vy.fsf@concordia.ellerman.id.au


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 130+ messages in thread

* [PATCH 0/2] mm: introduce MAP_FIXED_SAFE
@ 2017-11-29 14:42 ` Michal Hocko
  0 siblings, 0 replies; 130+ messages in thread
From: Michal Hocko @ 2017-11-29 14:42 UTC (permalink / raw)
  To: linux-api
  Cc: Khalid Aziz, Michael Ellerman, Andrew Morton,
	Russell King - ARM Linux, Andrea Arcangeli, linux-mm, LKML,
	linux-arch, Florian Weimer, John Hubbard, Abdul Haleem,
	Joel Stanley, Kees Cook, Michal Hocko

Hi,
I am resending with RFC dropped and ask for inclusion. There haven't
been any fundamental objections for the RFC [1]. I have also prepared
a man page patch which is 0/3 of this series.

This has started as a follow up discussion [2][3] resulting in the
runtime failure caused by hardening patch [4] which removes MAP_FIXED
from the elf loader because MAP_FIXED is inherently dangerous as it
might silently clobber an existing underlying mapping (e.g. stack). The
reason for the failure is that some architectures enforce an alignment
for the given address hint without MAP_FIXED used (e.g. for shared or
file backed mappings).

One way around this would be excluding those archs which do alignment
tricks from the hardening [5]. The patch is really trivial but it has
been objected, rightfully so, that this screams for a more generic
solution. We basically want a non-destructive MAP_FIXED.

The first patch introduced MAP_FIXED_SAFE which enforces the given
address but unlike MAP_FIXED it fails with ENOMEM if the given range
conflicts with an existing one. The flag is introduced as a completely
new one rather than a MAP_FIXED extension because of the backward
compatibility. We really want a never-clobber semantic even on older
kernels which do not recognize the flag. Unfortunately mmap sucks wrt.
flags evaluation because we do not EINVAL on unknown flags. On those
kernels we would simply use the traditional hint based semantic so the
caller can still get a different address (which sucks) but at least not
silently corrupt an existing mapping. I do not see a good way around
that. Except we won't export expose the new semantic to the userspace at
all. 

It seems there are users who would like to have something like that.
Jemalloc has been mentioned by Michael Ellerman [6]

Florian Weimer has mentioned the following:
: glibc ld.so currently maps DSOs without hints.  This means that the kernel
: will map right next to each other, and the offsets between them a completely
: predictable.  We would like to change that and supply a random address in a
: window of the address space.  If there is a conflict, we do not want the
: kernel to pick a non-random address. Instead, we would try again with a
: random address.

John Hubbard has mentioned CUDA example
: a) Searches /proc/<pid>/maps for a "suitable" region of available
: VA space.  "Suitable" generally means it has to have a base address
: within a certain limited range (a particular device model might
: have odd limitations, for example), it has to be large enough, and
: alignment has to be large enough (again, various devices may have
: constraints that lead us to do this).
: 
: This is of course subject to races with other threads in the process.
: 
: Let's say it finds a region starting at va.
: 
: b) Next it does: 
:     p = mmap(va, ...) 
: 
: *without* setting MAP_FIXED, of course (so va is just a hint), to
: attempt to safely reserve that region. If p != va, then in most cases,
: this is a failure (almost certainly due to another thread getting a
: mapping from that region before we did), and so this layer now has to
: call munmap(), before returning a "failure: retry" to upper layers.
: 
:     IMPROVEMENT: --> if instead, we could call this:
: 
:             p = mmap(va, ... MAP_FIXED_SAFE ...)
: 
:         , then we could skip the munmap() call upon failure. This
:         is a small thing, but it is useful here. (Thanks to Piotr
:         Jaroszynski and Mark Hairgrove for helping me get that detail
:         exactly right, btw.)
: 
: c) After that, CUDA suballocates from p, via: 
:  
:      q = mmap(sub_region_start, ... MAP_FIXED ...)
: 
: Interestingly enough, "freeing" is also done via MAP_FIXED, and
: setting PROT_NONE to the subregion. Anyway, I just included (c) for
: general interest.

Atomic address range probing in the multithreaded programs in general
sounds like an interesting thing to me.

The second patch simply replaces MAP_FIXED use in elf loader by
MAP_FIXED_SAFE. I believe other places which rely on MAP_FIXED should
follow. Actually real MAP_FIXED usages should be docummented properly
and they should be more of an exception.

Does anybody see any fundamental reasons why this is a wrong approach?

Diffstat says
 arch/alpha/include/uapi/asm/mman.h   |  2 ++
 arch/metag/kernel/process.c          |  6 +++++-
 arch/mips/include/uapi/asm/mman.h    |  2 ++
 arch/parisc/include/uapi/asm/mman.h  |  2 ++
 arch/powerpc/include/uapi/asm/mman.h |  1 +
 arch/sparc/include/uapi/asm/mman.h   |  1 +
 arch/tile/include/uapi/asm/mman.h    |  1 +
 arch/xtensa/include/uapi/asm/mman.h  |  2 ++
 fs/binfmt_elf.c                      | 12 ++++++++----
 include/uapi/asm-generic/mman.h      |  1 +
 mm/mmap.c                            | 11 +++++++++++
 11 files changed, 36 insertions(+), 5 deletions(-)

[1] http://lkml.kernel.org/r/20171116101900.13621-1-mhocko@kernel.org
[2] http://lkml.kernel.org/r/20171107162217.382cd754@canb.auug.org.au
[3] http://lkml.kernel.org/r/1510048229.12079.7.camel@abdul.in.ibm.com
[4] http://lkml.kernel.org/r/20171023082608.6167-1-mhocko@kernel.org
[5] http://lkml.kernel.org/r/20171113094203.aofz2e7kueitk55y@dhcp22.suse.cz
[6] http://lkml.kernel.org/r/87efp1w7vy.fsf@concordia.ellerman.id.au

^ permalink raw reply	[flat|nested] 130+ messages in thread

* [PATCH 1/2] mm: introduce MAP_FIXED_SAFE
  2017-11-29 14:42 ` Michal Hocko
  (?)
@ 2017-11-29 14:42   ` Michal Hocko
  -1 siblings, 0 replies; 130+ messages in thread
From: Michal Hocko @ 2017-11-29 14:42 UTC (permalink / raw)
  To: linux-api
  Cc: Khalid Aziz, Michael Ellerman, Andrew Morton,
	Russell King - ARM Linux, Andrea Arcangeli, linux-mm, LKML,
	linux-arch, Florian Weimer, John Hubbard, Michal Hocko

From: Michal Hocko <mhocko@suse.com>

MAP_FIXED is used quite often to enforce mapping at the particular
range. The main problem of this flag is, however, that it is inherently
dangerous because it unmaps existing mappings covered by the requested
range. This can cause silent memory corruptions. Some of them even with
serious security implications. While the current semantic might be
really desiderable in many cases there are others which would want to
enforce the given range but rather see a failure than a silent memory
corruption on a clashing range. Please note that there is no guarantee
that a given range is obeyed by the mmap even when it is free - e.g.
arch specific code is allowed to apply an alignment.

Introduce a new MAP_FIXED_SAFE flag for mmap to achieve this behavior.
It has the same semantic as MAP_FIXED wrt. the given address request
with a single exception that it fails with EEXIST if the requested
address is already covered by an existing mapping. We still do rely on
get_unmaped_area to handle all the arch specific MAP_FIXED treatment and
check for a conflicting vma after it returns.

[fail on clashing range with EEXIST as per Florian Weimer]
[set MAP_FIXED before round_hint_to_min as per Khalid Aziz]
Reviewed-by: Khalid Aziz <khalid.aziz@oracle.com>
Signed-off-by: Michal Hocko <mhocko@suse.com>
---
 arch/alpha/include/uapi/asm/mman.h   |  2 ++
 arch/mips/include/uapi/asm/mman.h    |  2 ++
 arch/parisc/include/uapi/asm/mman.h  |  2 ++
 arch/powerpc/include/uapi/asm/mman.h |  1 +
 arch/sparc/include/uapi/asm/mman.h   |  1 +
 arch/tile/include/uapi/asm/mman.h    |  1 +
 arch/xtensa/include/uapi/asm/mman.h  |  2 ++
 include/uapi/asm-generic/mman.h      |  1 +
 mm/mmap.c                            | 11 +++++++++++
 9 files changed, 23 insertions(+)

diff --git a/arch/alpha/include/uapi/asm/mman.h b/arch/alpha/include/uapi/asm/mman.h
index 6bf730063e3f..ef3770262925 100644
--- a/arch/alpha/include/uapi/asm/mman.h
+++ b/arch/alpha/include/uapi/asm/mman.h
@@ -32,6 +32,8 @@
 #define MAP_STACK	0x80000		/* give out an address that is best suited for process/thread stacks */
 #define MAP_HUGETLB	0x100000	/* create a huge page mapping */
 
+#define MAP_FIXED_SAFE	0x200000	/* MAP_FIXED which doesn't unmap underlying mapping */
+
 #define MS_ASYNC	1		/* sync memory asynchronously */
 #define MS_SYNC		2		/* synchronous memory sync */
 #define MS_INVALIDATE	4		/* invalidate the caches */
diff --git a/arch/mips/include/uapi/asm/mman.h b/arch/mips/include/uapi/asm/mman.h
index 20c3df7a8fdd..f1e15890345c 100644
--- a/arch/mips/include/uapi/asm/mman.h
+++ b/arch/mips/include/uapi/asm/mman.h
@@ -50,6 +50,8 @@
 #define MAP_STACK	0x40000		/* give out an address that is best suited for process/thread stacks */
 #define MAP_HUGETLB	0x80000		/* create a huge page mapping */
 
+#define MAP_FIXED_SAFE	0x100000	/* MAP_FIXED which doesn't unmap underlying mapping */
+
 /*
  * Flags for msync
  */
diff --git a/arch/parisc/include/uapi/asm/mman.h b/arch/parisc/include/uapi/asm/mman.h
index d1af0d74a188..daf0282ac417 100644
--- a/arch/parisc/include/uapi/asm/mman.h
+++ b/arch/parisc/include/uapi/asm/mman.h
@@ -26,6 +26,8 @@
 #define MAP_STACK	0x40000		/* give out an address that is best suited for process/thread stacks */
 #define MAP_HUGETLB	0x80000		/* create a huge page mapping */
 
+#define MAP_FIXED_SAFE	0x100000	/* MAP_FIXED which doesn't unmap underlying mapping */
+
 #define MS_SYNC		1		/* synchronous memory sync */
 #define MS_ASYNC	2		/* sync memory asynchronously */
 #define MS_INVALIDATE	4		/* invalidate the caches */
diff --git a/arch/powerpc/include/uapi/asm/mman.h b/arch/powerpc/include/uapi/asm/mman.h
index e63bc37e33af..3ffd284e7160 100644
--- a/arch/powerpc/include/uapi/asm/mman.h
+++ b/arch/powerpc/include/uapi/asm/mman.h
@@ -29,5 +29,6 @@
 #define MAP_NONBLOCK	0x10000		/* do not block on IO */
 #define MAP_STACK	0x20000		/* give out an address that is best suited for process/thread stacks */
 #define MAP_HUGETLB	0x40000		/* create a huge page mapping */
+#define MAP_FIXED_SAFE	0x800000	/* MAP_FIXED which doesn't unmap underlying mapping */
 
 #endif /* _UAPI_ASM_POWERPC_MMAN_H */
diff --git a/arch/sparc/include/uapi/asm/mman.h b/arch/sparc/include/uapi/asm/mman.h
index 715a2c927e79..0c282c09fae8 100644
--- a/arch/sparc/include/uapi/asm/mman.h
+++ b/arch/sparc/include/uapi/asm/mman.h
@@ -24,6 +24,7 @@
 #define MAP_NONBLOCK	0x10000		/* do not block on IO */
 #define MAP_STACK	0x20000		/* give out an address that is best suited for process/thread stacks */
 #define MAP_HUGETLB	0x40000		/* create a huge page mapping */
+#define MAP_FIXED_SAFE	0x80000		/* MAP_FIXED which doesn't unmap underlying mapping */
 
 
 #endif /* _UAPI__SPARC_MMAN_H__ */
diff --git a/arch/tile/include/uapi/asm/mman.h b/arch/tile/include/uapi/asm/mman.h
index 9b7add95926b..b212f5fd5345 100644
--- a/arch/tile/include/uapi/asm/mman.h
+++ b/arch/tile/include/uapi/asm/mman.h
@@ -30,6 +30,7 @@
 #define MAP_DENYWRITE	0x0800		/* ETXTBSY */
 #define MAP_EXECUTABLE	0x1000		/* mark it as an executable */
 #define MAP_HUGETLB	0x4000		/* create a huge page mapping */
+#define MAP_FIXED_SAFE	0x8000		/* MAP_FIXED which doesn't unmap underlying mapping */
 
 
 /*
diff --git a/arch/xtensa/include/uapi/asm/mman.h b/arch/xtensa/include/uapi/asm/mman.h
index 2bfe590694fc..0daf199caa57 100644
--- a/arch/xtensa/include/uapi/asm/mman.h
+++ b/arch/xtensa/include/uapi/asm/mman.h
@@ -56,6 +56,7 @@
 #define MAP_NONBLOCK	0x20000		/* do not block on IO */
 #define MAP_STACK	0x40000		/* give out an address that is best suited for process/thread stacks */
 #define MAP_HUGETLB	0x80000		/* create a huge page mapping */
+#define MAP_FIXED_SAFE	0x100000	/* MAP_FIXED which doesn't unmap underlying mapping */
 #ifdef CONFIG_MMAP_ALLOW_UNINITIALIZED
 # define MAP_UNINITIALIZED 0x4000000	/* For anonymous mmap, memory could be
 					 * uninitialized */
@@ -63,6 +64,7 @@
 # define MAP_UNINITIALIZED 0x0		/* Don't support this flag */
 #endif
 
+
 /*
  * Flags for msync
  */
diff --git a/include/uapi/asm-generic/mman.h b/include/uapi/asm-generic/mman.h
index 2dffcbf705b3..56cde132a80a 100644
--- a/include/uapi/asm-generic/mman.h
+++ b/include/uapi/asm-generic/mman.h
@@ -13,6 +13,7 @@
 #define MAP_NONBLOCK	0x10000		/* do not block on IO */
 #define MAP_STACK	0x20000		/* give out an address that is best suited for process/thread stacks */
 #define MAP_HUGETLB	0x40000		/* create a huge page mapping */
+#define MAP_FIXED_SAFE	0x80000		/* MAP_FIXED which doesn't unmap underlying mapping */
 
 /* Bits [26:31] are reserved, see mman-common.h for MAP_HUGETLB usage */
 
diff --git a/mm/mmap.c b/mm/mmap.c
index 476e810cf100..e84339842bb8 100644
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -1342,6 +1342,10 @@ unsigned long do_mmap(struct file *file, unsigned long addr,
 		if (!(file && path_noexec(&file->f_path)))
 			prot |= PROT_EXEC;
 
+	/* force arch specific MAP_FIXED handling in get_unmapped_area */
+	if (flags & MAP_FIXED_SAFE)
+		flags |= MAP_FIXED;
+
 	if (!(flags & MAP_FIXED))
 		addr = round_hint_to_min(addr);
 
@@ -1365,6 +1369,13 @@ unsigned long do_mmap(struct file *file, unsigned long addr,
 	if (offset_in_page(addr))
 		return addr;
 
+	if (flags & MAP_FIXED_SAFE) {
+		struct vm_area_struct *vma = find_vma(mm, addr);
+
+		if (vma && vma->vm_start <= addr)
+			return -EEXIST;
+	}
+
 	if (prot == PROT_EXEC) {
 		pkey = execute_only_pkey(mm);
 		if (pkey < 0)
-- 
2.15.0

^ permalink raw reply related	[flat|nested] 130+ messages in thread

* [PATCH 1/2] mm: introduce MAP_FIXED_SAFE
@ 2017-11-29 14:42   ` Michal Hocko
  0 siblings, 0 replies; 130+ messages in thread
From: Michal Hocko @ 2017-11-29 14:42 UTC (permalink / raw)
  To: linux-api
  Cc: Khalid Aziz, Michael Ellerman, Andrew Morton,
	Russell King - ARM Linux, Andrea Arcangeli, linux-mm, LKML,
	linux-arch, Florian Weimer, John Hubbard, Michal Hocko

From: Michal Hocko <mhocko@suse.com>

MAP_FIXED is used quite often to enforce mapping at the particular
range. The main problem of this flag is, however, that it is inherently
dangerous because it unmaps existing mappings covered by the requested
range. This can cause silent memory corruptions. Some of them even with
serious security implications. While the current semantic might be
really desiderable in many cases there are others which would want to
enforce the given range but rather see a failure than a silent memory
corruption on a clashing range. Please note that there is no guarantee
that a given range is obeyed by the mmap even when it is free - e.g.
arch specific code is allowed to apply an alignment.

Introduce a new MAP_FIXED_SAFE flag for mmap to achieve this behavior.
It has the same semantic as MAP_FIXED wrt. the given address request
with a single exception that it fails with EEXIST if the requested
address is already covered by an existing mapping. We still do rely on
get_unmaped_area to handle all the arch specific MAP_FIXED treatment and
check for a conflicting vma after it returns.

[fail on clashing range with EEXIST as per Florian Weimer]
[set MAP_FIXED before round_hint_to_min as per Khalid Aziz]
Reviewed-by: Khalid Aziz <khalid.aziz@oracle.com>
Signed-off-by: Michal Hocko <mhocko@suse.com>
---
 arch/alpha/include/uapi/asm/mman.h   |  2 ++
 arch/mips/include/uapi/asm/mman.h    |  2 ++
 arch/parisc/include/uapi/asm/mman.h  |  2 ++
 arch/powerpc/include/uapi/asm/mman.h |  1 +
 arch/sparc/include/uapi/asm/mman.h   |  1 +
 arch/tile/include/uapi/asm/mman.h    |  1 +
 arch/xtensa/include/uapi/asm/mman.h  |  2 ++
 include/uapi/asm-generic/mman.h      |  1 +
 mm/mmap.c                            | 11 +++++++++++
 9 files changed, 23 insertions(+)

diff --git a/arch/alpha/include/uapi/asm/mman.h b/arch/alpha/include/uapi/asm/mman.h
index 6bf730063e3f..ef3770262925 100644
--- a/arch/alpha/include/uapi/asm/mman.h
+++ b/arch/alpha/include/uapi/asm/mman.h
@@ -32,6 +32,8 @@
 #define MAP_STACK	0x80000		/* give out an address that is best suited for process/thread stacks */
 #define MAP_HUGETLB	0x100000	/* create a huge page mapping */
 
+#define MAP_FIXED_SAFE	0x200000	/* MAP_FIXED which doesn't unmap underlying mapping */
+
 #define MS_ASYNC	1		/* sync memory asynchronously */
 #define MS_SYNC		2		/* synchronous memory sync */
 #define MS_INVALIDATE	4		/* invalidate the caches */
diff --git a/arch/mips/include/uapi/asm/mman.h b/arch/mips/include/uapi/asm/mman.h
index 20c3df7a8fdd..f1e15890345c 100644
--- a/arch/mips/include/uapi/asm/mman.h
+++ b/arch/mips/include/uapi/asm/mman.h
@@ -50,6 +50,8 @@
 #define MAP_STACK	0x40000		/* give out an address that is best suited for process/thread stacks */
 #define MAP_HUGETLB	0x80000		/* create a huge page mapping */
 
+#define MAP_FIXED_SAFE	0x100000	/* MAP_FIXED which doesn't unmap underlying mapping */
+
 /*
  * Flags for msync
  */
diff --git a/arch/parisc/include/uapi/asm/mman.h b/arch/parisc/include/uapi/asm/mman.h
index d1af0d74a188..daf0282ac417 100644
--- a/arch/parisc/include/uapi/asm/mman.h
+++ b/arch/parisc/include/uapi/asm/mman.h
@@ -26,6 +26,8 @@
 #define MAP_STACK	0x40000		/* give out an address that is best suited for process/thread stacks */
 #define MAP_HUGETLB	0x80000		/* create a huge page mapping */
 
+#define MAP_FIXED_SAFE	0x100000	/* MAP_FIXED which doesn't unmap underlying mapping */
+
 #define MS_SYNC		1		/* synchronous memory sync */
 #define MS_ASYNC	2		/* sync memory asynchronously */
 #define MS_INVALIDATE	4		/* invalidate the caches */
diff --git a/arch/powerpc/include/uapi/asm/mman.h b/arch/powerpc/include/uapi/asm/mman.h
index e63bc37e33af..3ffd284e7160 100644
--- a/arch/powerpc/include/uapi/asm/mman.h
+++ b/arch/powerpc/include/uapi/asm/mman.h
@@ -29,5 +29,6 @@
 #define MAP_NONBLOCK	0x10000		/* do not block on IO */
 #define MAP_STACK	0x20000		/* give out an address that is best suited for process/thread stacks */
 #define MAP_HUGETLB	0x40000		/* create a huge page mapping */
+#define MAP_FIXED_SAFE	0x800000	/* MAP_FIXED which doesn't unmap underlying mapping */
 
 #endif /* _UAPI_ASM_POWERPC_MMAN_H */
diff --git a/arch/sparc/include/uapi/asm/mman.h b/arch/sparc/include/uapi/asm/mman.h
index 715a2c927e79..0c282c09fae8 100644
--- a/arch/sparc/include/uapi/asm/mman.h
+++ b/arch/sparc/include/uapi/asm/mman.h
@@ -24,6 +24,7 @@
 #define MAP_NONBLOCK	0x10000		/* do not block on IO */
 #define MAP_STACK	0x20000		/* give out an address that is best suited for process/thread stacks */
 #define MAP_HUGETLB	0x40000		/* create a huge page mapping */
+#define MAP_FIXED_SAFE	0x80000		/* MAP_FIXED which doesn't unmap underlying mapping */
 
 
 #endif /* _UAPI__SPARC_MMAN_H__ */
diff --git a/arch/tile/include/uapi/asm/mman.h b/arch/tile/include/uapi/asm/mman.h
index 9b7add95926b..b212f5fd5345 100644
--- a/arch/tile/include/uapi/asm/mman.h
+++ b/arch/tile/include/uapi/asm/mman.h
@@ -30,6 +30,7 @@
 #define MAP_DENYWRITE	0x0800		/* ETXTBSY */
 #define MAP_EXECUTABLE	0x1000		/* mark it as an executable */
 #define MAP_HUGETLB	0x4000		/* create a huge page mapping */
+#define MAP_FIXED_SAFE	0x8000		/* MAP_FIXED which doesn't unmap underlying mapping */
 
 
 /*
diff --git a/arch/xtensa/include/uapi/asm/mman.h b/arch/xtensa/include/uapi/asm/mman.h
index 2bfe590694fc..0daf199caa57 100644
--- a/arch/xtensa/include/uapi/asm/mman.h
+++ b/arch/xtensa/include/uapi/asm/mman.h
@@ -56,6 +56,7 @@
 #define MAP_NONBLOCK	0x20000		/* do not block on IO */
 #define MAP_STACK	0x40000		/* give out an address that is best suited for process/thread stacks */
 #define MAP_HUGETLB	0x80000		/* create a huge page mapping */
+#define MAP_FIXED_SAFE	0x100000	/* MAP_FIXED which doesn't unmap underlying mapping */
 #ifdef CONFIG_MMAP_ALLOW_UNINITIALIZED
 # define MAP_UNINITIALIZED 0x4000000	/* For anonymous mmap, memory could be
 					 * uninitialized */
@@ -63,6 +64,7 @@
 # define MAP_UNINITIALIZED 0x0		/* Don't support this flag */
 #endif
 
+
 /*
  * Flags for msync
  */
diff --git a/include/uapi/asm-generic/mman.h b/include/uapi/asm-generic/mman.h
index 2dffcbf705b3..56cde132a80a 100644
--- a/include/uapi/asm-generic/mman.h
+++ b/include/uapi/asm-generic/mman.h
@@ -13,6 +13,7 @@
 #define MAP_NONBLOCK	0x10000		/* do not block on IO */
 #define MAP_STACK	0x20000		/* give out an address that is best suited for process/thread stacks */
 #define MAP_HUGETLB	0x40000		/* create a huge page mapping */
+#define MAP_FIXED_SAFE	0x80000		/* MAP_FIXED which doesn't unmap underlying mapping */
 
 /* Bits [26:31] are reserved, see mman-common.h for MAP_HUGETLB usage */
 
diff --git a/mm/mmap.c b/mm/mmap.c
index 476e810cf100..e84339842bb8 100644
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -1342,6 +1342,10 @@ unsigned long do_mmap(struct file *file, unsigned long addr,
 		if (!(file && path_noexec(&file->f_path)))
 			prot |= PROT_EXEC;
 
+	/* force arch specific MAP_FIXED handling in get_unmapped_area */
+	if (flags & MAP_FIXED_SAFE)
+		flags |= MAP_FIXED;
+
 	if (!(flags & MAP_FIXED))
 		addr = round_hint_to_min(addr);
 
@@ -1365,6 +1369,13 @@ unsigned long do_mmap(struct file *file, unsigned long addr,
 	if (offset_in_page(addr))
 		return addr;
 
+	if (flags & MAP_FIXED_SAFE) {
+		struct vm_area_struct *vma = find_vma(mm, addr);
+
+		if (vma && vma->vm_start <= addr)
+			return -EEXIST;
+	}
+
 	if (prot == PROT_EXEC) {
 		pkey = execute_only_pkey(mm);
 		if (pkey < 0)
-- 
2.15.0

^ permalink raw reply related	[flat|nested] 130+ messages in thread

* [PATCH 1/2] mm: introduce MAP_FIXED_SAFE
@ 2017-11-29 14:42   ` Michal Hocko
  0 siblings, 0 replies; 130+ messages in thread
From: Michal Hocko @ 2017-11-29 14:42 UTC (permalink / raw)
  To: linux-api
  Cc: Khalid Aziz, Michael Ellerman, Andrew Morton,
	Russell King - ARM Linux, Andrea Arcangeli, linux-mm, LKML,
	linux-arch, Florian Weimer, John Hubbard, Michal Hocko

From: Michal Hocko <mhocko@suse.com>

MAP_FIXED is used quite often to enforce mapping at the particular
range. The main problem of this flag is, however, that it is inherently
dangerous because it unmaps existing mappings covered by the requested
range. This can cause silent memory corruptions. Some of them even with
serious security implications. While the current semantic might be
really desiderable in many cases there are others which would want to
enforce the given range but rather see a failure than a silent memory
corruption on a clashing range. Please note that there is no guarantee
that a given range is obeyed by the mmap even when it is free - e.g.
arch specific code is allowed to apply an alignment.

Introduce a new MAP_FIXED_SAFE flag for mmap to achieve this behavior.
It has the same semantic as MAP_FIXED wrt. the given address request
with a single exception that it fails with EEXIST if the requested
address is already covered by an existing mapping. We still do rely on
get_unmaped_area to handle all the arch specific MAP_FIXED treatment and
check for a conflicting vma after it returns.

[fail on clashing range with EEXIST as per Florian Weimer]
[set MAP_FIXED before round_hint_to_min as per Khalid Aziz]
Reviewed-by: Khalid Aziz <khalid.aziz@oracle.com>
Signed-off-by: Michal Hocko <mhocko@suse.com>
---
 arch/alpha/include/uapi/asm/mman.h   |  2 ++
 arch/mips/include/uapi/asm/mman.h    |  2 ++
 arch/parisc/include/uapi/asm/mman.h  |  2 ++
 arch/powerpc/include/uapi/asm/mman.h |  1 +
 arch/sparc/include/uapi/asm/mman.h   |  1 +
 arch/tile/include/uapi/asm/mman.h    |  1 +
 arch/xtensa/include/uapi/asm/mman.h  |  2 ++
 include/uapi/asm-generic/mman.h      |  1 +
 mm/mmap.c                            | 11 +++++++++++
 9 files changed, 23 insertions(+)

diff --git a/arch/alpha/include/uapi/asm/mman.h b/arch/alpha/include/uapi/asm/mman.h
index 6bf730063e3f..ef3770262925 100644
--- a/arch/alpha/include/uapi/asm/mman.h
+++ b/arch/alpha/include/uapi/asm/mman.h
@@ -32,6 +32,8 @@
 #define MAP_STACK	0x80000		/* give out an address that is best suited for process/thread stacks */
 #define MAP_HUGETLB	0x100000	/* create a huge page mapping */
 
+#define MAP_FIXED_SAFE	0x200000	/* MAP_FIXED which doesn't unmap underlying mapping */
+
 #define MS_ASYNC	1		/* sync memory asynchronously */
 #define MS_SYNC		2		/* synchronous memory sync */
 #define MS_INVALIDATE	4		/* invalidate the caches */
diff --git a/arch/mips/include/uapi/asm/mman.h b/arch/mips/include/uapi/asm/mman.h
index 20c3df7a8fdd..f1e15890345c 100644
--- a/arch/mips/include/uapi/asm/mman.h
+++ b/arch/mips/include/uapi/asm/mman.h
@@ -50,6 +50,8 @@
 #define MAP_STACK	0x40000		/* give out an address that is best suited for process/thread stacks */
 #define MAP_HUGETLB	0x80000		/* create a huge page mapping */
 
+#define MAP_FIXED_SAFE	0x100000	/* MAP_FIXED which doesn't unmap underlying mapping */
+
 /*
  * Flags for msync
  */
diff --git a/arch/parisc/include/uapi/asm/mman.h b/arch/parisc/include/uapi/asm/mman.h
index d1af0d74a188..daf0282ac417 100644
--- a/arch/parisc/include/uapi/asm/mman.h
+++ b/arch/parisc/include/uapi/asm/mman.h
@@ -26,6 +26,8 @@
 #define MAP_STACK	0x40000		/* give out an address that is best suited for process/thread stacks */
 #define MAP_HUGETLB	0x80000		/* create a huge page mapping */
 
+#define MAP_FIXED_SAFE	0x100000	/* MAP_FIXED which doesn't unmap underlying mapping */
+
 #define MS_SYNC		1		/* synchronous memory sync */
 #define MS_ASYNC	2		/* sync memory asynchronously */
 #define MS_INVALIDATE	4		/* invalidate the caches */
diff --git a/arch/powerpc/include/uapi/asm/mman.h b/arch/powerpc/include/uapi/asm/mman.h
index e63bc37e33af..3ffd284e7160 100644
--- a/arch/powerpc/include/uapi/asm/mman.h
+++ b/arch/powerpc/include/uapi/asm/mman.h
@@ -29,5 +29,6 @@
 #define MAP_NONBLOCK	0x10000		/* do not block on IO */
 #define MAP_STACK	0x20000		/* give out an address that is best suited for process/thread stacks */
 #define MAP_HUGETLB	0x40000		/* create a huge page mapping */
+#define MAP_FIXED_SAFE	0x800000	/* MAP_FIXED which doesn't unmap underlying mapping */
 
 #endif /* _UAPI_ASM_POWERPC_MMAN_H */
diff --git a/arch/sparc/include/uapi/asm/mman.h b/arch/sparc/include/uapi/asm/mman.h
index 715a2c927e79..0c282c09fae8 100644
--- a/arch/sparc/include/uapi/asm/mman.h
+++ b/arch/sparc/include/uapi/asm/mman.h
@@ -24,6 +24,7 @@
 #define MAP_NONBLOCK	0x10000		/* do not block on IO */
 #define MAP_STACK	0x20000		/* give out an address that is best suited for process/thread stacks */
 #define MAP_HUGETLB	0x40000		/* create a huge page mapping */
+#define MAP_FIXED_SAFE	0x80000		/* MAP_FIXED which doesn't unmap underlying mapping */
 
 
 #endif /* _UAPI__SPARC_MMAN_H__ */
diff --git a/arch/tile/include/uapi/asm/mman.h b/arch/tile/include/uapi/asm/mman.h
index 9b7add95926b..b212f5fd5345 100644
--- a/arch/tile/include/uapi/asm/mman.h
+++ b/arch/tile/include/uapi/asm/mman.h
@@ -30,6 +30,7 @@
 #define MAP_DENYWRITE	0x0800		/* ETXTBSY */
 #define MAP_EXECUTABLE	0x1000		/* mark it as an executable */
 #define MAP_HUGETLB	0x4000		/* create a huge page mapping */
+#define MAP_FIXED_SAFE	0x8000		/* MAP_FIXED which doesn't unmap underlying mapping */
 
 
 /*
diff --git a/arch/xtensa/include/uapi/asm/mman.h b/arch/xtensa/include/uapi/asm/mman.h
index 2bfe590694fc..0daf199caa57 100644
--- a/arch/xtensa/include/uapi/asm/mman.h
+++ b/arch/xtensa/include/uapi/asm/mman.h
@@ -56,6 +56,7 @@
 #define MAP_NONBLOCK	0x20000		/* do not block on IO */
 #define MAP_STACK	0x40000		/* give out an address that is best suited for process/thread stacks */
 #define MAP_HUGETLB	0x80000		/* create a huge page mapping */
+#define MAP_FIXED_SAFE	0x100000	/* MAP_FIXED which doesn't unmap underlying mapping */
 #ifdef CONFIG_MMAP_ALLOW_UNINITIALIZED
 # define MAP_UNINITIALIZED 0x4000000	/* For anonymous mmap, memory could be
 					 * uninitialized */
@@ -63,6 +64,7 @@
 # define MAP_UNINITIALIZED 0x0		/* Don't support this flag */
 #endif
 
+
 /*
  * Flags for msync
  */
diff --git a/include/uapi/asm-generic/mman.h b/include/uapi/asm-generic/mman.h
index 2dffcbf705b3..56cde132a80a 100644
--- a/include/uapi/asm-generic/mman.h
+++ b/include/uapi/asm-generic/mman.h
@@ -13,6 +13,7 @@
 #define MAP_NONBLOCK	0x10000		/* do not block on IO */
 #define MAP_STACK	0x20000		/* give out an address that is best suited for process/thread stacks */
 #define MAP_HUGETLB	0x40000		/* create a huge page mapping */
+#define MAP_FIXED_SAFE	0x80000		/* MAP_FIXED which doesn't unmap underlying mapping */
 
 /* Bits [26:31] are reserved, see mman-common.h for MAP_HUGETLB usage */
 
diff --git a/mm/mmap.c b/mm/mmap.c
index 476e810cf100..e84339842bb8 100644
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -1342,6 +1342,10 @@ unsigned long do_mmap(struct file *file, unsigned long addr,
 		if (!(file && path_noexec(&file->f_path)))
 			prot |= PROT_EXEC;
 
+	/* force arch specific MAP_FIXED handling in get_unmapped_area */
+	if (flags & MAP_FIXED_SAFE)
+		flags |= MAP_FIXED;
+
 	if (!(flags & MAP_FIXED))
 		addr = round_hint_to_min(addr);
 
@@ -1365,6 +1369,13 @@ unsigned long do_mmap(struct file *file, unsigned long addr,
 	if (offset_in_page(addr))
 		return addr;
 
+	if (flags & MAP_FIXED_SAFE) {
+		struct vm_area_struct *vma = find_vma(mm, addr);
+
+		if (vma && vma->vm_start <= addr)
+			return -EEXIST;
+	}
+
 	if (prot == PROT_EXEC) {
 		pkey = execute_only_pkey(mm);
 		if (pkey < 0)
-- 
2.15.0

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 130+ messages in thread

* [PATCH 2/2] fs, elf: drop MAP_FIXED usage from elf_map
  2017-11-29 14:42 ` Michal Hocko
  (?)
@ 2017-11-29 14:42   ` Michal Hocko
  -1 siblings, 0 replies; 130+ messages in thread
From: Michal Hocko @ 2017-11-29 14:42 UTC (permalink / raw)
  To: linux-api
  Cc: Khalid Aziz, Michael Ellerman, Andrew Morton,
	Russell King - ARM Linux, Andrea Arcangeli, linux-mm, LKML,
	linux-arch, Florian Weimer, John Hubbard, Michal Hocko,
	Abdul Haleem, Joel Stanley, Kees Cook

From: Michal Hocko <mhocko@suse.com>

Both load_elf_interp and load_elf_binary rely on elf_map to map segments
on a controlled address and they use MAP_FIXED to enforce that. This is
however dangerous thing prone to silent data corruption which can be
even exploitable. Let's take CVE-2017-1000253 as an example. At the time
(before eab09532d400 ("binfmt_elf: use ELF_ET_DYN_BASE only for PIE"))
ELF_ET_DYN_BASE was at TASK_SIZE / 3 * 2 which is not that far away from
the stack top on 32b (legacy) memory layout (only 1GB away). Therefore
we could end up mapping over the existing stack with some luck.

The issue has been fixed since then (a87938b2e246 ("fs/binfmt_elf.c:
fix bug in loading of PIE binaries")), ELF_ET_DYN_BASE moved moved much
further from the stack (eab09532d400 and later by c715b72c1ba4 ("mm:
revert x86_64 and arm64 ELF_ET_DYN_BASE base changes")) and excessive
stack consumption early during execve fully stopped by da029c11e6b1
("exec: Limit arg stack to at most 75% of _STK_LIM"). So we should be
safe and any attack should be impractical. On the other hand this is
just too subtle assumption so it can break quite easily and hard to
spot.

I believe that the MAP_FIXED usage in load_elf_binary (et. al) is still
fundamentally dangerous. Moreover it shouldn't be even needed. We are
at the early process stage and so there shouldn't be unrelated mappings
(except for stack and loader) existing so mmap for a given address
should succeed even without MAP_FIXED. Something is terribly wrong if
this is not the case and we should rather fail than silently corrupt the
underlying mapping.

Address this issue by changing MAP_FIXED to the newly added
MAP_FIXED_SAFE. This will mean that mmap will fail if there is an
existing mapping clashing with the requested one without clobbering it.

Cc: Abdul Haleem <abdhalee@linux.vnet.ibm.com>
Cc: Joel Stanley <joel@jms.id.au>
Acked-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Michal Hocko <mhocko@suse.com>
---
 arch/metag/kernel/process.c |  6 +++++-
 fs/binfmt_elf.c             | 12 ++++++++----
 2 files changed, 13 insertions(+), 5 deletions(-)

diff --git a/arch/metag/kernel/process.c b/arch/metag/kernel/process.c
index 0909834c83a7..867c8d0a5fb4 100644
--- a/arch/metag/kernel/process.c
+++ b/arch/metag/kernel/process.c
@@ -399,7 +399,7 @@ unsigned long __metag_elf_map(struct file *filep, unsigned long addr,
 	tcm_tag = tcm_lookup_tag(addr);
 
 	if (tcm_tag != TCM_INVALID_TAG)
-		type &= ~MAP_FIXED;
+		type &= ~(MAP_FIXED | MAP_FIXED_SAFE);
 
 	/*
 	* total_size is the size of the ELF (interpreter) image.
@@ -417,6 +417,10 @@ unsigned long __metag_elf_map(struct file *filep, unsigned long addr,
 	} else
 		map_addr = vm_mmap(filep, addr, size, prot, type, off);
 
+	if ((type & MAP_FIXED_SAFE) && BAD_ADDR(map_addr))
+		pr_info("%d (%s): Uhuuh, elf segement at %p requested but the memory is mapped already\n",
+				task_pid_nr(current), tsk->comm, (void*)addr);
+
 	if (!BAD_ADDR(map_addr) && tcm_tag != TCM_INVALID_TAG) {
 		struct tcm_allocation *tcm;
 		unsigned long tcm_addr;
diff --git a/fs/binfmt_elf.c b/fs/binfmt_elf.c
index 73b01e474fdc..5916d45f64a7 100644
--- a/fs/binfmt_elf.c
+++ b/fs/binfmt_elf.c
@@ -372,6 +372,10 @@ static unsigned long elf_map(struct file *filep, unsigned long addr,
 	} else
 		map_addr = vm_mmap(filep, addr, size, prot, type, off);
 
+	if ((type & MAP_FIXED_SAFE) && BAD_ADDR(map_addr))
+		pr_info("%d (%s): Uhuuh, elf segement at %p requested but the memory is mapped already\n",
+				task_pid_nr(current), current->comm, (void*)addr);
+
 	return(map_addr);
 }
 
@@ -569,7 +573,7 @@ static unsigned long load_elf_interp(struct elfhdr *interp_elf_ex,
 				elf_prot |= PROT_EXEC;
 			vaddr = eppnt->p_vaddr;
 			if (interp_elf_ex->e_type == ET_EXEC || load_addr_set)
-				elf_type |= MAP_FIXED;
+				elf_type |= MAP_FIXED_SAFE;
 			else if (no_base && interp_elf_ex->e_type == ET_DYN)
 				load_addr = -vaddr;
 
@@ -930,7 +934,7 @@ static int load_elf_binary(struct linux_binprm *bprm)
 		 * the ET_DYN load_addr calculations, proceed normally.
 		 */
 		if (loc->elf_ex.e_type == ET_EXEC || load_addr_set) {
-			elf_flags |= MAP_FIXED;
+			elf_flags |= MAP_FIXED_SAFE;
 		} else if (loc->elf_ex.e_type == ET_DYN) {
 			/*
 			 * This logic is run once for the first LOAD Program
@@ -966,7 +970,7 @@ static int load_elf_binary(struct linux_binprm *bprm)
 				load_bias = ELF_ET_DYN_BASE;
 				if (current->flags & PF_RANDOMIZE)
 					load_bias += arch_mmap_rnd();
-				elf_flags |= MAP_FIXED;
+				elf_flags |= MAP_FIXED_SAFE;
 			} else
 				load_bias = 0;
 
@@ -1223,7 +1227,7 @@ static int load_elf_library(struct file *file)
 			(eppnt->p_filesz +
 			 ELF_PAGEOFFSET(eppnt->p_vaddr)),
 			PROT_READ | PROT_WRITE | PROT_EXEC,
-			MAP_FIXED | MAP_PRIVATE | MAP_DENYWRITE,
+			MAP_FIXED_SAFE | MAP_PRIVATE | MAP_DENYWRITE,
 			(eppnt->p_offset -
 			 ELF_PAGEOFFSET(eppnt->p_vaddr)));
 	if (error != ELF_PAGESTART(eppnt->p_vaddr))
-- 
2.15.0

^ permalink raw reply related	[flat|nested] 130+ messages in thread

* [PATCH 2/2] fs, elf: drop MAP_FIXED usage from elf_map
@ 2017-11-29 14:42   ` Michal Hocko
  0 siblings, 0 replies; 130+ messages in thread
From: Michal Hocko @ 2017-11-29 14:42 UTC (permalink / raw)
  To: linux-api
  Cc: Khalid Aziz, Michael Ellerman, Andrew Morton,
	Russell King - ARM Linux, Andrea Arcangeli, linux-mm, LKML,
	linux-arch, Florian Weimer, John Hubbard, Michal Hocko,
	Abdul Haleem, Joel Stanley, Kees Cook

From: Michal Hocko <mhocko@suse.com>

Both load_elf_interp and load_elf_binary rely on elf_map to map segments
on a controlled address and they use MAP_FIXED to enforce that. This is
however dangerous thing prone to silent data corruption which can be
even exploitable. Let's take CVE-2017-1000253 as an example. At the time
(before eab09532d400 ("binfmt_elf: use ELF_ET_DYN_BASE only for PIE"))
ELF_ET_DYN_BASE was at TASK_SIZE / 3 * 2 which is not that far away from
the stack top on 32b (legacy) memory layout (only 1GB away). Therefore
we could end up mapping over the existing stack with some luck.

The issue has been fixed since then (a87938b2e246 ("fs/binfmt_elf.c:
fix bug in loading of PIE binaries")), ELF_ET_DYN_BASE moved moved much
further from the stack (eab09532d400 and later by c715b72c1ba4 ("mm:
revert x86_64 and arm64 ELF_ET_DYN_BASE base changes")) and excessive
stack consumption early during execve fully stopped by da029c11e6b1
("exec: Limit arg stack to at most 75% of _STK_LIM"). So we should be
safe and any attack should be impractical. On the other hand this is
just too subtle assumption so it can break quite easily and hard to
spot.

I believe that the MAP_FIXED usage in load_elf_binary (et. al) is still
fundamentally dangerous. Moreover it shouldn't be even needed. We are
at the early process stage and so there shouldn't be unrelated mappings
(except for stack and loader) existing so mmap for a given address
should succeed even without MAP_FIXED. Something is terribly wrong if
this is not the case and we should rather fail than silently corrupt the
underlying mapping.

Address this issue by changing MAP_FIXED to the newly added
MAP_FIXED_SAFE. This will mean that mmap will fail if there is an
existing mapping clashing with the requested one without clobbering it.

Cc: Abdul Haleem <abdhalee@linux.vnet.ibm.com>
Cc: Joel Stanley <joel@jms.id.au>
Acked-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Michal Hocko <mhocko@suse.com>
---
 arch/metag/kernel/process.c |  6 +++++-
 fs/binfmt_elf.c             | 12 ++++++++----
 2 files changed, 13 insertions(+), 5 deletions(-)

diff --git a/arch/metag/kernel/process.c b/arch/metag/kernel/process.c
index 0909834c83a7..867c8d0a5fb4 100644
--- a/arch/metag/kernel/process.c
+++ b/arch/metag/kernel/process.c
@@ -399,7 +399,7 @@ unsigned long __metag_elf_map(struct file *filep, unsigned long addr,
 	tcm_tag = tcm_lookup_tag(addr);
 
 	if (tcm_tag != TCM_INVALID_TAG)
-		type &= ~MAP_FIXED;
+		type &= ~(MAP_FIXED | MAP_FIXED_SAFE);
 
 	/*
 	* total_size is the size of the ELF (interpreter) image.
@@ -417,6 +417,10 @@ unsigned long __metag_elf_map(struct file *filep, unsigned long addr,
 	} else
 		map_addr = vm_mmap(filep, addr, size, prot, type, off);
 
+	if ((type & MAP_FIXED_SAFE) && BAD_ADDR(map_addr))
+		pr_info("%d (%s): Uhuuh, elf segement at %p requested but the memory is mapped already\n",
+				task_pid_nr(current), tsk->comm, (void*)addr);
+
 	if (!BAD_ADDR(map_addr) && tcm_tag != TCM_INVALID_TAG) {
 		struct tcm_allocation *tcm;
 		unsigned long tcm_addr;
diff --git a/fs/binfmt_elf.c b/fs/binfmt_elf.c
index 73b01e474fdc..5916d45f64a7 100644
--- a/fs/binfmt_elf.c
+++ b/fs/binfmt_elf.c
@@ -372,6 +372,10 @@ static unsigned long elf_map(struct file *filep, unsigned long addr,
 	} else
 		map_addr = vm_mmap(filep, addr, size, prot, type, off);
 
+	if ((type & MAP_FIXED_SAFE) && BAD_ADDR(map_addr))
+		pr_info("%d (%s): Uhuuh, elf segement at %p requested but the memory is mapped already\n",
+				task_pid_nr(current), current->comm, (void*)addr);
+
 	return(map_addr);
 }
 
@@ -569,7 +573,7 @@ static unsigned long load_elf_interp(struct elfhdr *interp_elf_ex,
 				elf_prot |= PROT_EXEC;
 			vaddr = eppnt->p_vaddr;
 			if (interp_elf_ex->e_type == ET_EXEC || load_addr_set)
-				elf_type |= MAP_FIXED;
+				elf_type |= MAP_FIXED_SAFE;
 			else if (no_base && interp_elf_ex->e_type == ET_DYN)
 				load_addr = -vaddr;
 
@@ -930,7 +934,7 @@ static int load_elf_binary(struct linux_binprm *bprm)
 		 * the ET_DYN load_addr calculations, proceed normally.
 		 */
 		if (loc->elf_ex.e_type == ET_EXEC || load_addr_set) {
-			elf_flags |= MAP_FIXED;
+			elf_flags |= MAP_FIXED_SAFE;
 		} else if (loc->elf_ex.e_type == ET_DYN) {
 			/*
 			 * This logic is run once for the first LOAD Program
@@ -966,7 +970,7 @@ static int load_elf_binary(struct linux_binprm *bprm)
 				load_bias = ELF_ET_DYN_BASE;
 				if (current->flags & PF_RANDOMIZE)
 					load_bias += arch_mmap_rnd();
-				elf_flags |= MAP_FIXED;
+				elf_flags |= MAP_FIXED_SAFE;
 			} else
 				load_bias = 0;
 
@@ -1223,7 +1227,7 @@ static int load_elf_library(struct file *file)
 			(eppnt->p_filesz +
 			 ELF_PAGEOFFSET(eppnt->p_vaddr)),
 			PROT_READ | PROT_WRITE | PROT_EXEC,
-			MAP_FIXED | MAP_PRIVATE | MAP_DENYWRITE,
+			MAP_FIXED_SAFE | MAP_PRIVATE | MAP_DENYWRITE,
 			(eppnt->p_offset -
 			 ELF_PAGEOFFSET(eppnt->p_vaddr)));
 	if (error != ELF_PAGESTART(eppnt->p_vaddr))
-- 
2.15.0

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 130+ messages in thread

* [PATCH 2/2] fs, elf: drop MAP_FIXED usage from elf_map
@ 2017-11-29 14:42   ` Michal Hocko
  0 siblings, 0 replies; 130+ messages in thread
From: Michal Hocko @ 2017-11-29 14:42 UTC (permalink / raw)
  To: linux-api
  Cc: Khalid Aziz, Michael Ellerman, Andrew Morton,
	Russell King - ARM Linux, Andrea Arcangeli, linux-mm, LKML,
	linux-arch, Florian Weimer, John Hubbard, Michal Hocko,
	Abdul Haleem, Joel Stanley, Kees Cook

From: Michal Hocko <mhocko@suse.com>

Both load_elf_interp and load_elf_binary rely on elf_map to map segments
on a controlled address and they use MAP_FIXED to enforce that. This is
however dangerous thing prone to silent data corruption which can be
even exploitable. Let's take CVE-2017-1000253 as an example. At the time
(before eab09532d400 ("binfmt_elf: use ELF_ET_DYN_BASE only for PIE"))
ELF_ET_DYN_BASE was at TASK_SIZE / 3 * 2 which is not that far away from
the stack top on 32b (legacy) memory layout (only 1GB away). Therefore
we could end up mapping over the existing stack with some luck.

The issue has been fixed since then (a87938b2e246 ("fs/binfmt_elf.c:
fix bug in loading of PIE binaries")), ELF_ET_DYN_BASE moved moved much
further from the stack (eab09532d400 and later by c715b72c1ba4 ("mm:
revert x86_64 and arm64 ELF_ET_DYN_BASE base changes")) and excessive
stack consumption early during execve fully stopped by da029c11e6b1
("exec: Limit arg stack to at most 75% of _STK_LIM"). So we should be
safe and any attack should be impractical. On the other hand this is
just too subtle assumption so it can break quite easily and hard to
spot.

I believe that the MAP_FIXED usage in load_elf_binary (et. al) is still
fundamentally dangerous. Moreover it shouldn't be even needed. We are
at the early process stage and so there shouldn't be unrelated mappings
(except for stack and loader) existing so mmap for a given address
should succeed even without MAP_FIXED. Something is terribly wrong if
this is not the case and we should rather fail than silently corrupt the
underlying mapping.

Address this issue by changing MAP_FIXED to the newly added
MAP_FIXED_SAFE. This will mean that mmap will fail if there is an
existing mapping clashing with the requested one without clobbering it.

Cc: Abdul Haleem <abdhalee@linux.vnet.ibm.com>
Cc: Joel Stanley <joel@jms.id.au>
Acked-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Michal Hocko <mhocko@suse.com>
---
 arch/metag/kernel/process.c |  6 +++++-
 fs/binfmt_elf.c             | 12 ++++++++----
 2 files changed, 13 insertions(+), 5 deletions(-)

diff --git a/arch/metag/kernel/process.c b/arch/metag/kernel/process.c
index 0909834c83a7..867c8d0a5fb4 100644
--- a/arch/metag/kernel/process.c
+++ b/arch/metag/kernel/process.c
@@ -399,7 +399,7 @@ unsigned long __metag_elf_map(struct file *filep, unsigned long addr,
 	tcm_tag = tcm_lookup_tag(addr);
 
 	if (tcm_tag != TCM_INVALID_TAG)
-		type &= ~MAP_FIXED;
+		type &= ~(MAP_FIXED | MAP_FIXED_SAFE);
 
 	/*
 	* total_size is the size of the ELF (interpreter) image.
@@ -417,6 +417,10 @@ unsigned long __metag_elf_map(struct file *filep, unsigned long addr,
 	} else
 		map_addr = vm_mmap(filep, addr, size, prot, type, off);
 
+	if ((type & MAP_FIXED_SAFE) && BAD_ADDR(map_addr))
+		pr_info("%d (%s): Uhuuh, elf segement at %p requested but the memory is mapped already\n",
+				task_pid_nr(current), tsk->comm, (void*)addr);
+
 	if (!BAD_ADDR(map_addr) && tcm_tag != TCM_INVALID_TAG) {
 		struct tcm_allocation *tcm;
 		unsigned long tcm_addr;
diff --git a/fs/binfmt_elf.c b/fs/binfmt_elf.c
index 73b01e474fdc..5916d45f64a7 100644
--- a/fs/binfmt_elf.c
+++ b/fs/binfmt_elf.c
@@ -372,6 +372,10 @@ static unsigned long elf_map(struct file *filep, unsigned long addr,
 	} else
 		map_addr = vm_mmap(filep, addr, size, prot, type, off);
 
+	if ((type & MAP_FIXED_SAFE) && BAD_ADDR(map_addr))
+		pr_info("%d (%s): Uhuuh, elf segement at %p requested but the memory is mapped already\n",
+				task_pid_nr(current), current->comm, (void*)addr);
+
 	return(map_addr);
 }
 
@@ -569,7 +573,7 @@ static unsigned long load_elf_interp(struct elfhdr *interp_elf_ex,
 				elf_prot |= PROT_EXEC;
 			vaddr = eppnt->p_vaddr;
 			if (interp_elf_ex->e_type == ET_EXEC || load_addr_set)
-				elf_type |= MAP_FIXED;
+				elf_type |= MAP_FIXED_SAFE;
 			else if (no_base && interp_elf_ex->e_type == ET_DYN)
 				load_addr = -vaddr;
 
@@ -930,7 +934,7 @@ static int load_elf_binary(struct linux_binprm *bprm)
 		 * the ET_DYN load_addr calculations, proceed normally.
 		 */
 		if (loc->elf_ex.e_type == ET_EXEC || load_addr_set) {
-			elf_flags |= MAP_FIXED;
+			elf_flags |= MAP_FIXED_SAFE;
 		} else if (loc->elf_ex.e_type == ET_DYN) {
 			/*
 			 * This logic is run once for the first LOAD Program
@@ -966,7 +970,7 @@ static int load_elf_binary(struct linux_binprm *bprm)
 				load_bias = ELF_ET_DYN_BASE;
 				if (current->flags & PF_RANDOMIZE)
 					load_bias += arch_mmap_rnd();
-				elf_flags |= MAP_FIXED;
+				elf_flags |= MAP_FIXED_SAFE;
 			} else
 				load_bias = 0;
 
@@ -1223,7 +1227,7 @@ static int load_elf_library(struct file *file)
 			(eppnt->p_filesz +
 			 ELF_PAGEOFFSET(eppnt->p_vaddr)),
 			PROT_READ | PROT_WRITE | PROT_EXEC,
-			MAP_FIXED | MAP_PRIVATE | MAP_DENYWRITE,
+			MAP_FIXED_SAFE | MAP_PRIVATE | MAP_DENYWRITE,
 			(eppnt->p_offset -
 			 ELF_PAGEOFFSET(eppnt->p_vaddr)));
 	if (error != ELF_PAGESTART(eppnt->p_vaddr))
-- 
2.15.0

^ permalink raw reply related	[flat|nested] 130+ messages in thread

* [PATCH] mmap.2: document new MAP_FIXED_SAFE flag
  2017-11-29 14:42 ` Michal Hocko
  (?)
@ 2017-11-29 14:45   ` Michal Hocko
  -1 siblings, 0 replies; 130+ messages in thread
From: Michal Hocko @ 2017-11-29 14:45 UTC (permalink / raw)
  To: Michael Kerrisk
  Cc: linux-api, Khalid Aziz, Michael Ellerman, Andrew Morton,
	Russell King - ARM Linux, Andrea Arcangeli, linux-mm, LKML,
	linux-arch, Florian Weimer, John Hubbard, Michal Hocko

From: Michal Hocko <mhocko@suse.com>

4.16+ kernels offer a new MAP_FIXED_SAFE flag which allows to atomicaly
probe for a given address range.

Signed-off-by: Michal Hocko <mhocko@suse.com>
---
 man2/mmap.2 | 18 ++++++++++++++++++
 1 file changed, 18 insertions(+)

diff --git a/man2/mmap.2 b/man2/mmap.2
index 385f3bfd5393..622a7000de83 100644
--- a/man2/mmap.2
+++ b/man2/mmap.2
@@ -225,6 +225,18 @@ will fail.
 Because requiring a fixed address for a mapping is less portable,
 the use of this option is discouraged.
 .TP
+.B MAP_FIXED_SAFE (since 4.16)
+Similar to MAP_FIXED wrt. to the
+.I
+addr
+enforcement except it never clobbers a colliding mapped range and rather fail with
+.B EEXIST
+in such a case. This flag can therefore be used as a safe and atomic probe for the
+the specific address range. Please note that older kernels which do not recognize
+this flag can fallback to the hint based implementation and map to a different
+location. Any backward compatible software should therefore check the returning
+address with the given one.
+.TP
 .B MAP_GROWSDOWN
 This flag is used for stacks.
 It indicates to the kernel virtual memory system that the mapping
@@ -449,6 +461,12 @@ is not a valid file descriptor (and
 .B MAP_ANONYMOUS
 was not set).
 .TP
+.B EEXIST
+range covered by
+.IR addr , 
+.IR length
+is clashing with an existing mapping.
+.TP
 .B EINVAL
 We don't like
 .IR addr ,
-- 
2.15.0

^ permalink raw reply related	[flat|nested] 130+ messages in thread

* [PATCH] mmap.2: document new MAP_FIXED_SAFE flag
@ 2017-11-29 14:45   ` Michal Hocko
  0 siblings, 0 replies; 130+ messages in thread
From: Michal Hocko @ 2017-11-29 14:45 UTC (permalink / raw)
  To: Michael Kerrisk
  Cc: linux-api, Khalid Aziz, Michael Ellerman, Andrew Morton,
	Russell King - ARM Linux, Andrea Arcangeli, linux-mm, LKML,
	linux-arch, Florian Weimer, John Hubbard, Michal Hocko

From: Michal Hocko <mhocko@suse.com>

4.16+ kernels offer a new MAP_FIXED_SAFE flag which allows to atomicaly
probe for a given address range.

Signed-off-by: Michal Hocko <mhocko@suse.com>
---
 man2/mmap.2 | 18 ++++++++++++++++++
 1 file changed, 18 insertions(+)

diff --git a/man2/mmap.2 b/man2/mmap.2
index 385f3bfd5393..622a7000de83 100644
--- a/man2/mmap.2
+++ b/man2/mmap.2
@@ -225,6 +225,18 @@ will fail.
 Because requiring a fixed address for a mapping is less portable,
 the use of this option is discouraged.
 .TP
+.B MAP_FIXED_SAFE (since 4.16)
+Similar to MAP_FIXED wrt. to the
+.I
+addr
+enforcement except it never clobbers a colliding mapped range and rather fail with
+.B EEXIST
+in such a case. This flag can therefore be used as a safe and atomic probe for the
+the specific address range. Please note that older kernels which do not recognize
+this flag can fallback to the hint based implementation and map to a different
+location. Any backward compatible software should therefore check the returning
+address with the given one.
+.TP
 .B MAP_GROWSDOWN
 This flag is used for stacks.
 It indicates to the kernel virtual memory system that the mapping
@@ -449,6 +461,12 @@ is not a valid file descriptor (and
 .B MAP_ANONYMOUS
 was not set).
 .TP
+.B EEXIST
+range covered by
+.IR addr , 
+.IR length
+is clashing with an existing mapping.
+.TP
 .B EINVAL
 We don't like
 .IR addr ,
-- 
2.15.0

^ permalink raw reply related	[flat|nested] 130+ messages in thread

* [PATCH] mmap.2: document new MAP_FIXED_SAFE flag
@ 2017-11-29 14:45   ` Michal Hocko
  0 siblings, 0 replies; 130+ messages in thread
From: Michal Hocko @ 2017-11-29 14:45 UTC (permalink / raw)
  To: Michael Kerrisk
  Cc: linux-api, Khalid Aziz, Michael Ellerman, Andrew Morton,
	Russell King - ARM Linux, Andrea Arcangeli, linux-mm, LKML,
	linux-arch, Florian Weimer, John Hubbard, Michal Hocko

From: Michal Hocko <mhocko@suse.com>

4.16+ kernels offer a new MAP_FIXED_SAFE flag which allows to atomicaly
probe for a given address range.

Signed-off-by: Michal Hocko <mhocko@suse.com>
---
 man2/mmap.2 | 18 ++++++++++++++++++
 1 file changed, 18 insertions(+)

diff --git a/man2/mmap.2 b/man2/mmap.2
index 385f3bfd5393..622a7000de83 100644
--- a/man2/mmap.2
+++ b/man2/mmap.2
@@ -225,6 +225,18 @@ will fail.
 Because requiring a fixed address for a mapping is less portable,
 the use of this option is discouraged.
 .TP
+.B MAP_FIXED_SAFE (since 4.16)
+Similar to MAP_FIXED wrt. to the
+.I
+addr
+enforcement except it never clobbers a colliding mapped range and rather fail with
+.B EEXIST
+in such a case. This flag can therefore be used as a safe and atomic probe for the
+the specific address range. Please note that older kernels which do not recognize
+this flag can fallback to the hint based implementation and map to a different
+location. Any backward compatible software should therefore check the returning
+address with the given one.
+.TP
 .B MAP_GROWSDOWN
 This flag is used for stacks.
 It indicates to the kernel virtual memory system that the mapping
@@ -449,6 +461,12 @@ is not a valid file descriptor (and
 .B MAP_ANONYMOUS
 was not set).
 .TP
+.B EEXIST
+range covered by
+.IR addr , 
+.IR length
+is clashing with an existing mapping.
+.TP
 .B EINVAL
 We don't like
 .IR addr ,
-- 
2.15.0

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 130+ messages in thread

* Re: [PATCH 0/2] mm: introduce MAP_FIXED_SAFE
  2017-11-29 14:42 ` Michal Hocko
  (?)
@ 2017-11-29 15:13   ` Rasmus Villemoes
  -1 siblings, 0 replies; 130+ messages in thread
From: Rasmus Villemoes @ 2017-11-29 15:13 UTC (permalink / raw)
  To: Michal Hocko, linux-api
  Cc: Khalid Aziz, Michael Ellerman, Andrew Morton,
	Russell King - ARM Linux, Andrea Arcangeli, linux-mm, LKML,
	linux-arch, Florian Weimer, John Hubbard, Abdul Haleem,
	Joel Stanley, Kees Cook, Michal Hocko

On 2017-11-29 15:42, Michal Hocko wrote:
> 
> The first patch introduced MAP_FIXED_SAFE which enforces the given
> address but unlike MAP_FIXED it fails with ENOMEM if the given range
> conflicts with an existing one.

[s/ENOMEM/EEXIST/, as it seems you also did in the actual patch and
changelog]

>The flag is introduced as a completely
> new one rather than a MAP_FIXED extension because of the backward
> compatibility. We really want a never-clobber semantic even on older
> kernels which do not recognize the flag. Unfortunately mmap sucks wrt.
> flags evaluation because we do not EINVAL on unknown flags. On those
> kernels we would simply use the traditional hint based semantic so the
> caller can still get a different address (which sucks) but at least not
> silently corrupt an existing mapping. I do not see a good way around
> that.

I think it would be nice if this rationale was in the 1/2 changelog,
along with the hint about what userspace that wants to be compatible
with old kernels will have to do (namely, check that it got what it
requested) - which I see you did put in the man page.

-- 
Rasmus Villemoes
Software Developer
Prevas A/S
Hedeager 3
DK-8200 Aarhus N
+45 51210274
rasmus.villemoes@prevas.dk
www.prevas.dk

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [PATCH 0/2] mm: introduce MAP_FIXED_SAFE
@ 2017-11-29 15:13   ` Rasmus Villemoes
  0 siblings, 0 replies; 130+ messages in thread
From: Rasmus Villemoes @ 2017-11-29 15:13 UTC (permalink / raw)
  To: Michal Hocko, linux-api
  Cc: Khalid Aziz, Michael Ellerman, Andrew Morton,
	Russell King - ARM Linux, Andrea Arcangeli, linux-mm, LKML,
	linux-arch, Florian Weimer, John Hubbard, Abdul Haleem,
	Joel Stanley, Kees Cook, Michal Hocko

On 2017-11-29 15:42, Michal Hocko wrote:
> 
> The first patch introduced MAP_FIXED_SAFE which enforces the given
> address but unlike MAP_FIXED it fails with ENOMEM if the given range
> conflicts with an existing one.

[s/ENOMEM/EEXIST/, as it seems you also did in the actual patch and
changelog]

>The flag is introduced as a completely
> new one rather than a MAP_FIXED extension because of the backward
> compatibility. We really want a never-clobber semantic even on older
> kernels which do not recognize the flag. Unfortunately mmap sucks wrt.
> flags evaluation because we do not EINVAL on unknown flags. On those
> kernels we would simply use the traditional hint based semantic so the
> caller can still get a different address (which sucks) but at least not
> silently corrupt an existing mapping. I do not see a good way around
> that.

I think it would be nice if this rationale was in the 1/2 changelog,
along with the hint about what userspace that wants to be compatible
with old kernels will have to do (namely, check that it got what it
requested) - which I see you did put in the man page.

-- 
Rasmus Villemoes
Software Developer
Prevas A/S
Hedeager 3
DK-8200 Aarhus N
+45 51210274
rasmus.villemoes@prevas.dk
www.prevas.dk

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [PATCH 0/2] mm: introduce MAP_FIXED_SAFE
@ 2017-11-29 15:13   ` Rasmus Villemoes
  0 siblings, 0 replies; 130+ messages in thread
From: Rasmus Villemoes @ 2017-11-29 15:13 UTC (permalink / raw)
  To: Michal Hocko, linux-api
  Cc: Khalid Aziz, Michael Ellerman, Andrew Morton,
	Russell King - ARM Linux, Andrea Arcangeli, linux-mm, LKML,
	linux-arch, Florian Weimer, John Hubbard, Abdul Haleem,
	Joel Stanley, Kees Cook, Michal Hocko

On 2017-11-29 15:42, Michal Hocko wrote:
> 
> The first patch introduced MAP_FIXED_SAFE which enforces the given
> address but unlike MAP_FIXED it fails with ENOMEM if the given range
> conflicts with an existing one.

[s/ENOMEM/EEXIST/, as it seems you also did in the actual patch and
changelog]

>The flag is introduced as a completely
> new one rather than a MAP_FIXED extension because of the backward
> compatibility. We really want a never-clobber semantic even on older
> kernels which do not recognize the flag. Unfortunately mmap sucks wrt.
> flags evaluation because we do not EINVAL on unknown flags. On those
> kernels we would simply use the traditional hint based semantic so the
> caller can still get a different address (which sucks) but at least not
> silently corrupt an existing mapping. I do not see a good way around
> that.

I think it would be nice if this rationale was in the 1/2 changelog,
along with the hint about what userspace that wants to be compatible
with old kernels will have to do (namely, check that it got what it
requested) - which I see you did put in the man page.

-- 
Rasmus Villemoes
Software Developer
Prevas A/S
Hedeager 3
DK-8200 Aarhus N
+45 51210274
rasmus.villemoes@prevas.dk
www.prevas.dk

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [PATCH 0/2] mm: introduce MAP_FIXED_SAFE
@ 2017-11-29 15:50     ` Michal Hocko
  0 siblings, 0 replies; 130+ messages in thread
From: Michal Hocko @ 2017-11-29 15:50 UTC (permalink / raw)
  To: Rasmus Villemoes
  Cc: linux-api, Khalid Aziz, Michael Ellerman, Andrew Morton,
	Russell King - ARM Linux, Andrea Arcangeli, linux-mm, LKML,
	linux-arch, Florian Weimer, John Hubbard, Abdul Haleem,
	Joel Stanley, Kees Cook

On Wed 29-11-17 16:13:53, Rasmus Villemoes wrote:
> On 2017-11-29 15:42, Michal Hocko wrote:
[...]
> >The flag is introduced as a completely
> > new one rather than a MAP_FIXED extension because of the backward
> > compatibility. We really want a never-clobber semantic even on older
> > kernels which do not recognize the flag. Unfortunately mmap sucks wrt.
> > flags evaluation because we do not EINVAL on unknown flags. On those
> > kernels we would simply use the traditional hint based semantic so the
> > caller can still get a different address (which sucks) but at least not
> > silently corrupt an existing mapping. I do not see a good way around
> > that.
> 
> I think it would be nice if this rationale was in the 1/2 changelog,
> along with the hint about what userspace that wants to be compatible
> with old kernels will have to do (namely, check that it got what it
> requested) - which I see you did put in the man page.

OK, I've added there.

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [PATCH 0/2] mm: introduce MAP_FIXED_SAFE
@ 2017-11-29 15:50     ` Michal Hocko
  0 siblings, 0 replies; 130+ messages in thread
From: Michal Hocko @ 2017-11-29 15:50 UTC (permalink / raw)
  To: Rasmus Villemoes
  Cc: linux-api-u79uwXL29TY76Z2rM5mHXA, Khalid Aziz, Michael Ellerman,
	Andrew Morton, Russell King - ARM Linux, Andrea Arcangeli,
	linux-mm-Bw31MaZKKs3YtjvyW6yDsg, LKML,
	linux-arch-u79uwXL29TY76Z2rM5mHXA, Florian Weimer, John Hubbard,
	Abdul Haleem, Joel Stanley, Kees Cook

On Wed 29-11-17 16:13:53, Rasmus Villemoes wrote:
> On 2017-11-29 15:42, Michal Hocko wrote:
[...]
> >The flag is introduced as a completely
> > new one rather than a MAP_FIXED extension because of the backward
> > compatibility. We really want a never-clobber semantic even on older
> > kernels which do not recognize the flag. Unfortunately mmap sucks wrt.
> > flags evaluation because we do not EINVAL on unknown flags. On those
> > kernels we would simply use the traditional hint based semantic so the
> > caller can still get a different address (which sucks) but at least not
> > silently corrupt an existing mapping. I do not see a good way around
> > that.
> 
> I think it would be nice if this rationale was in the 1/2 changelog,
> along with the hint about what userspace that wants to be compatible
> with old kernels will have to do (namely, check that it got what it
> requested) - which I see you did put in the man page.

OK, I've added there.

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [PATCH 0/2] mm: introduce MAP_FIXED_SAFE
@ 2017-11-29 15:50     ` Michal Hocko
  0 siblings, 0 replies; 130+ messages in thread
From: Michal Hocko @ 2017-11-29 15:50 UTC (permalink / raw)
  To: Rasmus Villemoes
  Cc: linux-api, Khalid Aziz, Michael Ellerman, Andrew Morton,
	Russell King - ARM Linux, Andrea Arcangeli, linux-mm, LKML,
	linux-arch, Florian Weimer, John Hubbard, Abdul Haleem,
	Joel Stanley, Kees Cook

On Wed 29-11-17 16:13:53, Rasmus Villemoes wrote:
> On 2017-11-29 15:42, Michal Hocko wrote:
[...]
> >The flag is introduced as a completely
> > new one rather than a MAP_FIXED extension because of the backward
> > compatibility. We really want a never-clobber semantic even on older
> > kernels which do not recognize the flag. Unfortunately mmap sucks wrt.
> > flags evaluation because we do not EINVAL on unknown flags. On those
> > kernels we would simply use the traditional hint based semantic so the
> > caller can still get a different address (which sucks) but at least not
> > silently corrupt an existing mapping. I do not see a good way around
> > that.
> 
> I think it would be nice if this rationale was in the 1/2 changelog,
> along with the hint about what userspace that wants to be compatible
> with old kernels will have to do (namely, check that it got what it
> requested) - which I see you did put in the man page.

OK, I've added there.

-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [PATCH 2/2] fs, elf: drop MAP_FIXED usage from elf_map
  2017-11-29 14:42   ` Michal Hocko
@ 2017-11-29 17:45     ` Khalid Aziz
  -1 siblings, 0 replies; 130+ messages in thread
From: Khalid Aziz @ 2017-11-29 17:45 UTC (permalink / raw)
  To: Michal Hocko, linux-api
  Cc: Michael Ellerman, Andrew Morton, Russell King - ARM Linux,
	Andrea Arcangeli, linux-mm, LKML, linux-arch, Florian Weimer,
	John Hubbard, Michal Hocko, Abdul Haleem, Joel Stanley,
	Kees Cook

On 11/29/2017 07:42 AM, Michal Hocko wrote:
> From: Michal Hocko <mhocko@suse.com>
> 
> Both load_elf_interp and load_elf_binary rely on elf_map to map segments
> on a controlled address and they use MAP_FIXED to enforce that. This is
> however dangerous thing prone to silent data corruption which can be
> even exploitable. Let's take CVE-2017-1000253 as an example. At the time
> (before eab09532d400 ("binfmt_elf: use ELF_ET_DYN_BASE only for PIE"))
> ELF_ET_DYN_BASE was at TASK_SIZE / 3 * 2 which is not that far away from
> the stack top on 32b (legacy) memory layout (only 1GB away). Therefore
> we could end up mapping over the existing stack with some luck.
> 
> The issue has been fixed since then (a87938b2e246 ("fs/binfmt_elf.c:
> fix bug in loading of PIE binaries")), ELF_ET_DYN_BASE moved moved much
> further from the stack (eab09532d400 and later by c715b72c1ba4 ("mm:
> revert x86_64 and arm64 ELF_ET_DYN_BASE base changes")) and excessive
> stack consumption early during execve fully stopped by da029c11e6b1
> ("exec: Limit arg stack to at most 75% of _STK_LIM"). So we should be
> safe and any attack should be impractical. On the other hand this is
> just too subtle assumption so it can break quite easily and hard to
> spot.
> 
> I believe that the MAP_FIXED usage in load_elf_binary (et. al) is still
> fundamentally dangerous. Moreover it shouldn't be even needed. We are
> at the early process stage and so there shouldn't be unrelated mappings
> (except for stack and loader) existing so mmap for a given address
> should succeed even without MAP_FIXED. Something is terribly wrong if
> this is not the case and we should rather fail than silently corrupt the
> underlying mapping.
> 
> Address this issue by changing MAP_FIXED to the newly added
> MAP_FIXED_SAFE. This will mean that mmap will fail if there is an
> existing mapping clashing with the requested one without clobbering it.
> 
> Cc: Abdul Haleem <abdhalee@linux.vnet.ibm.com>
> Cc: Joel Stanley <joel@jms.id.au>
> Acked-by: Kees Cook <keescook@chromium.org>
> Signed-off-by: Michal Hocko <mhocko@suse.com>
> ---

Reviewed-by: Khalid Aziz <khalid.aziz@oracle.com>

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [PATCH 2/2] fs, elf: drop MAP_FIXED usage from elf_map
@ 2017-11-29 17:45     ` Khalid Aziz
  0 siblings, 0 replies; 130+ messages in thread
From: Khalid Aziz @ 2017-11-29 17:45 UTC (permalink / raw)
  To: Michal Hocko, linux-api
  Cc: Michael Ellerman, Andrew Morton, Russell King - ARM Linux,
	Andrea Arcangeli, linux-mm, LKML, linux-arch, Florian Weimer,
	John Hubbard, Michal Hocko, Abdul Haleem, Joel Stanley,
	Kees Cook

On 11/29/2017 07:42 AM, Michal Hocko wrote:
> From: Michal Hocko <mhocko@suse.com>
> 
> Both load_elf_interp and load_elf_binary rely on elf_map to map segments
> on a controlled address and they use MAP_FIXED to enforce that. This is
> however dangerous thing prone to silent data corruption which can be
> even exploitable. Let's take CVE-2017-1000253 as an example. At the time
> (before eab09532d400 ("binfmt_elf: use ELF_ET_DYN_BASE only for PIE"))
> ELF_ET_DYN_BASE was at TASK_SIZE / 3 * 2 which is not that far away from
> the stack top on 32b (legacy) memory layout (only 1GB away). Therefore
> we could end up mapping over the existing stack with some luck.
> 
> The issue has been fixed since then (a87938b2e246 ("fs/binfmt_elf.c:
> fix bug in loading of PIE binaries")), ELF_ET_DYN_BASE moved moved much
> further from the stack (eab09532d400 and later by c715b72c1ba4 ("mm:
> revert x86_64 and arm64 ELF_ET_DYN_BASE base changes")) and excessive
> stack consumption early during execve fully stopped by da029c11e6b1
> ("exec: Limit arg stack to at most 75% of _STK_LIM"). So we should be
> safe and any attack should be impractical. On the other hand this is
> just too subtle assumption so it can break quite easily and hard to
> spot.
> 
> I believe that the MAP_FIXED usage in load_elf_binary (et. al) is still
> fundamentally dangerous. Moreover it shouldn't be even needed. We are
> at the early process stage and so there shouldn't be unrelated mappings
> (except for stack and loader) existing so mmap for a given address
> should succeed even without MAP_FIXED. Something is terribly wrong if
> this is not the case and we should rather fail than silently corrupt the
> underlying mapping.
> 
> Address this issue by changing MAP_FIXED to the newly added
> MAP_FIXED_SAFE. This will mean that mmap will fail if there is an
> existing mapping clashing with the requested one without clobbering it.
> 
> Cc: Abdul Haleem <abdhalee@linux.vnet.ibm.com>
> Cc: Joel Stanley <joel@jms.id.au>
> Acked-by: Kees Cook <keescook@chromium.org>
> Signed-off-by: Michal Hocko <mhocko@suse.com>
> ---

Reviewed-by: Khalid Aziz <khalid.aziz@oracle.com>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [PATCH 0/2] mm: introduce MAP_FIXED_SAFE
  2017-11-29 14:42 ` Michal Hocko
@ 2017-11-29 22:12   ` Kees Cook
  -1 siblings, 0 replies; 130+ messages in thread
From: Kees Cook @ 2017-11-29 22:12 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Linux API, Khalid Aziz, Michael Ellerman, Andrew Morton,
	Russell King - ARM Linux, Andrea Arcangeli, Linux-MM, LKML,
	linux-arch, Florian Weimer, John Hubbard, Abdul Haleem,
	Joel Stanley, Michal Hocko

On Wed, Nov 29, 2017 at 6:42 AM, Michal Hocko <mhocko@kernel.org> wrote:
> Except we won't export expose the new semantic to the userspace at all.

I'm confused: the changes in patch 1 are explicitly adding
MAP_FIXED_SAFE to the uapi. If it's not supposed to be exposed,
shouldn't it go somewhere else?

-Kees

-- 
Kees Cook
Pixel Security

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [PATCH 0/2] mm: introduce MAP_FIXED_SAFE
@ 2017-11-29 22:12   ` Kees Cook
  0 siblings, 0 replies; 130+ messages in thread
From: Kees Cook @ 2017-11-29 22:12 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Linux API, Khalid Aziz, Michael Ellerman, Andrew Morton,
	Russell King - ARM Linux, Andrea Arcangeli, Linux-MM, LKML,
	linux-arch, Florian Weimer, John Hubbard, Abdul Haleem,
	Joel Stanley, Michal Hocko

On Wed, Nov 29, 2017 at 6:42 AM, Michal Hocko <mhocko@kernel.org> wrote:
> Except we won't export expose the new semantic to the userspace at all.

I'm confused: the changes in patch 1 are explicitly adding
MAP_FIXED_SAFE to the uapi. If it's not supposed to be exposed,
shouldn't it go somewhere else?

-Kees

-- 
Kees Cook
Pixel Security

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [PATCH 0/2] mm: introduce MAP_FIXED_SAFE
  2017-11-29 15:13   ` Rasmus Villemoes
@ 2017-11-29 22:15     ` Kees Cook
  -1 siblings, 0 replies; 130+ messages in thread
From: Kees Cook @ 2017-11-29 22:15 UTC (permalink / raw)
  To: Rasmus Villemoes
  Cc: Michal Hocko, Linux API, Khalid Aziz, Michael Ellerman,
	Andrew Morton, Russell King - ARM Linux, Andrea Arcangeli,
	Linux-MM, LKML, linux-arch, Florian Weimer, John Hubbard,
	Abdul Haleem, Joel Stanley, Michal Hocko

On Wed, Nov 29, 2017 at 7:13 AM, Rasmus Villemoes
<rasmus.villemoes@prevas.dk> wrote:
> On 2017-11-29 15:42, Michal Hocko wrote:
>>
>> The first patch introduced MAP_FIXED_SAFE which enforces the given
>> address but unlike MAP_FIXED it fails with ENOMEM if the given range
>> conflicts with an existing one.
>
> [s/ENOMEM/EEXIST/, as it seems you also did in the actual patch and
> changelog]
>
>>The flag is introduced as a completely
>> new one rather than a MAP_FIXED extension because of the backward
>> compatibility. We really want a never-clobber semantic even on older
>> kernels which do not recognize the flag. Unfortunately mmap sucks wrt.
>> flags evaluation because we do not EINVAL on unknown flags. On those
>> kernels we would simply use the traditional hint based semantic so the
>> caller can still get a different address (which sucks) but at least not
>> silently corrupt an existing mapping. I do not see a good way around
>> that.
>
> I think it would be nice if this rationale was in the 1/2 changelog,
> along with the hint about what userspace that wants to be compatible
> with old kernels will have to do (namely, check that it got what it
> requested) - which I see you did put in the man page.

Okay, so ignore my other email, I must have misunderstood. It _is_,
quite intentionally, being exposed to userspace. Cool by me. :)

-Kees

-- 
Kees Cook
Pixel Security

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [PATCH 0/2] mm: introduce MAP_FIXED_SAFE
@ 2017-11-29 22:15     ` Kees Cook
  0 siblings, 0 replies; 130+ messages in thread
From: Kees Cook @ 2017-11-29 22:15 UTC (permalink / raw)
  To: Rasmus Villemoes
  Cc: Michal Hocko, Linux API, Khalid Aziz, Michael Ellerman,
	Andrew Morton, Russell King - ARM Linux, Andrea Arcangeli,
	Linux-MM, LKML, linux-arch, Florian Weimer, John Hubbard,
	Abdul Haleem, Joel Stanley, Michal Hocko

On Wed, Nov 29, 2017 at 7:13 AM, Rasmus Villemoes
<rasmus.villemoes@prevas.dk> wrote:
> On 2017-11-29 15:42, Michal Hocko wrote:
>>
>> The first patch introduced MAP_FIXED_SAFE which enforces the given
>> address but unlike MAP_FIXED it fails with ENOMEM if the given range
>> conflicts with an existing one.
>
> [s/ENOMEM/EEXIST/, as it seems you also did in the actual patch and
> changelog]
>
>>The flag is introduced as a completely
>> new one rather than a MAP_FIXED extension because of the backward
>> compatibility. We really want a never-clobber semantic even on older
>> kernels which do not recognize the flag. Unfortunately mmap sucks wrt.
>> flags evaluation because we do not EINVAL on unknown flags. On those
>> kernels we would simply use the traditional hint based semantic so the
>> caller can still get a different address (which sucks) but at least not
>> silently corrupt an existing mapping. I do not see a good way around
>> that.
>
> I think it would be nice if this rationale was in the 1/2 changelog,
> along with the hint about what userspace that wants to be compatible
> with old kernels will have to do (namely, check that it got what it
> requested) - which I see you did put in the man page.

Okay, so ignore my other email, I must have misunderstood. It _is_,
quite intentionally, being exposed to userspace. Cool by me. :)

-Kees

-- 
Kees Cook
Pixel Security

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [PATCH 0/2] mm: introduce MAP_FIXED_SAFE
  2017-11-29 14:42 ` Michal Hocko
@ 2017-11-29 22:25   ` Kees Cook
  -1 siblings, 0 replies; 130+ messages in thread
From: Kees Cook @ 2017-11-29 22:25 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Linux API, Khalid Aziz, Michael Ellerman, Andrew Morton,
	Russell King - ARM Linux, Andrea Arcangeli, Linux-MM, LKML,
	linux-arch, Florian Weimer, John Hubbard, Abdul Haleem,
	Joel Stanley, Michal Hocko

On Wed, Nov 29, 2017 at 6:42 AM, Michal Hocko <mhocko@kernel.org> wrote:
> The first patch introduced MAP_FIXED_SAFE which enforces the given
> address but unlike MAP_FIXED it fails with ENOMEM if the given range
> conflicts with an existing one. The flag is introduced as a completely

I still think this name should be better. "SAFE" doesn't say what it's
safe from...

MAP_FIXED_UNIQUE
MAP_FIXED_ONCE
MAP_FIXED_FRESH

?

-Kees

-- 
Kees Cook
Pixel Security

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [PATCH 0/2] mm: introduce MAP_FIXED_SAFE
@ 2017-11-29 22:25   ` Kees Cook
  0 siblings, 0 replies; 130+ messages in thread
From: Kees Cook @ 2017-11-29 22:25 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Linux API, Khalid Aziz, Michael Ellerman, Andrew Morton,
	Russell King - ARM Linux, Andrea Arcangeli, Linux-MM, LKML,
	linux-arch, Florian Weimer, John Hubbard, Abdul Haleem,
	Joel Stanley, Michal Hocko

On Wed, Nov 29, 2017 at 6:42 AM, Michal Hocko <mhocko@kernel.org> wrote:
> The first patch introduced MAP_FIXED_SAFE which enforces the given
> address but unlike MAP_FIXED it fails with ENOMEM if the given range
> conflicts with an existing one. The flag is introduced as a completely

I still think this name should be better. "SAFE" doesn't say what it's
safe from...

MAP_FIXED_UNIQUE
MAP_FIXED_ONCE
MAP_FIXED_FRESH

?

-Kees

-- 
Kees Cook
Pixel Security

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [PATCH] mmap.2: document new MAP_FIXED_SAFE flag
  2017-11-29 14:45   ` Michal Hocko
  (?)
@ 2017-11-30  3:16     ` John Hubbard
  -1 siblings, 0 replies; 130+ messages in thread
From: John Hubbard @ 2017-11-30  3:16 UTC (permalink / raw)
  To: Michal Hocko, Michael Kerrisk
  Cc: linux-api, Khalid Aziz, Michael Ellerman, Andrew Morton,
	Russell King - ARM Linux, Andrea Arcangeli, linux-mm, LKML,
	linux-arch, Florian Weimer, Michal Hocko

On 11/29/2017 06:45 AM, Michal Hocko wrote:
> From: Michal Hocko <mhocko@suse.com>
> 
> 4.16+ kernels offer a new MAP_FIXED_SAFE flag which allows to atomicaly

"allows the caller to atomically"

, if you care about polishing the commit message...see the real review,
below. :)

> probe for a given address range.
> 
> Signed-off-by: Michal Hocko <mhocko@suse.com>
> ---
>  man2/mmap.2 | 18 ++++++++++++++++++
>  1 file changed, 18 insertions(+)
> 
> diff --git a/man2/mmap.2 b/man2/mmap.2
> index 385f3bfd5393..622a7000de83 100644
> --- a/man2/mmap.2
> +++ b/man2/mmap.2
> @@ -225,6 +225,18 @@ will fail.
>  Because requiring a fixed address for a mapping is less portable,
>  the use of this option is discouraged.
>  .TP
> +.B MAP_FIXED_SAFE (since 4.16)
> +Similar to MAP_FIXED wrt. to the
> +.I
> +addr
> +enforcement except it never clobbers a colliding mapped range and rather fail with
> +.B EEXIST
> +in such a case. This flag can therefore be used as a safe and atomic probe for the
> +the specific address range. Please note that older kernels which do not recognize
> +this flag can fallback to the hint based implementation and map to a different
> +location. Any backward compatible software should therefore check the returning
> +address with the given one.
> +.TP
>  .B MAP_GROWSDOWN
>  This flag is used for stacks.
>  It indicates to the kernel virtual memory system that the mapping

Hi Michal,

I've taken the liberty of mostly rewriting this part, in order to more closely 
match the existing paragraphs; to fix minor typos; and to attempt to slightly
clarify the paragraph.

+.BR MAP_FIXED_SAFE " (since Linux 4.16)"
+Similar to MAP_FIXED with respect to the
+.I
+addr
+enforcement, but different in that MAP_FIXED_SAFE never clobbers a pre-existing
+mapped range. If the requested range would collide with an existing
+mapping, then this call fails with
+.B EEXIST.
+This flag can therefore be used as a way to atomically (with respect to other
+threads) attempt to map an address range: one thread will succeed; all others
+will report failure. Please note that older kernels which do not recognize this
+flag will typically (upon detecting a collision with a pre-existing mapping)
+fall back a "non-MAP_FIXED" type of behavior: they will return an address that
+is different than the requested one. Therefore, backward-compatible software
+should check the returned address against the requested address.
+.TP

(I'm ignoring the naming, because there is another thread about that,
so please just the above as "MAP_FIXED_whatever-is-chosen".)

> @@ -449,6 +461,12 @@ is not a valid file descriptor (and
>  .B MAP_ANONYMOUS
>  was not set).
>  .TP
> +.B EEXIST
> +range covered by
> +.IR addr , 

nit: trailing space on the above line.

> +.IR length
> +is clashing with an existing mapping.
> +.TP
>  .B EINVAL
>  We don't like
>  .IR addr ,
> 

One other thing: reading through mmap.2, I now want to add this as well:

diff --git a/man2/mmap.2 b/man2/mmap.2
index 622a7000d..780cad6d9 100644
--- a/man2/mmap.2
+++ b/man2/mmap.2
@@ -222,20 +222,25 @@ part of the existing mapping(s) will be discarded.
 If the specified address cannot be used,
 .BR mmap ()
 will fail.
-Because requiring a fixed address for a mapping is less portable,
-the use of this option is discouraged.
+Software that aspires to be as portable as possible should use this option with
+care, keeping in mind that different kernels and C libraries may set up quite
+different mapping ranges.


...because that advice is just wrong (it presumes that "less portable" ==
"must be discouraged").

Should I send out a separate patch for that, or is it better to glom it together 
with this one?

thanks,
John Hubbard
NVIDIA

^ permalink raw reply related	[flat|nested] 130+ messages in thread

* Re: [PATCH] mmap.2: document new MAP_FIXED_SAFE flag
@ 2017-11-30  3:16     ` John Hubbard
  0 siblings, 0 replies; 130+ messages in thread
From: John Hubbard @ 2017-11-30  3:16 UTC (permalink / raw)
  To: Michal Hocko, Michael Kerrisk
  Cc: linux-api, Khalid Aziz, Michael Ellerman, Andrew Morton,
	Russell King - ARM Linux, Andrea Arcangeli, linux-mm, LKML,
	linux-arch, Florian Weimer, Michal Hocko

On 11/29/2017 06:45 AM, Michal Hocko wrote:
> From: Michal Hocko <mhocko@suse.com>
> 
> 4.16+ kernels offer a new MAP_FIXED_SAFE flag which allows to atomicaly

"allows the caller to atomically"

, if you care about polishing the commit message...see the real review,
below. :)

> probe for a given address range.
> 
> Signed-off-by: Michal Hocko <mhocko@suse.com>
> ---
>  man2/mmap.2 | 18 ++++++++++++++++++
>  1 file changed, 18 insertions(+)
> 
> diff --git a/man2/mmap.2 b/man2/mmap.2
> index 385f3bfd5393..622a7000de83 100644
> --- a/man2/mmap.2
> +++ b/man2/mmap.2
> @@ -225,6 +225,18 @@ will fail.
>  Because requiring a fixed address for a mapping is less portable,
>  the use of this option is discouraged.
>  .TP
> +.B MAP_FIXED_SAFE (since 4.16)
> +Similar to MAP_FIXED wrt. to the
> +.I
> +addr
> +enforcement except it never clobbers a colliding mapped range and rather fail with
> +.B EEXIST
> +in such a case. This flag can therefore be used as a safe and atomic probe for the
> +the specific address range. Please note that older kernels which do not recognize
> +this flag can fallback to the hint based implementation and map to a different
> +location. Any backward compatible software should therefore check the returning
> +address with the given one.
> +.TP
>  .B MAP_GROWSDOWN
>  This flag is used for stacks.
>  It indicates to the kernel virtual memory system that the mapping

Hi Michal,

I've taken the liberty of mostly rewriting this part, in order to more closely 
match the existing paragraphs; to fix minor typos; and to attempt to slightly
clarify the paragraph.

+.BR MAP_FIXED_SAFE " (since Linux 4.16)"
+Similar to MAP_FIXED with respect to the
+.I
+addr
+enforcement, but different in that MAP_FIXED_SAFE never clobbers a pre-existing
+mapped range. If the requested range would collide with an existing
+mapping, then this call fails with
+.B EEXIST.
+This flag can therefore be used as a way to atomically (with respect to other
+threads) attempt to map an address range: one thread will succeed; all others
+will report failure. Please note that older kernels which do not recognize this
+flag will typically (upon detecting a collision with a pre-existing mapping)
+fall back a "non-MAP_FIXED" type of behavior: they will return an address that
+is different than the requested one. Therefore, backward-compatible software
+should check the returned address against the requested address.
+.TP

(I'm ignoring the naming, because there is another thread about that,
so please just the above as "MAP_FIXED_whatever-is-chosen".)

> @@ -449,6 +461,12 @@ is not a valid file descriptor (and
>  .B MAP_ANONYMOUS
>  was not set).
>  .TP
> +.B EEXIST
> +range covered by
> +.IR addr , 

nit: trailing space on the above line.

> +.IR length
> +is clashing with an existing mapping.
> +.TP
>  .B EINVAL
>  We don't like
>  .IR addr ,
> 

One other thing: reading through mmap.2, I now want to add this as well:

diff --git a/man2/mmap.2 b/man2/mmap.2
index 622a7000d..780cad6d9 100644
--- a/man2/mmap.2
+++ b/man2/mmap.2
@@ -222,20 +222,25 @@ part of the existing mapping(s) will be discarded.
 If the specified address cannot be used,
 .BR mmap ()
 will fail.
-Because requiring a fixed address for a mapping is less portable,
-the use of this option is discouraged.
+Software that aspires to be as portable as possible should use this option with
+care, keeping in mind that different kernels and C libraries may set up quite
+different mapping ranges.


...because that advice is just wrong (it presumes that "less portable" ==
"must be discouraged").

Should I send out a separate patch for that, or is it better to glom it together 
with this one?

thanks,
John Hubbard
NVIDIA

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 130+ messages in thread

* Re: [PATCH] mmap.2: document new MAP_FIXED_SAFE flag
@ 2017-11-30  3:16     ` John Hubbard
  0 siblings, 0 replies; 130+ messages in thread
From: John Hubbard @ 2017-11-30  3:16 UTC (permalink / raw)
  To: Michal Hocko, Michael Kerrisk
  Cc: linux-api, Khalid Aziz, Michael Ellerman, Andrew Morton,
	Russell King - ARM Linux, Andrea Arcangeli, linux-mm, LKML,
	linux-arch, Florian Weimer, Michal Hocko

On 11/29/2017 06:45 AM, Michal Hocko wrote:
> From: Michal Hocko <mhocko@suse.com>
> 
> 4.16+ kernels offer a new MAP_FIXED_SAFE flag which allows to atomicaly

"allows the caller to atomically"

, if you care about polishing the commit message...see the real review,
below. :)

> probe for a given address range.
> 
> Signed-off-by: Michal Hocko <mhocko@suse.com>
> ---
>  man2/mmap.2 | 18 ++++++++++++++++++
>  1 file changed, 18 insertions(+)
> 
> diff --git a/man2/mmap.2 b/man2/mmap.2
> index 385f3bfd5393..622a7000de83 100644
> --- a/man2/mmap.2
> +++ b/man2/mmap.2
> @@ -225,6 +225,18 @@ will fail.
>  Because requiring a fixed address for a mapping is less portable,
>  the use of this option is discouraged.
>  .TP
> +.B MAP_FIXED_SAFE (since 4.16)
> +Similar to MAP_FIXED wrt. to the
> +.I
> +addr
> +enforcement except it never clobbers a colliding mapped range and rather fail with
> +.B EEXIST
> +in such a case. This flag can therefore be used as a safe and atomic probe for the
> +the specific address range. Please note that older kernels which do not recognize
> +this flag can fallback to the hint based implementation and map to a different
> +location. Any backward compatible software should therefore check the returning
> +address with the given one.
> +.TP
>  .B MAP_GROWSDOWN
>  This flag is used for stacks.
>  It indicates to the kernel virtual memory system that the mapping

Hi Michal,

I've taken the liberty of mostly rewriting this part, in order to more closely 
match the existing paragraphs; to fix minor typos; and to attempt to slightly
clarify the paragraph.

+.BR MAP_FIXED_SAFE " (since Linux 4.16)"
+Similar to MAP_FIXED with respect to the
+.I
+addr
+enforcement, but different in that MAP_FIXED_SAFE never clobbers a pre-existing
+mapped range. If the requested range would collide with an existing
+mapping, then this call fails with
+.B EEXIST.
+This flag can therefore be used as a way to atomically (with respect to other
+threads) attempt to map an address range: one thread will succeed; all others
+will report failure. Please note that older kernels which do not recognize this
+flag will typically (upon detecting a collision with a pre-existing mapping)
+fall back a "non-MAP_FIXED" type of behavior: they will return an address that
+is different than the requested one. Therefore, backward-compatible software
+should check the returned address against the requested address.
+.TP

(I'm ignoring the naming, because there is another thread about that,
so please just the above as "MAP_FIXED_whatever-is-chosen".)

> @@ -449,6 +461,12 @@ is not a valid file descriptor (and
>  .B MAP_ANONYMOUS
>  was not set).
>  .TP
> +.B EEXIST
> +range covered by
> +.IR addr , 

nit: trailing space on the above line.

> +.IR length
> +is clashing with an existing mapping.
> +.TP
>  .B EINVAL
>  We don't like
>  .IR addr ,
> 

One other thing: reading through mmap.2, I now want to add this as well:

diff --git a/man2/mmap.2 b/man2/mmap.2
index 622a7000d..780cad6d9 100644
--- a/man2/mmap.2
+++ b/man2/mmap.2
@@ -222,20 +222,25 @@ part of the existing mapping(s) will be discarded.
 If the specified address cannot be used,
 .BR mmap ()
 will fail.
-Because requiring a fixed address for a mapping is less portable,
-the use of this option is discouraged.
+Software that aspires to be as portable as possible should use this option with
+care, keeping in mind that different kernels and C libraries may set up quite
+different mapping ranges.


...because that advice is just wrong (it presumes that "less portable" ==
"must be discouraged").

Should I send out a separate patch for that, or is it better to glom it together 
with this one?

thanks,
John Hubbard
NVIDIA

^ permalink raw reply related	[flat|nested] 130+ messages in thread

* Re: [PATCH 0/2] mm: introduce MAP_FIXED_SAFE
@ 2017-11-30  6:58     ` Michal Hocko
  0 siblings, 0 replies; 130+ messages in thread
From: Michal Hocko @ 2017-11-30  6:58 UTC (permalink / raw)
  To: Kees Cook
  Cc: Linux API, Khalid Aziz, Michael Ellerman, Andrew Morton,
	Russell King - ARM Linux, Andrea Arcangeli, Linux-MM, LKML,
	linux-arch, Florian Weimer, John Hubbard, Abdul Haleem,
	Joel Stanley

On Wed 29-11-17 14:25:36, Kees Cook wrote:
> On Wed, Nov 29, 2017 at 6:42 AM, Michal Hocko <mhocko@kernel.org> wrote:
> > The first patch introduced MAP_FIXED_SAFE which enforces the given
> > address but unlike MAP_FIXED it fails with ENOMEM if the given range
> > conflicts with an existing one. The flag is introduced as a completely
> 
> I still think this name should be better. "SAFE" doesn't say what it's
> safe from...

It is safe in a sense it doesn't perform any address space dangerous
operations. mmap is _inherently_ about the address space so the context
should be kind of clear.

> MAP_FIXED_UNIQUE
> MAP_FIXED_ONCE
> MAP_FIXED_FRESH

Well, I can open a poll for the best name, but none of those you are
proposing sound much better to me. Yeah, naming sucks...
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [PATCH 0/2] mm: introduce MAP_FIXED_SAFE
@ 2017-11-30  6:58     ` Michal Hocko
  0 siblings, 0 replies; 130+ messages in thread
From: Michal Hocko @ 2017-11-30  6:58 UTC (permalink / raw)
  To: Kees Cook
  Cc: Linux API, Khalid Aziz, Michael Ellerman, Andrew Morton,
	Russell King - ARM Linux, Andrea Arcangeli, Linux-MM, LKML,
	linux-arch, Florian Weimer, John Hubbard, Abdul Haleem,
	Joel Stanley

On Wed 29-11-17 14:25:36, Kees Cook wrote:
> On Wed, Nov 29, 2017 at 6:42 AM, Michal Hocko <mhocko-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> wrote:
> > The first patch introduced MAP_FIXED_SAFE which enforces the given
> > address but unlike MAP_FIXED it fails with ENOMEM if the given range
> > conflicts with an existing one. The flag is introduced as a completely
> 
> I still think this name should be better. "SAFE" doesn't say what it's
> safe from...

It is safe in a sense it doesn't perform any address space dangerous
operations. mmap is _inherently_ about the address space so the context
should be kind of clear.

> MAP_FIXED_UNIQUE
> MAP_FIXED_ONCE
> MAP_FIXED_FRESH

Well, I can open a poll for the best name, but none of those you are
proposing sound much better to me. Yeah, naming sucks...
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [PATCH 0/2] mm: introduce MAP_FIXED_SAFE
@ 2017-11-30  6:58     ` Michal Hocko
  0 siblings, 0 replies; 130+ messages in thread
From: Michal Hocko @ 2017-11-30  6:58 UTC (permalink / raw)
  To: Kees Cook
  Cc: Linux API, Khalid Aziz, Michael Ellerman, Andrew Morton,
	Russell King - ARM Linux, Andrea Arcangeli, Linux-MM, LKML,
	linux-arch, Florian Weimer, John Hubbard, Abdul Haleem,
	Joel Stanley

On Wed 29-11-17 14:25:36, Kees Cook wrote:
> On Wed, Nov 29, 2017 at 6:42 AM, Michal Hocko <mhocko@kernel.org> wrote:
> > The first patch introduced MAP_FIXED_SAFE which enforces the given
> > address but unlike MAP_FIXED it fails with ENOMEM if the given range
> > conflicts with an existing one. The flag is introduced as a completely
> 
> I still think this name should be better. "SAFE" doesn't say what it's
> safe from...

It is safe in a sense it doesn't perform any address space dangerous
operations. mmap is _inherently_ about the address space so the context
should be kind of clear.

> MAP_FIXED_UNIQUE
> MAP_FIXED_ONCE
> MAP_FIXED_FRESH

Well, I can open a poll for the best name, but none of those you are
proposing sound much better to me. Yeah, naming sucks...
-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [PATCH] mmap.2: document new MAP_FIXED_SAFE flag
  2017-11-30  3:16     ` John Hubbard
@ 2017-11-30  8:23       ` Michal Hocko
  -1 siblings, 0 replies; 130+ messages in thread
From: Michal Hocko @ 2017-11-30  8:23 UTC (permalink / raw)
  To: John Hubbard
  Cc: Michael Kerrisk, linux-api, Khalid Aziz, Michael Ellerman,
	Andrew Morton, Russell King - ARM Linux, Andrea Arcangeli,
	linux-mm, LKML, linux-arch, Florian Weimer

On Wed 29-11-17 19:16:39, John Hubbard wrote:
[...]
> Hi Michal,
> 
> I've taken the liberty of mostly rewriting this part, in order to more closely 
> match the existing paragraphs; to fix minor typos; and to attempt to slightly
> clarify the paragraph.
> 
> +.BR MAP_FIXED_SAFE " (since Linux 4.16)"
> +Similar to MAP_FIXED with respect to the
> +.I
> +addr
> +enforcement, but different in that MAP_FIXED_SAFE never clobbers a pre-existing
> +mapped range. If the requested range would collide with an existing
> +mapping, then this call fails with
> +.B EEXIST.
> +This flag can therefore be used as a way to atomically (with respect to other
> +threads) attempt to map an address range: one thread will succeed; all others
> +will report failure. Please note that older kernels which do not recognize this
> +flag will typically (upon detecting a collision with a pre-existing mapping)
> +fall back a "non-MAP_FIXED" type of behavior: they will return an address that
> +is different than the requested one. Therefore, backward-compatible software
> +should check the returned address against the requested address.
> +.TP

I have taken yours. Thanks a lot!

> (I'm ignoring the naming, because there is another thread about that,
> so please just the above as "MAP_FIXED_whatever-is-chosen".)
> 
> > @@ -449,6 +461,12 @@ is not a valid file descriptor (and
> >  .B MAP_ANONYMOUS
> >  was not set).
> >  .TP
> > +.B EEXIST
> > +range covered by
> > +.IR addr , 
> 
> nit: trailing space on the above line.

fixed

> > +.IR length
> > +is clashing with an existing mapping.
> > +.TP
> >  .B EINVAL
> >  We don't like
> >  .IR addr ,
> > 
> 
> One other thing: reading through mmap.2, I now want to add this as well:
> 
> diff --git a/man2/mmap.2 b/man2/mmap.2
> index 622a7000d..780cad6d9 100644
> --- a/man2/mmap.2
> +++ b/man2/mmap.2
> @@ -222,20 +222,25 @@ part of the existing mapping(s) will be discarded.
>  If the specified address cannot be used,
>  .BR mmap ()
>  will fail.
> -Because requiring a fixed address for a mapping is less portable,
> -the use of this option is discouraged.
> +Software that aspires to be as portable as possible should use this option with
> +care, keeping in mind that different kernels and C libraries may set up quite
> +different mapping ranges.
> 
> 
> ...because that advice is just wrong (it presumes that "less portable" ==
> "must be discouraged").
> 
> Should I send out a separate patch for that, or is it better to glom it together 
> with this one?

yes please
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [PATCH] mmap.2: document new MAP_FIXED_SAFE flag
@ 2017-11-30  8:23       ` Michal Hocko
  0 siblings, 0 replies; 130+ messages in thread
From: Michal Hocko @ 2017-11-30  8:23 UTC (permalink / raw)
  To: John Hubbard
  Cc: Michael Kerrisk, linux-api, Khalid Aziz, Michael Ellerman,
	Andrew Morton, Russell King - ARM Linux, Andrea Arcangeli,
	linux-mm, LKML, linux-arch, Florian Weimer

On Wed 29-11-17 19:16:39, John Hubbard wrote:
[...]
> Hi Michal,
> 
> I've taken the liberty of mostly rewriting this part, in order to more closely 
> match the existing paragraphs; to fix minor typos; and to attempt to slightly
> clarify the paragraph.
> 
> +.BR MAP_FIXED_SAFE " (since Linux 4.16)"
> +Similar to MAP_FIXED with respect to the
> +.I
> +addr
> +enforcement, but different in that MAP_FIXED_SAFE never clobbers a pre-existing
> +mapped range. If the requested range would collide with an existing
> +mapping, then this call fails with
> +.B EEXIST.
> +This flag can therefore be used as a way to atomically (with respect to other
> +threads) attempt to map an address range: one thread will succeed; all others
> +will report failure. Please note that older kernels which do not recognize this
> +flag will typically (upon detecting a collision with a pre-existing mapping)
> +fall back a "non-MAP_FIXED" type of behavior: they will return an address that
> +is different than the requested one. Therefore, backward-compatible software
> +should check the returned address against the requested address.
> +.TP

I have taken yours. Thanks a lot!

> (I'm ignoring the naming, because there is another thread about that,
> so please just the above as "MAP_FIXED_whatever-is-chosen".)
> 
> > @@ -449,6 +461,12 @@ is not a valid file descriptor (and
> >  .B MAP_ANONYMOUS
> >  was not set).
> >  .TP
> > +.B EEXIST
> > +range covered by
> > +.IR addr , 
> 
> nit: trailing space on the above line.

fixed

> > +.IR length
> > +is clashing with an existing mapping.
> > +.TP
> >  .B EINVAL
> >  We don't like
> >  .IR addr ,
> > 
> 
> One other thing: reading through mmap.2, I now want to add this as well:
> 
> diff --git a/man2/mmap.2 b/man2/mmap.2
> index 622a7000d..780cad6d9 100644
> --- a/man2/mmap.2
> +++ b/man2/mmap.2
> @@ -222,20 +222,25 @@ part of the existing mapping(s) will be discarded.
>  If the specified address cannot be used,
>  .BR mmap ()
>  will fail.
> -Because requiring a fixed address for a mapping is less portable,
> -the use of this option is discouraged.
> +Software that aspires to be as portable as possible should use this option with
> +care, keeping in mind that different kernels and C libraries may set up quite
> +different mapping ranges.
> 
> 
> ...because that advice is just wrong (it presumes that "less portable" ==
> "must be discouraged").
> 
> Should I send out a separate patch for that, or is it better to glom it together 
> with this one?

yes please
-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 130+ messages in thread

* [PATCH v2] mmap.2: document new MAP_FIXED_SAFE flag
  2017-11-29 14:45   ` Michal Hocko
  (?)
  (?)
@ 2017-11-30  8:24     ` Michal Hocko
  -1 siblings, 0 replies; 130+ messages in thread
From: Michal Hocko @ 2017-11-30  8:24 UTC (permalink / raw)
  To: Michael Kerrisk
  Cc: linux-api, Khalid Aziz, Michael Ellerman, Andrew Morton,
	Russell King - ARM Linux, Andrea Arcangeli, linux-mm, LKML,
	linux-arch, Florian Weimer, John Hubbard

Updated version based on feedback from John.
---
>From ade1eba229b558431581448e7d7838f0e1fe2c49 Mon Sep 17 00:00:00 2001
From: Michal Hocko <mhocko@suse.com>
Date: Wed, 29 Nov 2017 15:32:08 +0100
Subject: [PATCH] mmap.2: document new MAP_FIXED_SAFE flag

4.16+ kernels offer a new MAP_FIXED_SAFE flag which allows the caller to
atomicaly probe for a given address range.

[wording heavily updated by John Hubbard <jhubbard@nvidia.com>]
Signed-off-by: Michal Hocko <mhocko@suse.com>
---
 man2/mmap.2 | 22 ++++++++++++++++++++++
 1 file changed, 22 insertions(+)

diff --git a/man2/mmap.2 b/man2/mmap.2
index 385f3bfd5393..923bbb290875 100644
--- a/man2/mmap.2
+++ b/man2/mmap.2
@@ -225,6 +225,22 @@ will fail.
 Because requiring a fixed address for a mapping is less portable,
 the use of this option is discouraged.
 .TP
+.BR MAP_FIXED_SAFE " (since Linux 4.16)"
+Similar to MAP_FIXED with respect to the
+.I
+addr
+enforcement, but different in that MAP_FIXED_SAFE never clobbers a pre-existing
+mapped range. If the requested range would collide with an existing
+mapping, then this call fails with
+.B EEXIST.
+This flag can therefore be used as a way to atomically (with respect to other
+threads) attempt to map an address range: one thread will succeed; all others
+will report failure. Please note that older kernels which do not recognize this
+flag will typically (upon detecting a collision with a pre-existing mapping)
+fall back a "non-MAP_FIXED" type of behavior: they will return an address that
+is different than the requested one. Therefore, backward-compatible software
+should check the returned address against the requested address.
+.TP
 .B MAP_GROWSDOWN
 This flag is used for stacks.
 It indicates to the kernel virtual memory system that the mapping
@@ -449,6 +465,12 @@ is not a valid file descriptor (and
 .B MAP_ANONYMOUS
 was not set).
 .TP
+.B EEXIST
+range covered by
+.IR addr ,
+.IR length
+is clashing with an existing mapping.
+.TP
 .B EINVAL
 We don't like
 .IR addr ,
-- 
2.15.0

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply related	[flat|nested] 130+ messages in thread

* [PATCH v2] mmap.2: document new MAP_FIXED_SAFE flag
@ 2017-11-30  8:24     ` Michal Hocko
  0 siblings, 0 replies; 130+ messages in thread
From: Michal Hocko @ 2017-11-30  8:24 UTC (permalink / raw)
  To: Michael Kerrisk
  Cc: linux-api, Khalid Aziz, Michael Ellerman, Andrew Morton,
	Russell King - ARM Linux, Andrea Arcangeli, linux-mm, LKML,
	linux-arch, Florian Weimer, John Hubbard

Updated version based on feedback from John.
---
>From ade1eba229b558431581448e7d7838f0e1fe2c49 Mon Sep 17 00:00:00 2001
From: Michal Hocko <mhocko@suse.com>
Date: Wed, 29 Nov 2017 15:32:08 +0100
Subject: [PATCH] mmap.2: document new MAP_FIXED_SAFE flag

4.16+ kernels offer a new MAP_FIXED_SAFE flag which allows the caller to
atomicaly probe for a given address range.

[wording heavily updated by John Hubbard <jhubbard@nvidia.com>]
Signed-off-by: Michal Hocko <mhocko@suse.com>
---
 man2/mmap.2 | 22 ++++++++++++++++++++++
 1 file changed, 22 insertions(+)

diff --git a/man2/mmap.2 b/man2/mmap.2
index 385f3bfd5393..923bbb290875 100644
--- a/man2/mmap.2
+++ b/man2/mmap.2
@@ -225,6 +225,22 @@ will fail.
 Because requiring a fixed address for a mapping is less portable,
 the use of this option is discouraged.
 .TP
+.BR MAP_FIXED_SAFE " (since Linux 4.16)"
+Similar to MAP_FIXED with respect to the
+.I
+addr
+enforcement, but different in that MAP_FIXED_SAFE never clobbers a pre-existing
+mapped range. If the requested range would collide with an existing
+mapping, then this call fails with
+.B EEXIST.
+This flag can therefore be used as a way to atomically (with respect to other
+threads) attempt to map an address range: one thread will succeed; all others
+will report failure. Please note that older kernels which do not recognize this
+flag will typically (upon detecting a collision with a pre-existing mapping)
+fall back a "non-MAP_FIXED" type of behavior: they will return an address that
+is different than the requested one. Therefore, backward-compatible software
+should check the returned address against the requested address.
+.TP
 .B MAP_GROWSDOWN
 This flag is used for stacks.
 It indicates to the kernel virtual memory system that the mapping
@@ -449,6 +465,12 @@ is not a valid file descriptor (and
 .B MAP_ANONYMOUS
 was not set).
 .TP
+.B EEXIST
+range covered by
+.IR addr ,
+.IR length
+is clashing with an existing mapping.
+.TP
 .B EINVAL
 We don't like
 .IR addr ,
-- 
2.15.0

-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 130+ messages in thread

* [PATCH v2] mmap.2: document new MAP_FIXED_SAFE flag
@ 2017-11-30  8:24     ` Michal Hocko
  0 siblings, 0 replies; 130+ messages in thread
From: Michal Hocko @ 2017-11-30  8:24 UTC (permalink / raw)
  To: Michael Kerrisk
  Cc: linux-api, Khalid Aziz, Michael Ellerman, Andrew Morton,
	Russell King - ARM Linux, Andrea Arcangeli, linux-mm, LKML,
	linux-arch, Florian Weimer, John Hubbard

Updated version based on feedback from John.
---
From ade1eba229b558431581448e7d7838f0e1fe2c49 Mon Sep 17 00:00:00 2001
From: Michal Hocko <mhocko@suse.com>
Date: Wed, 29 Nov 2017 15:32:08 +0100
Subject: [PATCH] mmap.2: document new MAP_FIXED_SAFE flag

4.16+ kernels offer a new MAP_FIXED_SAFE flag which allows the caller to
atomicaly probe for a given address range.

[wording heavily updated by John Hubbard <jhubbard@nvidia.com>]
Signed-off-by: Michal Hocko <mhocko@suse.com>
---
 man2/mmap.2 | 22 ++++++++++++++++++++++
 1 file changed, 22 insertions(+)

diff --git a/man2/mmap.2 b/man2/mmap.2
index 385f3bfd5393..923bbb290875 100644
--- a/man2/mmap.2
+++ b/man2/mmap.2
@@ -225,6 +225,22 @@ will fail.
 Because requiring a fixed address for a mapping is less portable,
 the use of this option is discouraged.
 .TP
+.BR MAP_FIXED_SAFE " (since Linux 4.16)"
+Similar to MAP_FIXED with respect to the
+.I
+addr
+enforcement, but different in that MAP_FIXED_SAFE never clobbers a pre-existing
+mapped range. If the requested range would collide with an existing
+mapping, then this call fails with
+.B EEXIST.
+This flag can therefore be used as a way to atomically (with respect to other
+threads) attempt to map an address range: one thread will succeed; all others
+will report failure. Please note that older kernels which do not recognize this
+flag will typically (upon detecting a collision with a pre-existing mapping)
+fall back a "non-MAP_FIXED" type of behavior: they will return an address that
+is different than the requested one. Therefore, backward-compatible software
+should check the returned address against the requested address.
+.TP
 .B MAP_GROWSDOWN
 This flag is used for stacks.
 It indicates to the kernel virtual memory system that the mapping
@@ -449,6 +465,12 @@ is not a valid file descriptor (and
 .B MAP_ANONYMOUS
 was not set).
 .TP
+.B EEXIST
+range covered by
+.IR addr ,
+.IR length
+is clashing with an existing mapping.
+.TP
 .B EINVAL
 We don't like
 .IR addr ,
-- 
2.15.0

-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 130+ messages in thread

* [PATCH v2] mmap.2: document new MAP_FIXED_SAFE flag
@ 2017-11-30  8:24     ` Michal Hocko
  0 siblings, 0 replies; 130+ messages in thread
From: Michal Hocko @ 2017-11-30  8:24 UTC (permalink / raw)
  To: Michael Kerrisk
  Cc: linux-api, Khalid Aziz, Michael Ellerman, Andrew Morton,
	Russell King - ARM Linux, Andrea Arcangeli, linux-mm, LKML,
	linux-arch, Florian Weimer, John Hubbard

Updated version based on feedback from John.
---

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [PATCH v2] mmap.2: document new MAP_FIXED_SAFE flag
  2017-11-30  8:24     ` Michal Hocko
  (?)
@ 2017-11-30 18:31       ` John Hubbard
  -1 siblings, 0 replies; 130+ messages in thread
From: John Hubbard @ 2017-11-30 18:31 UTC (permalink / raw)
  To: Michal Hocko, Michael Kerrisk
  Cc: linux-api, Khalid Aziz, Michael Ellerman, Andrew Morton,
	Russell King - ARM Linux, Andrea Arcangeli, linux-mm, LKML,
	linux-arch, Florian Weimer

On 11/30/2017 12:24 AM, Michal Hocko wrote:
> Updated version based on feedback from John.
> ---
> From ade1eba229b558431581448e7d7838f0e1fe2c49 Mon Sep 17 00:00:00 2001
> From: Michal Hocko <mhocko@suse.com>
> Date: Wed, 29 Nov 2017 15:32:08 +0100
> Subject: [PATCH] mmap.2: document new MAP_FIXED_SAFE flag
> 
> 4.16+ kernels offer a new MAP_FIXED_SAFE flag which allows the caller to
> atomicaly probe for a given address range.
> 
> [wording heavily updated by John Hubbard <jhubbard@nvidia.com>]
> Signed-off-by: Michal Hocko <mhocko@suse.com>
> ---
>  man2/mmap.2 | 22 ++++++++++++++++++++++
>  1 file changed, 22 insertions(+)
> 
> diff --git a/man2/mmap.2 b/man2/mmap.2
> index 385f3bfd5393..923bbb290875 100644
> --- a/man2/mmap.2
> +++ b/man2/mmap.2
> @@ -225,6 +225,22 @@ will fail.
>  Because requiring a fixed address for a mapping is less portable,
>  the use of this option is discouraged.
>  .TP
> +.BR MAP_FIXED_SAFE " (since Linux 4.16)"
> +Similar to MAP_FIXED with respect to the
> +.I
> +addr
> +enforcement, but different in that MAP_FIXED_SAFE never clobbers a pre-existing
> +mapped range. If the requested range would collide with an existing
> +mapping, then this call fails with
> +.B EEXIST.
> +This flag can therefore be used as a way to atomically (with respect to other
> +threads) attempt to map an address range: one thread will succeed; all others
> +will report failure. Please note that older kernels which do not recognize this
> +flag will typically (upon detecting a collision with a pre-existing mapping)
> +fall back a "non-MAP_FIXED" type of behavior: they will return an address that

...and now I've created my own typo: please make that "fall back to a"  (the 
"to" was missing).

Sorry about the churn. It turns out that the compiler doesn't catch these. :)

thanks,
John Hubbard


> +is different than the requested one. Therefore, backward-compatible software
> +should check the returned address against the requested address.
> +.TP
>  .B MAP_GROWSDOWN
>  This flag is used for stacks.
>  It indicates to the kernel virtual memory system that the mapping
> @@ -449,6 +465,12 @@ is not a valid file descriptor (and
>  .B MAP_ANONYMOUS
>  was not set).
>  .TP
> +.B EEXIST
> +range covered by
> +.IR addr ,
> +.IR length
> +is clashing with an existing mapping.
> +.TP
>  .B EINVAL
>  We don't like
>  .IR addr ,
> 

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [PATCH v2] mmap.2: document new MAP_FIXED_SAFE flag
@ 2017-11-30 18:31       ` John Hubbard
  0 siblings, 0 replies; 130+ messages in thread
From: John Hubbard @ 2017-11-30 18:31 UTC (permalink / raw)
  To: Michal Hocko, Michael Kerrisk
  Cc: linux-api, Khalid Aziz, Michael Ellerman, Andrew Morton,
	Russell King - ARM Linux, Andrea Arcangeli, linux-mm, LKML,
	linux-arch, Florian Weimer

On 11/30/2017 12:24 AM, Michal Hocko wrote:
> Updated version based on feedback from John.
> ---
> From ade1eba229b558431581448e7d7838f0e1fe2c49 Mon Sep 17 00:00:00 2001
> From: Michal Hocko <mhocko@suse.com>
> Date: Wed, 29 Nov 2017 15:32:08 +0100
> Subject: [PATCH] mmap.2: document new MAP_FIXED_SAFE flag
> 
> 4.16+ kernels offer a new MAP_FIXED_SAFE flag which allows the caller to
> atomicaly probe for a given address range.
> 
> [wording heavily updated by John Hubbard <jhubbard@nvidia.com>]
> Signed-off-by: Michal Hocko <mhocko@suse.com>
> ---
>  man2/mmap.2 | 22 ++++++++++++++++++++++
>  1 file changed, 22 insertions(+)
> 
> diff --git a/man2/mmap.2 b/man2/mmap.2
> index 385f3bfd5393..923bbb290875 100644
> --- a/man2/mmap.2
> +++ b/man2/mmap.2
> @@ -225,6 +225,22 @@ will fail.
>  Because requiring a fixed address for a mapping is less portable,
>  the use of this option is discouraged.
>  .TP
> +.BR MAP_FIXED_SAFE " (since Linux 4.16)"
> +Similar to MAP_FIXED with respect to the
> +.I
> +addr
> +enforcement, but different in that MAP_FIXED_SAFE never clobbers a pre-existing
> +mapped range. If the requested range would collide with an existing
> +mapping, then this call fails with
> +.B EEXIST.
> +This flag can therefore be used as a way to atomically (with respect to other
> +threads) attempt to map an address range: one thread will succeed; all others
> +will report failure. Please note that older kernels which do not recognize this
> +flag will typically (upon detecting a collision with a pre-existing mapping)
> +fall back a "non-MAP_FIXED" type of behavior: they will return an address that

...and now I've created my own typo: please make that "fall back to a"  (the 
"to" was missing).

Sorry about the churn. It turns out that the compiler doesn't catch these. :)

thanks,
John Hubbard


> +is different than the requested one. Therefore, backward-compatible software
> +should check the returned address against the requested address.
> +.TP
>  .B MAP_GROWSDOWN
>  This flag is used for stacks.
>  It indicates to the kernel virtual memory system that the mapping
> @@ -449,6 +465,12 @@ is not a valid file descriptor (and
>  .B MAP_ANONYMOUS
>  was not set).
>  .TP
> +.B EEXIST
> +range covered by
> +.IR addr ,
> +.IR length
> +is clashing with an existing mapping.
> +.TP
>  .B EINVAL
>  We don't like
>  .IR addr ,
> 

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [PATCH v2] mmap.2: document new MAP_FIXED_SAFE flag
@ 2017-11-30 18:31       ` John Hubbard
  0 siblings, 0 replies; 130+ messages in thread
From: John Hubbard @ 2017-11-30 18:31 UTC (permalink / raw)
  To: Michal Hocko, Michael Kerrisk
  Cc: linux-api, Khalid Aziz, Michael Ellerman, Andrew Morton,
	Russell King - ARM Linux, Andrea Arcangeli, linux-mm, LKML,
	linux-arch, Florian Weimer

On 11/30/2017 12:24 AM, Michal Hocko wrote:
> Updated version based on feedback from John.
> ---
> From ade1eba229b558431581448e7d7838f0e1fe2c49 Mon Sep 17 00:00:00 2001
> From: Michal Hocko <mhocko@suse.com>
> Date: Wed, 29 Nov 2017 15:32:08 +0100
> Subject: [PATCH] mmap.2: document new MAP_FIXED_SAFE flag
> 
> 4.16+ kernels offer a new MAP_FIXED_SAFE flag which allows the caller to
> atomicaly probe for a given address range.
> 
> [wording heavily updated by John Hubbard <jhubbard@nvidia.com>]
> Signed-off-by: Michal Hocko <mhocko@suse.com>
> ---
>  man2/mmap.2 | 22 ++++++++++++++++++++++
>  1 file changed, 22 insertions(+)
> 
> diff --git a/man2/mmap.2 b/man2/mmap.2
> index 385f3bfd5393..923bbb290875 100644
> --- a/man2/mmap.2
> +++ b/man2/mmap.2
> @@ -225,6 +225,22 @@ will fail.
>  Because requiring a fixed address for a mapping is less portable,
>  the use of this option is discouraged.
>  .TP
> +.BR MAP_FIXED_SAFE " (since Linux 4.16)"
> +Similar to MAP_FIXED with respect to the
> +.I
> +addr
> +enforcement, but different in that MAP_FIXED_SAFE never clobbers a pre-existing
> +mapped range. If the requested range would collide with an existing
> +mapping, then this call fails with
> +.B EEXIST.
> +This flag can therefore be used as a way to atomically (with respect to other
> +threads) attempt to map an address range: one thread will succeed; all others
> +will report failure. Please note that older kernels which do not recognize this
> +flag will typically (upon detecting a collision with a pre-existing mapping)
> +fall back a "non-MAP_FIXED" type of behavior: they will return an address that

...and now I've created my own typo: please make that "fall back to a"  (the 
"to" was missing).

Sorry about the churn. It turns out that the compiler doesn't catch these. :)

thanks,
John Hubbard


> +is different than the requested one. Therefore, backward-compatible software
> +should check the returned address against the requested address.
> +.TP
>  .B MAP_GROWSDOWN
>  This flag is used for stacks.
>  It indicates to the kernel virtual memory system that the mapping
> @@ -449,6 +465,12 @@ is not a valid file descriptor (and
>  .B MAP_ANONYMOUS
>  was not set).
>  .TP
> +.B EEXIST
> +range covered by
> +.IR addr ,
> +.IR length
> +is clashing with an existing mapping.
> +.TP
>  .B EINVAL
>  We don't like
>  .IR addr ,
> 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [PATCH v2] mmap.2: document new MAP_FIXED_SAFE flag
  2017-11-30 18:31       ` John Hubbard
@ 2017-11-30 18:39         ` Michal Hocko
  -1 siblings, 0 replies; 130+ messages in thread
From: Michal Hocko @ 2017-11-30 18:39 UTC (permalink / raw)
  To: John Hubbard
  Cc: Michael Kerrisk, linux-api, Khalid Aziz, Michael Ellerman,
	Andrew Morton, Russell King - ARM Linux, Andrea Arcangeli,
	linux-mm, LKML, linux-arch, Florian Weimer

On Thu 30-11-17 10:31:12, John Hubbard wrote:
> On 11/30/2017 12:24 AM, Michal Hocko wrote:
> > Updated version based on feedback from John.
> > ---
> > From ade1eba229b558431581448e7d7838f0e1fe2c49 Mon Sep 17 00:00:00 2001
> > From: Michal Hocko <mhocko@suse.com>
> > Date: Wed, 29 Nov 2017 15:32:08 +0100
> > Subject: [PATCH] mmap.2: document new MAP_FIXED_SAFE flag
> > 
> > 4.16+ kernels offer a new MAP_FIXED_SAFE flag which allows the caller to
> > atomicaly probe for a given address range.
> > 
> > [wording heavily updated by John Hubbard <jhubbard@nvidia.com>]
> > Signed-off-by: Michal Hocko <mhocko@suse.com>
> > ---
> >  man2/mmap.2 | 22 ++++++++++++++++++++++
> >  1 file changed, 22 insertions(+)
> > 
> > diff --git a/man2/mmap.2 b/man2/mmap.2
> > index 385f3bfd5393..923bbb290875 100644
> > --- a/man2/mmap.2
> > +++ b/man2/mmap.2
> > @@ -225,6 +225,22 @@ will fail.
> >  Because requiring a fixed address for a mapping is less portable,
> >  the use of this option is discouraged.
> >  .TP
> > +.BR MAP_FIXED_SAFE " (since Linux 4.16)"
> > +Similar to MAP_FIXED with respect to the
> > +.I
> > +addr
> > +enforcement, but different in that MAP_FIXED_SAFE never clobbers a pre-existing
> > +mapped range. If the requested range would collide with an existing
> > +mapping, then this call fails with
> > +.B EEXIST.
> > +This flag can therefore be used as a way to atomically (with respect to other
> > +threads) attempt to map an address range: one thread will succeed; all others
> > +will report failure. Please note that older kernels which do not recognize this
> > +flag will typically (upon detecting a collision with a pre-existing mapping)
> > +fall back a "non-MAP_FIXED" type of behavior: they will return an address that
> 
> ...and now I've created my own typo: please make that "fall back to a"  (the 
> "to" was missing).
> 
> Sorry about the churn. It turns out that the compiler doesn't catch these. :)

Fixed. I will resubmit after there is more feedback review.

Thanks
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [PATCH v2] mmap.2: document new MAP_FIXED_SAFE flag
@ 2017-11-30 18:39         ` Michal Hocko
  0 siblings, 0 replies; 130+ messages in thread
From: Michal Hocko @ 2017-11-30 18:39 UTC (permalink / raw)
  To: John Hubbard
  Cc: Michael Kerrisk, linux-api, Khalid Aziz, Michael Ellerman,
	Andrew Morton, Russell King - ARM Linux, Andrea Arcangeli,
	linux-mm, LKML, linux-arch, Florian Weimer

On Thu 30-11-17 10:31:12, John Hubbard wrote:
> On 11/30/2017 12:24 AM, Michal Hocko wrote:
> > Updated version based on feedback from John.
> > ---
> > From ade1eba229b558431581448e7d7838f0e1fe2c49 Mon Sep 17 00:00:00 2001
> > From: Michal Hocko <mhocko@suse.com>
> > Date: Wed, 29 Nov 2017 15:32:08 +0100
> > Subject: [PATCH] mmap.2: document new MAP_FIXED_SAFE flag
> > 
> > 4.16+ kernels offer a new MAP_FIXED_SAFE flag which allows the caller to
> > atomicaly probe for a given address range.
> > 
> > [wording heavily updated by John Hubbard <jhubbard@nvidia.com>]
> > Signed-off-by: Michal Hocko <mhocko@suse.com>
> > ---
> >  man2/mmap.2 | 22 ++++++++++++++++++++++
> >  1 file changed, 22 insertions(+)
> > 
> > diff --git a/man2/mmap.2 b/man2/mmap.2
> > index 385f3bfd5393..923bbb290875 100644
> > --- a/man2/mmap.2
> > +++ b/man2/mmap.2
> > @@ -225,6 +225,22 @@ will fail.
> >  Because requiring a fixed address for a mapping is less portable,
> >  the use of this option is discouraged.
> >  .TP
> > +.BR MAP_FIXED_SAFE " (since Linux 4.16)"
> > +Similar to MAP_FIXED with respect to the
> > +.I
> > +addr
> > +enforcement, but different in that MAP_FIXED_SAFE never clobbers a pre-existing
> > +mapped range. If the requested range would collide with an existing
> > +mapping, then this call fails with
> > +.B EEXIST.
> > +This flag can therefore be used as a way to atomically (with respect to other
> > +threads) attempt to map an address range: one thread will succeed; all others
> > +will report failure. Please note that older kernels which do not recognize this
> > +flag will typically (upon detecting a collision with a pre-existing mapping)
> > +fall back a "non-MAP_FIXED" type of behavior: they will return an address that
> 
> ...and now I've created my own typo: please make that "fall back to a"  (the 
> "to" was missing).
> 
> Sorry about the churn. It turns out that the compiler doesn't catch these. :)

Fixed. I will resubmit after there is more feedback review.

Thanks
-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [PATCH 0/2] mm: introduce MAP_FIXED_SAFE
  2017-11-30  6:58     ` Michal Hocko
@ 2017-12-01 15:26       ` Cyril Hrubis
  -1 siblings, 0 replies; 130+ messages in thread
From: Cyril Hrubis @ 2017-12-01 15:26 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Kees Cook, Linux API, Khalid Aziz, Michael Ellerman,
	Andrew Morton, Russell King - ARM Linux, Andrea Arcangeli,
	Linux-MM, LKML, linux-arch, Florian Weimer, John Hubbard,
	Abdul Haleem, Joel Stanley

Hi!
> > MAP_FIXED_UNIQUE
> > MAP_FIXED_ONCE
> > MAP_FIXED_FRESH
> 
> Well, I can open a poll for the best name, but none of those you are
> proposing sound much better to me. Yeah, naming sucks...

Given that MAP_FIXED replaces the previous mapping MAP_FIXED_NOREPLACE
would probably be a best fit.

-- 
Cyril Hrubis
chrubis@suse.cz

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [PATCH 0/2] mm: introduce MAP_FIXED_SAFE
@ 2017-12-01 15:26       ` Cyril Hrubis
  0 siblings, 0 replies; 130+ messages in thread
From: Cyril Hrubis @ 2017-12-01 15:26 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Kees Cook, Linux API, Khalid Aziz, Michael Ellerman,
	Andrew Morton, Russell King - ARM Linux, Andrea Arcangeli,
	Linux-MM, LKML, linux-arch, Florian Weimer, John Hubbard,
	Abdul Haleem, Joel Stanley

Hi!
> > MAP_FIXED_UNIQUE
> > MAP_FIXED_ONCE
> > MAP_FIXED_FRESH
> 
> Well, I can open a poll for the best name, but none of those you are
> proposing sound much better to me. Yeah, naming sucks...

Given that MAP_FIXED replaces the previous mapping MAP_FIXED_NOREPLACE
would probably be a best fit.

-- 
Cyril Hrubis
chrubis@suse.cz

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [PATCH 0/2] mm: introduce MAP_FIXED_SAFE
  2017-11-30  6:58     ` Michal Hocko
@ 2017-12-06  4:50       ` Michael Ellerman
  -1 siblings, 0 replies; 130+ messages in thread
From: Michael Ellerman @ 2017-12-06  4:50 UTC (permalink / raw)
  To: Michal Hocko, Kees Cook
  Cc: Linux API, Khalid Aziz, Andrew Morton, Russell King - ARM Linux,
	Andrea Arcangeli, Linux-MM, LKML, linux-arch, Florian Weimer,
	John Hubbard, Abdul Haleem, Joel Stanley

Michal Hocko <mhocko@kernel.org> writes:

> On Wed 29-11-17 14:25:36, Kees Cook wrote:
>> On Wed, Nov 29, 2017 at 6:42 AM, Michal Hocko <mhocko@kernel.org> wrote:
>> > The first patch introduced MAP_FIXED_SAFE which enforces the given
>> > address but unlike MAP_FIXED it fails with ENOMEM if the given range
>> > conflicts with an existing one. The flag is introduced as a completely
>> 
>> I still think this name should be better. "SAFE" doesn't say what it's
>> safe from...

Yes exactly.

> It is safe in a sense it doesn't perform any address space dangerous
> operations. mmap is _inherently_ about the address space so the context
> should be kind of clear.

So now you have to define what "dangerous" means.

>> MAP_FIXED_UNIQUE
>> MAP_FIXED_ONCE
>> MAP_FIXED_FRESH
>
> Well, I can open a poll for the best name, but none of those you are
> proposing sound much better to me. Yeah, naming sucks...

I think Kees and I both previously suggested MAP_NO_CLOBBER for the
modifier.

So the obvious option for this would be MAP_FIXED_NO_CLOBBER.

Which is a bit longer sure, but says more or less exactly what it does.

cheers

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [PATCH 0/2] mm: introduce MAP_FIXED_SAFE
@ 2017-12-06  4:50       ` Michael Ellerman
  0 siblings, 0 replies; 130+ messages in thread
From: Michael Ellerman @ 2017-12-06  4:50 UTC (permalink / raw)
  To: Michal Hocko, Kees Cook
  Cc: Linux API, Khalid Aziz, Andrew Morton, Russell King - ARM Linux,
	Andrea Arcangeli, Linux-MM, LKML, linux-arch, Florian Weimer,
	John Hubbard, Abdul Haleem, Joel Stanley

Michal Hocko <mhocko@kernel.org> writes:

> On Wed 29-11-17 14:25:36, Kees Cook wrote:
>> On Wed, Nov 29, 2017 at 6:42 AM, Michal Hocko <mhocko@kernel.org> wrote:
>> > The first patch introduced MAP_FIXED_SAFE which enforces the given
>> > address but unlike MAP_FIXED it fails with ENOMEM if the given range
>> > conflicts with an existing one. The flag is introduced as a completely
>> 
>> I still think this name should be better. "SAFE" doesn't say what it's
>> safe from...

Yes exactly.

> It is safe in a sense it doesn't perform any address space dangerous
> operations. mmap is _inherently_ about the address space so the context
> should be kind of clear.

So now you have to define what "dangerous" means.

>> MAP_FIXED_UNIQUE
>> MAP_FIXED_ONCE
>> MAP_FIXED_FRESH
>
> Well, I can open a poll for the best name, but none of those you are
> proposing sound much better to me. Yeah, naming sucks...

I think Kees and I both previously suggested MAP_NO_CLOBBER for the
modifier.

So the obvious option for this would be MAP_FIXED_NO_CLOBBER.

Which is a bit longer sure, but says more or less exactly what it does.

cheers

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [PATCH 0/2] mm: introduce MAP_FIXED_SAFE
  2017-12-01 15:26       ` Cyril Hrubis
@ 2017-12-06  4:51         ` Michael Ellerman
  -1 siblings, 0 replies; 130+ messages in thread
From: Michael Ellerman @ 2017-12-06  4:51 UTC (permalink / raw)
  To: Cyril Hrubis, Michal Hocko
  Cc: Kees Cook, Linux API, Khalid Aziz, Andrew Morton,
	Russell King - ARM Linux, Andrea Arcangeli, Linux-MM, LKML,
	linux-arch, Florian Weimer, John Hubbard, Abdul Haleem,
	Joel Stanley

Cyril Hrubis <chrubis@suse.cz> writes:

> Hi!
>> > MAP_FIXED_UNIQUE
>> > MAP_FIXED_ONCE
>> > MAP_FIXED_FRESH
>> 
>> Well, I can open a poll for the best name, but none of those you are
>> proposing sound much better to me. Yeah, naming sucks...
>
> Given that MAP_FIXED replaces the previous mapping MAP_FIXED_NOREPLACE
> would probably be a best fit.

Yeah that could work.

I prefer "no clobber" as I just suggested, because the existing
MAP_FIXED doesn't politely "replace" a mapping, it destroys the current
one - which you or another thread may be using - and clobbers it with
the new one.

cheers

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [PATCH 0/2] mm: introduce MAP_FIXED_SAFE
@ 2017-12-06  4:51         ` Michael Ellerman
  0 siblings, 0 replies; 130+ messages in thread
From: Michael Ellerman @ 2017-12-06  4:51 UTC (permalink / raw)
  To: Cyril Hrubis, Michal Hocko
  Cc: Kees Cook, Linux API, Khalid Aziz, Andrew Morton,
	Russell King - ARM Linux, Andrea Arcangeli, Linux-MM, LKML,
	linux-arch, Florian Weimer, John Hubbard, Abdul Haleem,
	Joel Stanley

Cyril Hrubis <chrubis@suse.cz> writes:

> Hi!
>> > MAP_FIXED_UNIQUE
>> > MAP_FIXED_ONCE
>> > MAP_FIXED_FRESH
>> 
>> Well, I can open a poll for the best name, but none of those you are
>> proposing sound much better to me. Yeah, naming sucks...
>
> Given that MAP_FIXED replaces the previous mapping MAP_FIXED_NOREPLACE
> would probably be a best fit.

Yeah that could work.

I prefer "no clobber" as I just suggested, because the existing
MAP_FIXED doesn't politely "replace" a mapping, it destroys the current
one - which you or another thread may be using - and clobbers it with
the new one.

cheers

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [PATCH 0/2] mm: introduce MAP_FIXED_SAFE
  2017-12-06  4:51         ` Michael Ellerman
@ 2017-12-06  4:54           ` Matthew Wilcox
  -1 siblings, 0 replies; 130+ messages in thread
From: Matthew Wilcox @ 2017-12-06  4:54 UTC (permalink / raw)
  To: Michael Ellerman
  Cc: Cyril Hrubis, Michal Hocko, Kees Cook, Linux API, Khalid Aziz,
	Andrew Morton, Russell King - ARM Linux, Andrea Arcangeli,
	Linux-MM, LKML, linux-arch, Florian Weimer, John Hubbard,
	Abdul Haleem, Joel Stanley

On Wed, Dec 06, 2017 at 03:51:44PM +1100, Michael Ellerman wrote:
> Cyril Hrubis <chrubis@suse.cz> writes:
> 
> > Hi!
> >> > MAP_FIXED_UNIQUE
> >> > MAP_FIXED_ONCE
> >> > MAP_FIXED_FRESH
> >> 
> >> Well, I can open a poll for the best name, but none of those you are
> >> proposing sound much better to me. Yeah, naming sucks...
> >
> > Given that MAP_FIXED replaces the previous mapping MAP_FIXED_NOREPLACE
> > would probably be a best fit.
> 
> Yeah that could work.
> 
> I prefer "no clobber" as I just suggested, because the existing
> MAP_FIXED doesn't politely "replace" a mapping, it destroys the current
> one - which you or another thread may be using - and clobbers it with
> the new one.

It's longer than MAP_FIXED_WEAK :-P

You'd have to be pretty darn strong to clobber an existing mapping.

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [PATCH 0/2] mm: introduce MAP_FIXED_SAFE
@ 2017-12-06  4:54           ` Matthew Wilcox
  0 siblings, 0 replies; 130+ messages in thread
From: Matthew Wilcox @ 2017-12-06  4:54 UTC (permalink / raw)
  To: Michael Ellerman
  Cc: Cyril Hrubis, Michal Hocko, Kees Cook, Linux API, Khalid Aziz,
	Andrew Morton, Russell King - ARM Linux, Andrea Arcangeli,
	Linux-MM, LKML, linux-arch, Florian Weimer, John Hubbard,
	Abdul Haleem, Joel Stanley

On Wed, Dec 06, 2017 at 03:51:44PM +1100, Michael Ellerman wrote:
> Cyril Hrubis <chrubis@suse.cz> writes:
> 
> > Hi!
> >> > MAP_FIXED_UNIQUE
> >> > MAP_FIXED_ONCE
> >> > MAP_FIXED_FRESH
> >> 
> >> Well, I can open a poll for the best name, but none of those you are
> >> proposing sound much better to me. Yeah, naming sucks...
> >
> > Given that MAP_FIXED replaces the previous mapping MAP_FIXED_NOREPLACE
> > would probably be a best fit.
> 
> Yeah that could work.
> 
> I prefer "no clobber" as I just suggested, because the existing
> MAP_FIXED doesn't politely "replace" a mapping, it destroys the current
> one - which you or another thread may be using - and clobbers it with
> the new one.

It's longer than MAP_FIXED_WEAK :-P

You'd have to be pretty darn strong to clobber an existing mapping.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [PATCH 1/2] mm: introduce MAP_FIXED_SAFE
  2017-11-29 14:42   ` Michal Hocko
@ 2017-12-06  5:15     ` Michael Ellerman
  -1 siblings, 0 replies; 130+ messages in thread
From: Michael Ellerman @ 2017-12-06  5:15 UTC (permalink / raw)
  To: Michal Hocko, linux-api
  Cc: Khalid Aziz, Andrew Morton, Russell King - ARM Linux,
	Andrea Arcangeli, linux-mm, LKML, linux-arch, Florian Weimer,
	John Hubbard, Michal Hocko

Hi Michal,

Some comments below.

Michal Hocko <mhocko@kernel.org> writes:

> From: Michal Hocko <mhocko@suse.com>
>
> MAP_FIXED is used quite often to enforce mapping at the particular
> range. The main problem of this flag is, however, that it is inherently
> dangerous because it unmaps existing mappings covered by the requested
> range. This can cause silent memory corruptions. Some of them even with
> serious security implications. While the current semantic might be
> really desiderable in many cases there are others which would want to
> enforce the given range but rather see a failure than a silent memory
> corruption on a clashing range. Please note that there is no guarantee
> that a given range is obeyed by the mmap even when it is free - e.g.
> arch specific code is allowed to apply an alignment.

I don't think this last sentence is correct. Or maybe I don't understand
what you're referring to.

If you specifiy MAP_FIXED on a page boundary then the mapping must be
made at that address, I don't think arch code is allowed to add any
extra alignment.

> Introduce a new MAP_FIXED_SAFE flag for mmap to achieve this behavior.
> It has the same semantic as MAP_FIXED wrt. the given address request
> with a single exception that it fails with EEXIST if the requested
> address is already covered by an existing mapping. We still do rely on
> get_unmaped_area to handle all the arch specific MAP_FIXED treatment and
> check for a conflicting vma after it returns.
>
> [fail on clashing range with EEXIST as per Florian Weimer]
> [set MAP_FIXED before round_hint_to_min as per Khalid Aziz]
> Reviewed-by: Khalid Aziz <khalid.aziz@oracle.com>
> Signed-off-by: Michal Hocko <mhocko@suse.com>
> ---
>  arch/alpha/include/uapi/asm/mman.h   |  2 ++
>  arch/mips/include/uapi/asm/mman.h    |  2 ++
>  arch/parisc/include/uapi/asm/mman.h  |  2 ++
>  arch/powerpc/include/uapi/asm/mman.h |  1 +
>  arch/sparc/include/uapi/asm/mman.h   |  1 +
>  arch/tile/include/uapi/asm/mman.h    |  1 +
>  arch/xtensa/include/uapi/asm/mman.h  |  2 ++
>  include/uapi/asm-generic/mman.h      |  1 +
>  mm/mmap.c                            | 11 +++++++++++
>  9 files changed, 23 insertions(+)
>
> diff --git a/arch/alpha/include/uapi/asm/mman.h b/arch/alpha/include/uapi/asm/mman.h
> index 6bf730063e3f..ef3770262925 100644
> --- a/arch/alpha/include/uapi/asm/mman.h
> +++ b/arch/alpha/include/uapi/asm/mman.h
> @@ -32,6 +32,8 @@
>  #define MAP_STACK	0x80000		/* give out an address that is best suited for process/thread stacks */
>  #define MAP_HUGETLB	0x100000	/* create a huge page mapping */
>  
> +#define MAP_FIXED_SAFE	0x200000	/* MAP_FIXED which doesn't unmap underlying mapping */
> +

Why the new line before MAP_FIXED_SAFE? It should sit with the others.

You're using a different value to other arches here, but that's OK, and
alpha doesn't use asm-generic/mman.h or mman-common.h

> diff --git a/arch/powerpc/include/uapi/asm/mman.h b/arch/powerpc/include/uapi/asm/mman.h
> index e63bc37e33af..3ffd284e7160 100644
> --- a/arch/powerpc/include/uapi/asm/mman.h
> +++ b/arch/powerpc/include/uapi/asm/mman.h
> @@ -29,5 +29,6 @@
>  #define MAP_NONBLOCK	0x10000		/* do not block on IO */
>  #define MAP_STACK	0x20000		/* give out an address that is best suited for process/thread stacks */
>  #define MAP_HUGETLB	0x40000		/* create a huge page mapping */
> +#define MAP_FIXED_SAFE	0x800000	/* MAP_FIXED which doesn't unmap underlying mapping */

Why did you pick 0x800000?

I don't see any reason you can't use 0x8000 on powerpc.

 
> diff --git a/arch/sparc/include/uapi/asm/mman.h b/arch/sparc/include/uapi/asm/mman.h
> index 715a2c927e79..0c282c09fae8 100644
> --- a/arch/sparc/include/uapi/asm/mman.h
> +++ b/arch/sparc/include/uapi/asm/mman.h
> @@ -24,6 +24,7 @@
>  #define MAP_NONBLOCK	0x10000		/* do not block on IO */
>  #define MAP_STACK	0x20000		/* give out an address that is best suited for process/thread stacks */
>  #define MAP_HUGETLB	0x40000		/* create a huge page mapping */
> +#define MAP_FIXED_SAFE	0x80000		/* MAP_FIXED which doesn't unmap underlying mapping */

Using 0x80000 on sparc, sparc uses mman-common.h.

> diff --git a/arch/tile/include/uapi/asm/mman.h b/arch/tile/include/uapi/asm/mman.h
> index 9b7add95926b..b212f5fd5345 100644
> --- a/arch/tile/include/uapi/asm/mman.h
> +++ b/arch/tile/include/uapi/asm/mman.h
> @@ -30,6 +30,7 @@
>  #define MAP_DENYWRITE	0x0800		/* ETXTBSY */
>  #define MAP_EXECUTABLE	0x1000		/* mark it as an executable */
>  #define MAP_HUGETLB	0x4000		/* create a huge page mapping */
> +#define MAP_FIXED_SAFE	0x8000		/* MAP_FIXED which doesn't unmap underlying mapping */
  
That is the next free flag, but you could also use 0x80000 on tile.

tile uses mman-common.h.

> diff --git a/arch/xtensa/include/uapi/asm/mman.h b/arch/xtensa/include/uapi/asm/mman.h
> index 2bfe590694fc..0daf199caa57 100644
> --- a/arch/xtensa/include/uapi/asm/mman.h
> +++ b/arch/xtensa/include/uapi/asm/mman.h
> @@ -56,6 +56,7 @@
>  #define MAP_NONBLOCK	0x20000		/* do not block on IO */
>  #define MAP_STACK	0x40000		/* give out an address that is best suited for process/thread stacks */
>  #define MAP_HUGETLB	0x80000		/* create a huge page mapping */
> +#define MAP_FIXED_SAFE	0x100000	/* MAP_FIXED which doesn't unmap underlying mapping */

xtensa doesn't use asm-generic/mman.h or mman-common.h

>  #ifdef CONFIG_MMAP_ALLOW_UNINITIALIZED
>  # define MAP_UNINITIALIZED 0x4000000	/* For anonymous mmap, memory could be
>  					 * uninitialized */
> @@ -63,6 +64,7 @@
>  # define MAP_UNINITIALIZED 0x0		/* Don't support this flag */
>  #endif
>  
> +

Stray new line.

>  /*
>   * Flags for msync
>   */
> diff --git a/include/uapi/asm-generic/mman.h b/include/uapi/asm-generic/mman.h
> index 2dffcbf705b3..56cde132a80a 100644
> --- a/include/uapi/asm-generic/mman.h
> +++ b/include/uapi/asm-generic/mman.h
> @@ -13,6 +13,7 @@
>  #define MAP_NONBLOCK	0x10000		/* do not block on IO */
>  #define MAP_STACK	0x20000		/* give out an address that is best suited for process/thread stacks */
>  #define MAP_HUGETLB	0x40000		/* create a huge page mapping */
> +#define MAP_FIXED_SAFE	0x80000		/* MAP_FIXED which doesn't unmap underlying mapping */

So I think I proved above that all the arches that are using 0x80000 are
also using mman-common.h, and vice-versa.

So you can put this in mman-common.h can't you?

> diff --git a/mm/mmap.c b/mm/mmap.c
> index 476e810cf100..e84339842bb8 100644
> --- a/mm/mmap.c
> +++ b/mm/mmap.c
> @@ -1342,6 +1342,10 @@ unsigned long do_mmap(struct file *file, unsigned long addr,
>  		if (!(file && path_noexec(&file->f_path)))
>  			prot |= PROT_EXEC;
>  
> +	/* force arch specific MAP_FIXED handling in get_unmapped_area */
> +	if (flags & MAP_FIXED_SAFE)
> +		flags |= MAP_FIXED;
> +

The comment is misleading, because literally on the next line below we
check MAP_FIXED and change the behaviour, but not in the arch code.

>  	if (!(flags & MAP_FIXED))
>  		addr = round_hint_to_min(addr);

So it would be more accurate to say something like:

	/*
	 * Internal to the kernel MAP_FIXED_SAFE is a superset of
	 * MAP_FIXED, so set MAP_FIXED in flags if MAP_FIXED_SAFE was
	 * set by the caller. This avoids all the arch code having to
	 * check for MAP_FIXED and MAP_FIXED_SAFE.
	 */
	if (flags & MAP_FIXED_SAFE)
		flags |= MAP_FIXED;


cheers

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [PATCH 1/2] mm: introduce MAP_FIXED_SAFE
@ 2017-12-06  5:15     ` Michael Ellerman
  0 siblings, 0 replies; 130+ messages in thread
From: Michael Ellerman @ 2017-12-06  5:15 UTC (permalink / raw)
  To: Michal Hocko, linux-api
  Cc: Khalid Aziz, Andrew Morton, Russell King - ARM Linux,
	Andrea Arcangeli, linux-mm, LKML, linux-arch, Florian Weimer,
	John Hubbard, Michal Hocko

Hi Michal,

Some comments below.

Michal Hocko <mhocko@kernel.org> writes:

> From: Michal Hocko <mhocko@suse.com>
>
> MAP_FIXED is used quite often to enforce mapping at the particular
> range. The main problem of this flag is, however, that it is inherently
> dangerous because it unmaps existing mappings covered by the requested
> range. This can cause silent memory corruptions. Some of them even with
> serious security implications. While the current semantic might be
> really desiderable in many cases there are others which would want to
> enforce the given range but rather see a failure than a silent memory
> corruption on a clashing range. Please note that there is no guarantee
> that a given range is obeyed by the mmap even when it is free - e.g.
> arch specific code is allowed to apply an alignment.

I don't think this last sentence is correct. Or maybe I don't understand
what you're referring to.

If you specifiy MAP_FIXED on a page boundary then the mapping must be
made at that address, I don't think arch code is allowed to add any
extra alignment.

> Introduce a new MAP_FIXED_SAFE flag for mmap to achieve this behavior.
> It has the same semantic as MAP_FIXED wrt. the given address request
> with a single exception that it fails with EEXIST if the requested
> address is already covered by an existing mapping. We still do rely on
> get_unmaped_area to handle all the arch specific MAP_FIXED treatment and
> check for a conflicting vma after it returns.
>
> [fail on clashing range with EEXIST as per Florian Weimer]
> [set MAP_FIXED before round_hint_to_min as per Khalid Aziz]
> Reviewed-by: Khalid Aziz <khalid.aziz@oracle.com>
> Signed-off-by: Michal Hocko <mhocko@suse.com>
> ---
>  arch/alpha/include/uapi/asm/mman.h   |  2 ++
>  arch/mips/include/uapi/asm/mman.h    |  2 ++
>  arch/parisc/include/uapi/asm/mman.h  |  2 ++
>  arch/powerpc/include/uapi/asm/mman.h |  1 +
>  arch/sparc/include/uapi/asm/mman.h   |  1 +
>  arch/tile/include/uapi/asm/mman.h    |  1 +
>  arch/xtensa/include/uapi/asm/mman.h  |  2 ++
>  include/uapi/asm-generic/mman.h      |  1 +
>  mm/mmap.c                            | 11 +++++++++++
>  9 files changed, 23 insertions(+)
>
> diff --git a/arch/alpha/include/uapi/asm/mman.h b/arch/alpha/include/uapi/asm/mman.h
> index 6bf730063e3f..ef3770262925 100644
> --- a/arch/alpha/include/uapi/asm/mman.h
> +++ b/arch/alpha/include/uapi/asm/mman.h
> @@ -32,6 +32,8 @@
>  #define MAP_STACK	0x80000		/* give out an address that is best suited for process/thread stacks */
>  #define MAP_HUGETLB	0x100000	/* create a huge page mapping */
>  
> +#define MAP_FIXED_SAFE	0x200000	/* MAP_FIXED which doesn't unmap underlying mapping */
> +

Why the new line before MAP_FIXED_SAFE? It should sit with the others.

You're using a different value to other arches here, but that's OK, and
alpha doesn't use asm-generic/mman.h or mman-common.h

> diff --git a/arch/powerpc/include/uapi/asm/mman.h b/arch/powerpc/include/uapi/asm/mman.h
> index e63bc37e33af..3ffd284e7160 100644
> --- a/arch/powerpc/include/uapi/asm/mman.h
> +++ b/arch/powerpc/include/uapi/asm/mman.h
> @@ -29,5 +29,6 @@
>  #define MAP_NONBLOCK	0x10000		/* do not block on IO */
>  #define MAP_STACK	0x20000		/* give out an address that is best suited for process/thread stacks */
>  #define MAP_HUGETLB	0x40000		/* create a huge page mapping */
> +#define MAP_FIXED_SAFE	0x800000	/* MAP_FIXED which doesn't unmap underlying mapping */

Why did you pick 0x800000?

I don't see any reason you can't use 0x8000 on powerpc.

 
> diff --git a/arch/sparc/include/uapi/asm/mman.h b/arch/sparc/include/uapi/asm/mman.h
> index 715a2c927e79..0c282c09fae8 100644
> --- a/arch/sparc/include/uapi/asm/mman.h
> +++ b/arch/sparc/include/uapi/asm/mman.h
> @@ -24,6 +24,7 @@
>  #define MAP_NONBLOCK	0x10000		/* do not block on IO */
>  #define MAP_STACK	0x20000		/* give out an address that is best suited for process/thread stacks */
>  #define MAP_HUGETLB	0x40000		/* create a huge page mapping */
> +#define MAP_FIXED_SAFE	0x80000		/* MAP_FIXED which doesn't unmap underlying mapping */

Using 0x80000 on sparc, sparc uses mman-common.h.

> diff --git a/arch/tile/include/uapi/asm/mman.h b/arch/tile/include/uapi/asm/mman.h
> index 9b7add95926b..b212f5fd5345 100644
> --- a/arch/tile/include/uapi/asm/mman.h
> +++ b/arch/tile/include/uapi/asm/mman.h
> @@ -30,6 +30,7 @@
>  #define MAP_DENYWRITE	0x0800		/* ETXTBSY */
>  #define MAP_EXECUTABLE	0x1000		/* mark it as an executable */
>  #define MAP_HUGETLB	0x4000		/* create a huge page mapping */
> +#define MAP_FIXED_SAFE	0x8000		/* MAP_FIXED which doesn't unmap underlying mapping */
  
That is the next free flag, but you could also use 0x80000 on tile.

tile uses mman-common.h.

> diff --git a/arch/xtensa/include/uapi/asm/mman.h b/arch/xtensa/include/uapi/asm/mman.h
> index 2bfe590694fc..0daf199caa57 100644
> --- a/arch/xtensa/include/uapi/asm/mman.h
> +++ b/arch/xtensa/include/uapi/asm/mman.h
> @@ -56,6 +56,7 @@
>  #define MAP_NONBLOCK	0x20000		/* do not block on IO */
>  #define MAP_STACK	0x40000		/* give out an address that is best suited for process/thread stacks */
>  #define MAP_HUGETLB	0x80000		/* create a huge page mapping */
> +#define MAP_FIXED_SAFE	0x100000	/* MAP_FIXED which doesn't unmap underlying mapping */

xtensa doesn't use asm-generic/mman.h or mman-common.h

>  #ifdef CONFIG_MMAP_ALLOW_UNINITIALIZED
>  # define MAP_UNINITIALIZED 0x4000000	/* For anonymous mmap, memory could be
>  					 * uninitialized */
> @@ -63,6 +64,7 @@
>  # define MAP_UNINITIALIZED 0x0		/* Don't support this flag */
>  #endif
>  
> +

Stray new line.

>  /*
>   * Flags for msync
>   */
> diff --git a/include/uapi/asm-generic/mman.h b/include/uapi/asm-generic/mman.h
> index 2dffcbf705b3..56cde132a80a 100644
> --- a/include/uapi/asm-generic/mman.h
> +++ b/include/uapi/asm-generic/mman.h
> @@ -13,6 +13,7 @@
>  #define MAP_NONBLOCK	0x10000		/* do not block on IO */
>  #define MAP_STACK	0x20000		/* give out an address that is best suited for process/thread stacks */
>  #define MAP_HUGETLB	0x40000		/* create a huge page mapping */
> +#define MAP_FIXED_SAFE	0x80000		/* MAP_FIXED which doesn't unmap underlying mapping */

So I think I proved above that all the arches that are using 0x80000 are
also using mman-common.h, and vice-versa.

So you can put this in mman-common.h can't you?

> diff --git a/mm/mmap.c b/mm/mmap.c
> index 476e810cf100..e84339842bb8 100644
> --- a/mm/mmap.c
> +++ b/mm/mmap.c
> @@ -1342,6 +1342,10 @@ unsigned long do_mmap(struct file *file, unsigned long addr,
>  		if (!(file && path_noexec(&file->f_path)))
>  			prot |= PROT_EXEC;
>  
> +	/* force arch specific MAP_FIXED handling in get_unmapped_area */
> +	if (flags & MAP_FIXED_SAFE)
> +		flags |= MAP_FIXED;
> +

The comment is misleading, because literally on the next line below we
check MAP_FIXED and change the behaviour, but not in the arch code.

>  	if (!(flags & MAP_FIXED))
>  		addr = round_hint_to_min(addr);

So it would be more accurate to say something like:

	/*
	 * Internal to the kernel MAP_FIXED_SAFE is a superset of
	 * MAP_FIXED, so set MAP_FIXED in flags if MAP_FIXED_SAFE was
	 * set by the caller. This avoids all the arch code having to
	 * check for MAP_FIXED and MAP_FIXED_SAFE.
	 */
	if (flags & MAP_FIXED_SAFE)
		flags |= MAP_FIXED;


cheers

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [PATCH 0/2] mm: introduce MAP_FIXED_SAFE
  2017-12-06  4:54           ` Matthew Wilcox
@ 2017-12-06  7:03             ` Matthew Wilcox
  -1 siblings, 0 replies; 130+ messages in thread
From: Matthew Wilcox @ 2017-12-06  7:03 UTC (permalink / raw)
  To: Michael Ellerman
  Cc: Cyril Hrubis, Michal Hocko, Kees Cook, Linux API, Khalid Aziz,
	Andrew Morton, Russell King - ARM Linux, Andrea Arcangeli,
	Linux-MM, LKML, linux-arch, Florian Weimer, John Hubbard,
	Abdul Haleem, Joel Stanley

On Tue, Dec 05, 2017 at 08:54:35PM -0800, Matthew Wilcox wrote:
> On Wed, Dec 06, 2017 at 03:51:44PM +1100, Michael Ellerman wrote:
> > Cyril Hrubis <chrubis@suse.cz> writes:
> > 
> > > Hi!
> > >> > MAP_FIXED_UNIQUE
> > >> > MAP_FIXED_ONCE
> > >> > MAP_FIXED_FRESH
> > >> 
> > >> Well, I can open a poll for the best name, but none of those you are
> > >> proposing sound much better to me. Yeah, naming sucks...
> > >
> > > Given that MAP_FIXED replaces the previous mapping MAP_FIXED_NOREPLACE
> > > would probably be a best fit.
> > 
> > Yeah that could work.
> > 
> > I prefer "no clobber" as I just suggested, because the existing
> > MAP_FIXED doesn't politely "replace" a mapping, it destroys the current
> > one - which you or another thread may be using - and clobbers it with
> > the new one.
> 
> It's longer than MAP_FIXED_WEAK :-P
> 
> You'd have to be pretty darn strong to clobber an existing mapping.

I think we're thinking about this all wrong.  We shouldn't document it as
"This is a variant of MAP_FIXED".  We should document it as "Here's an
alternative to MAP_FIXED".

So, just like we currently say "exactly one of MAP_SHARED or MAP_PRIVATE",
we could add a new paragraph saying "at most one of MAP_FIXED or
MAP_REQUIRED" and "any of the following values".

Now, we should implement MAP_REQUIRED as having each architecture
define _MAP_NOT_A_HINT, and then #define MAP_REQUIRED (MAP_FIXED |
_MAP_NOT_A_HINT), but that's not information to confuse users with.

Also, that lets us add a third option at some point that is Yet Another
Way to interpret the 'addr' argument, by having MAP_FIXED clear and
_MAP_NOT_A_HINT set.

I'm not set on MAP_REQUIRED.  I came up with some awful names
(MAP_TODDLER, MAP_TANTRUM, MAP_ULTIMATUM, MAP_BOSS, MAP_PROGRAM_MANAGER,
etc).  But I think we should drop FIXED from the middle of the name.

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [PATCH 0/2] mm: introduce MAP_FIXED_SAFE
@ 2017-12-06  7:03             ` Matthew Wilcox
  0 siblings, 0 replies; 130+ messages in thread
From: Matthew Wilcox @ 2017-12-06  7:03 UTC (permalink / raw)
  To: Michael Ellerman
  Cc: Cyril Hrubis, Michal Hocko, Kees Cook, Linux API, Khalid Aziz,
	Andrew Morton, Russell King - ARM Linux, Andrea Arcangeli,
	Linux-MM, LKML, linux-arch, Florian Weimer, John Hubbard,
	Abdul Haleem, Joel Stanley

On Tue, Dec 05, 2017 at 08:54:35PM -0800, Matthew Wilcox wrote:
> On Wed, Dec 06, 2017 at 03:51:44PM +1100, Michael Ellerman wrote:
> > Cyril Hrubis <chrubis@suse.cz> writes:
> > 
> > > Hi!
> > >> > MAP_FIXED_UNIQUE
> > >> > MAP_FIXED_ONCE
> > >> > MAP_FIXED_FRESH
> > >> 
> > >> Well, I can open a poll for the best name, but none of those you are
> > >> proposing sound much better to me. Yeah, naming sucks...
> > >
> > > Given that MAP_FIXED replaces the previous mapping MAP_FIXED_NOREPLACE
> > > would probably be a best fit.
> > 
> > Yeah that could work.
> > 
> > I prefer "no clobber" as I just suggested, because the existing
> > MAP_FIXED doesn't politely "replace" a mapping, it destroys the current
> > one - which you or another thread may be using - and clobbers it with
> > the new one.
> 
> It's longer than MAP_FIXED_WEAK :-P
> 
> You'd have to be pretty darn strong to clobber an existing mapping.

I think we're thinking about this all wrong.  We shouldn't document it as
"This is a variant of MAP_FIXED".  We should document it as "Here's an
alternative to MAP_FIXED".

So, just like we currently say "exactly one of MAP_SHARED or MAP_PRIVATE",
we could add a new paragraph saying "at most one of MAP_FIXED or
MAP_REQUIRED" and "any of the following values".

Now, we should implement MAP_REQUIRED as having each architecture
define _MAP_NOT_A_HINT, and then #define MAP_REQUIRED (MAP_FIXED |
_MAP_NOT_A_HINT), but that's not information to confuse users with.

Also, that lets us add a third option at some point that is Yet Another
Way to interpret the 'addr' argument, by having MAP_FIXED clear and
_MAP_NOT_A_HINT set.

I'm not set on MAP_REQUIRED.  I came up with some awful names
(MAP_TODDLER, MAP_TANTRUM, MAP_ULTIMATUM, MAP_BOSS, MAP_PROGRAM_MANAGER,
etc).  But I think we should drop FIXED from the middle of the name.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [PATCH 0/2] mm: introduce MAP_FIXED_SAFE
@ 2017-12-06  7:33         ` Rasmus Villemoes
  0 siblings, 0 replies; 130+ messages in thread
From: Rasmus Villemoes @ 2017-12-06  7:33 UTC (permalink / raw)
  To: Michael Ellerman, Michal Hocko, Kees Cook
  Cc: Linux API, Khalid Aziz, Andrew Morton, Russell King - ARM Linux,
	Andrea Arcangeli, Linux-MM, LKML, linux-arch, Florian Weimer,
	John Hubbard, Abdul Haleem, Joel Stanley

On 2017-12-06 05:50, Michael Ellerman wrote:
> Michal Hocko <mhocko@kernel.org> writes:
> 
>> On Wed 29-11-17 14:25:36, Kees Cook wrote:
>> It is safe in a sense it doesn't perform any address space dangerous
>> operations. mmap is _inherently_ about the address space so the context
>> should be kind of clear.
> 
> So now you have to define what "dangerous" means.
> 
>>> MAP_FIXED_UNIQUE
>>> MAP_FIXED_ONCE
>>> MAP_FIXED_FRESH
>>
>> Well, I can open a poll for the best name, but none of those you are
>> proposing sound much better to me. Yeah, naming sucks...

I also don't like the _SAFE name - MAP_FIXED in itself isn't unsafe [1],
but I do agree that having a way to avoid clobbering (parts of) an
existing mapping is quite useful. Since we're bikeshedding names, how
about MAP_FIXED_EXCL, in analogy with the O_ flag.

[1] I like the analogy between MAP_FIXED and dup2 made in
<stackoverflow.com/questions/28575893>.

Rasmus

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [PATCH 0/2] mm: introduce MAP_FIXED_SAFE
@ 2017-12-06  7:33         ` Rasmus Villemoes
  0 siblings, 0 replies; 130+ messages in thread
From: Rasmus Villemoes @ 2017-12-06  7:33 UTC (permalink / raw)
  To: Michael Ellerman, Michal Hocko, Kees Cook
  Cc: Linux API, Khalid Aziz, Andrew Morton, Russell King - ARM Linux,
	Andrea Arcangeli, Linux-MM, LKML, linux-arch, Florian Weimer,
	John Hubbard, Abdul Haleem, Joel Stanley

On 2017-12-06 05:50, Michael Ellerman wrote:
> Michal Hocko <mhocko-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> writes:
> 
>> On Wed 29-11-17 14:25:36, Kees Cook wrote:
>> It is safe in a sense it doesn't perform any address space dangerous
>> operations. mmap is _inherently_ about the address space so the context
>> should be kind of clear.
> 
> So now you have to define what "dangerous" means.
> 
>>> MAP_FIXED_UNIQUE
>>> MAP_FIXED_ONCE
>>> MAP_FIXED_FRESH
>>
>> Well, I can open a poll for the best name, but none of those you are
>> proposing sound much better to me. Yeah, naming sucks...

I also don't like the _SAFE name - MAP_FIXED in itself isn't unsafe [1],
but I do agree that having a way to avoid clobbering (parts of) an
existing mapping is quite useful. Since we're bikeshedding names, how
about MAP_FIXED_EXCL, in analogy with the O_ flag.

[1] I like the analogy between MAP_FIXED and dup2 made in
<stackoverflow.com/questions/28575893>.

Rasmus

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [PATCH 0/2] mm: introduce MAP_FIXED_SAFE
@ 2017-12-06  7:33         ` Rasmus Villemoes
  0 siblings, 0 replies; 130+ messages in thread
From: Rasmus Villemoes @ 2017-12-06  7:33 UTC (permalink / raw)
  To: Michael Ellerman, Michal Hocko, Kees Cook
  Cc: Linux API, Khalid Aziz, Andrew Morton, Russell King - ARM Linux,
	Andrea Arcangeli, Linux-MM, LKML, linux-arch, Florian Weimer,
	John Hubbard, Abdul Haleem, Joel Stanley

On 2017-12-06 05:50, Michael Ellerman wrote:
> Michal Hocko <mhocko@kernel.org> writes:
> 
>> On Wed 29-11-17 14:25:36, Kees Cook wrote:
>> It is safe in a sense it doesn't perform any address space dangerous
>> operations. mmap is _inherently_ about the address space so the context
>> should be kind of clear.
> 
> So now you have to define what "dangerous" means.
> 
>>> MAP_FIXED_UNIQUE
>>> MAP_FIXED_ONCE
>>> MAP_FIXED_FRESH
>>
>> Well, I can open a poll for the best name, but none of those you are
>> proposing sound much better to me. Yeah, naming sucks...

I also don't like the _SAFE name - MAP_FIXED in itself isn't unsafe [1],
but I do agree that having a way to avoid clobbering (parts of) an
existing mapping is quite useful. Since we're bikeshedding names, how
about MAP_FIXED_EXCL, in analogy with the O_ flag.

[1] I like the analogy between MAP_FIXED and dup2 made in
<stackoverflow.com/questions/28575893>.

Rasmus

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [PATCH 0/2] mm: introduce MAP_FIXED_SAFE
  2017-12-06  7:03             ` Matthew Wilcox
@ 2017-12-06  7:33               ` John Hubbard
  -1 siblings, 0 replies; 130+ messages in thread
From: John Hubbard @ 2017-12-06  7:33 UTC (permalink / raw)
  To: Matthew Wilcox, Michael Ellerman
  Cc: Cyril Hrubis, Michal Hocko, Kees Cook, Linux API, Khalid Aziz,
	Andrew Morton, Russell King - ARM Linux, Andrea Arcangeli,
	Linux-MM, LKML, linux-arch, Florian Weimer, Abdul Haleem,
	Joel Stanley

On 12/05/2017 11:03 PM, Matthew Wilcox wrote:
> On Tue, Dec 05, 2017 at 08:54:35PM -0800, Matthew Wilcox wrote:
>> On Wed, Dec 06, 2017 at 03:51:44PM +1100, Michael Ellerman wrote:
>>> Cyril Hrubis <chrubis@suse.cz> writes:
>>>
>>>> Hi!
>>>>>> MAP_FIXED_UNIQUE
>>>>>> MAP_FIXED_ONCE
>>>>>> MAP_FIXED_FRESH
>>>>>
>>>>> Well, I can open a poll for the best name, but none of those you are
>>>>> proposing sound much better to me. Yeah, naming sucks...
>>>>
>>>> Given that MAP_FIXED replaces the previous mapping MAP_FIXED_NOREPLACE
>>>> would probably be a best fit.
>>>
>>> Yeah that could work.
>>>
>>> I prefer "no clobber" as I just suggested, because the existing
>>> MAP_FIXED doesn't politely "replace" a mapping, it destroys the current
>>> one - which you or another thread may be using - and clobbers it with
>>> the new one.
>>
>> It's longer than MAP_FIXED_WEAK :-P
>>
>> You'd have to be pretty darn strong to clobber an existing mapping.
> 
> I think we're thinking about this all wrong.  We shouldn't document it as
> "This is a variant of MAP_FIXED".  We should document it as "Here's an
> alternative to MAP_FIXED".
> 
> So, just like we currently say "exactly one of MAP_SHARED or MAP_PRIVATE",
> we could add a new paragraph saying "at most one of MAP_FIXED or
> MAP_REQUIRED" and "any of the following values".
> 
> Now, we should implement MAP_REQUIRED as having each architecture
> define _MAP_NOT_A_HINT, and then #define MAP_REQUIRED (MAP_FIXED |
> _MAP_NOT_A_HINT), but that's not information to confuse users with.
> 
> Also, that lets us add a third option at some point that is Yet Another
> Way to interpret the 'addr' argument, by having MAP_FIXED clear and
> _MAP_NOT_A_HINT set.
> 
> I'm not set on MAP_REQUIRED.  I came up with some awful names
> (MAP_TODDLER, MAP_TANTRUM, MAP_ULTIMATUM, MAP_BOSS, MAP_PROGRAM_MANAGER,
> etc).  But I think we should drop FIXED from the middle of the name.
> 

In that case, maybe:

    MAP_EXACT

? ...because that's the characteristic behavior. It doesn't clobber, but
you don't need to say that in the name, now that we're not including
_FIXED_ in the middle.

thanks,
John Hubbard
NVIDIA    

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [PATCH 0/2] mm: introduce MAP_FIXED_SAFE
@ 2017-12-06  7:33               ` John Hubbard
  0 siblings, 0 replies; 130+ messages in thread
From: John Hubbard @ 2017-12-06  7:33 UTC (permalink / raw)
  To: Matthew Wilcox, Michael Ellerman
  Cc: Cyril Hrubis, Michal Hocko, Kees Cook, Linux API, Khalid Aziz,
	Andrew Morton, Russell King - ARM Linux, Andrea Arcangeli,
	Linux-MM, LKML, linux-arch, Florian Weimer, Abdul Haleem,
	Joel Stanley

On 12/05/2017 11:03 PM, Matthew Wilcox wrote:
> On Tue, Dec 05, 2017 at 08:54:35PM -0800, Matthew Wilcox wrote:
>> On Wed, Dec 06, 2017 at 03:51:44PM +1100, Michael Ellerman wrote:
>>> Cyril Hrubis <chrubis@suse.cz> writes:
>>>
>>>> Hi!
>>>>>> MAP_FIXED_UNIQUE
>>>>>> MAP_FIXED_ONCE
>>>>>> MAP_FIXED_FRESH
>>>>>
>>>>> Well, I can open a poll for the best name, but none of those you are
>>>>> proposing sound much better to me. Yeah, naming sucks...
>>>>
>>>> Given that MAP_FIXED replaces the previous mapping MAP_FIXED_NOREPLACE
>>>> would probably be a best fit.
>>>
>>> Yeah that could work.
>>>
>>> I prefer "no clobber" as I just suggested, because the existing
>>> MAP_FIXED doesn't politely "replace" a mapping, it destroys the current
>>> one - which you or another thread may be using - and clobbers it with
>>> the new one.
>>
>> It's longer than MAP_FIXED_WEAK :-P
>>
>> You'd have to be pretty darn strong to clobber an existing mapping.
> 
> I think we're thinking about this all wrong.  We shouldn't document it as
> "This is a variant of MAP_FIXED".  We should document it as "Here's an
> alternative to MAP_FIXED".
> 
> So, just like we currently say "exactly one of MAP_SHARED or MAP_PRIVATE",
> we could add a new paragraph saying "at most one of MAP_FIXED or
> MAP_REQUIRED" and "any of the following values".
> 
> Now, we should implement MAP_REQUIRED as having each architecture
> define _MAP_NOT_A_HINT, and then #define MAP_REQUIRED (MAP_FIXED |
> _MAP_NOT_A_HINT), but that's not information to confuse users with.
> 
> Also, that lets us add a third option at some point that is Yet Another
> Way to interpret the 'addr' argument, by having MAP_FIXED clear and
> _MAP_NOT_A_HINT set.
> 
> I'm not set on MAP_REQUIRED.  I came up with some awful names
> (MAP_TODDLER, MAP_TANTRUM, MAP_ULTIMATUM, MAP_BOSS, MAP_PROGRAM_MANAGER,
> etc).  But I think we should drop FIXED from the middle of the name.
> 

In that case, maybe:

    MAP_EXACT

? ...because that's the characteristic behavior. It doesn't clobber, but
you don't need to say that in the name, now that we're not including
_FIXED_ in the middle.

thanks,
John Hubbard
NVIDIA    

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [PATCH 0/2] mm: introduce MAP_FIXED_SAFE
@ 2017-12-06  7:35                 ` Florian Weimer
  0 siblings, 0 replies; 130+ messages in thread
From: Florian Weimer @ 2017-12-06  7:35 UTC (permalink / raw)
  To: John Hubbard, Matthew Wilcox, Michael Ellerman
  Cc: Cyril Hrubis, Michal Hocko, Kees Cook, Linux API, Khalid Aziz,
	Andrew Morton, Russell King - ARM Linux, Andrea Arcangeli,
	Linux-MM, LKML, linux-arch, Abdul Haleem, Joel Stanley

On 12/06/2017 08:33 AM, John Hubbard wrote:
> In that case, maybe:
> 
>      MAP_EXACT
> 
> ? ...because that's the characteristic behavior.

Is that true?  mmap still silently rounding up the length to the page 
size, I assume, so even that name is misleading.

Thanks,
Florian

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [PATCH 0/2] mm: introduce MAP_FIXED_SAFE
@ 2017-12-06  7:35                 ` Florian Weimer
  0 siblings, 0 replies; 130+ messages in thread
From: Florian Weimer @ 2017-12-06  7:35 UTC (permalink / raw)
  To: John Hubbard, Matthew Wilcox, Michael Ellerman
  Cc: Cyril Hrubis, Michal Hocko, Kees Cook, Linux API, Khalid Aziz,
	Andrew Morton, Russell King - ARM Linux, Andrea Arcangeli,
	Linux-MM, LKML, linux-arch, Abdul Haleem, Joel Stanley

On 12/06/2017 08:33 AM, John Hubbard wrote:
> In that case, maybe:
> 
>      MAP_EXACT
> 
> ? ...because that's the characteristic behavior.

Is that true?  mmap still silently rounding up the length to the page 
size, I assume, so even that name is misleading.

Thanks,
Florian

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [PATCH 0/2] mm: introduce MAP_FIXED_SAFE
@ 2017-12-06  7:35                 ` Florian Weimer
  0 siblings, 0 replies; 130+ messages in thread
From: Florian Weimer @ 2017-12-06  7:35 UTC (permalink / raw)
  To: John Hubbard, Matthew Wilcox, Michael Ellerman
  Cc: Cyril Hrubis, Michal Hocko, Kees Cook, Linux API, Khalid Aziz,
	Andrew Morton, Russell King - ARM Linux, Andrea Arcangeli,
	Linux-MM, LKML, linux-arch, Abdul Haleem, Joel Stanley

On 12/06/2017 08:33 AM, John Hubbard wrote:
> In that case, maybe:
> 
>      MAP_EXACT
> 
> ? ...because that's the characteristic behavior.

Is that true?  mmap still silently rounding up the length to the page 
size, I assume, so even that name is misleading.

Thanks,
Florian

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [PATCH 0/2] mm: introduce MAP_FIXED_SAFE
  2017-12-06  7:35                 ` Florian Weimer
  (?)
  (?)
@ 2017-12-06  8:06                   ` John Hubbard
  -1 siblings, 0 replies; 130+ messages in thread
From: John Hubbard @ 2017-12-06  8:06 UTC (permalink / raw)
  To: Florian Weimer, Matthew Wilcox, Michael Ellerman
  Cc: Cyril Hrubis, Michal Hocko, Kees Cook, Linux API, Khalid Aziz,
	Andrew Morton, Russell King - ARM Linux, Andrea Arcangeli,
	Linux-MM, LKML, linux-arch, Abdul Haleem, Joel Stanley

On 12/05/2017 11:35 PM, Florian Weimer wrote:
> On 12/06/2017 08:33 AM, John Hubbard wrote:
>> In that case, maybe:
>>
>>      MAP_EXACT
>>
>> ? ...because that's the characteristic behavior.
> 
> Is that true?  mmap still silently rounding up the length to the page size, I assume, so even that name is misleading.

Hi Florian,

Not as far as I can tell, it's not doing that.

For both MAP_FIXED, and this new flag, the documented (and actual)
behavior is *not* to do any such rounding. Instead, the requested
input address is required to be page-aligned itself, and mmap()
should be honoring the exact addr.

>From the mmap(2) man page:

   MAP_FIXED
          Don't  interpret  addr  as  a  hint: place the mapping at
          exactly that address.  addr must be  a  multiple  of  the
          page  size. 


And from what I can see, the do_mmap() implementation leaves addr
unchanged, in the MAP_FIXED case:

do_mmap(...)
{
        /* ... */
	if (!(flags & MAP_FIXED))
		addr = round_hint_to_min(addr);

...although it does look like device drivers have the opportunity
to break that:

mmap_region(...)
{
		/* Can addr have changed??
		 *
		 * Answer: Yes, several device drivers can do it in their
		 *         f_op->mmap method. -DaveM
		 * Bug: If addr is changed, prev, rb_link, rb_parent should
		 *      be updated for vma_link()
		 */
		WARN_ON_ONCE(addr != vma->vm_start);

		addr = vma->vm_start;
   

--
thanks,
John Hubbard
NVIDIA

> 
> Thanks,
> Florian

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [PATCH 0/2] mm: introduce MAP_FIXED_SAFE
@ 2017-12-06  8:06                   ` John Hubbard
  0 siblings, 0 replies; 130+ messages in thread
From: John Hubbard @ 2017-12-06  8:06 UTC (permalink / raw)
  To: Florian Weimer, Matthew Wilcox, Michael Ellerman
  Cc: Cyril Hrubis, Michal Hocko, Kees Cook, Linux API, Khalid Aziz,
	Andrew Morton, Russell King - ARM Linux, Andrea Arcangeli,
	Linux-MM, LKML, linux-arch, Abdul Haleem, Joel Stanley

On 12/05/2017 11:35 PM, Florian Weimer wrote:
> On 12/06/2017 08:33 AM, John Hubbard wrote:
>> In that case, maybe:
>>
>>      MAP_EXACT
>>
>> ? ...because that's the characteristic behavior.
> 
> Is that true?  mmap still silently rounding up the length to the page size, I assume, so even that name is misleading.

Hi Florian,

Not as far as I can tell, it's not doing that.

For both MAP_FIXED, and this new flag, the documented (and actual)
behavior is *not* to do any such rounding. Instead, the requested
input address is required to be page-aligned itself, and mmap()
should be honoring the exact addr.

>From the mmap(2) man page:

   MAP_FIXED
          Don't  interpret  addr  as  a  hint: place the mapping at
          exactly that address.  addr must be  a  multiple  of  the
          page  size. 


And from what I can see, the do_mmap() implementation leaves addr
unchanged, in the MAP_FIXED case:

do_mmap(...)
{
        /* ... */
	if (!(flags & MAP_FIXED))
		addr = round_hint_to_min(addr);

...although it does look like device drivers have the opportunity
to break that:

mmap_region(...)
{
		/* Can addr have changed??
		 *
		 * Answer: Yes, several device drivers can do it in their
		 *         f_op->mmap method. -DaveM
		 * Bug: If addr is changed, prev, rb_link, rb_parent should
		 *      be updated for vma_link()
		 */
		WARN_ON_ONCE(addr != vma->vm_start);

		addr = vma->vm_start;
   

--
thanks,
John Hubbard
NVIDIA

> 
> Thanks,
> Florian

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [PATCH 0/2] mm: introduce MAP_FIXED_SAFE
@ 2017-12-06  8:06                   ` John Hubbard
  0 siblings, 0 replies; 130+ messages in thread
From: John Hubbard @ 2017-12-06  8:06 UTC (permalink / raw)
  To: Florian Weimer, Matthew Wilcox, Michael Ellerman
  Cc: Cyril Hrubis, Michal Hocko, Kees Cook, Linux API, Khalid Aziz,
	Andrew Morton, Russell King - ARM Linux, Andrea Arcangeli,
	Linux-MM, LKML, linux-arch, Abdul Haleem, Joel Stanley

On 12/05/2017 11:35 PM, Florian Weimer wrote:
> On 12/06/2017 08:33 AM, John Hubbard wrote:
>> In that case, maybe:
>>
>>      MAP_EXACT
>>
>> ? ...because that's the characteristic behavior.
> 
> Is that true?  mmap still silently rounding up the length to the page size, I assume, so even that name is misleading.

Hi Florian,

Not as far as I can tell, it's not doing that.

For both MAP_FIXED, and this new flag, the documented (and actual)
behavior is *not* to do any such rounding. Instead, the requested
input address is required to be page-aligned itself, and mmap()
should be honoring the exact addr.

From the mmap(2) man page:

   MAP_FIXED
          Don't  interpret  addr  as  a  hint: place the mapping at
          exactly that address.  addr must be  a  multiple  of  the
          page  size. 


And from what I can see, the do_mmap() implementation leaves addr
unchanged, in the MAP_FIXED case:

do_mmap(...)
{
        /* ... */
	if (!(flags & MAP_FIXED))
		addr = round_hint_to_min(addr);

...although it does look like device drivers have the opportunity
to break that:

mmap_region(...)
{
		/* Can addr have changed??
		 *
		 * Answer: Yes, several device drivers can do it in their
		 *         f_op->mmap method. -DaveM
		 * Bug: If addr is changed, prev, rb_link, rb_parent should
		 *      be updated for vma_link()
		 */
		WARN_ON_ONCE(addr != vma->vm_start);

		addr = vma->vm_start;
   

--
thanks,
John Hubbard
NVIDIA

> 
> Thanks,
> Florian

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [PATCH 0/2] mm: introduce MAP_FIXED_SAFE
@ 2017-12-06  8:06                   ` John Hubbard
  0 siblings, 0 replies; 130+ messages in thread
From: John Hubbard @ 2017-12-06  8:06 UTC (permalink / raw)
  To: Florian Weimer, Matthew Wilcox, Michael Ellerman
  Cc: Cyril Hrubis, Michal Hocko, Kees Cook, Linux API, Khalid Aziz,
	Andrew Morton, Russell King - ARM Linux, Andrea Arcangeli,
	Linux-MM, LKML, linux-arch, Abdul Haleem, Joel Stanley

On 12/05/2017 11:35 PM, Florian Weimer wrote:
> On 12/06/2017 08:33 AM, John Hubbard wrote:
>> In that case, maybe:
>>
>>      MAP_EXACT
>>
>> ? ...because that's the characteristic behavior.
> 
> Is that true?  mmap still silently rounding up the length to the page size, I assume, so even that name is misleading.

Hi Florian,

Not as far as I can tell, it's not doing that.

For both MAP_FIXED, and this new flag, the documented (and actual)
behavior is *not* to do any such rounding. Instead, the requested
input address is required to be page-aligned itself, and mmap()
should be honoring the exact addr.

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [PATCH 0/2] mm: introduce MAP_FIXED_SAFE
@ 2017-12-06  8:54                     ` Florian Weimer
  0 siblings, 0 replies; 130+ messages in thread
From: Florian Weimer @ 2017-12-06  8:54 UTC (permalink / raw)
  To: John Hubbard, Matthew Wilcox, Michael Ellerman
  Cc: Cyril Hrubis, Michal Hocko, Kees Cook, Linux API, Khalid Aziz,
	Andrew Morton, Russell King - ARM Linux, Andrea Arcangeli,
	Linux-MM, LKML, linux-arch, Abdul Haleem, Joel Stanley

On 12/06/2017 09:06 AM, John Hubbard wrote:
> On 12/05/2017 11:35 PM, Florian Weimer wrote:
>> On 12/06/2017 08:33 AM, John Hubbard wrote:
>>> In that case, maybe:
>>>
>>>       MAP_EXACT
>>>
>>> ? ...because that's the characteristic behavior.
>>
>> Is that true?  mmap still silently rounding up the length to the page size, I assume, so even that name is misleading.
> 
> Hi Florian,
> 
> Not as far as I can tell, it's not doing that.
> 
> For both MAP_FIXED, and this new flag, the documented (and actual)
> behavior is *not* to do any such rounding. Instead, the requested
> input address is required to be page-aligned itself, and mmap()
> should be honoring the exact addr.

I meant the length, not the address.

Florian

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [PATCH 0/2] mm: introduce MAP_FIXED_SAFE
@ 2017-12-06  8:54                     ` Florian Weimer
  0 siblings, 0 replies; 130+ messages in thread
From: Florian Weimer @ 2017-12-06  8:54 UTC (permalink / raw)
  To: John Hubbard, Matthew Wilcox, Michael Ellerman
  Cc: Cyril Hrubis, Michal Hocko, Kees Cook, Linux API, Khalid Aziz,
	Andrew Morton, Russell King - ARM Linux, Andrea Arcangeli,
	Linux-MM, LKML, linux-arch, Abdul Haleem, Joel Stanley

On 12/06/2017 09:06 AM, John Hubbard wrote:
> On 12/05/2017 11:35 PM, Florian Weimer wrote:
>> On 12/06/2017 08:33 AM, John Hubbard wrote:
>>> In that case, maybe:
>>>
>>>       MAP_EXACT
>>>
>>> ? ...because that's the characteristic behavior.
>>
>> Is that true?  mmap still silently rounding up the length to the page size, I assume, so even that name is misleading.
> 
> Hi Florian,
> 
> Not as far as I can tell, it's not doing that.
> 
> For both MAP_FIXED, and this new flag, the documented (and actual)
> behavior is *not* to do any such rounding. Instead, the requested
> input address is required to be page-aligned itself, and mmap()
> should be honoring the exact addr.

I meant the length, not the address.

Florian

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [PATCH 0/2] mm: introduce MAP_FIXED_SAFE
@ 2017-12-06  8:54                     ` Florian Weimer
  0 siblings, 0 replies; 130+ messages in thread
From: Florian Weimer @ 2017-12-06  8:54 UTC (permalink / raw)
  To: John Hubbard, Matthew Wilcox, Michael Ellerman
  Cc: Cyril Hrubis, Michal Hocko, Kees Cook, Linux API, Khalid Aziz,
	Andrew Morton, Russell King - ARM Linux, Andrea Arcangeli,
	Linux-MM, LKML, linux-arch, Abdul Haleem, Joel Stanley

On 12/06/2017 09:06 AM, John Hubbard wrote:
> On 12/05/2017 11:35 PM, Florian Weimer wrote:
>> On 12/06/2017 08:33 AM, John Hubbard wrote:
>>> In that case, maybe:
>>>
>>>  A A A A  MAP_EXACT
>>>
>>> ? ...because that's the characteristic behavior.
>>
>> Is that true?A  mmap still silently rounding up the length to the page size, I assume, so even that name is misleading.
> 
> Hi Florian,
> 
> Not as far as I can tell, it's not doing that.
> 
> For both MAP_FIXED, and this new flag, the documented (and actual)
> behavior is *not* to do any such rounding. Instead, the requested
> input address is required to be page-aligned itself, and mmap()
> should be honoring the exact addr.

I meant the length, not the address.

Florian

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [PATCH 0/2] mm: introduce MAP_FIXED_SAFE
@ 2017-12-06  9:08           ` Michal Hocko
  0 siblings, 0 replies; 130+ messages in thread
From: Michal Hocko @ 2017-12-06  9:08 UTC (permalink / raw)
  To: Rasmus Villemoes
  Cc: Michael Ellerman, Kees Cook, Linux API, Khalid Aziz,
	Andrew Morton, Russell King - ARM Linux, Andrea Arcangeli,
	Linux-MM, LKML, linux-arch, Florian Weimer, John Hubbard,
	Abdul Haleem, Joel Stanley

On Wed 06-12-17 08:33:37, Rasmus Villemoes wrote:
> On 2017-12-06 05:50, Michael Ellerman wrote:
> > Michal Hocko <mhocko@kernel.org> writes:
> > 
> >> On Wed 29-11-17 14:25:36, Kees Cook wrote:
> >> It is safe in a sense it doesn't perform any address space dangerous
> >> operations. mmap is _inherently_ about the address space so the context
> >> should be kind of clear.
> > 
> > So now you have to define what "dangerous" means.
> > 
> >>> MAP_FIXED_UNIQUE
> >>> MAP_FIXED_ONCE
> >>> MAP_FIXED_FRESH
> >>
> >> Well, I can open a poll for the best name, but none of those you are
> >> proposing sound much better to me. Yeah, naming sucks...
> 
> I also don't like the _SAFE name - MAP_FIXED in itself isn't unsafe [1],
> but I do agree that having a way to avoid clobbering (parts of) an
> existing mapping is quite useful. Since we're bikeshedding names, how
> about MAP_FIXED_EXCL, in analogy with the O_ flag.

I really give up on the name discussion. I will take whatever the
majority comes up with. I just do not want this (useful) funtionality
get bikeched to death.
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [PATCH 0/2] mm: introduce MAP_FIXED_SAFE
@ 2017-12-06  9:08           ` Michal Hocko
  0 siblings, 0 replies; 130+ messages in thread
From: Michal Hocko @ 2017-12-06  9:08 UTC (permalink / raw)
  To: Rasmus Villemoes
  Cc: Michael Ellerman, Kees Cook, Linux API, Khalid Aziz,
	Andrew Morton, Russell King - ARM Linux, Andrea Arcangeli,
	Linux-MM, LKML, linux-arch, Florian Weimer, John Hubbard,
	Abdul Haleem, Joel Stanley

On Wed 06-12-17 08:33:37, Rasmus Villemoes wrote:
> On 2017-12-06 05:50, Michael Ellerman wrote:
> > Michal Hocko <mhocko-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> writes:
> > 
> >> On Wed 29-11-17 14:25:36, Kees Cook wrote:
> >> It is safe in a sense it doesn't perform any address space dangerous
> >> operations. mmap is _inherently_ about the address space so the context
> >> should be kind of clear.
> > 
> > So now you have to define what "dangerous" means.
> > 
> >>> MAP_FIXED_UNIQUE
> >>> MAP_FIXED_ONCE
> >>> MAP_FIXED_FRESH
> >>
> >> Well, I can open a poll for the best name, but none of those you are
> >> proposing sound much better to me. Yeah, naming sucks...
> 
> I also don't like the _SAFE name - MAP_FIXED in itself isn't unsafe [1],
> but I do agree that having a way to avoid clobbering (parts of) an
> existing mapping is quite useful. Since we're bikeshedding names, how
> about MAP_FIXED_EXCL, in analogy with the O_ flag.

I really give up on the name discussion. I will take whatever the
majority comes up with. I just do not want this (useful) funtionality
get bikeched to death.
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [PATCH 0/2] mm: introduce MAP_FIXED_SAFE
@ 2017-12-06  9:08           ` Michal Hocko
  0 siblings, 0 replies; 130+ messages in thread
From: Michal Hocko @ 2017-12-06  9:08 UTC (permalink / raw)
  To: Rasmus Villemoes
  Cc: Michael Ellerman, Kees Cook, Linux API, Khalid Aziz,
	Andrew Morton, Russell King - ARM Linux, Andrea Arcangeli,
	Linux-MM, LKML, linux-arch, Florian Weimer, John Hubbard,
	Abdul Haleem, Joel Stanley

On Wed 06-12-17 08:33:37, Rasmus Villemoes wrote:
> On 2017-12-06 05:50, Michael Ellerman wrote:
> > Michal Hocko <mhocko@kernel.org> writes:
> > 
> >> On Wed 29-11-17 14:25:36, Kees Cook wrote:
> >> It is safe in a sense it doesn't perform any address space dangerous
> >> operations. mmap is _inherently_ about the address space so the context
> >> should be kind of clear.
> > 
> > So now you have to define what "dangerous" means.
> > 
> >>> MAP_FIXED_UNIQUE
> >>> MAP_FIXED_ONCE
> >>> MAP_FIXED_FRESH
> >>
> >> Well, I can open a poll for the best name, but none of those you are
> >> proposing sound much better to me. Yeah, naming sucks...
> 
> I also don't like the _SAFE name - MAP_FIXED in itself isn't unsafe [1],
> but I do agree that having a way to avoid clobbering (parts of) an
> existing mapping is quite useful. Since we're bikeshedding names, how
> about MAP_FIXED_EXCL, in analogy with the O_ flag.

I really give up on the name discussion. I will take whatever the
majority comes up with. I just do not want this (useful) funtionality
get bikeched to death.
-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [PATCH 1/2] mm: introduce MAP_FIXED_SAFE
  2017-12-06  5:15     ` Michael Ellerman
@ 2017-12-06  9:27       ` Michal Hocko
  -1 siblings, 0 replies; 130+ messages in thread
From: Michal Hocko @ 2017-12-06  9:27 UTC (permalink / raw)
  To: Michael Ellerman
  Cc: linux-api, Khalid Aziz, Andrew Morton, Russell King - ARM Linux,
	Andrea Arcangeli, linux-mm, LKML, linux-arch, Florian Weimer,
	John Hubbard

On Wed 06-12-17 16:15:24, Michael Ellerman wrote:
> Hi Michal,
> 
> Some comments below.
> 
> Michal Hocko <mhocko@kernel.org> writes:
> 
> > From: Michal Hocko <mhocko@suse.com>
> >
> > MAP_FIXED is used quite often to enforce mapping at the particular
> > range. The main problem of this flag is, however, that it is inherently
> > dangerous because it unmaps existing mappings covered by the requested
> > range. This can cause silent memory corruptions. Some of them even with
> > serious security implications. While the current semantic might be
> > really desiderable in many cases there are others which would want to
> > enforce the given range but rather see a failure than a silent memory
> > corruption on a clashing range. Please note that there is no guarantee
> > that a given range is obeyed by the mmap even when it is free - e.g.
> > arch specific code is allowed to apply an alignment.
> 
> I don't think this last sentence is correct. Or maybe I don't understand
> what you're referring to.
> 
> If you specifiy MAP_FIXED on a page boundary then the mapping must be
> made at that address, I don't think arch code is allowed to add any
> extra alignment.

The last sentence doesn't talk about MAP_FIXED. It talks about a hint
based mmap without MAP_FIXED ("there are others which would want to
enforce the given range but rather see a failure").
 
[...]
> > diff --git a/arch/alpha/include/uapi/asm/mman.h b/arch/alpha/include/uapi/asm/mman.h
> > index 6bf730063e3f..ef3770262925 100644
> > --- a/arch/alpha/include/uapi/asm/mman.h
> > +++ b/arch/alpha/include/uapi/asm/mman.h
> > @@ -32,6 +32,8 @@
> >  #define MAP_STACK	0x80000		/* give out an address that is best suited for process/thread stacks */
> >  #define MAP_HUGETLB	0x100000	/* create a huge page mapping */
> >  
> > +#define MAP_FIXED_SAFE	0x200000	/* MAP_FIXED which doesn't unmap underlying mapping */
> > +
> 
> Why the new line before MAP_FIXED_SAFE? It should sit with the others.

will remove the empty line

> You're using a different value to other arches here, but that's OK, and
> alpha doesn't use asm-generic/mman.h or mman-common.h
> 
> > diff --git a/arch/powerpc/include/uapi/asm/mman.h b/arch/powerpc/include/uapi/asm/mman.h
> > index e63bc37e33af..3ffd284e7160 100644
> > --- a/arch/powerpc/include/uapi/asm/mman.h
> > +++ b/arch/powerpc/include/uapi/asm/mman.h
> > @@ -29,5 +29,6 @@
> >  #define MAP_NONBLOCK	0x10000		/* do not block on IO */
> >  #define MAP_STACK	0x20000		/* give out an address that is best suited for process/thread stacks */
> >  #define MAP_HUGETLB	0x40000		/* create a huge page mapping */
> > +#define MAP_FIXED_SAFE	0x800000	/* MAP_FIXED which doesn't unmap underlying mapping */
> 
> Why did you pick 0x800000?
> 
> I don't see any reason you can't use 0x8000 on powerpc.

Copy&paste I guess, will update it.

[...]

> >  #ifdef CONFIG_MMAP_ALLOW_UNINITIALIZED
> >  # define MAP_UNINITIALIZED 0x4000000	/* For anonymous mmap, memory could be
> >  					 * uninitialized */
> > @@ -63,6 +64,7 @@
> >  # define MAP_UNINITIALIZED 0x0		/* Don't support this flag */
> >  #endif
> >  
> > +
> 
> Stray new line.

will remove

> >  /*
> >   * Flags for msync
> >   */
> > diff --git a/include/uapi/asm-generic/mman.h b/include/uapi/asm-generic/mman.h
> > index 2dffcbf705b3..56cde132a80a 100644
> > --- a/include/uapi/asm-generic/mman.h
> > +++ b/include/uapi/asm-generic/mman.h
> > @@ -13,6 +13,7 @@
> >  #define MAP_NONBLOCK	0x10000		/* do not block on IO */
> >  #define MAP_STACK	0x20000		/* give out an address that is best suited for process/thread stacks */
> >  #define MAP_HUGETLB	0x40000		/* create a huge page mapping */
> > +#define MAP_FIXED_SAFE	0x80000		/* MAP_FIXED which doesn't unmap underlying mapping */
> 
> So I think I proved above that all the arches that are using 0x80000 are
> also using mman-common.h, and vice-versa.
> 
> So you can put this in mman-common.h can't you?

Yes it seems I can. I would have to double check. It is true that
defining the new flag closer to MAP_FIXED makes some sense

[...]
> So it would be more accurate to say something like:
> 
> 	/*
> 	 * Internal to the kernel MAP_FIXED_SAFE is a superset of
> 	 * MAP_FIXED, so set MAP_FIXED in flags if MAP_FIXED_SAFE was
> 	 * set by the caller. This avoids all the arch code having to
> 	 * check for MAP_FIXED and MAP_FIXED_SAFE.
> 	 */
> 	if (flags & MAP_FIXED_SAFE)
> 		flags |= MAP_FIXED;

OK, I will use this wording.

Thanks for your review! Finally something that doesn't try to beat the
name to death ;)
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [PATCH 1/2] mm: introduce MAP_FIXED_SAFE
@ 2017-12-06  9:27       ` Michal Hocko
  0 siblings, 0 replies; 130+ messages in thread
From: Michal Hocko @ 2017-12-06  9:27 UTC (permalink / raw)
  To: Michael Ellerman
  Cc: linux-api, Khalid Aziz, Andrew Morton, Russell King - ARM Linux,
	Andrea Arcangeli, linux-mm, LKML, linux-arch, Florian Weimer,
	John Hubbard

On Wed 06-12-17 16:15:24, Michael Ellerman wrote:
> Hi Michal,
> 
> Some comments below.
> 
> Michal Hocko <mhocko@kernel.org> writes:
> 
> > From: Michal Hocko <mhocko@suse.com>
> >
> > MAP_FIXED is used quite often to enforce mapping at the particular
> > range. The main problem of this flag is, however, that it is inherently
> > dangerous because it unmaps existing mappings covered by the requested
> > range. This can cause silent memory corruptions. Some of them even with
> > serious security implications. While the current semantic might be
> > really desiderable in many cases there are others which would want to
> > enforce the given range but rather see a failure than a silent memory
> > corruption on a clashing range. Please note that there is no guarantee
> > that a given range is obeyed by the mmap even when it is free - e.g.
> > arch specific code is allowed to apply an alignment.
> 
> I don't think this last sentence is correct. Or maybe I don't understand
> what you're referring to.
> 
> If you specifiy MAP_FIXED on a page boundary then the mapping must be
> made at that address, I don't think arch code is allowed to add any
> extra alignment.

The last sentence doesn't talk about MAP_FIXED. It talks about a hint
based mmap without MAP_FIXED ("there are others which would want to
enforce the given range but rather see a failure").
 
[...]
> > diff --git a/arch/alpha/include/uapi/asm/mman.h b/arch/alpha/include/uapi/asm/mman.h
> > index 6bf730063e3f..ef3770262925 100644
> > --- a/arch/alpha/include/uapi/asm/mman.h
> > +++ b/arch/alpha/include/uapi/asm/mman.h
> > @@ -32,6 +32,8 @@
> >  #define MAP_STACK	0x80000		/* give out an address that is best suited for process/thread stacks */
> >  #define MAP_HUGETLB	0x100000	/* create a huge page mapping */
> >  
> > +#define MAP_FIXED_SAFE	0x200000	/* MAP_FIXED which doesn't unmap underlying mapping */
> > +
> 
> Why the new line before MAP_FIXED_SAFE? It should sit with the others.

will remove the empty line

> You're using a different value to other arches here, but that's OK, and
> alpha doesn't use asm-generic/mman.h or mman-common.h
> 
> > diff --git a/arch/powerpc/include/uapi/asm/mman.h b/arch/powerpc/include/uapi/asm/mman.h
> > index e63bc37e33af..3ffd284e7160 100644
> > --- a/arch/powerpc/include/uapi/asm/mman.h
> > +++ b/arch/powerpc/include/uapi/asm/mman.h
> > @@ -29,5 +29,6 @@
> >  #define MAP_NONBLOCK	0x10000		/* do not block on IO */
> >  #define MAP_STACK	0x20000		/* give out an address that is best suited for process/thread stacks */
> >  #define MAP_HUGETLB	0x40000		/* create a huge page mapping */
> > +#define MAP_FIXED_SAFE	0x800000	/* MAP_FIXED which doesn't unmap underlying mapping */
> 
> Why did you pick 0x800000?
> 
> I don't see any reason you can't use 0x8000 on powerpc.

Copy&paste I guess, will update it.

[...]

> >  #ifdef CONFIG_MMAP_ALLOW_UNINITIALIZED
> >  # define MAP_UNINITIALIZED 0x4000000	/* For anonymous mmap, memory could be
> >  					 * uninitialized */
> > @@ -63,6 +64,7 @@
> >  # define MAP_UNINITIALIZED 0x0		/* Don't support this flag */
> >  #endif
> >  
> > +
> 
> Stray new line.

will remove

> >  /*
> >   * Flags for msync
> >   */
> > diff --git a/include/uapi/asm-generic/mman.h b/include/uapi/asm-generic/mman.h
> > index 2dffcbf705b3..56cde132a80a 100644
> > --- a/include/uapi/asm-generic/mman.h
> > +++ b/include/uapi/asm-generic/mman.h
> > @@ -13,6 +13,7 @@
> >  #define MAP_NONBLOCK	0x10000		/* do not block on IO */
> >  #define MAP_STACK	0x20000		/* give out an address that is best suited for process/thread stacks */
> >  #define MAP_HUGETLB	0x40000		/* create a huge page mapping */
> > +#define MAP_FIXED_SAFE	0x80000		/* MAP_FIXED which doesn't unmap underlying mapping */
> 
> So I think I proved above that all the arches that are using 0x80000 are
> also using mman-common.h, and vice-versa.
> 
> So you can put this in mman-common.h can't you?

Yes it seems I can. I would have to double check. It is true that
defining the new flag closer to MAP_FIXED makes some sense

[...]
> So it would be more accurate to say something like:
> 
> 	/*
> 	 * Internal to the kernel MAP_FIXED_SAFE is a superset of
> 	 * MAP_FIXED, so set MAP_FIXED in flags if MAP_FIXED_SAFE was
> 	 * set by the caller. This avoids all the arch code having to
> 	 * check for MAP_FIXED and MAP_FIXED_SAFE.
> 	 */
> 	if (flags & MAP_FIXED_SAFE)
> 		flags |= MAP_FIXED;

OK, I will use this wording.

Thanks for your review! Finally something that doesn't try to beat the
name to death ;)
-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [PATCH 1/2] mm: introduce MAP_FIXED_SAFE
  2017-12-06  9:27       ` Michal Hocko
@ 2017-12-06 10:02         ` Michal Hocko
  -1 siblings, 0 replies; 130+ messages in thread
From: Michal Hocko @ 2017-12-06 10:02 UTC (permalink / raw)
  To: Michael Ellerman
  Cc: linux-api, Khalid Aziz, Andrew Morton, Russell King - ARM Linux,
	Andrea Arcangeli, linux-mm, LKML, linux-arch, Florian Weimer,
	John Hubbard

On Wed 06-12-17 10:27:24, Michal Hocko wrote:
> On Wed 06-12-17 16:15:24, Michael Ellerman wrote:
[...]
> > So I think I proved above that all the arches that are using 0x80000 are
> > also using mman-common.h, and vice-versa.
> > 
> > So you can put this in mman-common.h can't your?
> 
> Yes it seems I can. I would have to double check. It is true that
> defining the new flag closer to MAP_FIXED makes some sense

OK, so some recap
those which include generic mman-common.h directly
arch/sparc/include/uapi/asm/mman.h:#include <asm-generic/mman-common.h>
arch/powerpc/include/uapi/asm/mman.h:#include <asm-generic/mman-common.h>
arch/tile/include/uapi/asm/mman.h:#include <asm-generic/mman-common.h>

then we have
arch/metag/include/asm/mman.h
which includes uapi/asm/mman.h and that is a generic one
arch/metag/include/uapi/asm/Kbuild:generic-y += mman.h

others include generic mman.h
arch/arm/include/uapi/asm/mman.h:#include <asm-generic/mman.h>
arch/frv/include/uapi/asm/mman.h:#include <asm-generic/mman.h>
arch/ia64/include/uapi/asm/mman.h:#include <asm-generic/mman.h>
arch/m32r/include/uapi/asm/mman.h:#include <asm-generic/mman.h>
arch/mn10300/include/uapi/asm/mman.h:#include <asm-generic/mman.h>
arch/score/include/uapi/asm/mman.h:#include <asm-generic/mman.h>
arch/x86/include/uapi/asm/mman.h:#include <asm-generic/mman.h>

which includes mman-common.h as well

and then we are left with
arch/alpha/include/uapi/asm/mman.h
arch/mips/include/uapi/asm/mman.h
arch/parisc/include/uapi/asm/mman.h
arch/xtensa/include/uapi/asm/mman.h

which do not include anything. So the patch can be indeed simplified.
I will fold the following into the patch. I hope nothing got left behind
but my defconfig compile test on all arches which i have a cross
compiler for succeeded.
---
commit 52a9272f419f428cb079d340f64113325516ef9b
Author: Michal Hocko <mhocko@suse.com>
Date:   Wed Dec 6 10:46:16 2017 +0100

    - define MAP_FIXED_SAFE in asm-generic/mman-common.h as per Michael
      Ellerman because all architecture which use this header can share
      the same value. This will leave us with only 4 arches which need
      special handling.

diff --git a/arch/alpha/include/uapi/asm/mman.h b/arch/alpha/include/uapi/asm/mman.h
index ef3770262925..7287dbf1e11b 100644
--- a/arch/alpha/include/uapi/asm/mman.h
+++ b/arch/alpha/include/uapi/asm/mman.h
@@ -31,7 +31,6 @@
 #define MAP_NONBLOCK	0x40000		/* do not block on IO */
 #define MAP_STACK	0x80000		/* give out an address that is best suited for process/thread stacks */
 #define MAP_HUGETLB	0x100000	/* create a huge page mapping */
-
 #define MAP_FIXED_SAFE	0x200000	/* MAP_FIXED which doesn't unmap underlying mapping */
 
 #define MS_ASYNC	1		/* sync memory asynchronously */
diff --git a/arch/powerpc/include/uapi/asm/mman.h b/arch/powerpc/include/uapi/asm/mman.h
index 3ffd284e7160..e63bc37e33af 100644
--- a/arch/powerpc/include/uapi/asm/mman.h
+++ b/arch/powerpc/include/uapi/asm/mman.h
@@ -29,6 +29,5 @@
 #define MAP_NONBLOCK	0x10000		/* do not block on IO */
 #define MAP_STACK	0x20000		/* give out an address that is best suited for process/thread stacks */
 #define MAP_HUGETLB	0x40000		/* create a huge page mapping */
-#define MAP_FIXED_SAFE	0x800000	/* MAP_FIXED which doesn't unmap underlying mapping */
 
 #endif /* _UAPI_ASM_POWERPC_MMAN_H */
diff --git a/arch/sparc/include/uapi/asm/mman.h b/arch/sparc/include/uapi/asm/mman.h
index 0c282c09fae8..d21bffd5d3dc 100644
--- a/arch/sparc/include/uapi/asm/mman.h
+++ b/arch/sparc/include/uapi/asm/mman.h
@@ -24,7 +24,5 @@
 #define MAP_NONBLOCK	0x10000		/* do not block on IO */
 #define MAP_STACK	0x20000		/* give out an address that is best suited for process/thread stacks */
 #define MAP_HUGETLB	0x40000		/* create a huge page mapping */
-#define MAP_FIXED_SAFE	0x80000		/* MAP_FIXED which doesn't unmap underlying mapping */
-
 
 #endif /* _UAPI__SPARC_MMAN_H__ */
diff --git a/arch/tile/include/uapi/asm/mman.h b/arch/tile/include/uapi/asm/mman.h
index b212f5fd5345..9b7add95926b 100644
--- a/arch/tile/include/uapi/asm/mman.h
+++ b/arch/tile/include/uapi/asm/mman.h
@@ -30,7 +30,6 @@
 #define MAP_DENYWRITE	0x0800		/* ETXTBSY */
 #define MAP_EXECUTABLE	0x1000		/* mark it as an executable */
 #define MAP_HUGETLB	0x4000		/* create a huge page mapping */
-#define MAP_FIXED_SAFE	0x8000		/* MAP_FIXED which doesn't unmap underlying mapping */
 
 
 /*
diff --git a/include/uapi/asm-generic/mman-common.h b/include/uapi/asm-generic/mman-common.h
index 6d319c46fd90..1eca2cb10d44 100644
--- a/include/uapi/asm-generic/mman-common.h
+++ b/include/uapi/asm-generic/mman-common.h
@@ -25,6 +25,7 @@
 #else
 # define MAP_UNINITIALIZED 0x0		/* Don't support this flag */
 #endif
+#define MAP_FIXED_SAFE	0x80000		/* MAP_FIXED which doesn't unmap underlying mapping */
 
 /*
  * Flags for mlock
diff --git a/include/uapi/asm-generic/mman.h b/include/uapi/asm-generic/mman.h
index 56cde132a80a..2dffcbf705b3 100644
--- a/include/uapi/asm-generic/mman.h
+++ b/include/uapi/asm-generic/mman.h
@@ -13,7 +13,6 @@
 #define MAP_NONBLOCK	0x10000		/* do not block on IO */
 #define MAP_STACK	0x20000		/* give out an address that is best suited for process/thread stacks */
 #define MAP_HUGETLB	0x40000		/* create a huge page mapping */
-#define MAP_FIXED_SAFE	0x80000		/* MAP_FIXED which doesn't unmap underlying mapping */
 
 /* Bits [26:31] are reserved, see mman-common.h for MAP_HUGETLB usage */
 
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply related	[flat|nested] 130+ messages in thread

* Re: [PATCH 1/2] mm: introduce MAP_FIXED_SAFE
@ 2017-12-06 10:02         ` Michal Hocko
  0 siblings, 0 replies; 130+ messages in thread
From: Michal Hocko @ 2017-12-06 10:02 UTC (permalink / raw)
  To: Michael Ellerman
  Cc: linux-api, Khalid Aziz, Andrew Morton, Russell King - ARM Linux,
	Andrea Arcangeli, linux-mm, LKML, linux-arch, Florian Weimer,
	John Hubbard

On Wed 06-12-17 10:27:24, Michal Hocko wrote:
> On Wed 06-12-17 16:15:24, Michael Ellerman wrote:
[...]
> > So I think I proved above that all the arches that are using 0x80000 are
> > also using mman-common.h, and vice-versa.
> > 
> > So you can put this in mman-common.h can't your?
> 
> Yes it seems I can. I would have to double check. It is true that
> defining the new flag closer to MAP_FIXED makes some sense

OK, so some recap
those which include generic mman-common.h directly
arch/sparc/include/uapi/asm/mman.h:#include <asm-generic/mman-common.h>
arch/powerpc/include/uapi/asm/mman.h:#include <asm-generic/mman-common.h>
arch/tile/include/uapi/asm/mman.h:#include <asm-generic/mman-common.h>

then we have
arch/metag/include/asm/mman.h
which includes uapi/asm/mman.h and that is a generic one
arch/metag/include/uapi/asm/Kbuild:generic-y += mman.h

others include generic mman.h
arch/arm/include/uapi/asm/mman.h:#include <asm-generic/mman.h>
arch/frv/include/uapi/asm/mman.h:#include <asm-generic/mman.h>
arch/ia64/include/uapi/asm/mman.h:#include <asm-generic/mman.h>
arch/m32r/include/uapi/asm/mman.h:#include <asm-generic/mman.h>
arch/mn10300/include/uapi/asm/mman.h:#include <asm-generic/mman.h>
arch/score/include/uapi/asm/mman.h:#include <asm-generic/mman.h>
arch/x86/include/uapi/asm/mman.h:#include <asm-generic/mman.h>

which includes mman-common.h as well

and then we are left with
arch/alpha/include/uapi/asm/mman.h
arch/mips/include/uapi/asm/mman.h
arch/parisc/include/uapi/asm/mman.h
arch/xtensa/include/uapi/asm/mman.h

which do not include anything. So the patch can be indeed simplified.
I will fold the following into the patch. I hope nothing got left behind
but my defconfig compile test on all arches which i have a cross
compiler for succeeded.
---
commit 52a9272f419f428cb079d340f64113325516ef9b
Author: Michal Hocko <mhocko@suse.com>
Date:   Wed Dec 6 10:46:16 2017 +0100

    - define MAP_FIXED_SAFE in asm-generic/mman-common.h as per Michael
      Ellerman because all architecture which use this header can share
      the same value. This will leave us with only 4 arches which need
      special handling.

diff --git a/arch/alpha/include/uapi/asm/mman.h b/arch/alpha/include/uapi/asm/mman.h
index ef3770262925..7287dbf1e11b 100644
--- a/arch/alpha/include/uapi/asm/mman.h
+++ b/arch/alpha/include/uapi/asm/mman.h
@@ -31,7 +31,6 @@
 #define MAP_NONBLOCK	0x40000		/* do not block on IO */
 #define MAP_STACK	0x80000		/* give out an address that is best suited for process/thread stacks */
 #define MAP_HUGETLB	0x100000	/* create a huge page mapping */
-
 #define MAP_FIXED_SAFE	0x200000	/* MAP_FIXED which doesn't unmap underlying mapping */
 
 #define MS_ASYNC	1		/* sync memory asynchronously */
diff --git a/arch/powerpc/include/uapi/asm/mman.h b/arch/powerpc/include/uapi/asm/mman.h
index 3ffd284e7160..e63bc37e33af 100644
--- a/arch/powerpc/include/uapi/asm/mman.h
+++ b/arch/powerpc/include/uapi/asm/mman.h
@@ -29,6 +29,5 @@
 #define MAP_NONBLOCK	0x10000		/* do not block on IO */
 #define MAP_STACK	0x20000		/* give out an address that is best suited for process/thread stacks */
 #define MAP_HUGETLB	0x40000		/* create a huge page mapping */
-#define MAP_FIXED_SAFE	0x800000	/* MAP_FIXED which doesn't unmap underlying mapping */
 
 #endif /* _UAPI_ASM_POWERPC_MMAN_H */
diff --git a/arch/sparc/include/uapi/asm/mman.h b/arch/sparc/include/uapi/asm/mman.h
index 0c282c09fae8..d21bffd5d3dc 100644
--- a/arch/sparc/include/uapi/asm/mman.h
+++ b/arch/sparc/include/uapi/asm/mman.h
@@ -24,7 +24,5 @@
 #define MAP_NONBLOCK	0x10000		/* do not block on IO */
 #define MAP_STACK	0x20000		/* give out an address that is best suited for process/thread stacks */
 #define MAP_HUGETLB	0x40000		/* create a huge page mapping */
-#define MAP_FIXED_SAFE	0x80000		/* MAP_FIXED which doesn't unmap underlying mapping */
-
 
 #endif /* _UAPI__SPARC_MMAN_H__ */
diff --git a/arch/tile/include/uapi/asm/mman.h b/arch/tile/include/uapi/asm/mman.h
index b212f5fd5345..9b7add95926b 100644
--- a/arch/tile/include/uapi/asm/mman.h
+++ b/arch/tile/include/uapi/asm/mman.h
@@ -30,7 +30,6 @@
 #define MAP_DENYWRITE	0x0800		/* ETXTBSY */
 #define MAP_EXECUTABLE	0x1000		/* mark it as an executable */
 #define MAP_HUGETLB	0x4000		/* create a huge page mapping */
-#define MAP_FIXED_SAFE	0x8000		/* MAP_FIXED which doesn't unmap underlying mapping */
 
 
 /*
diff --git a/include/uapi/asm-generic/mman-common.h b/include/uapi/asm-generic/mman-common.h
index 6d319c46fd90..1eca2cb10d44 100644
--- a/include/uapi/asm-generic/mman-common.h
+++ b/include/uapi/asm-generic/mman-common.h
@@ -25,6 +25,7 @@
 #else
 # define MAP_UNINITIALIZED 0x0		/* Don't support this flag */
 #endif
+#define MAP_FIXED_SAFE	0x80000		/* MAP_FIXED which doesn't unmap underlying mapping */
 
 /*
  * Flags for mlock
diff --git a/include/uapi/asm-generic/mman.h b/include/uapi/asm-generic/mman.h
index 56cde132a80a..2dffcbf705b3 100644
--- a/include/uapi/asm-generic/mman.h
+++ b/include/uapi/asm-generic/mman.h
@@ -13,7 +13,6 @@
 #define MAP_NONBLOCK	0x10000		/* do not block on IO */
 #define MAP_STACK	0x20000		/* give out an address that is best suited for process/thread stacks */
 #define MAP_HUGETLB	0x40000		/* create a huge page mapping */
-#define MAP_FIXED_SAFE	0x80000		/* MAP_FIXED which doesn't unmap underlying mapping */
 
 /* Bits [26:31] are reserved, see mman-common.h for MAP_HUGETLB usage */
 
-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 130+ messages in thread

* Re: [PATCH 0/2] mm: introduce MAP_FIXED_SAFE
  2017-12-06  9:08           ` Michal Hocko
@ 2017-12-07  0:19             ` Kees Cook
  -1 siblings, 0 replies; 130+ messages in thread
From: Kees Cook @ 2017-12-07  0:19 UTC (permalink / raw)
  To: Michal Hocko, Michael Kerrisk
  Cc: Rasmus Villemoes, Michael Ellerman, Linux API, Khalid Aziz,
	Andrew Morton, Russell King - ARM Linux, Andrea Arcangeli,
	Linux-MM, LKML, linux-arch, Florian Weimer, John Hubbard,
	Abdul Haleem, Joel Stanley, Matthew Wilcox

On Wed, Dec 6, 2017 at 1:08 AM, Michal Hocko <mhocko@kernel.org> wrote:
> On Wed 06-12-17 08:33:37, Rasmus Villemoes wrote:
>> On 2017-12-06 05:50, Michael Ellerman wrote:
>> > Michal Hocko <mhocko@kernel.org> writes:
>> >
>> >> On Wed 29-11-17 14:25:36, Kees Cook wrote:
>> >> It is safe in a sense it doesn't perform any address space dangerous
>> >> operations. mmap is _inherently_ about the address space so the context
>> >> should be kind of clear.
>> >
>> > So now you have to define what "dangerous" means.
>> >
>> >>> MAP_FIXED_UNIQUE
>> >>> MAP_FIXED_ONCE
>> >>> MAP_FIXED_FRESH
>> >>
>> >> Well, I can open a poll for the best name, but none of those you are
>> >> proposing sound much better to me. Yeah, naming sucks...
>>
>> I also don't like the _SAFE name - MAP_FIXED in itself isn't unsafe [1],
>> but I do agree that having a way to avoid clobbering (parts of) an
>> existing mapping is quite useful. Since we're bikeshedding names, how
>> about MAP_FIXED_EXCL, in analogy with the O_ flag.
>
> I really give up on the name discussion. I will take whatever the
> majority comes up with. I just do not want this (useful) funtionality
> get bikeched to death.

Yup, I really want this to land too. What do people think of Matthew
Wilcox's MAP_REQUIRED ? MAP_EXACT isn't exact, and dropping "FIXED"
out of the middle seems sensible to me.

MIchael, any suggestions with your API hat on?

-Kees

-- 
Kees Cook
Pixel Security

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [PATCH 0/2] mm: introduce MAP_FIXED_SAFE
@ 2017-12-07  0:19             ` Kees Cook
  0 siblings, 0 replies; 130+ messages in thread
From: Kees Cook @ 2017-12-07  0:19 UTC (permalink / raw)
  To: Michal Hocko, Michael Kerrisk
  Cc: Rasmus Villemoes, Michael Ellerman, Linux API, Khalid Aziz,
	Andrew Morton, Russell King - ARM Linux, Andrea Arcangeli,
	Linux-MM, LKML, linux-arch, Florian Weimer, John Hubbard,
	Abdul Haleem, Joel Stanley, Matthew Wilcox

On Wed, Dec 6, 2017 at 1:08 AM, Michal Hocko <mhocko@kernel.org> wrote:
> On Wed 06-12-17 08:33:37, Rasmus Villemoes wrote:
>> On 2017-12-06 05:50, Michael Ellerman wrote:
>> > Michal Hocko <mhocko@kernel.org> writes:
>> >
>> >> On Wed 29-11-17 14:25:36, Kees Cook wrote:
>> >> It is safe in a sense it doesn't perform any address space dangerous
>> >> operations. mmap is _inherently_ about the address space so the context
>> >> should be kind of clear.
>> >
>> > So now you have to define what "dangerous" means.
>> >
>> >>> MAP_FIXED_UNIQUE
>> >>> MAP_FIXED_ONCE
>> >>> MAP_FIXED_FRESH
>> >>
>> >> Well, I can open a poll for the best name, but none of those you are
>> >> proposing sound much better to me. Yeah, naming sucks...
>>
>> I also don't like the _SAFE name - MAP_FIXED in itself isn't unsafe [1],
>> but I do agree that having a way to avoid clobbering (parts of) an
>> existing mapping is quite useful. Since we're bikeshedding names, how
>> about MAP_FIXED_EXCL, in analogy with the O_ flag.
>
> I really give up on the name discussion. I will take whatever the
> majority comes up with. I just do not want this (useful) funtionality
> get bikeched to death.

Yup, I really want this to land too. What do people think of Matthew
Wilcox's MAP_REQUIRED ? MAP_EXACT isn't exact, and dropping "FIXED"
out of the middle seems sensible to me.

MIchael, any suggestions with your API hat on?

-Kees

-- 
Kees Cook
Pixel Security

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [PATCH 0/2] mm: introduce MAP_FIXED_SAFE
  2017-12-07  0:19             ` Kees Cook
@ 2017-12-07  1:08               ` John Hubbard
  -1 siblings, 0 replies; 130+ messages in thread
From: John Hubbard @ 2017-12-07  1:08 UTC (permalink / raw)
  To: Kees Cook, Michal Hocko, Michael Kerrisk
  Cc: Rasmus Villemoes, Michael Ellerman, Linux API, Khalid Aziz,
	Andrew Morton, Russell King - ARM Linux, Andrea Arcangeli,
	Linux-MM, LKML, linux-arch, Florian Weimer, Abdul Haleem,
	Joel Stanley, Matthew Wilcox

On 12/06/2017 04:19 PM, Kees Cook wrote:
> On Wed, Dec 6, 2017 at 1:08 AM, Michal Hocko <mhocko@kernel.org> wrote:
>> On Wed 06-12-17 08:33:37, Rasmus Villemoes wrote:
>>> On 2017-12-06 05:50, Michael Ellerman wrote:
>>>> Michal Hocko <mhocko@kernel.org> writes:
>>>>
>>>>> On Wed 29-11-17 14:25:36, Kees Cook wrote:
>>>>> It is safe in a sense it doesn't perform any address space dangerous
>>>>> operations. mmap is _inherently_ about the address space so the context
>>>>> should be kind of clear.
>>>>
>>>> So now you have to define what "dangerous" means.
>>>>
>>>>>> MAP_FIXED_UNIQUE
>>>>>> MAP_FIXED_ONCE
>>>>>> MAP_FIXED_FRESH
>>>>>
>>>>> Well, I can open a poll for the best name, but none of those you are
>>>>> proposing sound much better to me. Yeah, naming sucks...
>>>
>>> I also don't like the _SAFE name - MAP_FIXED in itself isn't unsafe [1],
>>> but I do agree that having a way to avoid clobbering (parts of) an
>>> existing mapping is quite useful. Since we're bikeshedding names, how
>>> about MAP_FIXED_EXCL, in analogy with the O_ flag.
>>
>> I really give up on the name discussion. I will take whatever the
>> majority comes up with. I just do not want this (useful) funtionality
>> get bikeched to death.
> 
> Yup, I really want this to land too. What do people think of Matthew
> Wilcox's MAP_REQUIRED ? MAP_EXACT isn't exact, and dropping "FIXED"
> out of the middle seems sensible to me.

+1, MAP_REQUIRED does sound like the best one so far, yes. Sorry if I contributed
to any excessive bikeshedding. :)

thanks,
john h

> 
> MIchael, any suggestions with your API hat on?
> 
> -Kees
> 

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [PATCH 0/2] mm: introduce MAP_FIXED_SAFE
@ 2017-12-07  1:08               ` John Hubbard
  0 siblings, 0 replies; 130+ messages in thread
From: John Hubbard @ 2017-12-07  1:08 UTC (permalink / raw)
  To: Kees Cook, Michal Hocko, Michael Kerrisk
  Cc: Rasmus Villemoes, Michael Ellerman, Linux API, Khalid Aziz,
	Andrew Morton, Russell King - ARM Linux, Andrea Arcangeli,
	Linux-MM, LKML, linux-arch, Florian Weimer, Abdul Haleem,
	Joel Stanley, Matthew Wilcox

On 12/06/2017 04:19 PM, Kees Cook wrote:
> On Wed, Dec 6, 2017 at 1:08 AM, Michal Hocko <mhocko@kernel.org> wrote:
>> On Wed 06-12-17 08:33:37, Rasmus Villemoes wrote:
>>> On 2017-12-06 05:50, Michael Ellerman wrote:
>>>> Michal Hocko <mhocko@kernel.org> writes:
>>>>
>>>>> On Wed 29-11-17 14:25:36, Kees Cook wrote:
>>>>> It is safe in a sense it doesn't perform any address space dangerous
>>>>> operations. mmap is _inherently_ about the address space so the context
>>>>> should be kind of clear.
>>>>
>>>> So now you have to define what "dangerous" means.
>>>>
>>>>>> MAP_FIXED_UNIQUE
>>>>>> MAP_FIXED_ONCE
>>>>>> MAP_FIXED_FRESH
>>>>>
>>>>> Well, I can open a poll for the best name, but none of those you are
>>>>> proposing sound much better to me. Yeah, naming sucks...
>>>
>>> I also don't like the _SAFE name - MAP_FIXED in itself isn't unsafe [1],
>>> but I do agree that having a way to avoid clobbering (parts of) an
>>> existing mapping is quite useful. Since we're bikeshedding names, how
>>> about MAP_FIXED_EXCL, in analogy with the O_ flag.
>>
>> I really give up on the name discussion. I will take whatever the
>> majority comes up with. I just do not want this (useful) funtionality
>> get bikeched to death.
> 
> Yup, I really want this to land too. What do people think of Matthew
> Wilcox's MAP_REQUIRED ? MAP_EXACT isn't exact, and dropping "FIXED"
> out of the middle seems sensible to me.

+1, MAP_REQUIRED does sound like the best one so far, yes. Sorry if I contributed
to any excessive bikeshedding. :)

thanks,
john h

> 
> MIchael, any suggestions with your API hat on?
> 
> -Kees
> 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [PATCH 0/2] mm: introduce MAP_FIXED_SAFE
@ 2017-12-07  5:46               ` Michael Ellerman
  0 siblings, 0 replies; 130+ messages in thread
From: Michael Ellerman @ 2017-12-07  5:46 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Cyril Hrubis, Michal Hocko, Kees Cook, Linux API, Khalid Aziz,
	Andrew Morton, Russell King - ARM Linux, Andrea Arcangeli,
	Linux-MM, LKML, linux-arch, Florian Weimer, John Hubbard,
	Abdul Haleem, Joel Stanley

Matthew Wilcox <willy@infradead.org> writes:

> On Tue, Dec 05, 2017 at 08:54:35PM -0800, Matthew Wilcox wrote:
>> On Wed, Dec 06, 2017 at 03:51:44PM +1100, Michael Ellerman wrote:
>> > Cyril Hrubis <chrubis@suse.cz> writes:
>> > 
>> > > Hi!
>> > >> > MAP_FIXED_UNIQUE
>> > >> > MAP_FIXED_ONCE
>> > >> > MAP_FIXED_FRESH
>> > >> 
>> > >> Well, I can open a poll for the best name, but none of those you are
>> > >> proposing sound much better to me. Yeah, naming sucks...
>> > >
>> > > Given that MAP_FIXED replaces the previous mapping MAP_FIXED_NOREPLACE
>> > > would probably be a best fit.
>> > 
>> > Yeah that could work.
>> > 
>> > I prefer "no clobber" as I just suggested, because the existing
>> > MAP_FIXED doesn't politely "replace" a mapping, it destroys the current
>> > one - which you or another thread may be using - and clobbers it with
>> > the new one.
>> 
>> It's longer than MAP_FIXED_WEAK :-P
>> 
>> You'd have to be pretty darn strong to clobber an existing mapping.
>
> I think we're thinking about this all wrong.  We shouldn't document it as
> "This is a variant of MAP_FIXED".  We should document it as "Here's an
> alternative to MAP_FIXED".
>
> So, just like we currently say "exactly one of MAP_SHARED or MAP_PRIVATE",
> we could add a new paragraph saying "at most one of MAP_FIXED or
> MAP_REQUIRED" and "any of the following values".
>
> Now, we should implement MAP_REQUIRED as having each architecture
> define _MAP_NOT_A_HINT, and then #define MAP_REQUIRED (MAP_FIXED |
> _MAP_NOT_A_HINT), but that's not information to confuse users with.
>
> Also, that lets us add a third option at some point that is Yet Another
> Way to interpret the 'addr' argument, by having MAP_FIXED clear and
> _MAP_NOT_A_HINT set.
>
> I'm not set on MAP_REQUIRED.  I came up with some awful names
> (MAP_TODDLER, MAP_TANTRUM, MAP_ULTIMATUM, MAP_BOSS, MAP_PROGRAM_MANAGER,
> etc).  But I think we should drop FIXED from the middle of the name.

MAP_REQUIRED doesn't immediately grab me, but I don't actively dislike
it either :)

What about MAP_AT_ADDR ?

It's short, and says what it does on the tin. The first argument to mmap
is actually called "addr" too.

cheers

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [PATCH 0/2] mm: introduce MAP_FIXED_SAFE
@ 2017-12-07  5:46               ` Michael Ellerman
  0 siblings, 0 replies; 130+ messages in thread
From: Michael Ellerman @ 2017-12-07  5:46 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Cyril Hrubis, Michal Hocko, Kees Cook, Linux API, Khalid Aziz,
	Andrew Morton, Russell King - ARM Linux, Andrea Arcangeli,
	Linux-MM, LKML, linux-arch, Florian Weimer, John Hubbard,
	Abdul Haleem, Joel Stanley

Matthew Wilcox <willy-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org> writes:

> On Tue, Dec 05, 2017 at 08:54:35PM -0800, Matthew Wilcox wrote:
>> On Wed, Dec 06, 2017 at 03:51:44PM +1100, Michael Ellerman wrote:
>> > Cyril Hrubis <chrubis-AlSwsSmVLrQ@public.gmane.org> writes:
>> > 
>> > > Hi!
>> > >> > MAP_FIXED_UNIQUE
>> > >> > MAP_FIXED_ONCE
>> > >> > MAP_FIXED_FRESH
>> > >> 
>> > >> Well, I can open a poll for the best name, but none of those you are
>> > >> proposing sound much better to me. Yeah, naming sucks...
>> > >
>> > > Given that MAP_FIXED replaces the previous mapping MAP_FIXED_NOREPLACE
>> > > would probably be a best fit.
>> > 
>> > Yeah that could work.
>> > 
>> > I prefer "no clobber" as I just suggested, because the existing
>> > MAP_FIXED doesn't politely "replace" a mapping, it destroys the current
>> > one - which you or another thread may be using - and clobbers it with
>> > the new one.
>> 
>> It's longer than MAP_FIXED_WEAK :-P
>> 
>> You'd have to be pretty darn strong to clobber an existing mapping.
>
> I think we're thinking about this all wrong.  We shouldn't document it as
> "This is a variant of MAP_FIXED".  We should document it as "Here's an
> alternative to MAP_FIXED".
>
> So, just like we currently say "exactly one of MAP_SHARED or MAP_PRIVATE",
> we could add a new paragraph saying "at most one of MAP_FIXED or
> MAP_REQUIRED" and "any of the following values".
>
> Now, we should implement MAP_REQUIRED as having each architecture
> define _MAP_NOT_A_HINT, and then #define MAP_REQUIRED (MAP_FIXED |
> _MAP_NOT_A_HINT), but that's not information to confuse users with.
>
> Also, that lets us add a third option at some point that is Yet Another
> Way to interpret the 'addr' argument, by having MAP_FIXED clear and
> _MAP_NOT_A_HINT set.
>
> I'm not set on MAP_REQUIRED.  I came up with some awful names
> (MAP_TODDLER, MAP_TANTRUM, MAP_ULTIMATUM, MAP_BOSS, MAP_PROGRAM_MANAGER,
> etc).  But I think we should drop FIXED from the middle of the name.

MAP_REQUIRED doesn't immediately grab me, but I don't actively dislike
it either :)

What about MAP_AT_ADDR ?

It's short, and says what it does on the tin. The first argument to mmap
is actually called "addr" too.

cheers

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [PATCH 0/2] mm: introduce MAP_FIXED_SAFE
@ 2017-12-07  5:46               ` Michael Ellerman
  0 siblings, 0 replies; 130+ messages in thread
From: Michael Ellerman @ 2017-12-07  5:46 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Cyril Hrubis, Michal Hocko, Kees Cook, Linux API, Khalid Aziz,
	Andrew Morton, Russell King - ARM Linux, Andrea Arcangeli,
	Linux-MM, LKML, linux-arch, Florian Weimer, John Hubbard,
	Abdul Haleem, Joel Stanley

Matthew Wilcox <willy@infradead.org> writes:

> On Tue, Dec 05, 2017 at 08:54:35PM -0800, Matthew Wilcox wrote:
>> On Wed, Dec 06, 2017 at 03:51:44PM +1100, Michael Ellerman wrote:
>> > Cyril Hrubis <chrubis@suse.cz> writes:
>> > 
>> > > Hi!
>> > >> > MAP_FIXED_UNIQUE
>> > >> > MAP_FIXED_ONCE
>> > >> > MAP_FIXED_FRESH
>> > >> 
>> > >> Well, I can open a poll for the best name, but none of those you are
>> > >> proposing sound much better to me. Yeah, naming sucks...
>> > >
>> > > Given that MAP_FIXED replaces the previous mapping MAP_FIXED_NOREPLACE
>> > > would probably be a best fit.
>> > 
>> > Yeah that could work.
>> > 
>> > I prefer "no clobber" as I just suggested, because the existing
>> > MAP_FIXED doesn't politely "replace" a mapping, it destroys the current
>> > one - which you or another thread may be using - and clobbers it with
>> > the new one.
>> 
>> It's longer than MAP_FIXED_WEAK :-P
>> 
>> You'd have to be pretty darn strong to clobber an existing mapping.
>
> I think we're thinking about this all wrong.  We shouldn't document it as
> "This is a variant of MAP_FIXED".  We should document it as "Here's an
> alternative to MAP_FIXED".
>
> So, just like we currently say "exactly one of MAP_SHARED or MAP_PRIVATE",
> we could add a new paragraph saying "at most one of MAP_FIXED or
> MAP_REQUIRED" and "any of the following values".
>
> Now, we should implement MAP_REQUIRED as having each architecture
> define _MAP_NOT_A_HINT, and then #define MAP_REQUIRED (MAP_FIXED |
> _MAP_NOT_A_HINT), but that's not information to confuse users with.
>
> Also, that lets us add a third option at some point that is Yet Another
> Way to interpret the 'addr' argument, by having MAP_FIXED clear and
> _MAP_NOT_A_HINT set.
>
> I'm not set on MAP_REQUIRED.  I came up with some awful names
> (MAP_TODDLER, MAP_TANTRUM, MAP_ULTIMATUM, MAP_BOSS, MAP_PROGRAM_MANAGER,
> etc).  But I think we should drop FIXED from the middle of the name.

MAP_REQUIRED doesn't immediately grab me, but I don't actively dislike
it either :)

What about MAP_AT_ADDR ?

It's short, and says what it does on the tin. The first argument to mmap
is actually called "addr" too.

cheers

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [PATCH 1/2] mm: introduce MAP_FIXED_SAFE
  2017-11-29 14:42   ` Michal Hocko
@ 2017-12-07 12:07     ` Pavel Machek
  -1 siblings, 0 replies; 130+ messages in thread
From: Pavel Machek @ 2017-12-07 12:07 UTC (permalink / raw)
  To: Michal Hocko
  Cc: linux-api, Khalid Aziz, Michael Ellerman, Andrew Morton,
	Russell King - ARM Linux, Andrea Arcangeli, linux-mm, LKML,
	linux-arch, Florian Weimer, John Hubbard, Michal Hocko

Hi!

> MAP_FIXED is used quite often to enforce mapping at the particular
> range. The main problem of this flag is, however, that it is inherently
> dangerous because it unmaps existing mappings covered by the requested
> range. This can cause silent memory corruptions. Some of them even with
> serious security implications. While the current semantic might be
> really desiderable in many cases there are others which would want to
> enforce the given range but rather see a failure than a silent memory
> corruption on a clashing range. Please note that there is no guarantee
> that a given range is obeyed by the mmap even when it is free - e.g.
> arch specific code is allowed to apply an alignment.
> 
> Introduce a new MAP_FIXED_SAFE flag for mmap to achieve this behavior.
> It has the same semantic as MAP_FIXED wrt. the given address request

Could we get some better name? Functionality seems reasonable, but
_SAFE suffix does not really explain what is going on to the user.

MAP_ADD_FIXED ?

								Pavel

-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [PATCH 1/2] mm: introduce MAP_FIXED_SAFE
@ 2017-12-07 12:07     ` Pavel Machek
  0 siblings, 0 replies; 130+ messages in thread
From: Pavel Machek @ 2017-12-07 12:07 UTC (permalink / raw)
  To: Michal Hocko
  Cc: linux-api, Khalid Aziz, Michael Ellerman, Andrew Morton,
	Russell King - ARM Linux, Andrea Arcangeli, linux-mm, LKML,
	linux-arch, Florian Weimer, John Hubbard, Michal Hocko

Hi!

> MAP_FIXED is used quite often to enforce mapping at the particular
> range. The main problem of this flag is, however, that it is inherently
> dangerous because it unmaps existing mappings covered by the requested
> range. This can cause silent memory corruptions. Some of them even with
> serious security implications. While the current semantic might be
> really desiderable in many cases there are others which would want to
> enforce the given range but rather see a failure than a silent memory
> corruption on a clashing range. Please note that there is no guarantee
> that a given range is obeyed by the mmap even when it is free - e.g.
> arch specific code is allowed to apply an alignment.
> 
> Introduce a new MAP_FIXED_SAFE flag for mmap to achieve this behavior.
> It has the same semantic as MAP_FIXED wrt. the given address request

Could we get some better name? Functionality seems reasonable, but
_SAFE suffix does not really explain what is going on to the user.

MAP_ADD_FIXED ?

								Pavel

-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [PATCH 0/2] mm: introduce MAP_FIXED_SAFE
  2017-12-07  5:46               ` Michael Ellerman
@ 2017-12-07 19:14                 ` Kees Cook
  -1 siblings, 0 replies; 130+ messages in thread
From: Kees Cook @ 2017-12-07 19:14 UTC (permalink / raw)
  To: Michael Ellerman
  Cc: Matthew Wilcox, Cyril Hrubis, Michal Hocko, Linux API,
	Khalid Aziz, Andrew Morton, Russell King - ARM Linux,
	Andrea Arcangeli, Linux-MM, LKML, linux-arch, Florian Weimer,
	John Hubbard, Abdul Haleem, Joel Stanley, Pavel Machek

On Wed, Dec 6, 2017 at 9:46 PM, Michael Ellerman <mpe@ellerman.id.au> wrote:
> Matthew Wilcox <willy@infradead.org> writes:
>
>> On Tue, Dec 05, 2017 at 08:54:35PM -0800, Matthew Wilcox wrote:
>>> On Wed, Dec 06, 2017 at 03:51:44PM +1100, Michael Ellerman wrote:
>>> > Cyril Hrubis <chrubis@suse.cz> writes:
>>> >
>>> > > Hi!
>>> > >> > MAP_FIXED_UNIQUE
>>> > >> > MAP_FIXED_ONCE
>>> > >> > MAP_FIXED_FRESH
>>> > >>
>>> > >> Well, I can open a poll for the best name, but none of those you are
>>> > >> proposing sound much better to me. Yeah, naming sucks...
>>> > >
>>> > > Given that MAP_FIXED replaces the previous mapping MAP_FIXED_NOREPLACE
>>> > > would probably be a best fit.
>>> >
>>> > Yeah that could work.
>>> >
>>> > I prefer "no clobber" as I just suggested, because the existing
>>> > MAP_FIXED doesn't politely "replace" a mapping, it destroys the current
>>> > one - which you or another thread may be using - and clobbers it with
>>> > the new one.
>>>
>>> It's longer than MAP_FIXED_WEAK :-P
>>>
>>> You'd have to be pretty darn strong to clobber an existing mapping.
>>
>> I think we're thinking about this all wrong.  We shouldn't document it as
>> "This is a variant of MAP_FIXED".  We should document it as "Here's an
>> alternative to MAP_FIXED".
>>
>> So, just like we currently say "exactly one of MAP_SHARED or MAP_PRIVATE",
>> we could add a new paragraph saying "at most one of MAP_FIXED or
>> MAP_REQUIRED" and "any of the following values".
>>
>> Now, we should implement MAP_REQUIRED as having each architecture
>> define _MAP_NOT_A_HINT, and then #define MAP_REQUIRED (MAP_FIXED |
>> _MAP_NOT_A_HINT), but that's not information to confuse users with.
>>
>> Also, that lets us add a third option at some point that is Yet Another
>> Way to interpret the 'addr' argument, by having MAP_FIXED clear and
>> _MAP_NOT_A_HINT set.
>>
>> I'm not set on MAP_REQUIRED.  I came up with some awful names
>> (MAP_TODDLER, MAP_TANTRUM, MAP_ULTIMATUM, MAP_BOSS, MAP_PROGRAM_MANAGER,
>> etc).  But I think we should drop FIXED from the middle of the name.
>
> MAP_REQUIRED doesn't immediately grab me, but I don't actively dislike
> it either :)
>
> What about MAP_AT_ADDR ?
>
> It's short, and says what it does on the tin. The first argument to mmap
> is actually called "addr" too.

"FIXED" is supposed to do this too.

Pavel suggested:

MAP_ADD_FIXED

(which is different from "use fixed", and describes why it would fail:
can't add since it already exists.)

Perhaps "MAP_FIXED_NEW"?

There has been a request to drop "FIXED" from the name, so these:

MAP_FIXED_NOCLOBBER
MAP_FIXED_NOREPLACE
MAP_FIXED_ADD
MAP_FIXED_NEW

Could be:

MAP_NOCLOBBER
MAP_NOREPLACE
MAP_ADD
MAP_NEW

and we still have the unloved, but acceptable:

MAP_REQUIRED

My vote is still for "NOREPLACE" or "NOCLOBBER" since it's very
specific, though "NEW" is pretty clear too.

-Kees

-- 
Kees Cook
Pixel Security

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [PATCH 0/2] mm: introduce MAP_FIXED_SAFE
@ 2017-12-07 19:14                 ` Kees Cook
  0 siblings, 0 replies; 130+ messages in thread
From: Kees Cook @ 2017-12-07 19:14 UTC (permalink / raw)
  To: Michael Ellerman
  Cc: Matthew Wilcox, Cyril Hrubis, Michal Hocko, Linux API,
	Khalid Aziz, Andrew Morton, Russell King - ARM Linux,
	Andrea Arcangeli, Linux-MM, LKML, linux-arch, Florian Weimer,
	John Hubbard, Abdul Haleem, Joel Stanley, Pavel Machek

On Wed, Dec 6, 2017 at 9:46 PM, Michael Ellerman <mpe@ellerman.id.au> wrote:
> Matthew Wilcox <willy@infradead.org> writes:
>
>> On Tue, Dec 05, 2017 at 08:54:35PM -0800, Matthew Wilcox wrote:
>>> On Wed, Dec 06, 2017 at 03:51:44PM +1100, Michael Ellerman wrote:
>>> > Cyril Hrubis <chrubis@suse.cz> writes:
>>> >
>>> > > Hi!
>>> > >> > MAP_FIXED_UNIQUE
>>> > >> > MAP_FIXED_ONCE
>>> > >> > MAP_FIXED_FRESH
>>> > >>
>>> > >> Well, I can open a poll for the best name, but none of those you are
>>> > >> proposing sound much better to me. Yeah, naming sucks...
>>> > >
>>> > > Given that MAP_FIXED replaces the previous mapping MAP_FIXED_NOREPLACE
>>> > > would probably be a best fit.
>>> >
>>> > Yeah that could work.
>>> >
>>> > I prefer "no clobber" as I just suggested, because the existing
>>> > MAP_FIXED doesn't politely "replace" a mapping, it destroys the current
>>> > one - which you or another thread may be using - and clobbers it with
>>> > the new one.
>>>
>>> It's longer than MAP_FIXED_WEAK :-P
>>>
>>> You'd have to be pretty darn strong to clobber an existing mapping.
>>
>> I think we're thinking about this all wrong.  We shouldn't document it as
>> "This is a variant of MAP_FIXED".  We should document it as "Here's an
>> alternative to MAP_FIXED".
>>
>> So, just like we currently say "exactly one of MAP_SHARED or MAP_PRIVATE",
>> we could add a new paragraph saying "at most one of MAP_FIXED or
>> MAP_REQUIRED" and "any of the following values".
>>
>> Now, we should implement MAP_REQUIRED as having each architecture
>> define _MAP_NOT_A_HINT, and then #define MAP_REQUIRED (MAP_FIXED |
>> _MAP_NOT_A_HINT), but that's not information to confuse users with.
>>
>> Also, that lets us add a third option at some point that is Yet Another
>> Way to interpret the 'addr' argument, by having MAP_FIXED clear and
>> _MAP_NOT_A_HINT set.
>>
>> I'm not set on MAP_REQUIRED.  I came up with some awful names
>> (MAP_TODDLER, MAP_TANTRUM, MAP_ULTIMATUM, MAP_BOSS, MAP_PROGRAM_MANAGER,
>> etc).  But I think we should drop FIXED from the middle of the name.
>
> MAP_REQUIRED doesn't immediately grab me, but I don't actively dislike
> it either :)
>
> What about MAP_AT_ADDR ?
>
> It's short, and says what it does on the tin. The first argument to mmap
> is actually called "addr" too.

"FIXED" is supposed to do this too.

Pavel suggested:

MAP_ADD_FIXED

(which is different from "use fixed", and describes why it would fail:
can't add since it already exists.)

Perhaps "MAP_FIXED_NEW"?

There has been a request to drop "FIXED" from the name, so these:

MAP_FIXED_NOCLOBBER
MAP_FIXED_NOREPLACE
MAP_FIXED_ADD
MAP_FIXED_NEW

Could be:

MAP_NOCLOBBER
MAP_NOREPLACE
MAP_ADD
MAP_NEW

and we still have the unloved, but acceptable:

MAP_REQUIRED

My vote is still for "NOREPLACE" or "NOCLOBBER" since it's very
specific, though "NEW" is pretty clear too.

-Kees

-- 
Kees Cook
Pixel Security

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [PATCH 0/2] mm: introduce MAP_FIXED_SAFE
@ 2017-12-07 19:57                   ` Matthew Wilcox
  0 siblings, 0 replies; 130+ messages in thread
From: Matthew Wilcox @ 2017-12-07 19:57 UTC (permalink / raw)
  To: Kees Cook
  Cc: Michael Ellerman, Cyril Hrubis, Michal Hocko, Linux API,
	Khalid Aziz, Andrew Morton, Russell King - ARM Linux,
	Andrea Arcangeli, Linux-MM, LKML, linux-arch, Florian Weimer,
	John Hubbard, Abdul Haleem, Joel Stanley, Pavel Machek

On Thu, Dec 07, 2017 at 11:14:27AM -0800, Kees Cook wrote:
> On Wed, Dec 6, 2017 at 9:46 PM, Michael Ellerman <mpe@ellerman.id.au> wrote:
> > Matthew Wilcox <willy@infradead.org> writes:
> >> So, just like we currently say "exactly one of MAP_SHARED or MAP_PRIVATE",
> >> we could add a new paragraph saying "at most one of MAP_FIXED or
> >> MAP_REQUIRED" and "any of the following values".
> >
> > MAP_REQUIRED doesn't immediately grab me, but I don't actively dislike
> > it either :)
> >
> > What about MAP_AT_ADDR ?
> >
> > It's short, and says what it does on the tin. The first argument to mmap
> > is actually called "addr" too.
> 
> "FIXED" is supposed to do this too.
> 
> Pavel suggested:
> 
> MAP_ADD_FIXED
> 
> (which is different from "use fixed", and describes why it would fail:
> can't add since it already exists.)
> 
> Perhaps "MAP_FIXED_NEW"?
> 
> There has been a request to drop "FIXED" from the name, so these:
> 
> MAP_FIXED_NOCLOBBER
> MAP_FIXED_NOREPLACE
> MAP_FIXED_ADD
> MAP_FIXED_NEW
> 
> Could be:
> 
> MAP_NOCLOBBER
> MAP_NOREPLACE
> MAP_ADD
> MAP_NEW
> 
> and we still have the unloved, but acceptable:
> 
> MAP_REQUIRED
> 
> My vote is still for "NOREPLACE" or "NOCLOBBER" since it's very
> specific, though "NEW" is pretty clear too.

How about MAP_NOFORCE?

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [PATCH 0/2] mm: introduce MAP_FIXED_SAFE
@ 2017-12-07 19:57                   ` Matthew Wilcox
  0 siblings, 0 replies; 130+ messages in thread
From: Matthew Wilcox @ 2017-12-07 19:57 UTC (permalink / raw)
  To: Kees Cook
  Cc: Michael Ellerman, Cyril Hrubis, Michal Hocko, Linux API,
	Khalid Aziz, Andrew Morton, Russell King - ARM Linux,
	Andrea Arcangeli, Linux-MM, LKML, linux-arch, Florian Weimer,
	John Hubbard, Abdul Haleem, Joel Stanley, Pavel Machek

On Thu, Dec 07, 2017 at 11:14:27AM -0800, Kees Cook wrote:
> On Wed, Dec 6, 2017 at 9:46 PM, Michael Ellerman <mpe-Gsx/Oe8HsFggBc27wqDAHg@public.gmane.org> wrote:
> > Matthew Wilcox <willy-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org> writes:
> >> So, just like we currently say "exactly one of MAP_SHARED or MAP_PRIVATE",
> >> we could add a new paragraph saying "at most one of MAP_FIXED or
> >> MAP_REQUIRED" and "any of the following values".
> >
> > MAP_REQUIRED doesn't immediately grab me, but I don't actively dislike
> > it either :)
> >
> > What about MAP_AT_ADDR ?
> >
> > It's short, and says what it does on the tin. The first argument to mmap
> > is actually called "addr" too.
> 
> "FIXED" is supposed to do this too.
> 
> Pavel suggested:
> 
> MAP_ADD_FIXED
> 
> (which is different from "use fixed", and describes why it would fail:
> can't add since it already exists.)
> 
> Perhaps "MAP_FIXED_NEW"?
> 
> There has been a request to drop "FIXED" from the name, so these:
> 
> MAP_FIXED_NOCLOBBER
> MAP_FIXED_NOREPLACE
> MAP_FIXED_ADD
> MAP_FIXED_NEW
> 
> Could be:
> 
> MAP_NOCLOBBER
> MAP_NOREPLACE
> MAP_ADD
> MAP_NEW
> 
> and we still have the unloved, but acceptable:
> 
> MAP_REQUIRED
> 
> My vote is still for "NOREPLACE" or "NOCLOBBER" since it's very
> specific, though "NEW" is pretty clear too.

How about MAP_NOFORCE?

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [PATCH 0/2] mm: introduce MAP_FIXED_SAFE
@ 2017-12-07 19:57                   ` Matthew Wilcox
  0 siblings, 0 replies; 130+ messages in thread
From: Matthew Wilcox @ 2017-12-07 19:57 UTC (permalink / raw)
  To: Kees Cook
  Cc: Michael Ellerman, Cyril Hrubis, Michal Hocko, Linux API,
	Khalid Aziz, Andrew Morton, Russell King - ARM Linux,
	Andrea Arcangeli, Linux-MM, LKML, linux-arch, Florian Weimer,
	John Hubbard, Abdul Haleem, Joel Stanley, Pavel Machek

On Thu, Dec 07, 2017 at 11:14:27AM -0800, Kees Cook wrote:
> On Wed, Dec 6, 2017 at 9:46 PM, Michael Ellerman <mpe@ellerman.id.au> wrote:
> > Matthew Wilcox <willy@infradead.org> writes:
> >> So, just like we currently say "exactly one of MAP_SHARED or MAP_PRIVATE",
> >> we could add a new paragraph saying "at most one of MAP_FIXED or
> >> MAP_REQUIRED" and "any of the following values".
> >
> > MAP_REQUIRED doesn't immediately grab me, but I don't actively dislike
> > it either :)
> >
> > What about MAP_AT_ADDR ?
> >
> > It's short, and says what it does on the tin. The first argument to mmap
> > is actually called "addr" too.
> 
> "FIXED" is supposed to do this too.
> 
> Pavel suggested:
> 
> MAP_ADD_FIXED
> 
> (which is different from "use fixed", and describes why it would fail:
> can't add since it already exists.)
> 
> Perhaps "MAP_FIXED_NEW"?
> 
> There has been a request to drop "FIXED" from the name, so these:
> 
> MAP_FIXED_NOCLOBBER
> MAP_FIXED_NOREPLACE
> MAP_FIXED_ADD
> MAP_FIXED_NEW
> 
> Could be:
> 
> MAP_NOCLOBBER
> MAP_NOREPLACE
> MAP_ADD
> MAP_NEW
> 
> and we still have the unloved, but acceptable:
> 
> MAP_REQUIRED
> 
> My vote is still for "NOREPLACE" or "NOCLOBBER" since it's very
> specific, though "NEW" is pretty clear too.

How about MAP_NOFORCE?

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [PATCH 0/2] mm: introduce MAP_FIXED_SAFE
  2017-12-07 19:57                   ` Matthew Wilcox
@ 2017-12-08  8:33                     ` Michal Hocko
  -1 siblings, 0 replies; 130+ messages in thread
From: Michal Hocko @ 2017-12-08  8:33 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Kees Cook, Michael Ellerman, Cyril Hrubis, Linux API,
	Khalid Aziz, Andrew Morton, Russell King - ARM Linux,
	Andrea Arcangeli, Linux-MM, LKML, linux-arch, Florian Weimer,
	John Hubbard, Abdul Haleem, Joel Stanley, Pavel Machek

On Thu 07-12-17 11:57:27, Matthew Wilcox wrote:
> On Thu, Dec 07, 2017 at 11:14:27AM -0800, Kees Cook wrote:
> > On Wed, Dec 6, 2017 at 9:46 PM, Michael Ellerman <mpe@ellerman.id.au> wrote:
> > > Matthew Wilcox <willy@infradead.org> writes:
> > >> So, just like we currently say "exactly one of MAP_SHARED or MAP_PRIVATE",
> > >> we could add a new paragraph saying "at most one of MAP_FIXED or
> > >> MAP_REQUIRED" and "any of the following values".
> > >
> > > MAP_REQUIRED doesn't immediately grab me, but I don't actively dislike
> > > it either :)
> > >
> > > What about MAP_AT_ADDR ?
> > >
> > > It's short, and says what it does on the tin. The first argument to mmap
> > > is actually called "addr" too.
> > 
> > "FIXED" is supposed to do this too.
> > 
> > Pavel suggested:
> > 
> > MAP_ADD_FIXED
> > 
> > (which is different from "use fixed", and describes why it would fail:
> > can't add since it already exists.)
> > 
> > Perhaps "MAP_FIXED_NEW"?
> > 
> > There has been a request to drop "FIXED" from the name, so these:
> > 
> > MAP_FIXED_NOCLOBBER
> > MAP_FIXED_NOREPLACE
> > MAP_FIXED_ADD
> > MAP_FIXED_NEW
> > 
> > Could be:
> > 
> > MAP_NOCLOBBER
> > MAP_NOREPLACE
> > MAP_ADD
> > MAP_NEW
> > 
> > and we still have the unloved, but acceptable:
> > 
> > MAP_REQUIRED
> > 
> > My vote is still for "NOREPLACE" or "NOCLOBBER" since it's very
> > specific, though "NEW" is pretty clear too.
> 
> How about MAP_NOFORCE?

OK, this doesn't seem to lead to anywhere. The more this is discussed
the more names we are getting. So you know what? I will resubmit and
keep my original name. If somebody really hates it then feel free to
nack the patch and push alternative and gain concensus on it.

I will keep MAP_FIXED_SAFE because it is an alternative to MAP_FIXED so
having that in the name is _useful_ for everybody familiar with
MAP_FIXED already. And _SAFE suffix tells that the operation doesn't
cause any silent memory corruptions or other unexpected side effects.
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [PATCH 0/2] mm: introduce MAP_FIXED_SAFE
@ 2017-12-08  8:33                     ` Michal Hocko
  0 siblings, 0 replies; 130+ messages in thread
From: Michal Hocko @ 2017-12-08  8:33 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Kees Cook, Michael Ellerman, Cyril Hrubis, Linux API,
	Khalid Aziz, Andrew Morton, Russell King - ARM Linux,
	Andrea Arcangeli, Linux-MM, LKML, linux-arch, Florian Weimer,
	John Hubbard, Abdul Haleem, Joel Stanley, Pavel Machek

On Thu 07-12-17 11:57:27, Matthew Wilcox wrote:
> On Thu, Dec 07, 2017 at 11:14:27AM -0800, Kees Cook wrote:
> > On Wed, Dec 6, 2017 at 9:46 PM, Michael Ellerman <mpe@ellerman.id.au> wrote:
> > > Matthew Wilcox <willy@infradead.org> writes:
> > >> So, just like we currently say "exactly one of MAP_SHARED or MAP_PRIVATE",
> > >> we could add a new paragraph saying "at most one of MAP_FIXED or
> > >> MAP_REQUIRED" and "any of the following values".
> > >
> > > MAP_REQUIRED doesn't immediately grab me, but I don't actively dislike
> > > it either :)
> > >
> > > What about MAP_AT_ADDR ?
> > >
> > > It's short, and says what it does on the tin. The first argument to mmap
> > > is actually called "addr" too.
> > 
> > "FIXED" is supposed to do this too.
> > 
> > Pavel suggested:
> > 
> > MAP_ADD_FIXED
> > 
> > (which is different from "use fixed", and describes why it would fail:
> > can't add since it already exists.)
> > 
> > Perhaps "MAP_FIXED_NEW"?
> > 
> > There has been a request to drop "FIXED" from the name, so these:
> > 
> > MAP_FIXED_NOCLOBBER
> > MAP_FIXED_NOREPLACE
> > MAP_FIXED_ADD
> > MAP_FIXED_NEW
> > 
> > Could be:
> > 
> > MAP_NOCLOBBER
> > MAP_NOREPLACE
> > MAP_ADD
> > MAP_NEW
> > 
> > and we still have the unloved, but acceptable:
> > 
> > MAP_REQUIRED
> > 
> > My vote is still for "NOREPLACE" or "NOCLOBBER" since it's very
> > specific, though "NEW" is pretty clear too.
> 
> How about MAP_NOFORCE?

OK, this doesn't seem to lead to anywhere. The more this is discussed
the more names we are getting. So you know what? I will resubmit and
keep my original name. If somebody really hates it then feel free to
nack the patch and push alternative and gain concensus on it.

I will keep MAP_FIXED_SAFE because it is an alternative to MAP_FIXED so
having that in the name is _useful_ for everybody familiar with
MAP_FIXED already. And _SAFE suffix tells that the operation doesn't
cause any silent memory corruptions or other unexpected side effects.
-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [PATCH 0/2] mm: introduce MAP_FIXED_SAFE
  2017-12-07 19:57                   ` Matthew Wilcox
@ 2017-12-08 11:08                     ` Michael Ellerman
  -1 siblings, 0 replies; 130+ messages in thread
From: Michael Ellerman @ 2017-12-08 11:08 UTC (permalink / raw)
  To: Matthew Wilcox, Kees Cook
  Cc: Cyril Hrubis, Michal Hocko, Linux API, Khalid Aziz,
	Andrew Morton, Russell King - ARM Linux, Andrea Arcangeli,
	Linux-MM, LKML, linux-arch, Florian Weimer, John Hubbard,
	Abdul Haleem, Joel Stanley, Pavel Machek

Matthew Wilcox <willy@infradead.org> writes:

> On Thu, Dec 07, 2017 at 11:14:27AM -0800, Kees Cook wrote:
>> On Wed, Dec 6, 2017 at 9:46 PM, Michael Ellerman <mpe@ellerman.id.au> wrote:
>> > Matthew Wilcox <willy@infradead.org> writes:
>> >> So, just like we currently say "exactly one of MAP_SHARED or MAP_PRIVATE",
>> >> we could add a new paragraph saying "at most one of MAP_FIXED or
>> >> MAP_REQUIRED" and "any of the following values".
>> >
>> > MAP_REQUIRED doesn't immediately grab me, but I don't actively dislike
>> > it either :)
>> >
>> > What about MAP_AT_ADDR ?
>> >
>> > It's short, and says what it does on the tin. The first argument to mmap
>> > is actually called "addr" too.
>> 
>> "FIXED" is supposed to do this too.
>> 
>> Pavel suggested:
>> 
>> MAP_ADD_FIXED
>> 
>> (which is different from "use fixed", and describes why it would fail:
>> can't add since it already exists.)
>> 
>> Perhaps "MAP_FIXED_NEW"?
>> 
>> There has been a request to drop "FIXED" from the name, so these:
>> 
>> MAP_FIXED_NOCLOBBER
>> MAP_FIXED_NOREPLACE
>> MAP_FIXED_ADD
>> MAP_FIXED_NEW
>> 
>> Could be:
>> 
>> MAP_NOCLOBBER
>> MAP_NOREPLACE
>> MAP_ADD
>> MAP_NEW
>> 
>> and we still have the unloved, but acceptable:
>> 
>> MAP_REQUIRED
>> 
>> My vote is still for "NOREPLACE" or "NOCLOBBER" since it's very
>> specific, though "NEW" is pretty clear too.
>
> How about MAP_NOFORCE?

It doesn't tell me that addr is not a hint. That's a crucial detail.

Without MAP_FIXED mmap never "forces/replaces/clobbers", so why would I
need MAP_NOFORCE if I don't have MAP_FIXED?

So it needs something in there to indicate that the addr is not a hint,
that's the only thing that flag actually *does*.


If we had a time machine, the right set of flags would be:

  - MAP_FIXED:   don't treat addr as a hint, fail if addr is not free
  - MAP_REPLACE: replace an existing mapping (or force or clobber)

But the two were conflated for some reason in the current MAP_FIXED.

Given we can't go back and fix it, the closest we can get is to add a
variant of MAP_FIXED which subtracts the "REPLACE" semantic.

ie: MAP_FIXED_NOREPLACE

cheers

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [PATCH 0/2] mm: introduce MAP_FIXED_SAFE
@ 2017-12-08 11:08                     ` Michael Ellerman
  0 siblings, 0 replies; 130+ messages in thread
From: Michael Ellerman @ 2017-12-08 11:08 UTC (permalink / raw)
  To: Matthew Wilcox, Kees Cook
  Cc: Cyril Hrubis, Michal Hocko, Linux API, Khalid Aziz,
	Andrew Morton, Russell King - ARM Linux, Andrea Arcangeli,
	Linux-MM, LKML, linux-arch, Florian Weimer, John Hubbard,
	Abdul Haleem, Joel Stanley, Pavel Machek

Matthew Wilcox <willy@infradead.org> writes:

> On Thu, Dec 07, 2017 at 11:14:27AM -0800, Kees Cook wrote:
>> On Wed, Dec 6, 2017 at 9:46 PM, Michael Ellerman <mpe@ellerman.id.au> wrote:
>> > Matthew Wilcox <willy@infradead.org> writes:
>> >> So, just like we currently say "exactly one of MAP_SHARED or MAP_PRIVATE",
>> >> we could add a new paragraph saying "at most one of MAP_FIXED or
>> >> MAP_REQUIRED" and "any of the following values".
>> >
>> > MAP_REQUIRED doesn't immediately grab me, but I don't actively dislike
>> > it either :)
>> >
>> > What about MAP_AT_ADDR ?
>> >
>> > It's short, and says what it does on the tin. The first argument to mmap
>> > is actually called "addr" too.
>> 
>> "FIXED" is supposed to do this too.
>> 
>> Pavel suggested:
>> 
>> MAP_ADD_FIXED
>> 
>> (which is different from "use fixed", and describes why it would fail:
>> can't add since it already exists.)
>> 
>> Perhaps "MAP_FIXED_NEW"?
>> 
>> There has been a request to drop "FIXED" from the name, so these:
>> 
>> MAP_FIXED_NOCLOBBER
>> MAP_FIXED_NOREPLACE
>> MAP_FIXED_ADD
>> MAP_FIXED_NEW
>> 
>> Could be:
>> 
>> MAP_NOCLOBBER
>> MAP_NOREPLACE
>> MAP_ADD
>> MAP_NEW
>> 
>> and we still have the unloved, but acceptable:
>> 
>> MAP_REQUIRED
>> 
>> My vote is still for "NOREPLACE" or "NOCLOBBER" since it's very
>> specific, though "NEW" is pretty clear too.
>
> How about MAP_NOFORCE?

It doesn't tell me that addr is not a hint. That's a crucial detail.

Without MAP_FIXED mmap never "forces/replaces/clobbers", so why would I
need MAP_NOFORCE if I don't have MAP_FIXED?

So it needs something in there to indicate that the addr is not a hint,
that's the only thing that flag actually *does*.


If we had a time machine, the right set of flags would be:

  - MAP_FIXED:   don't treat addr as a hint, fail if addr is not free
  - MAP_REPLACE: replace an existing mapping (or force or clobber)

But the two were conflated for some reason in the current MAP_FIXED.

Given we can't go back and fix it, the closest we can get is to add a
variant of MAP_FIXED which subtracts the "REPLACE" semantic.

ie: MAP_FIXED_NOREPLACE

cheers

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [PATCH 0/2] mm: introduce MAP_FIXED_SAFE
  2017-12-08 11:08                     ` Michael Ellerman
  (?)
@ 2017-12-08 14:27                     ` Pavel Machek
  2017-12-08 20:31                         ` Cyril Hrubis
  2017-12-08 20:47                         ` Florian Weimer
  -1 siblings, 2 replies; 130+ messages in thread
From: Pavel Machek @ 2017-12-08 14:27 UTC (permalink / raw)
  To: Michael Ellerman
  Cc: Matthew Wilcox, Kees Cook, Cyril Hrubis, Michal Hocko, Linux API,
	Khalid Aziz, Andrew Morton, Russell King - ARM Linux,
	Andrea Arcangeli, Linux-MM, LKML, linux-arch, Florian Weimer,
	John Hubbard, Abdul Haleem, Joel Stanley

[-- Attachment #1: Type: text/plain, Size: 2665 bytes --]

On Fri 2017-12-08 22:08:07, Michael Ellerman wrote:
> Matthew Wilcox <willy@infradead.org> writes:
> 
> > On Thu, Dec 07, 2017 at 11:14:27AM -0800, Kees Cook wrote:
> >> On Wed, Dec 6, 2017 at 9:46 PM, Michael Ellerman <mpe@ellerman.id.au> wrote:
> >> > Matthew Wilcox <willy@infradead.org> writes:
> >> >> So, just like we currently say "exactly one of MAP_SHARED or MAP_PRIVATE",
> >> >> we could add a new paragraph saying "at most one of MAP_FIXED or
> >> >> MAP_REQUIRED" and "any of the following values".
> >> >
> >> > MAP_REQUIRED doesn't immediately grab me, but I don't actively dislike
> >> > it either :)
> >> >
> >> > What about MAP_AT_ADDR ?
> >> >
> >> > It's short, and says what it does on the tin. The first argument to mmap
> >> > is actually called "addr" too.
> >> 
> >> "FIXED" is supposed to do this too.
> >> 
> >> Pavel suggested:
> >> 
> >> MAP_ADD_FIXED
> >> 
> >> (which is different from "use fixed", and describes why it would fail:
> >> can't add since it already exists.)
> >> 
> >> Perhaps "MAP_FIXED_NEW"?
> >> 
> >> There has been a request to drop "FIXED" from the name, so these:
> >> 
> >> MAP_FIXED_NOCLOBBER
> >> MAP_FIXED_NOREPLACE
> >> MAP_FIXED_ADD
> >> MAP_FIXED_NEW
> >> 
> >> Could be:
> >> 
> >> MAP_NOCLOBBER
> >> MAP_NOREPLACE
> >> MAP_ADD
> >> MAP_NEW
> >> 
> >> and we still have the unloved, but acceptable:
> >> 
> >> MAP_REQUIRED
> >> 
> >> My vote is still for "NOREPLACE" or "NOCLOBBER" since it's very
> >> specific, though "NEW" is pretty clear too.
> >
> > How about MAP_NOFORCE?
> 
> It doesn't tell me that addr is not a hint. That's a crucial detail.
> 
> Without MAP_FIXED mmap never "forces/replaces/clobbers", so why would I
> need MAP_NOFORCE if I don't have MAP_FIXED?
> 
> So it needs something in there to indicate that the addr is not a hint,
> that's the only thing that flag actually *does*.
> 
> 
> If we had a time machine, the right set of flags would be:
> 
>   - MAP_FIXED:   don't treat addr as a hint, fail if addr is not free
>   - MAP_REPLACE: replace an existing mapping (or force or clobber)

Actually, if we had a time machine... would we even provide
MAP_REPLACE functionality?

> But the two were conflated for some reason in the current MAP_FIXED.
> 
> Given we can't go back and fix it, the closest we can get is to add a
> variant of MAP_FIXED which subtracts the "REPLACE" semantic.
> 
> ie: MAP_FIXED_NOREPLACE

I like MAP_FIXED_NOREPLACE.

									Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 181 bytes --]

^ permalink raw reply	[flat|nested] 130+ messages in thread

* RE: [PATCH 0/2] mm: introduce MAP_FIXED_SAFE
  2017-12-08 11:08                     ` Michael Ellerman
@ 2017-12-08 14:33                       ` David Laight
  -1 siblings, 0 replies; 130+ messages in thread
From: David Laight @ 2017-12-08 14:33 UTC (permalink / raw)
  To: 'Michael Ellerman', Matthew Wilcox, Kees Cook
  Cc: Cyril Hrubis, Michal Hocko, Linux API, Khalid Aziz,
	Andrew Morton, Russell King - ARM Linux, Andrea Arcangeli,
	Linux-MM, LKML, linux-arch, Florian Weimer, John Hubbard,
	Abdul Haleem, Joel Stanley, Pavel Machek

From: Michael Ellerman
> Sent: 08 December 2017 11:08
...
> If we had a time machine, the right set of flags would be:
> 
>   - MAP_FIXED:   don't treat addr as a hint, fail if addr is not free
>   - MAP_REPLACE: replace an existing mapping (or force or clobber)
> 
> But the two were conflated for some reason in the current MAP_FIXED.

Possibly because the original use was loading overlays?

> Given we can't go back and fix it, the closest we can get is to add a
> variant of MAP_FIXED which subtracts the "REPLACE" semantic.
> 
> ie: MAP_FIXED_NOREPLACE

Much better than _SAFE - which is always bad because it is usually
one 'safe' for one specific use case.

	David

^ permalink raw reply	[flat|nested] 130+ messages in thread

* RE: [PATCH 0/2] mm: introduce MAP_FIXED_SAFE
@ 2017-12-08 14:33                       ` David Laight
  0 siblings, 0 replies; 130+ messages in thread
From: David Laight @ 2017-12-08 14:33 UTC (permalink / raw)
  To: 'Michael Ellerman', Matthew Wilcox, Kees Cook
  Cc: Cyril Hrubis, Michal Hocko, Linux API, Khalid Aziz,
	Andrew Morton, Russell King - ARM Linux, Andrea Arcangeli,
	Linux-MM, LKML, linux-arch, Florian Weimer, John Hubbard,
	Abdul Haleem, Joel Stanley, Pavel Machek

From: Michael Ellerman
> Sent: 08 December 2017 11:08
...
> If we had a time machine, the right set of flags would be:
> 
>   - MAP_FIXED:   don't treat addr as a hint, fail if addr is not free
>   - MAP_REPLACE: replace an existing mapping (or force or clobber)
> 
> But the two were conflated for some reason in the current MAP_FIXED.

Possibly because the original use was loading overlays?

> Given we can't go back and fix it, the closest we can get is to add a
> variant of MAP_FIXED which subtracts the "REPLACE" semantic.
> 
> ie: MAP_FIXED_NOREPLACE

Much better than _SAFE - which is always bad because it is usually
one 'safe' for one specific use case.

	David

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [PATCH 0/2] mm: introduce MAP_FIXED_SAFE
@ 2017-12-08 20:13                       ` Kees Cook
  0 siblings, 0 replies; 130+ messages in thread
From: Kees Cook @ 2017-12-08 20:13 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Matthew Wilcox, Michael Ellerman, Cyril Hrubis, Linux API,
	Khalid Aziz, Andrew Morton, Russell King - ARM Linux,
	Andrea Arcangeli, Linux-MM, LKML, linux-arch, Florian Weimer,
	John Hubbard, Abdul Haleem, Joel Stanley, Pavel Machek

On Fri, Dec 8, 2017 at 12:33 AM, Michal Hocko <mhocko@kernel.org> wrote:
> OK, this doesn't seem to lead to anywhere. The more this is discussed
> the more names we are getting. So you know what? I will resubmit and
> keep my original name. If somebody really hates it then feel free to
> nack the patch and push alternative and gain concensus on it.
>
> I will keep MAP_FIXED_SAFE because it is an alternative to MAP_FIXED so
> having that in the name is _useful_ for everybody familiar with
> MAP_FIXED already. And _SAFE suffix tells that the operation doesn't
> cause any silent memory corruptions or other unexpected side effects.

Looks like consensus is MAP_FIXED_NOREPLACE.

-Kees

-- 
Kees Cook
Pixel Security

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [PATCH 0/2] mm: introduce MAP_FIXED_SAFE
@ 2017-12-08 20:13                       ` Kees Cook
  0 siblings, 0 replies; 130+ messages in thread
From: Kees Cook @ 2017-12-08 20:13 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Matthew Wilcox, Michael Ellerman, Cyril Hrubis, Linux API,
	Khalid Aziz, Andrew Morton, Russell King - ARM Linux,
	Andrea Arcangeli, Linux-MM, LKML, linux-arch, Florian Weimer,
	John Hubbard, Abdul Haleem, Joel Stanley, Pavel Machek

On Fri, Dec 8, 2017 at 12:33 AM, Michal Hocko <mhocko-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> wrote:
> OK, this doesn't seem to lead to anywhere. The more this is discussed
> the more names we are getting. So you know what? I will resubmit and
> keep my original name. If somebody really hates it then feel free to
> nack the patch and push alternative and gain concensus on it.
>
> I will keep MAP_FIXED_SAFE because it is an alternative to MAP_FIXED so
> having that in the name is _useful_ for everybody familiar with
> MAP_FIXED already. And _SAFE suffix tells that the operation doesn't
> cause any silent memory corruptions or other unexpected side effects.

Looks like consensus is MAP_FIXED_NOREPLACE.

-Kees

-- 
Kees Cook
Pixel Security

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [PATCH 0/2] mm: introduce MAP_FIXED_SAFE
@ 2017-12-08 20:13                       ` Kees Cook
  0 siblings, 0 replies; 130+ messages in thread
From: Kees Cook @ 2017-12-08 20:13 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Matthew Wilcox, Michael Ellerman, Cyril Hrubis, Linux API,
	Khalid Aziz, Andrew Morton, Russell King - ARM Linux,
	Andrea Arcangeli, Linux-MM, LKML, linux-arch, Florian Weimer,
	John Hubbard, Abdul Haleem, Joel Stanley, Pavel Machek

On Fri, Dec 8, 2017 at 12:33 AM, Michal Hocko <mhocko@kernel.org> wrote:
> OK, this doesn't seem to lead to anywhere. The more this is discussed
> the more names we are getting. So you know what? I will resubmit and
> keep my original name. If somebody really hates it then feel free to
> nack the patch and push alternative and gain concensus on it.
>
> I will keep MAP_FIXED_SAFE because it is an alternative to MAP_FIXED so
> having that in the name is _useful_ for everybody familiar with
> MAP_FIXED already. And _SAFE suffix tells that the operation doesn't
> cause any silent memory corruptions or other unexpected side effects.

Looks like consensus is MAP_FIXED_NOREPLACE.

-Kees

-- 
Kees Cook
Pixel Security

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [PATCH 0/2] mm: introduce MAP_FIXED_SAFE
  2017-12-08 14:27                     ` Pavel Machek
  2017-12-08 20:31                         ` Cyril Hrubis
@ 2017-12-08 20:31                         ` Cyril Hrubis
  1 sibling, 0 replies; 130+ messages in thread
From: Cyril Hrubis @ 2017-12-08 20:31 UTC (permalink / raw)
  To: Pavel Machek
  Cc: Michael Ellerman, Matthew Wilcox, Kees Cook, Michal Hocko,
	Linux API, Khalid Aziz, Andrew Morton, Russell King - ARM Linux,
	Andrea Arcangeli, Linux-MM, LKML, linux-arch, Florian Weimer,
	John Hubbard, Abdul Haleem, Joel Stanley

Hi!
> > If we had a time machine, the right set of flags would be:
> > 
> >   - MAP_FIXED:   don't treat addr as a hint, fail if addr is not free
> >   - MAP_REPLACE: replace an existing mapping (or force or clobber)
> 
> Actually, if we had a time machine... would we even provide
> MAP_REPLACE functionality?

I did a bit of archeology just beacause we can and since there is a git
repository of the unix history [1].

The first version of mmap() seems to appear in BSD-4_2-Snapshot there was no
MAP_FIXED flag and the addr is expected to be used for the mapping. At least
that is what manual seems to say, the kernel code is not written at this point.
This seems to correspond to a time when Berkley students were busy rewriting
UNIX kernel to take advantage of the VAX's virtual memory.

The MAP_FIXED arrived to the manual shortly after, probably someone figured out
that passing an address to the call does not make much sense in most of the
cases.

The first actual implementation that supports MAP_FIXED appeared in the
BSD-4_3_Reno-Snapshot and already includes the replace behavior. The original
purpose seems to be replacing mappings in the implementation of the execve()
call.

So the answer would probably be yes but it would probably made sense to keep it
as kernel internal flag.

And BTW it looks like HPUX got it right before it was changed to follow POSIX.
There seems to be HPUX compatibility code in the early BSD codebase that
contains both HPUXMAP_FIXED and HPUXMAP_REPLACE.

[1] https://github.com/dspinellis/unix-history-repo

-- 
Cyril Hrubis
chrubis@suse.cz

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [PATCH 0/2] mm: introduce MAP_FIXED_SAFE
@ 2017-12-08 20:31                         ` Cyril Hrubis
  0 siblings, 0 replies; 130+ messages in thread
From: Cyril Hrubis @ 2017-12-08 20:31 UTC (permalink / raw)
  To: Pavel Machek
  Cc: Michael Ellerman, Matthew Wilcox, Kees Cook, Michal Hocko,
	Linux API, Khalid Aziz, Andrew Morton, Russell King - ARM Linux,
	Andrea Arcangeli, Linux-MM, LKML, linux-arch, Florian Weimer,
	John Hubbard, Abdul Haleem, Joel Stanley

Hi!
> > If we had a time machine, the right set of flags would be:
> > 
> >   - MAP_FIXED:   don't treat addr as a hint, fail if addr is not free
> >   - MAP_REPLACE: replace an existing mapping (or force or clobber)
> 
> Actually, if we had a time machine... would we even provide
> MAP_REPLACE functionality?

I did a bit of archeology just beacause we can and since there is a git
repository of the unix history [1].

The first version of mmap() seems to appear in BSD-4_2-Snapshot there was no
MAP_FIXED flag and the addr is expected to be used for the mapping. At least
that is what manual seems to say, the kernel code is not written at this point.
This seems to correspond to a time when Berkley students were busy rewriting
UNIX kernel to take advantage of the VAX's virtual memory.

The MAP_FIXED arrived to the manual shortly after, probably someone figured out
that passing an address to the call does not make much sense in most of the
cases.

The first actual implementation that supports MAP_FIXED appeared in the
BSD-4_3_Reno-Snapshot and already includes the replace behavior. The original
purpose seems to be replacing mappings in the implementation of the execve()
call.

So the answer would probably be yes but it would probably made sense to keep it
as kernel internal flag.

And BTW it looks like HPUX got it right before it was changed to follow POSIX.
There seems to be HPUX compatibility code in the early BSD codebase that
contains both HPUXMAP_FIXED and HPUXMAP_REPLACE.

[1] https://github.com/dspinellis/unix-history-repo

-- 
Cyril Hrubis
chrubis-AlSwsSmVLrQ@public.gmane.org

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [PATCH 0/2] mm: introduce MAP_FIXED_SAFE
@ 2017-12-08 20:31                         ` Cyril Hrubis
  0 siblings, 0 replies; 130+ messages in thread
From: Cyril Hrubis @ 2017-12-08 20:31 UTC (permalink / raw)
  To: Pavel Machek
  Cc: Michael Ellerman, Matthew Wilcox, Kees Cook, Michal Hocko,
	Linux API, Khalid Aziz, Andrew Morton, Russell King - ARM Linux,
	Andrea Arcangeli, Linux-MM, LKML, linux-arch, Florian Weimer,
	John Hubbard, Abdul Haleem, Joel Stanley

Hi!
> > If we had a time machine, the right set of flags would be:
> > 
> >   - MAP_FIXED:   don't treat addr as a hint, fail if addr is not free
> >   - MAP_REPLACE: replace an existing mapping (or force or clobber)
> 
> Actually, if we had a time machine... would we even provide
> MAP_REPLACE functionality?

I did a bit of archeology just beacause we can and since there is a git
repository of the unix history [1].

The first version of mmap() seems to appear in BSD-4_2-Snapshot there was no
MAP_FIXED flag and the addr is expected to be used for the mapping. At least
that is what manual seems to say, the kernel code is not written at this point.
This seems to correspond to a time when Berkley students were busy rewriting
UNIX kernel to take advantage of the VAX's virtual memory.

The MAP_FIXED arrived to the manual shortly after, probably someone figured out
that passing an address to the call does not make much sense in most of the
cases.

The first actual implementation that supports MAP_FIXED appeared in the
BSD-4_3_Reno-Snapshot and already includes the replace behavior. The original
purpose seems to be replacing mappings in the implementation of the execve()
call.

So the answer would probably be yes but it would probably made sense to keep it
as kernel internal flag.

And BTW it looks like HPUX got it right before it was changed to follow POSIX.
There seems to be HPUX compatibility code in the early BSD codebase that
contains both HPUXMAP_FIXED and HPUXMAP_REPLACE.

[1] https://github.com/dspinellis/unix-history-repo

-- 
Cyril Hrubis
chrubis@suse.cz

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [PATCH 0/2] mm: introduce MAP_FIXED_SAFE
  2017-12-08 14:27                     ` Pavel Machek
  2017-12-08 20:31                         ` Cyril Hrubis
@ 2017-12-08 20:47                         ` Florian Weimer
  1 sibling, 0 replies; 130+ messages in thread
From: Florian Weimer @ 2017-12-08 20:47 UTC (permalink / raw)
  To: Pavel Machek, Michael Ellerman
  Cc: Matthew Wilcox, Kees Cook, Cyril Hrubis, Michal Hocko, Linux API,
	Khalid Aziz, Andrew Morton, Russell King - ARM Linux,
	Andrea Arcangeli, Linux-MM, LKML, linux-arch, John Hubbard,
	Abdul Haleem, Joel Stanley

On 12/08/2017 03:27 PM, Pavel Machek wrote:
> On Fri 2017-12-08 22:08:07, Michael Ellerman wrote:
>> If we had a time machine, the right set of flags would be:
>>
>>    - MAP_FIXED:   don't treat addr as a hint, fail if addr is not free
>>    - MAP_REPLACE: replace an existing mapping (or force or clobber)

> Actually, if we had a time machine... would we even provide
> MAP_REPLACE functionality?

Probably yes.  ELF loading needs to construct a complex set of mappings 
from a single file.  munmap (to create a hole) followed by mmap would be 
racy because another thread could have reused the gap in the meantime. 
The only alternative to overriding existing mappings would be mremap 
with MREMAP_FIXED, and that doesn't look like an improvement API-wise.

(The glibc dynamic linker uses an mmap call with an increased length to 
reserve address space and then loads additional segments with MAP_FIXED 
at the offsets specified in the program header.)

Thanks,
Florian

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [PATCH 0/2] mm: introduce MAP_FIXED_SAFE
@ 2017-12-08 20:47                         ` Florian Weimer
  0 siblings, 0 replies; 130+ messages in thread
From: Florian Weimer @ 2017-12-08 20:47 UTC (permalink / raw)
  To: Pavel Machek, Michael Ellerman
  Cc: Matthew Wilcox, Kees Cook, Cyril Hrubis, Michal Hocko, Linux API,
	Khalid Aziz, Andrew Morton, Russell King - ARM Linux,
	Andrea Arcangeli, Linux-MM, LKML, linux-arch, John Hubbard,
	Abdul Haleem, Joel Stanley

On 12/08/2017 03:27 PM, Pavel Machek wrote:
> On Fri 2017-12-08 22:08:07, Michael Ellerman wrote:
>> If we had a time machine, the right set of flags would be:
>>
>>    - MAP_FIXED:   don't treat addr as a hint, fail if addr is not free
>>    - MAP_REPLACE: replace an existing mapping (or force or clobber)

> Actually, if we had a time machine... would we even provide
> MAP_REPLACE functionality?

Probably yes.  ELF loading needs to construct a complex set of mappings 
from a single file.  munmap (to create a hole) followed by mmap would be 
racy because another thread could have reused the gap in the meantime. 
The only alternative to overriding existing mappings would be mremap 
with MREMAP_FIXED, and that doesn't look like an improvement API-wise.

(The glibc dynamic linker uses an mmap call with an increased length to 
reserve address space and then loads additional segments with MAP_FIXED 
at the offsets specified in the program header.)

Thanks,
Florian

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [PATCH 0/2] mm: introduce MAP_FIXED_SAFE
@ 2017-12-08 20:47                         ` Florian Weimer
  0 siblings, 0 replies; 130+ messages in thread
From: Florian Weimer @ 2017-12-08 20:47 UTC (permalink / raw)
  To: Pavel Machek, Michael Ellerman
  Cc: Matthew Wilcox, Kees Cook, Cyril Hrubis, Michal Hocko, Linux API,
	Khalid Aziz, Andrew Morton, Russell King - ARM Linux,
	Andrea Arcangeli, Linux-MM, LKML, linux-arch, John Hubbard,
	Abdul Haleem, Joel Stanley

On 12/08/2017 03:27 PM, Pavel Machek wrote:
> On Fri 2017-12-08 22:08:07, Michael Ellerman wrote:
>> If we had a time machine, the right set of flags would be:
>>
>>    - MAP_FIXED:   don't treat addr as a hint, fail if addr is not free
>>    - MAP_REPLACE: replace an existing mapping (or force or clobber)

> Actually, if we had a time machine... would we even provide
> MAP_REPLACE functionality?

Probably yes.  ELF loading needs to construct a complex set of mappings 
from a single file.  munmap (to create a hole) followed by mmap would be 
racy because another thread could have reused the gap in the meantime. 
The only alternative to overriding existing mappings would be mremap 
with MREMAP_FIXED, and that doesn't look like an improvement API-wise.

(The glibc dynamic linker uses an mmap call with an increased length to 
reserve address space and then loads additional segments with MAP_FIXED 
at the offsets specified in the program header.)

Thanks,
Florian

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [PATCH 0/2] mm: introduce MAP_FIXED_SAFE
@ 2017-12-08 20:57                         ` Matthew Wilcox
  0 siblings, 0 replies; 130+ messages in thread
From: Matthew Wilcox @ 2017-12-08 20:57 UTC (permalink / raw)
  To: Kees Cook
  Cc: Michal Hocko, Michael Ellerman, Cyril Hrubis, Linux API,
	Khalid Aziz, Andrew Morton, Russell King - ARM Linux,
	Andrea Arcangeli, Linux-MM, LKML, linux-arch, Florian Weimer,
	John Hubbard, Abdul Haleem, Joel Stanley, Pavel Machek

On Fri, Dec 08, 2017 at 12:13:31PM -0800, Kees Cook wrote:
> On Fri, Dec 8, 2017 at 12:33 AM, Michal Hocko <mhocko@kernel.org> wrote:
> > OK, this doesn't seem to lead to anywhere. The more this is discussed
> > the more names we are getting. So you know what? I will resubmit and
> > keep my original name. If somebody really hates it then feel free to
> > nack the patch and push alternative and gain concensus on it.
> >
> > I will keep MAP_FIXED_SAFE because it is an alternative to MAP_FIXED so
> > having that in the name is _useful_ for everybody familiar with
> > MAP_FIXED already. And _SAFE suffix tells that the operation doesn't
> > cause any silent memory corruptions or other unexpected side effects.
> 
> Looks like consensus is MAP_FIXED_NOREPLACE.

I'd rather MAP_AT_ADDR or MAP_REQUIRED, but I prefer FIXED_NOREPLACE to
FIXED_SAFE.

I just had a thought though -- MAP_STATIC?  ie don't move it.

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [PATCH 0/2] mm: introduce MAP_FIXED_SAFE
@ 2017-12-08 20:57                         ` Matthew Wilcox
  0 siblings, 0 replies; 130+ messages in thread
From: Matthew Wilcox @ 2017-12-08 20:57 UTC (permalink / raw)
  To: Kees Cook
  Cc: Michal Hocko, Michael Ellerman, Cyril Hrubis, Linux API,
	Khalid Aziz, Andrew Morton, Russell King - ARM Linux,
	Andrea Arcangeli, Linux-MM, LKML, linux-arch, Florian Weimer,
	John Hubbard, Abdul Haleem, Joel Stanley, Pavel Machek

On Fri, Dec 08, 2017 at 12:13:31PM -0800, Kees Cook wrote:
> On Fri, Dec 8, 2017 at 12:33 AM, Michal Hocko <mhocko-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> wrote:
> > OK, this doesn't seem to lead to anywhere. The more this is discussed
> > the more names we are getting. So you know what? I will resubmit and
> > keep my original name. If somebody really hates it then feel free to
> > nack the patch and push alternative and gain concensus on it.
> >
> > I will keep MAP_FIXED_SAFE because it is an alternative to MAP_FIXED so
> > having that in the name is _useful_ for everybody familiar with
> > MAP_FIXED already. And _SAFE suffix tells that the operation doesn't
> > cause any silent memory corruptions or other unexpected side effects.
> 
> Looks like consensus is MAP_FIXED_NOREPLACE.

I'd rather MAP_AT_ADDR or MAP_REQUIRED, but I prefer FIXED_NOREPLACE to
FIXED_SAFE.

I just had a thought though -- MAP_STATIC?  ie don't move it.

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [PATCH 0/2] mm: introduce MAP_FIXED_SAFE
@ 2017-12-08 20:57                         ` Matthew Wilcox
  0 siblings, 0 replies; 130+ messages in thread
From: Matthew Wilcox @ 2017-12-08 20:57 UTC (permalink / raw)
  To: Kees Cook
  Cc: Michal Hocko, Michael Ellerman, Cyril Hrubis, Linux API,
	Khalid Aziz, Andrew Morton, Russell King - ARM Linux,
	Andrea Arcangeli, Linux-MM, LKML, linux-arch, Florian Weimer,
	John Hubbard, Abdul Haleem, Joel Stanley, Pavel Machek

On Fri, Dec 08, 2017 at 12:13:31PM -0800, Kees Cook wrote:
> On Fri, Dec 8, 2017 at 12:33 AM, Michal Hocko <mhocko@kernel.org> wrote:
> > OK, this doesn't seem to lead to anywhere. The more this is discussed
> > the more names we are getting. So you know what? I will resubmit and
> > keep my original name. If somebody really hates it then feel free to
> > nack the patch and push alternative and gain concensus on it.
> >
> > I will keep MAP_FIXED_SAFE because it is an alternative to MAP_FIXED so
> > having that in the name is _useful_ for everybody familiar with
> > MAP_FIXED already. And _SAFE suffix tells that the operation doesn't
> > cause any silent memory corruptions or other unexpected side effects.
> 
> Looks like consensus is MAP_FIXED_NOREPLACE.

I'd rather MAP_AT_ADDR or MAP_REQUIRED, but I prefer FIXED_NOREPLACE to
FIXED_SAFE.

I just had a thought though -- MAP_STATIC?  ie don't move it.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [PATCH 2/2] fs, elf: drop MAP_FIXED usage from elf_map
  2017-11-29 17:45     ` Khalid Aziz
  (?)
@ 2018-05-29 22:21     ` Mike Kravetz
  2018-05-30  8:02       ` Michal Hocko
  -1 siblings, 1 reply; 130+ messages in thread
From: Mike Kravetz @ 2018-05-29 22:21 UTC (permalink / raw)
  To: Michal Hocko; +Cc: Andrew Morton, linux-mm, LKML, libhugetlbfs

Just a quick heads up.  I noticed a change in libhugetlbfs testing starting
with v4.17-rc1.

V4.16 libhugetlbfs test results
********** TEST SUMMARY
*                      2M            
*                      32-bit 64-bit 
*     Total testcases:   110    113   
*             Skipped:     0      0   
*                PASS:   105    111   
*                FAIL:     0      0   
*    Killed by signal:     4      1   
*   Bad configuration:     1      1   
*       Expected FAIL:     0      0   
*     Unexpected PASS:     0      0   
*    Test not present:     0      0   
* Strange test result:     0      0   
**********

v4.17-rc1 (and later) libhugetlbfs test results
********** TEST SUMMARY
*                      2M            
*                      32-bit 64-bit 
*     Total testcases:   110    113   
*             Skipped:     0      0   
*                PASS:    98    111   
*                FAIL:     0      0   
*    Killed by signal:    11      1   
*   Bad configuration:     1      1   
*       Expected FAIL:     0      0   
*     Unexpected PASS:     0      0   
*    Test not present:     0      0   
* Strange test result:     0      0   
**********

I traced the 7 additional (32-bit) killed by signal results to this
commit 4ed28639519c fs, elf: drop MAP_FIXED usage from elf_map.

libhugetlbfs does unusual things and even provides custom linker scripts.
So, in hindsight this change in behavior does not seem too unexpected.  I
JUST discovered this while running libhugetlbfs tests for an unrelated
issue/change and, will do some analysis to see exactly what is happening.

Also, will take it upon myself to run libhugetlbfs test suite on a
regular (at least weekly) basis.
-- 
Mike Kravetz

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [PATCH 2/2] fs, elf: drop MAP_FIXED usage from elf_map
  2018-05-29 22:21     ` Mike Kravetz
@ 2018-05-30  8:02       ` Michal Hocko
  2018-05-30 15:00         ` Mike Kravetz
  0 siblings, 1 reply; 130+ messages in thread
From: Michal Hocko @ 2018-05-30  8:02 UTC (permalink / raw)
  To: Mike Kravetz; +Cc: Andrew Morton, linux-mm, LKML, libhugetlbfs

On Tue 29-05-18 15:21:14, Mike Kravetz wrote:
> Just a quick heads up.  I noticed a change in libhugetlbfs testing starting
> with v4.17-rc1.
> 
> V4.16 libhugetlbfs test results
> ********** TEST SUMMARY
> *                      2M            
> *                      32-bit 64-bit 
> *     Total testcases:   110    113   
> *             Skipped:     0      0   
> *                PASS:   105    111   
> *                FAIL:     0      0   
> *    Killed by signal:     4      1   
> *   Bad configuration:     1      1   
> *       Expected FAIL:     0      0   
> *     Unexpected PASS:     0      0   
> *    Test not present:     0      0   
> * Strange test result:     0      0   
> **********
> 
> v4.17-rc1 (and later) libhugetlbfs test results
> ********** TEST SUMMARY
> *                      2M            
> *                      32-bit 64-bit 
> *     Total testcases:   110    113   
> *             Skipped:     0      0   
> *                PASS:    98    111   
> *                FAIL:     0      0   
> *    Killed by signal:    11      1   
> *   Bad configuration:     1      1   
> *       Expected FAIL:     0      0   
> *     Unexpected PASS:     0      0   
> *    Test not present:     0      0   
> * Strange test result:     0      0   
> **********
> 
> I traced the 7 additional (32-bit) killed by signal results to this
> commit 4ed28639519c fs, elf: drop MAP_FIXED usage from elf_map.
> 
> libhugetlbfs does unusual things and even provides custom linker scripts.
> So, in hindsight this change in behavior does not seem too unexpected.  I
> JUST discovered this while running libhugetlbfs tests for an unrelated
> issue/change and, will do some analysis to see exactly what is happening.

I am definitely interested about further details. Are there any messages
in the kernel log?

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [PATCH 2/2] fs, elf: drop MAP_FIXED usage from elf_map
  2018-05-30  8:02       ` Michal Hocko
@ 2018-05-30 15:00         ` Mike Kravetz
  2018-05-30 16:25           ` Michal Hocko
  0 siblings, 1 reply; 130+ messages in thread
From: Mike Kravetz @ 2018-05-30 15:00 UTC (permalink / raw)
  To: Michal Hocko; +Cc: Andrew Morton, linux-mm, LKML, libhugetlbfs

On 05/30/2018 01:02 AM, Michal Hocko wrote:
> On Tue 29-05-18 15:21:14, Mike Kravetz wrote:
>> Just a quick heads up.  I noticed a change in libhugetlbfs testing starting
>> with v4.17-rc1.
>>
>> V4.16 libhugetlbfs test results
>> ********** TEST SUMMARY
>> *                      2M            
>> *                      32-bit 64-bit 
>> *     Total testcases:   110    113   
>> *             Skipped:     0      0   
>> *                PASS:   105    111   
>> *                FAIL:     0      0   
>> *    Killed by signal:     4      1   
>> *   Bad configuration:     1      1   
>> *       Expected FAIL:     0      0   
>> *     Unexpected PASS:     0      0   
>> *    Test not present:     0      0   
>> * Strange test result:     0      0   
>> **********
>>
>> v4.17-rc1 (and later) libhugetlbfs test results
>> ********** TEST SUMMARY
>> *                      2M            
>> *                      32-bit 64-bit 
>> *     Total testcases:   110    113   
>> *             Skipped:     0      0   
>> *                PASS:    98    111   
>> *                FAIL:     0      0   
>> *    Killed by signal:    11      1   
>> *   Bad configuration:     1      1   
>> *       Expected FAIL:     0      0   
>> *     Unexpected PASS:     0      0   
>> *    Test not present:     0      0   
>> * Strange test result:     0      0   
>> **********
>>
>> I traced the 7 additional (32-bit) killed by signal results to this
>> commit 4ed28639519c fs, elf: drop MAP_FIXED usage from elf_map.
>>
>> libhugetlbfs does unusual things and even provides custom linker scripts.
>> So, in hindsight this change in behavior does not seem too unexpected.  I
>> JUST discovered this while running libhugetlbfs tests for an unrelated
>> issue/change and, will do some analysis to see exactly what is happening.
> 
> I am definitely interested about further details. Are there any messages
> in the kernel log?
>

Yes, new messages associated with the failures.

[   47.570451] 1368 (xB.linkhuge_nof): Uhuuh, elf segment at 00000000a731413b requested but the memory is mapped already
[   47.606991] 1372 (xB.linkhuge_nof): Uhuuh, elf segment at 00000000a731413b requested but the memory is mapped already
[   47.641351] 1376 (xB.linkhuge_nof): Uhuuh, elf segment at 00000000a731413b requested but the memory is mapped already
[   47.726138] 1384 (xB.linkhuge): Uhuuh, elf segment at 0000000090b9eaf6 requested but the memory is mapped already
[   47.773169] 1393 (xB.linkhuge): Uhuuh, elf segment at 0000000090b9eaf6 requested but the memory is mapped already
[   47.817788] 1402 (xB.linkhuge): Uhuuh, elf segment at 0000000090b9eaf6 requested but the memory is mapped already
[   47.857338] 1406 (xB.linkshare): Uhuuh, elf segment at 0000000018430471 requested but the memory is mapped already
[   47.956355] 1427 (xB.linkshare): Uhuuh, elf segment at 0000000018430471 requested but the memory is mapped already
[   48.054894] 1448 (xB.linkhuge): Uhuuh, elf segment at 0000000090b9eaf6 requested but the memory is mapped already
[   48.071221] 1451 (xB.linkhuge): Uhuuh, elf segment at 0000000090b9eaf6 requested but the memory is mapped already

Just curious, the addresses printed in those messages does not seem correct.
They should be page aligned.  Correct?  I think that %p conversion in the
pr_info() may doing something wrong.

Also, the new failures in question are indeed being built with custom linker
scripts designed for use with binutils older than 2.16 (really old).  So, no
new users should encounter this issue (I think).  It appears that this may
only impact old applications built long ago with pre-2.16 binutils.
-- 
Mike Kravetz

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [PATCH 2/2] fs, elf: drop MAP_FIXED usage from elf_map
  2018-05-30 15:00         ` Mike Kravetz
@ 2018-05-30 16:25           ` Michal Hocko
  2018-05-31  0:51             ` Mike Kravetz
  0 siblings, 1 reply; 130+ messages in thread
From: Michal Hocko @ 2018-05-30 16:25 UTC (permalink / raw)
  To: Mike Kravetz; +Cc: Andrew Morton, linux-mm, LKML, libhugetlbfs

On Wed 30-05-18 08:00:29, Mike Kravetz wrote:
> On 05/30/2018 01:02 AM, Michal Hocko wrote:
> > On Tue 29-05-18 15:21:14, Mike Kravetz wrote:
> >> Just a quick heads up.  I noticed a change in libhugetlbfs testing starting
> >> with v4.17-rc1.
> >>
> >> V4.16 libhugetlbfs test results
> >> ********** TEST SUMMARY
> >> *                      2M            
> >> *                      32-bit 64-bit 
> >> *     Total testcases:   110    113   
> >> *             Skipped:     0      0   
> >> *                PASS:   105    111   
> >> *                FAIL:     0      0   
> >> *    Killed by signal:     4      1   
> >> *   Bad configuration:     1      1   
> >> *       Expected FAIL:     0      0   
> >> *     Unexpected PASS:     0      0   
> >> *    Test not present:     0      0   
> >> * Strange test result:     0      0   
> >> **********
> >>
> >> v4.17-rc1 (and later) libhugetlbfs test results
> >> ********** TEST SUMMARY
> >> *                      2M            
> >> *                      32-bit 64-bit 
> >> *     Total testcases:   110    113   
> >> *             Skipped:     0      0   
> >> *                PASS:    98    111   
> >> *                FAIL:     0      0   
> >> *    Killed by signal:    11      1   
> >> *   Bad configuration:     1      1   
> >> *       Expected FAIL:     0      0   
> >> *     Unexpected PASS:     0      0   
> >> *    Test not present:     0      0   
> >> * Strange test result:     0      0   
> >> **********
> >>
> >> I traced the 7 additional (32-bit) killed by signal results to this
> >> commit 4ed28639519c fs, elf: drop MAP_FIXED usage from elf_map.
> >>
> >> libhugetlbfs does unusual things and even provides custom linker scripts.
> >> So, in hindsight this change in behavior does not seem too unexpected.  I
> >> JUST discovered this while running libhugetlbfs tests for an unrelated
> >> issue/change and, will do some analysis to see exactly what is happening.
> > 
> > I am definitely interested about further details. Are there any messages
> > in the kernel log?
> >
> 
> Yes, new messages associated with the failures.
> 
> [   47.570451] 1368 (xB.linkhuge_nof): Uhuuh, elf segment at 00000000a731413b requested but the memory is mapped already
> [   47.606991] 1372 (xB.linkhuge_nof): Uhuuh, elf segment at 00000000a731413b requested but the memory is mapped already
> [   47.641351] 1376 (xB.linkhuge_nof): Uhuuh, elf segment at 00000000a731413b requested but the memory is mapped already
> [   47.726138] 1384 (xB.linkhuge): Uhuuh, elf segment at 0000000090b9eaf6 requested but the memory is mapped already
> [   47.773169] 1393 (xB.linkhuge): Uhuuh, elf segment at 0000000090b9eaf6 requested but the memory is mapped already
> [   47.817788] 1402 (xB.linkhuge): Uhuuh, elf segment at 0000000090b9eaf6 requested but the memory is mapped already
> [   47.857338] 1406 (xB.linkshare): Uhuuh, elf segment at 0000000018430471 requested but the memory is mapped already
> [   47.956355] 1427 (xB.linkshare): Uhuuh, elf segment at 0000000018430471 requested but the memory is mapped already
> [   48.054894] 1448 (xB.linkhuge): Uhuuh, elf segment at 0000000090b9eaf6 requested but the memory is mapped already
> [   48.071221] 1451 (xB.linkhuge): Uhuuh, elf segment at 0000000090b9eaf6 requested but the memory is mapped already
> 
> Just curious, the addresses printed in those messages does not seem correct.
> They should be page aligned.  Correct?

I have no idea what the loader actually does here.

> I think that %p conversion in the pr_info() may doing something wrong.

Well, we are using %px and that shouldn't do any tricks to the given
address.

> Also, the new failures in question are indeed being built with custom linker
> scripts designed for use with binutils older than 2.16 (really old).  So, no
> new users should encounter this issue (I think).  It appears that this may
> only impact old applications built long ago with pre-2.16 binutils.

Could you add a debugging data to dump the VMA which overlaps the
requested adress and who requested that? E.g. hook into do_mmap and dump
all requests from the linker.
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [PATCH 2/2] fs, elf: drop MAP_FIXED usage from elf_map
  2018-05-30 16:25           ` Michal Hocko
@ 2018-05-31  0:51             ` Mike Kravetz
  2018-05-31  9:24               ` Michal Hocko
  0 siblings, 1 reply; 130+ messages in thread
From: Mike Kravetz @ 2018-05-31  0:51 UTC (permalink / raw)
  To: Michal Hocko; +Cc: Andrew Morton, linux-mm, LKML, libhugetlbfs

On 05/30/2018 09:25 AM, Michal Hocko wrote:
> Could you add a debugging data to dump the VMA which overlaps the
> requested adress and who requested that? E.g. hook into do_mmap and dump
> all requests from the linker.

Here you go.  I added a bunch of stuff as I clearly do not understand
how elf loading works.  To me, the 'sections' parsed by the kernel code
do not seem to directly align with those produced by objdump.

[   38.899260] load_elf_binary: attempting to load file ./tests/obj32/xB.linkhuge_nofd
[   38.902340]     dumping section headers
[   38.903534]     index 0 p_offset = 34
[   38.904683]     index 0 p_vaddr  = 8048034
[   38.905680]     index 0 p_paddr  = 8048034
[   38.906442]     index 0 p_filesz = 120
[   38.907110]     index 0 p_memsz  = 120
[   38.907764] 
[   38.908019]     index 1 p_offset = 154
[   38.908521]     index 1 p_vaddr  = 8048154
[   38.909081]     index 1 p_paddr  = 8048154
[   38.909496]     index 1 p_filesz = 13
[   38.909855]     index 1 p_memsz  = 13
[   38.910453] 
[   38.910731]     index 2 p_offset = 0
[   38.911317]     index 2 p_vaddr  = 8048000
[   38.911997]     index 2 p_paddr  = 8048000
[   38.912590]     index 2 p_filesz = 169c
[   38.913141]     index 2 p_memsz  = 169c
[   38.913713] 
[   38.913987]     index 3 p_offset = 169c
[   38.914518]     index 3 p_vaddr  = 804969c
[   38.915101]     index 3 p_paddr  = 804969c
[   38.915718]     index 3 p_filesz = 1878
[   38.916266]     index 3 p_memsz  = 1878
[   38.916799] 
[   38.917032]     index 4 p_offset = 3000
[   38.917537]     index 4 p_vaddr  = 9000000
[   38.918119]     index 4 p_paddr  = 9000000
[   38.918709]     index 4 p_filesz = 0
[   38.919525]     index 4 p_memsz  = 10
[   38.919993] 
[   38.920275]     index 5 p_offset = 2d88
[   38.920791]     index 5 p_vaddr  = 804ad88
[   38.921307]     index 5 p_paddr  = 804ad88
[   38.921800]     index 5 p_filesz = 18c
[   38.922288]     index 5 p_memsz  = 18c
[   38.922739] 
[   38.922973]     index 6 p_offset = 168
[   38.923431]     index 6 p_vaddr  = 8048168
[   38.923946]     index 6 p_paddr  = 8048168
[   38.924457]     index 6 p_filesz = 44
[   38.924931]     index 6 p_memsz  = 44
[   38.925414] 
[   38.925593]     index 7 p_offset = 0
[   38.926031]     index 7 p_vaddr  = 0
[   38.926510]     index 7 p_paddr  = 0
[   38.926957]     index 7 p_filesz = 0
[   38.927443]     index 7 p_memsz  = 0
[   38.927879] 
[   38.928115]     index 8 p_offset = 169c
[   38.928594]     index 8 p_vaddr  = 804969c
[   38.929091]     index 8 p_paddr  = 804969c
[   38.929646]     index 8 p_filesz = 8c
[   38.930177]     index 8 p_memsz  = 8c
[   38.930710] 
[   38.931497] load_elf_binary: skipping index 0 p_vaddr = 8048034
[   38.932321] load_elf_binary: skipping index 1 p_vaddr = 8048154
[   38.933165] load_elf_binary: calling elf_map() index 2 bias 0 vaddr 8048000
[   38.934087]     map_addr ELF_PAGESTART(addr) 8048000 total_size 0 ELF_PAGEALIGN(size) 2000
[   38.935101]     eppnt->p_offset = 0
[   38.935561]     eppnt->p_vaddr  = 8048000
[   38.936073]     eppnt->p_paddr  = 8048000
[   38.936897]     eppnt->p_filesz = 169c
[   38.937493]     eppnt->p_memsz  = 169c
[   38.938042] load_elf_binary: calling elf_map() index 3 bias 0 vaddr 804969c
[   38.939002]     map_addr ELF_PAGESTART(addr) 8049000 total_size 0 ELF_PAGEALIGN(size) 2000
[   38.939959]     eppnt->p_offset = 169c
[   38.940410]     eppnt->p_vaddr  = 804969c
[   38.940897]     eppnt->p_paddr  = 804969c
[   38.941507]     eppnt->p_filesz = 1878
[   38.942019]     eppnt->p_memsz  = 1878
[   38.942516] 1123 (xB.linkhuge_nof): Uhuuh, elf segment at 8049000 requested but the memory is mapped already

It is pretty easy to see the mmap conflict.  I'm still trying to determine if
the executable file is 'valid'.  It did not throw an error previously as
MAP_FIXED unmapped the overlapping page.  However, this does not seem right.
-- 
Mike Kravetz

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [PATCH 2/2] fs, elf: drop MAP_FIXED usage from elf_map
  2018-05-31  0:51             ` Mike Kravetz
@ 2018-05-31  9:24               ` Michal Hocko
  2018-05-31 21:46                 ` Mike Kravetz
  0 siblings, 1 reply; 130+ messages in thread
From: Michal Hocko @ 2018-05-31  9:24 UTC (permalink / raw)
  To: Mike Kravetz; +Cc: Andrew Morton, linux-mm, LKML, libhugetlbfs

On Wed 30-05-18 17:51:15, Mike Kravetz wrote:
[...]
> [   38.931497] load_elf_binary: skipping index 0 p_vaddr = 8048034
> [   38.932321] load_elf_binary: skipping index 1 p_vaddr = 8048154
> [   38.933165] load_elf_binary: calling elf_map() index 2 bias 0 vaddr 8048000
> [   38.934087]     map_addr ELF_PAGESTART(addr) 8048000 total_size 0 ELF_PAGEALIGN(size) 2000
> [   38.935101]     eppnt->p_offset = 0
> [   38.935561]     eppnt->p_vaddr  = 8048000
> [   38.936073]     eppnt->p_paddr  = 8048000
> [   38.936897]     eppnt->p_filesz = 169c
> [   38.937493]     eppnt->p_memsz  = 169c
> [   38.938042] load_elf_binary: calling elf_map() index 3 bias 0 vaddr 804969c
> [   38.939002]     map_addr ELF_PAGESTART(addr) 8049000 total_size 0 ELF_PAGEALIGN(size) 2000
> [   38.939959]     eppnt->p_offset = 169c
> [   38.940410]     eppnt->p_vaddr  = 804969c
> [   38.940897]     eppnt->p_paddr  = 804969c
> [   38.941507]     eppnt->p_filesz = 1878
> [   38.942019]     eppnt->p_memsz  = 1878
> [   38.942516] 1123 (xB.linkhuge_nof): Uhuuh, elf segment at 8049000 requested but the memory is mapped already
> 
> It is pretty easy to see the mmap conflict.  I'm still trying to determine if
> the executable file is 'valid'.  It did not throw an error previously as
> MAP_FIXED unmapped the overlapping page.  However, this does not seem right.

Yes, it looks suspicious to say the least. How come the original content
is not needed anymore? Maybe the first section should be 0x1000 rather
than 0x169c?

I am not an expert on the load linkers myself so I cannot really answer
this question. Please note that ppc had something similar. See
ad55eac74f20 ("elf: enforce MAP_FIXED on overlaying elf segments").
Maybe we need to sprinkle more of those at other places?
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [PATCH 2/2] fs, elf: drop MAP_FIXED usage from elf_map
  2018-05-31  9:24               ` Michal Hocko
@ 2018-05-31 21:46                 ` Mike Kravetz
  0 siblings, 0 replies; 130+ messages in thread
From: Mike Kravetz @ 2018-05-31 21:46 UTC (permalink / raw)
  To: Michal Hocko; +Cc: Andrew Morton, linux-mm, LKML, libhugetlbfs

On 05/31/2018 02:24 AM, Michal Hocko wrote:
> I am not an expert on the load linkers myself so I cannot really answer
> this question. Please note that ppc had something similar. See
> ad55eac74f20 ("elf: enforce MAP_FIXED on overlaying elf segments").
> Maybe we need to sprinkle more of those at other places?

I finally understand the issue, and it is NOT a problem with the kernel.
The issue is with old libhugetlbfs provided linker scripts, and yes,
starting with v4.17 people who run libhugetlbfs tests on x86 (at least)
will see additional failures.

I'll try to work this from the libhugetlbfs side.  In the unlikely event
that anyone knows about those linker scripts, assistance and/or feedback
would be appreciated.

Read on only if you want additional details about this failure.

The executable files which are now failing are created with the elf_i386.xB
linker script.  This script is provided for pre-2.17 versions of binutils.
binutils-2.17 came out aprox in 2007, and this script is disabled by default
if binutils-2.17 or later is used.  The only way to create executables with
this script today is by setting the HUGETLB_DEPRECATED_LINK env variable.
This is what libhugetlbfs tests do to simply continue testing the old scripts.

I previously was mistaken about which tests were causing the additional
failures.  The example I previously provided failed on v4.16 as well as
v4.17-rc kernels.  So, please ignore that information.

For an executable that runs on v4.16 and fails on v4.17-rc, here is a listing
of elf sections that the kernel will attempt to load.

Type           Offset   VirtAddr   PhysAddr   FileSiz MemSiz  Flg Align
LOAD           0x000000 0x08048000 0x08048000 0x11c24 0x11c24 R E 0x1000
LOAD           0x011c24 0x08059c24 0x08059c24 0x10d04 0x10d04 RW  0x1000
LOAD           0x023000 0x09000000 0x09000000 0x00000 0x10048 RWE 0x1000

The first section is loaded without issue.  elf_map() will create a vma
based on the following:
map_addr ELF_PAGESTART(addr) 8048000 ELF_PAGEALIGN(size) 12000 
File_offset 0

We then attempt to load the following section with:
map_addr ELF_PAGESTART(addr) 8059000 ELF_PAGEALIGN(size) 12000
File_offset 11000

This results in,
Uhuuh, elf segment at 8059000 requested but the memory is mapped already

Note that the last page of the first section overlaps with the first page
of the second section.  Unlike the case in ad55eac74f20, the access
permissions on section 1 (RE) are different than section 2 (RW).  If we
allowed the previous MAP_FIXED behavior, we would be changing part of a
read only section to read write.  This is exactly what MAP_FIXED_NOREPLACE
was designed to prevent.
-- 
Mike Kravetz

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [PATCH 2/2] fs, elf: drop MAP_FIXED usage from elf_map
  2018-04-18 11:43       ` Tetsuo Handa
@ 2018-04-18 11:55         ` Michal Hocko
  0 siblings, 0 replies; 130+ messages in thread
From: Michal Hocko @ 2018-04-18 11:55 UTC (permalink / raw)
  To: Tetsuo Handa; +Cc: akpm, linux-mm, linux-kernel

On Wed 18-04-18 20:43:11, Tetsuo Handa wrote:
> Michal Hocko wrote:
> > > Don't complain if IS_ERR_VALUE(),
> > 
> > this is simply wrong. We do want to warn on the failure because this is
> > when the actual clash happens. We should just warn on EEXIST.
> 
> >From 25442cdd31aa5cc8522923a0153a77dfd2ebc832 Mon Sep 17 00:00:00 2001
> From: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
> Date: Wed, 18 Apr 2018 20:38:15 +0900
> Subject: [PATCH] fs, elf: don't complain MAP_FIXED_NOREPLACE unless -EEXIST
>  error.
> 
> Commit 4ed28639519c7bad ("fs, elf: drop MAP_FIXED usage from elf_map") is
> printing spurious messages under memory pressure due to map_addr == -ENOMEM.
> 
>  9794 (a.out): Uhuuh, elf segment at 00007f2e34738000(fffffffffffffff4) requested but the memory is mapped already
>  14104 (a.out): Uhuuh, elf segment at 00007f34fd76c000(fffffffffffffff4) requested but the memory is mapped already
>  16843 (a.out): Uhuuh, elf segment at 00007f930ecc7000(fffffffffffffff4) requested but the memory is mapped already
> 
> Complain only if -EEXIST, and use %px for printing the address.

Yes this is better. But...

[...]
> -	if ((type & MAP_FIXED_NOREPLACE) && BAD_ADDR(map_addr))
> -		pr_info("%d (%s): Uhuuh, elf segment at %p requested but the memory is mapped already\n",
> -				task_pid_nr(current), current->comm,
> -				(void *)addr);
> +	if ((type & MAP_FIXED_NOREPLACE) && map_addr == -EEXIST)

... please use PTR_ERR(map_addr) == -EEXIST

then you can add 
Acked-by: Michal Hocko <mhocko@suse.com>

> +		pr_info("%d (%s): Uhuuh, elf segment at %px requested but the memory is mapped already\n",
> +			task_pid_nr(current), current->comm, (void *)addr);
>  
>  	return(map_addr);
>  }
> -- 
> 1.8.3.1

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [PATCH 2/2] fs, elf: drop MAP_FIXED usage from elf_map
  2018-04-18 11:33     ` Michal Hocko
@ 2018-04-18 11:43       ` Tetsuo Handa
  2018-04-18 11:55         ` Michal Hocko
  0 siblings, 1 reply; 130+ messages in thread
From: Tetsuo Handa @ 2018-04-18 11:43 UTC (permalink / raw)
  To: mhocko; +Cc: akpm, linux-mm, linux-kernel

Michal Hocko wrote:
> > Don't complain if IS_ERR_VALUE(),
> 
> this is simply wrong. We do want to warn on the failure because this is
> when the actual clash happens. We should just warn on EEXIST.

>From 25442cdd31aa5cc8522923a0153a77dfd2ebc832 Mon Sep 17 00:00:00 2001
From: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Date: Wed, 18 Apr 2018 20:38:15 +0900
Subject: [PATCH] fs, elf: don't complain MAP_FIXED_NOREPLACE unless -EEXIST
 error.

Commit 4ed28639519c7bad ("fs, elf: drop MAP_FIXED usage from elf_map") is
printing spurious messages under memory pressure due to map_addr == -ENOMEM.

 9794 (a.out): Uhuuh, elf segment at 00007f2e34738000(fffffffffffffff4) requested but the memory is mapped already
 14104 (a.out): Uhuuh, elf segment at 00007f34fd76c000(fffffffffffffff4) requested but the memory is mapped already
 16843 (a.out): Uhuuh, elf segment at 00007f930ecc7000(fffffffffffffff4) requested but the memory is mapped already

Complain only if -EEXIST, and use %px for printing the address.

Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Andrei Vagin <avagin@openvz.org>
Cc: Khalid Aziz <khalid.aziz@oracle.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Kees Cook <keescook@chromium.org>
Cc: Abdul Haleem <abdhalee@linux.vnet.ibm.com>
Cc: Joel Stanley <joel@jms.id.au>
Cc: Anshuman Khandual <khandual@linux.vnet.ibm.com>
---
 fs/binfmt_elf.c | 7 +++----
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/fs/binfmt_elf.c b/fs/binfmt_elf.c
index 41e0418..96615d9 100644
--- a/fs/binfmt_elf.c
+++ b/fs/binfmt_elf.c
@@ -377,10 +377,9 @@ static unsigned long elf_map(struct file *filep, unsigned long addr,
 	} else
 		map_addr = vm_mmap(filep, addr, size, prot, type, off);
 
-	if ((type & MAP_FIXED_NOREPLACE) && BAD_ADDR(map_addr))
-		pr_info("%d (%s): Uhuuh, elf segment at %p requested but the memory is mapped already\n",
-				task_pid_nr(current), current->comm,
-				(void *)addr);
+	if ((type & MAP_FIXED_NOREPLACE) && map_addr == -EEXIST)
+		pr_info("%d (%s): Uhuuh, elf segment at %px requested but the memory is mapped already\n",
+			task_pid_nr(current), current->comm, (void *)addr);
 
 	return(map_addr);
 }
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 130+ messages in thread

* Re: [PATCH 2/2] fs, elf: drop MAP_FIXED usage from elf_map
  2018-04-18 10:51     ` Tetsuo Handa
  (?)
@ 2018-04-18 11:33     ` Michal Hocko
  2018-04-18 11:43       ` Tetsuo Handa
  -1 siblings, 1 reply; 130+ messages in thread
From: Michal Hocko @ 2018-04-18 11:33 UTC (permalink / raw)
  To: Tetsuo Handa; +Cc: Andrew Morton, linux-mm, LKML

On Wed 18-04-18 19:51:05, Tetsuo Handa wrote:
> >From 0ba20dcbbc40b703413c9a6907a77968b087811b Mon Sep 17 00:00:00 2001
> From: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
> Date: Wed, 18 Apr 2018 15:31:48 +0900
> Subject: [PATCH] fs, elf: don't complain MAP_FIXED_NOREPLACE if mapping
>  failed.
> 
> Commit 4ed28639519c7bad ("fs, elf: drop MAP_FIXED usage from elf_map") is
> printing spurious messages under memory pressure due to map_addr == -ENOMEM.
> 
>  9794 (a.out): Uhuuh, elf segment at 00007f2e34738000(fffffffffffffff4) requested but the memory is mapped already
>  14104 (a.out): Uhuuh, elf segment at 00007f34fd76c000(fffffffffffffff4) requested but the memory is mapped already
>  16843 (a.out): Uhuuh, elf segment at 00007f930ecc7000(fffffffffffffff4) requested but the memory is mapped already

Hmm this is ENOMEM.

> Don't complain if IS_ERR_VALUE(),

this is simply wrong. We do want to warn on the failure because this is
when the actual clash happens. We should just warn on EEXIST.

> and use %px for printing the address.
> 
> Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
> Cc: Michal Hocko <mhocko@suse.com>
> Cc: Andrei Vagin <avagin@openvz.org>
> Cc: Khalid Aziz <khalid.aziz@oracle.com>
> Cc: Michael Ellerman <mpe@ellerman.id.au>
> Cc: Kees Cook <keescook@chromium.org>
> Cc: Abdul Haleem <abdhalee@linux.vnet.ibm.com>
> Cc: Joel Stanley <joel@jms.id.au>
> Cc: Anshuman Khandual <khandual@linux.vnet.ibm.com>
> ---
>  fs/binfmt_elf.c | 8 ++++----
>  1 file changed, 4 insertions(+), 4 deletions(-)
> 
> diff --git a/fs/binfmt_elf.c b/fs/binfmt_elf.c
> index 41e0418..559d35b 100644
> --- a/fs/binfmt_elf.c
> +++ b/fs/binfmt_elf.c
> @@ -377,10 +377,10 @@ static unsigned long elf_map(struct file *filep, unsigned long addr,
>  	} else
>  		map_addr = vm_mmap(filep, addr, size, prot, type, off);
>  
> -	if ((type & MAP_FIXED_NOREPLACE) && BAD_ADDR(map_addr))
> -		pr_info("%d (%s): Uhuuh, elf segment at %p requested but the memory is mapped already\n",
> -				task_pid_nr(current), current->comm,
> -				(void *)addr);
> +	if ((type & MAP_FIXED_NOREPLACE) && BAD_ADDR(map_addr) &&
> +	    !IS_ERR_VALUE(map_addr))
> +		pr_info("%d (%s): Uhuuh, elf segment at %px requested but the memory is mapped already\n",
> +			task_pid_nr(current), current->comm, (void *)addr);
>  
>  	return(map_addr);
>  }
> -- 
> 1.8.3.1
> 
> 

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [PATCH 2/2] fs, elf: drop MAP_FIXED usage from elf_map
  2017-12-13  9:25   ` Michal Hocko
@ 2018-04-18 10:51     ` Tetsuo Handa
  -1 siblings, 0 replies; 130+ messages in thread
From: Tetsuo Handa @ 2018-04-18 10:51 UTC (permalink / raw)
  To: Andrew Morton, Michal Hocko; +Cc: linux-mm, LKML

>From 0ba20dcbbc40b703413c9a6907a77968b087811b Mon Sep 17 00:00:00 2001
From: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Date: Wed, 18 Apr 2018 15:31:48 +0900
Subject: [PATCH] fs, elf: don't complain MAP_FIXED_NOREPLACE if mapping
 failed.

Commit 4ed28639519c7bad ("fs, elf: drop MAP_FIXED usage from elf_map") is
printing spurious messages under memory pressure due to map_addr == -ENOMEM.

 9794 (a.out): Uhuuh, elf segment at 00007f2e34738000(fffffffffffffff4) requested but the memory is mapped already
 14104 (a.out): Uhuuh, elf segment at 00007f34fd76c000(fffffffffffffff4) requested but the memory is mapped already
 16843 (a.out): Uhuuh, elf segment at 00007f930ecc7000(fffffffffffffff4) requested but the memory is mapped already

Don't complain if IS_ERR_VALUE(), and use %px for printing the address.

Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Andrei Vagin <avagin@openvz.org>
Cc: Khalid Aziz <khalid.aziz@oracle.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Kees Cook <keescook@chromium.org>
Cc: Abdul Haleem <abdhalee@linux.vnet.ibm.com>
Cc: Joel Stanley <joel@jms.id.au>
Cc: Anshuman Khandual <khandual@linux.vnet.ibm.com>
---
 fs/binfmt_elf.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/fs/binfmt_elf.c b/fs/binfmt_elf.c
index 41e0418..559d35b 100644
--- a/fs/binfmt_elf.c
+++ b/fs/binfmt_elf.c
@@ -377,10 +377,10 @@ static unsigned long elf_map(struct file *filep, unsigned long addr,
 	} else
 		map_addr = vm_mmap(filep, addr, size, prot, type, off);
 
-	if ((type & MAP_FIXED_NOREPLACE) && BAD_ADDR(map_addr))
-		pr_info("%d (%s): Uhuuh, elf segment at %p requested but the memory is mapped already\n",
-				task_pid_nr(current), current->comm,
-				(void *)addr);
+	if ((type & MAP_FIXED_NOREPLACE) && BAD_ADDR(map_addr) &&
+	    !IS_ERR_VALUE(map_addr))
+		pr_info("%d (%s): Uhuuh, elf segment at %px requested but the memory is mapped already\n",
+			task_pid_nr(current), current->comm, (void *)addr);
 
 	return(map_addr);
 }
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 130+ messages in thread

* Re: [PATCH 2/2] fs, elf: drop MAP_FIXED usage from elf_map
@ 2018-04-18 10:51     ` Tetsuo Handa
  0 siblings, 0 replies; 130+ messages in thread
From: Tetsuo Handa @ 2018-04-18 10:51 UTC (permalink / raw)
  To: Andrew Morton, Michal Hocko; +Cc: linux-mm, LKML



^ permalink raw reply	[flat|nested] 130+ messages in thread

* [PATCH 2/2] fs, elf: drop MAP_FIXED usage from elf_map
  2017-12-13  9:25 [PATCH v2 " Michal Hocko
  2017-12-13  9:25   ` Michal Hocko
@ 2017-12-13  9:25   ` Michal Hocko
  0 siblings, 0 replies; 130+ messages in thread
From: Michal Hocko @ 2017-12-13  9:25 UTC (permalink / raw)
  To: linux-api
  Cc: Khalid Aziz, Michael Ellerman, Andrew Morton,
	Russell King - ARM Linux, Andrea Arcangeli, linux-mm, LKML,
	linux-arch, Florian Weimer, John Hubbard, Matthew Wilcox,
	Michal Hocko, Abdul Haleem, Joel Stanley, Kees Cook

From: Michal Hocko <mhocko@suse.com>

Both load_elf_interp and load_elf_binary rely on elf_map to map segments
on a controlled address and they use MAP_FIXED to enforce that. This is
however dangerous thing prone to silent data corruption which can be
even exploitable. Let's take CVE-2017-1000253 as an example. At the time
(before eab09532d400 ("binfmt_elf: use ELF_ET_DYN_BASE only for PIE"))
ELF_ET_DYN_BASE was at TASK_SIZE / 3 * 2 which is not that far away from
the stack top on 32b (legacy) memory layout (only 1GB away). Therefore
we could end up mapping over the existing stack with some luck.

The issue has been fixed since then (a87938b2e246 ("fs/binfmt_elf.c:
fix bug in loading of PIE binaries")), ELF_ET_DYN_BASE moved moved much
further from the stack (eab09532d400 and later by c715b72c1ba4 ("mm:
revert x86_64 and arm64 ELF_ET_DYN_BASE base changes")) and excessive
stack consumption early during execve fully stopped by da029c11e6b1
("exec: Limit arg stack to at most 75% of _STK_LIM"). So we should be
safe and any attack should be impractical. On the other hand this is
just too subtle assumption so it can break quite easily and hard to
spot.

I believe that the MAP_FIXED usage in load_elf_binary (et. al) is still
fundamentally dangerous. Moreover it shouldn't be even needed. We are
at the early process stage and so there shouldn't be unrelated mappings
(except for stack and loader) existing so mmap for a given address
should succeed even without MAP_FIXED. Something is terribly wrong if
this is not the case and we should rather fail than silently corrupt the
underlying mapping.

Address this issue by changing MAP_FIXED to the newly added
MAP_FIXED_SAFE. This will mean that mmap will fail if there is an
existing mapping clashing with the requested one without clobbering it.

Cc: Abdul Haleem <abdhalee@linux.vnet.ibm.com>
Cc: Joel Stanley <joel@jms.id.au>
Acked-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Khalid Aziz <khalid.aziz@oracle.com>
Signed-off-by: Michal Hocko <mhocko@suse.com>
---
 arch/metag/kernel/process.c |  6 +++++-
 fs/binfmt_elf.c             | 12 ++++++++----
 2 files changed, 13 insertions(+), 5 deletions(-)

diff --git a/arch/metag/kernel/process.c b/arch/metag/kernel/process.c
index 0909834c83a7..867c8d0a5fb4 100644
--- a/arch/metag/kernel/process.c
+++ b/arch/metag/kernel/process.c
@@ -399,7 +399,7 @@ unsigned long __metag_elf_map(struct file *filep, unsigned long addr,
 	tcm_tag = tcm_lookup_tag(addr);
 
 	if (tcm_tag != TCM_INVALID_TAG)
-		type &= ~MAP_FIXED;
+		type &= ~(MAP_FIXED | MAP_FIXED_SAFE);
 
 	/*
 	* total_size is the size of the ELF (interpreter) image.
@@ -417,6 +417,10 @@ unsigned long __metag_elf_map(struct file *filep, unsigned long addr,
 	} else
 		map_addr = vm_mmap(filep, addr, size, prot, type, off);
 
+	if ((type & MAP_FIXED_SAFE) && BAD_ADDR(map_addr))
+		pr_info("%d (%s): Uhuuh, elf segement at %p requested but the memory is mapped already\n",
+				task_pid_nr(current), tsk->comm, (void*)addr);
+
 	if (!BAD_ADDR(map_addr) && tcm_tag != TCM_INVALID_TAG) {
 		struct tcm_allocation *tcm;
 		unsigned long tcm_addr;
diff --git a/fs/binfmt_elf.c b/fs/binfmt_elf.c
index 73b01e474fdc..5916d45f64a7 100644
--- a/fs/binfmt_elf.c
+++ b/fs/binfmt_elf.c
@@ -372,6 +372,10 @@ static unsigned long elf_map(struct file *filep, unsigned long addr,
 	} else
 		map_addr = vm_mmap(filep, addr, size, prot, type, off);
 
+	if ((type & MAP_FIXED_SAFE) && BAD_ADDR(map_addr))
+		pr_info("%d (%s): Uhuuh, elf segement at %p requested but the memory is mapped already\n",
+				task_pid_nr(current), current->comm, (void*)addr);
+
 	return(map_addr);
 }
 
@@ -569,7 +573,7 @@ static unsigned long load_elf_interp(struct elfhdr *interp_elf_ex,
 				elf_prot |= PROT_EXEC;
 			vaddr = eppnt->p_vaddr;
 			if (interp_elf_ex->e_type == ET_EXEC || load_addr_set)
-				elf_type |= MAP_FIXED;
+				elf_type |= MAP_FIXED_SAFE;
 			else if (no_base && interp_elf_ex->e_type == ET_DYN)
 				load_addr = -vaddr;
 
@@ -930,7 +934,7 @@ static int load_elf_binary(struct linux_binprm *bprm)
 		 * the ET_DYN load_addr calculations, proceed normally.
 		 */
 		if (loc->elf_ex.e_type == ET_EXEC || load_addr_set) {
-			elf_flags |= MAP_FIXED;
+			elf_flags |= MAP_FIXED_SAFE;
 		} else if (loc->elf_ex.e_type == ET_DYN) {
 			/*
 			 * This logic is run once for the first LOAD Program
@@ -966,7 +970,7 @@ static int load_elf_binary(struct linux_binprm *bprm)
 				load_bias = ELF_ET_DYN_BASE;
 				if (current->flags & PF_RANDOMIZE)
 					load_bias += arch_mmap_rnd();
-				elf_flags |= MAP_FIXED;
+				elf_flags |= MAP_FIXED_SAFE;
 			} else
 				load_bias = 0;
 
@@ -1223,7 +1227,7 @@ static int load_elf_library(struct file *file)
 			(eppnt->p_filesz +
 			 ELF_PAGEOFFSET(eppnt->p_vaddr)),
 			PROT_READ | PROT_WRITE | PROT_EXEC,
-			MAP_FIXED | MAP_PRIVATE | MAP_DENYWRITE,
+			MAP_FIXED_SAFE | MAP_PRIVATE | MAP_DENYWRITE,
 			(eppnt->p_offset -
 			 ELF_PAGEOFFSET(eppnt->p_vaddr)));
 	if (error != ELF_PAGESTART(eppnt->p_vaddr))
-- 
2.15.0

^ permalink raw reply related	[flat|nested] 130+ messages in thread

* [PATCH 2/2] fs, elf: drop MAP_FIXED usage from elf_map
@ 2017-12-13  9:25   ` Michal Hocko
  0 siblings, 0 replies; 130+ messages in thread
From: Michal Hocko @ 2017-12-13  9:25 UTC (permalink / raw)
  To: linux-api
  Cc: Khalid Aziz, Michael Ellerman, Andrew Morton,
	Russell King - ARM Linux, Andrea Arcangeli, linux-mm, LKML,
	linux-arch, Florian Weimer, John Hubbard, Matthew Wilcox,
	Michal Hocko, Abdul Haleem, Joel Stanley, Kees Cook

From: Michal Hocko <mhocko@suse.com>

Both load_elf_interp and load_elf_binary rely on elf_map to map segments
on a controlled address and they use MAP_FIXED to enforce that. This is
however dangerous thing prone to silent data corruption which can be
even exploitable. Let's take CVE-2017-1000253 as an example. At the time
(before eab09532d400 ("binfmt_elf: use ELF_ET_DYN_BASE only for PIE"))
ELF_ET_DYN_BASE was at TASK_SIZE / 3 * 2 which is not that far away from
the stack top on 32b (legacy) memory layout (only 1GB away). Therefore
we could end up mapping over the existing stack with some luck.

The issue has been fixed since then (a87938b2e246 ("fs/binfmt_elf.c:
fix bug in loading of PIE binaries")), ELF_ET_DYN_BASE moved moved much
further from the stack (eab09532d400 and later by c715b72c1ba4 ("mm:
revert x86_64 and arm64 ELF_ET_DYN_BASE base changes")) and excessive
stack consumption early during execve fully stopped by da029c11e6b1
("exec: Limit arg stack to at most 75% of _STK_LIM"). So we should be
safe and any attack should be impractical. On the other hand this is
just too subtle assumption so it can break quite easily and hard to
spot.

I believe that the MAP_FIXED usage in load_elf_binary (et. al) is still
fundamentally dangerous. Moreover it shouldn't be even needed. We are
at the early process stage and so there shouldn't be unrelated mappings
(except for stack and loader) existing so mmap for a given address
should succeed even without MAP_FIXED. Something is terribly wrong if
this is not the case and we should rather fail than silently corrupt the
underlying mapping.

Address this issue by changing MAP_FIXED to the newly added
MAP_FIXED_SAFE. This will mean that mmap will fail if there is an
existing mapping clashing with the requested one without clobbering it.

Cc: Abdul Haleem <abdhalee@linux.vnet.ibm.com>
Cc: Joel Stanley <joel@jms.id.au>
Acked-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Khalid Aziz <khalid.aziz@oracle.com>
Signed-off-by: Michal Hocko <mhocko@suse.com>
---
 arch/metag/kernel/process.c |  6 +++++-
 fs/binfmt_elf.c             | 12 ++++++++----
 2 files changed, 13 insertions(+), 5 deletions(-)

diff --git a/arch/metag/kernel/process.c b/arch/metag/kernel/process.c
index 0909834c83a7..867c8d0a5fb4 100644
--- a/arch/metag/kernel/process.c
+++ b/arch/metag/kernel/process.c
@@ -399,7 +399,7 @@ unsigned long __metag_elf_map(struct file *filep, unsigned long addr,
 	tcm_tag = tcm_lookup_tag(addr);
 
 	if (tcm_tag != TCM_INVALID_TAG)
-		type &= ~MAP_FIXED;
+		type &= ~(MAP_FIXED | MAP_FIXED_SAFE);
 
 	/*
 	* total_size is the size of the ELF (interpreter) image.
@@ -417,6 +417,10 @@ unsigned long __metag_elf_map(struct file *filep, unsigned long addr,
 	} else
 		map_addr = vm_mmap(filep, addr, size, prot, type, off);
 
+	if ((type & MAP_FIXED_SAFE) && BAD_ADDR(map_addr))
+		pr_info("%d (%s): Uhuuh, elf segement at %p requested but the memory is mapped already\n",
+				task_pid_nr(current), tsk->comm, (void*)addr);
+
 	if (!BAD_ADDR(map_addr) && tcm_tag != TCM_INVALID_TAG) {
 		struct tcm_allocation *tcm;
 		unsigned long tcm_addr;
diff --git a/fs/binfmt_elf.c b/fs/binfmt_elf.c
index 73b01e474fdc..5916d45f64a7 100644
--- a/fs/binfmt_elf.c
+++ b/fs/binfmt_elf.c
@@ -372,6 +372,10 @@ static unsigned long elf_map(struct file *filep, unsigned long addr,
 	} else
 		map_addr = vm_mmap(filep, addr, size, prot, type, off);
 
+	if ((type & MAP_FIXED_SAFE) && BAD_ADDR(map_addr))
+		pr_info("%d (%s): Uhuuh, elf segement at %p requested but the memory is mapped already\n",
+				task_pid_nr(current), current->comm, (void*)addr);
+
 	return(map_addr);
 }
 
@@ -569,7 +573,7 @@ static unsigned long load_elf_interp(struct elfhdr *interp_elf_ex,
 				elf_prot |= PROT_EXEC;
 			vaddr = eppnt->p_vaddr;
 			if (interp_elf_ex->e_type == ET_EXEC || load_addr_set)
-				elf_type |= MAP_FIXED;
+				elf_type |= MAP_FIXED_SAFE;
 			else if (no_base && interp_elf_ex->e_type == ET_DYN)
 				load_addr = -vaddr;
 
@@ -930,7 +934,7 @@ static int load_elf_binary(struct linux_binprm *bprm)
 		 * the ET_DYN load_addr calculations, proceed normally.
 		 */
 		if (loc->elf_ex.e_type == ET_EXEC || load_addr_set) {
-			elf_flags |= MAP_FIXED;
+			elf_flags |= MAP_FIXED_SAFE;
 		} else if (loc->elf_ex.e_type == ET_DYN) {
 			/*
 			 * This logic is run once for the first LOAD Program
@@ -966,7 +970,7 @@ static int load_elf_binary(struct linux_binprm *bprm)
 				load_bias = ELF_ET_DYN_BASE;
 				if (current->flags & PF_RANDOMIZE)
 					load_bias += arch_mmap_rnd();
-				elf_flags |= MAP_FIXED;
+				elf_flags |= MAP_FIXED_SAFE;
 			} else
 				load_bias = 0;
 
@@ -1223,7 +1227,7 @@ static int load_elf_library(struct file *file)
 			(eppnt->p_filesz +
 			 ELF_PAGEOFFSET(eppnt->p_vaddr)),
 			PROT_READ | PROT_WRITE | PROT_EXEC,
-			MAP_FIXED | MAP_PRIVATE | MAP_DENYWRITE,
+			MAP_FIXED_SAFE | MAP_PRIVATE | MAP_DENYWRITE,
 			(eppnt->p_offset -
 			 ELF_PAGEOFFSET(eppnt->p_vaddr)));
 	if (error != ELF_PAGESTART(eppnt->p_vaddr))
-- 
2.15.0

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 130+ messages in thread

* [PATCH 2/2] fs, elf: drop MAP_FIXED usage from elf_map
@ 2017-12-13  9:25   ` Michal Hocko
  0 siblings, 0 replies; 130+ messages in thread
From: Michal Hocko @ 2017-12-13  9:25 UTC (permalink / raw)
  To: linux-api
  Cc: Khalid Aziz, Michael Ellerman, Andrew Morton,
	Russell King - ARM Linux, Andrea Arcangeli, linux-mm, LKML,
	linux-arch, Florian Weimer, John Hubbard, Matthew Wilcox,
	Michal Hocko, Abdul Haleem, Joel Stanley, Kees Cook

From: Michal Hocko <mhocko@suse.com>

Both load_elf_interp and load_elf_binary rely on elf_map to map segments
on a controlled address and they use MAP_FIXED to enforce that. This is
however dangerous thing prone to silent data corruption which can be
even exploitable. Let's take CVE-2017-1000253 as an example. At the time
(before eab09532d400 ("binfmt_elf: use ELF_ET_DYN_BASE only for PIE"))
ELF_ET_DYN_BASE was at TASK_SIZE / 3 * 2 which is not that far away from
the stack top on 32b (legacy) memory layout (only 1GB away). Therefore
we could end up mapping over the existing stack with some luck.

The issue has been fixed since then (a87938b2e246 ("fs/binfmt_elf.c:
fix bug in loading of PIE binaries")), ELF_ET_DYN_BASE moved moved much
further from the stack (eab09532d400 and later by c715b72c1ba4 ("mm:
revert x86_64 and arm64 ELF_ET_DYN_BASE base changes")) and excessive
stack consumption early during execve fully stopped by da029c11e6b1
("exec: Limit arg stack to at most 75% of _STK_LIM"). So we should be
safe and any attack should be impractical. On the other hand this is
just too subtle assumption so it can break quite easily and hard to
spot.

I believe that the MAP_FIXED usage in load_elf_binary (et. al) is still
fundamentally dangerous. Moreover it shouldn't be even needed. We are
at the early process stage and so there shouldn't be unrelated mappings
(except for stack and loader) existing so mmap for a given address
should succeed even without MAP_FIXED. Something is terribly wrong if
this is not the case and we should rather fail than silently corrupt the
underlying mapping.

Address this issue by changing MAP_FIXED to the newly added
MAP_FIXED_SAFE. This will mean that mmap will fail if there is an
existing mapping clashing with the requested one without clobbering it.

Cc: Abdul Haleem <abdhalee@linux.vnet.ibm.com>
Cc: Joel Stanley <joel@jms.id.au>
Acked-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Khalid Aziz <khalid.aziz@oracle.com>
Signed-off-by: Michal Hocko <mhocko@suse.com>
---
 arch/metag/kernel/process.c |  6 +++++-
 fs/binfmt_elf.c             | 12 ++++++++----
 2 files changed, 13 insertions(+), 5 deletions(-)

diff --git a/arch/metag/kernel/process.c b/arch/metag/kernel/process.c
index 0909834c83a7..867c8d0a5fb4 100644
--- a/arch/metag/kernel/process.c
+++ b/arch/metag/kernel/process.c
@@ -399,7 +399,7 @@ unsigned long __metag_elf_map(struct file *filep, unsigned long addr,
 	tcm_tag = tcm_lookup_tag(addr);
 
 	if (tcm_tag != TCM_INVALID_TAG)
-		type &= ~MAP_FIXED;
+		type &= ~(MAP_FIXED | MAP_FIXED_SAFE);
 
 	/*
 	* total_size is the size of the ELF (interpreter) image.
@@ -417,6 +417,10 @@ unsigned long __metag_elf_map(struct file *filep, unsigned long addr,
 	} else
 		map_addr = vm_mmap(filep, addr, size, prot, type, off);
 
+	if ((type & MAP_FIXED_SAFE) && BAD_ADDR(map_addr))
+		pr_info("%d (%s): Uhuuh, elf segement at %p requested but the memory is mapped already\n",
+				task_pid_nr(current), tsk->comm, (void*)addr);
+
 	if (!BAD_ADDR(map_addr) && tcm_tag != TCM_INVALID_TAG) {
 		struct tcm_allocation *tcm;
 		unsigned long tcm_addr;
diff --git a/fs/binfmt_elf.c b/fs/binfmt_elf.c
index 73b01e474fdc..5916d45f64a7 100644
--- a/fs/binfmt_elf.c
+++ b/fs/binfmt_elf.c
@@ -372,6 +372,10 @@ static unsigned long elf_map(struct file *filep, unsigned long addr,
 	} else
 		map_addr = vm_mmap(filep, addr, size, prot, type, off);
 
+	if ((type & MAP_FIXED_SAFE) && BAD_ADDR(map_addr))
+		pr_info("%d (%s): Uhuuh, elf segement at %p requested but the memory is mapped already\n",
+				task_pid_nr(current), current->comm, (void*)addr);
+
 	return(map_addr);
 }
 
@@ -569,7 +573,7 @@ static unsigned long load_elf_interp(struct elfhdr *interp_elf_ex,
 				elf_prot |= PROT_EXEC;
 			vaddr = eppnt->p_vaddr;
 			if (interp_elf_ex->e_type == ET_EXEC || load_addr_set)
-				elf_type |= MAP_FIXED;
+				elf_type |= MAP_FIXED_SAFE;
 			else if (no_base && interp_elf_ex->e_type == ET_DYN)
 				load_addr = -vaddr;
 
@@ -930,7 +934,7 @@ static int load_elf_binary(struct linux_binprm *bprm)
 		 * the ET_DYN load_addr calculations, proceed normally.
 		 */
 		if (loc->elf_ex.e_type == ET_EXEC || load_addr_set) {
-			elf_flags |= MAP_FIXED;
+			elf_flags |= MAP_FIXED_SAFE;
 		} else if (loc->elf_ex.e_type == ET_DYN) {
 			/*
 			 * This logic is run once for the first LOAD Program
@@ -966,7 +970,7 @@ static int load_elf_binary(struct linux_binprm *bprm)
 				load_bias = ELF_ET_DYN_BASE;
 				if (current->flags & PF_RANDOMIZE)
 					load_bias += arch_mmap_rnd();
-				elf_flags |= MAP_FIXED;
+				elf_flags |= MAP_FIXED_SAFE;
 			} else
 				load_bias = 0;
 
@@ -1223,7 +1227,7 @@ static int load_elf_library(struct file *file)
 			(eppnt->p_filesz +
 			 ELF_PAGEOFFSET(eppnt->p_vaddr)),
 			PROT_READ | PROT_WRITE | PROT_EXEC,
-			MAP_FIXED | MAP_PRIVATE | MAP_DENYWRITE,
+			MAP_FIXED_SAFE | MAP_PRIVATE | MAP_DENYWRITE,
 			(eppnt->p_offset -
 			 ELF_PAGEOFFSET(eppnt->p_vaddr)));
 	if (error != ELF_PAGESTART(eppnt->p_vaddr))
-- 
2.15.0

^ permalink raw reply related	[flat|nested] 130+ messages in thread

* Re: [PATCH 2/2] fs, elf: drop MAP_FIXED usage from elf_map
  2017-11-16 10:19   ` Michal Hocko
@ 2017-11-17  0:30     ` Kees Cook
  -1 siblings, 0 replies; 130+ messages in thread
From: Kees Cook @ 2017-11-17  0:30 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Linux API, Khalid Aziz, Michael Ellerman, Andrew Morton,
	Russell King - ARM Linux, Andrea Arcangeli, Linux-MM, LKML,
	linux-arch, Michal Hocko, Abdul Haleem, Joel Stanley

On Thu, Nov 16, 2017 at 2:19 AM, Michal Hocko <mhocko@kernel.org> wrote:
> From: Michal Hocko <mhocko@suse.com>
>
> Both load_elf_interp and load_elf_binary rely on elf_map to map segments
> on a controlled address and they use MAP_FIXED to enforce that. This is
> however dangerous thing prone to silent data corruption which can be
> even exploitable. Let's take CVE-2017-1000253 as an example. At the time
> (before eab09532d400 ("binfmt_elf: use ELF_ET_DYN_BASE only for PIE"))
> ELF_ET_DYN_BASE was at TASK_SIZE / 3 * 2 which is not that far away from
> the stack top on 32b (legacy) memory layout (only 1GB away). Therefore
> we could end up mapping over the existing stack with some luck.
>
> The issue has been fixed since then (a87938b2e246 ("fs/binfmt_elf.c:
> fix bug in loading of PIE binaries")), ELF_ET_DYN_BASE moved moved much
> further from the stack (eab09532d400 and later by c715b72c1ba4 ("mm:
> revert x86_64 and arm64 ELF_ET_DYN_BASE base changes")) and excessive
> stack consumption early during execve fully stopped by da029c11e6b1
> ("exec: Limit arg stack to at most 75% of _STK_LIM"). So we should be
> safe and any attack should be impractical. On the other hand this is
> just too subtle assumption so it can break quite easily and hard to
> spot.
>
> I believe that the MAP_FIXED usage in load_elf_binary (et. al) is still
> fundamentally dangerous. Moreover it shouldn't be even needed. We are
> at the early process stage and so there shouldn't be unrelated mappings
> (except for stack and loader) existing so mmap for a given address
> should succeed even without MAP_FIXED. Something is terribly wrong if
> this is not the case and we should rather fail than silently corrupt the
> underlying mapping.
>
> Address this issue by changing MAP_FIXED to the newly added
> MAP_FIXED_SAFE. This will mean that mmap will fail if there is an
> existing mapping clashing with the requested one without clobbering it.
>
> Cc: Abdul Haleem <abdhalee@linux.vnet.ibm.com>
> Cc: Joel Stanley <joel@jms.id.au>
> Cc: Kees Cook <keescook@chromium.org>
> Signed-off-by: Michal Hocko <mhocko@suse.com>

Once (if?) the name gets settled, this looks good to me:

Acked-by: Kees Cook <keescook@chromium.org>

-Kees

> ---
>  arch/metag/kernel/process.c |  6 +++++-
>  fs/binfmt_elf.c             | 12 ++++++++----
>  2 files changed, 13 insertions(+), 5 deletions(-)
>
> diff --git a/arch/metag/kernel/process.c b/arch/metag/kernel/process.c
> index c4606ce743d2..2286140e54e0 100644
> --- a/arch/metag/kernel/process.c
> +++ b/arch/metag/kernel/process.c
> @@ -398,7 +398,7 @@ unsigned long __metag_elf_map(struct file *filep, unsigned long addr,
>         tcm_tag = tcm_lookup_tag(addr);
>
>         if (tcm_tag != TCM_INVALID_TAG)
> -               type &= ~MAP_FIXED;
> +               type &= ~(MAP_FIXED | MAP_FIXED_SAFE);
>
>         /*
>         * total_size is the size of the ELF (interpreter) image.
> @@ -416,6 +416,10 @@ unsigned long __metag_elf_map(struct file *filep, unsigned long addr,
>         } else
>                 map_addr = vm_mmap(filep, addr, size, prot, type, off);
>
> +       if ((type & MAP_FIXED_SAFE) && BAD_ADDR(map_addr))
> +               pr_info("%d (%s): Uhuuh, elf segement at %p requested but the memory is mapped already\n",
> +                               task_pid_nr(current), tsk->comm, (void*)addr);
> +
>         if (!BAD_ADDR(map_addr) && tcm_tag != TCM_INVALID_TAG) {
>                 struct tcm_allocation *tcm;
>                 unsigned long tcm_addr;
> diff --git a/fs/binfmt_elf.c b/fs/binfmt_elf.c
> index 6466153f2bf0..12b21942ccde 100644
> --- a/fs/binfmt_elf.c
> +++ b/fs/binfmt_elf.c
> @@ -372,6 +372,10 @@ static unsigned long elf_map(struct file *filep, unsigned long addr,
>         } else
>                 map_addr = vm_mmap(filep, addr, size, prot, type, off);
>
> +       if ((type & MAP_FIXED_SAFE) && BAD_ADDR(map_addr))
> +               pr_info("%d (%s): Uhuuh, elf segement at %p requested but the memory is mapped already\n",
> +                               task_pid_nr(current), current->comm, (void*)addr);
> +
>         return(map_addr);
>  }
>
> @@ -569,7 +573,7 @@ static unsigned long load_elf_interp(struct elfhdr *interp_elf_ex,
>                                 elf_prot |= PROT_EXEC;
>                         vaddr = eppnt->p_vaddr;
>                         if (interp_elf_ex->e_type == ET_EXEC || load_addr_set)
> -                               elf_type |= MAP_FIXED;
> +                               elf_type |= MAP_FIXED_SAFE;
>                         else if (no_base && interp_elf_ex->e_type == ET_DYN)
>                                 load_addr = -vaddr;
>
> @@ -929,7 +933,7 @@ static int load_elf_binary(struct linux_binprm *bprm)
>                  * the ET_DYN load_addr calculations, proceed normally.
>                  */
>                 if (loc->elf_ex.e_type == ET_EXEC || load_addr_set) {
> -                       elf_flags |= MAP_FIXED;
> +                       elf_flags |= MAP_FIXED_SAFE;
>                 } else if (loc->elf_ex.e_type == ET_DYN) {
>                         /*
>                          * This logic is run once for the first LOAD Program
> @@ -965,7 +969,7 @@ static int load_elf_binary(struct linux_binprm *bprm)
>                                 load_bias = ELF_ET_DYN_BASE;
>                                 if (current->flags & PF_RANDOMIZE)
>                                         load_bias += arch_mmap_rnd();
> -                               elf_flags |= MAP_FIXED;
> +                               elf_flags |= MAP_FIXED_SAFE;
>                         } else
>                                 load_bias = 0;
>
> @@ -1220,7 +1224,7 @@ static int load_elf_library(struct file *file)
>                         (eppnt->p_filesz +
>                          ELF_PAGEOFFSET(eppnt->p_vaddr)),
>                         PROT_READ | PROT_WRITE | PROT_EXEC,
> -                       MAP_FIXED | MAP_PRIVATE | MAP_DENYWRITE,
> +                       MAP_FIXED_SAFE | MAP_PRIVATE | MAP_DENYWRITE,
>                         (eppnt->p_offset -
>                          ELF_PAGEOFFSET(eppnt->p_vaddr)));
>         if (error != ELF_PAGESTART(eppnt->p_vaddr))
> --
> 2.15.0
>



-- 
Kees Cook
Pixel Security

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [PATCH 2/2] fs, elf: drop MAP_FIXED usage from elf_map
@ 2017-11-17  0:30     ` Kees Cook
  0 siblings, 0 replies; 130+ messages in thread
From: Kees Cook @ 2017-11-17  0:30 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Linux API, Khalid Aziz, Michael Ellerman, Andrew Morton,
	Russell King - ARM Linux, Andrea Arcangeli, Linux-MM, LKML,
	linux-arch, Michal Hocko, Abdul Haleem, Joel Stanley

On Thu, Nov 16, 2017 at 2:19 AM, Michal Hocko <mhocko@kernel.org> wrote:
> From: Michal Hocko <mhocko@suse.com>
>
> Both load_elf_interp and load_elf_binary rely on elf_map to map segments
> on a controlled address and they use MAP_FIXED to enforce that. This is
> however dangerous thing prone to silent data corruption which can be
> even exploitable. Let's take CVE-2017-1000253 as an example. At the time
> (before eab09532d400 ("binfmt_elf: use ELF_ET_DYN_BASE only for PIE"))
> ELF_ET_DYN_BASE was at TASK_SIZE / 3 * 2 which is not that far away from
> the stack top on 32b (legacy) memory layout (only 1GB away). Therefore
> we could end up mapping over the existing stack with some luck.
>
> The issue has been fixed since then (a87938b2e246 ("fs/binfmt_elf.c:
> fix bug in loading of PIE binaries")), ELF_ET_DYN_BASE moved moved much
> further from the stack (eab09532d400 and later by c715b72c1ba4 ("mm:
> revert x86_64 and arm64 ELF_ET_DYN_BASE base changes")) and excessive
> stack consumption early during execve fully stopped by da029c11e6b1
> ("exec: Limit arg stack to at most 75% of _STK_LIM"). So we should be
> safe and any attack should be impractical. On the other hand this is
> just too subtle assumption so it can break quite easily and hard to
> spot.
>
> I believe that the MAP_FIXED usage in load_elf_binary (et. al) is still
> fundamentally dangerous. Moreover it shouldn't be even needed. We are
> at the early process stage and so there shouldn't be unrelated mappings
> (except for stack and loader) existing so mmap for a given address
> should succeed even without MAP_FIXED. Something is terribly wrong if
> this is not the case and we should rather fail than silently corrupt the
> underlying mapping.
>
> Address this issue by changing MAP_FIXED to the newly added
> MAP_FIXED_SAFE. This will mean that mmap will fail if there is an
> existing mapping clashing with the requested one without clobbering it.
>
> Cc: Abdul Haleem <abdhalee@linux.vnet.ibm.com>
> Cc: Joel Stanley <joel@jms.id.au>
> Cc: Kees Cook <keescook@chromium.org>
> Signed-off-by: Michal Hocko <mhocko@suse.com>

Once (if?) the name gets settled, this looks good to me:

Acked-by: Kees Cook <keescook@chromium.org>

-Kees

> ---
>  arch/metag/kernel/process.c |  6 +++++-
>  fs/binfmt_elf.c             | 12 ++++++++----
>  2 files changed, 13 insertions(+), 5 deletions(-)
>
> diff --git a/arch/metag/kernel/process.c b/arch/metag/kernel/process.c
> index c4606ce743d2..2286140e54e0 100644
> --- a/arch/metag/kernel/process.c
> +++ b/arch/metag/kernel/process.c
> @@ -398,7 +398,7 @@ unsigned long __metag_elf_map(struct file *filep, unsigned long addr,
>         tcm_tag = tcm_lookup_tag(addr);
>
>         if (tcm_tag != TCM_INVALID_TAG)
> -               type &= ~MAP_FIXED;
> +               type &= ~(MAP_FIXED | MAP_FIXED_SAFE);
>
>         /*
>         * total_size is the size of the ELF (interpreter) image.
> @@ -416,6 +416,10 @@ unsigned long __metag_elf_map(struct file *filep, unsigned long addr,
>         } else
>                 map_addr = vm_mmap(filep, addr, size, prot, type, off);
>
> +       if ((type & MAP_FIXED_SAFE) && BAD_ADDR(map_addr))
> +               pr_info("%d (%s): Uhuuh, elf segement at %p requested but the memory is mapped already\n",
> +                               task_pid_nr(current), tsk->comm, (void*)addr);
> +
>         if (!BAD_ADDR(map_addr) && tcm_tag != TCM_INVALID_TAG) {
>                 struct tcm_allocation *tcm;
>                 unsigned long tcm_addr;
> diff --git a/fs/binfmt_elf.c b/fs/binfmt_elf.c
> index 6466153f2bf0..12b21942ccde 100644
> --- a/fs/binfmt_elf.c
> +++ b/fs/binfmt_elf.c
> @@ -372,6 +372,10 @@ static unsigned long elf_map(struct file *filep, unsigned long addr,
>         } else
>                 map_addr = vm_mmap(filep, addr, size, prot, type, off);
>
> +       if ((type & MAP_FIXED_SAFE) && BAD_ADDR(map_addr))
> +               pr_info("%d (%s): Uhuuh, elf segement at %p requested but the memory is mapped already\n",
> +                               task_pid_nr(current), current->comm, (void*)addr);
> +
>         return(map_addr);
>  }
>
> @@ -569,7 +573,7 @@ static unsigned long load_elf_interp(struct elfhdr *interp_elf_ex,
>                                 elf_prot |= PROT_EXEC;
>                         vaddr = eppnt->p_vaddr;
>                         if (interp_elf_ex->e_type == ET_EXEC || load_addr_set)
> -                               elf_type |= MAP_FIXED;
> +                               elf_type |= MAP_FIXED_SAFE;
>                         else if (no_base && interp_elf_ex->e_type == ET_DYN)
>                                 load_addr = -vaddr;
>
> @@ -929,7 +933,7 @@ static int load_elf_binary(struct linux_binprm *bprm)
>                  * the ET_DYN load_addr calculations, proceed normally.
>                  */
>                 if (loc->elf_ex.e_type == ET_EXEC || load_addr_set) {
> -                       elf_flags |= MAP_FIXED;
> +                       elf_flags |= MAP_FIXED_SAFE;
>                 } else if (loc->elf_ex.e_type == ET_DYN) {
>                         /*
>                          * This logic is run once for the first LOAD Program
> @@ -965,7 +969,7 @@ static int load_elf_binary(struct linux_binprm *bprm)
>                                 load_bias = ELF_ET_DYN_BASE;
>                                 if (current->flags & PF_RANDOMIZE)
>                                         load_bias += arch_mmap_rnd();
> -                               elf_flags |= MAP_FIXED;
> +                               elf_flags |= MAP_FIXED_SAFE;
>                         } else
>                                 load_bias = 0;
>
> @@ -1220,7 +1224,7 @@ static int load_elf_library(struct file *file)
>                         (eppnt->p_filesz +
>                          ELF_PAGEOFFSET(eppnt->p_vaddr)),
>                         PROT_READ | PROT_WRITE | PROT_EXEC,
> -                       MAP_FIXED | MAP_PRIVATE | MAP_DENYWRITE,
> +                       MAP_FIXED_SAFE | MAP_PRIVATE | MAP_DENYWRITE,
>                         (eppnt->p_offset -
>                          ELF_PAGEOFFSET(eppnt->p_vaddr)));
>         if (error != ELF_PAGESTART(eppnt->p_vaddr))
> --
> 2.15.0
>



-- 
Kees Cook
Pixel Security

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 130+ messages in thread

* [PATCH 2/2] fs, elf: drop MAP_FIXED usage from elf_map
  2017-11-16 10:18 Michal Hocko
  2017-11-16 10:19   ` Michal Hocko
@ 2017-11-16 10:19   ` Michal Hocko
  0 siblings, 0 replies; 130+ messages in thread
From: Michal Hocko @ 2017-11-16 10:19 UTC (permalink / raw)
  To: linux-api
  Cc: Khalid Aziz, Michael Ellerman, Andrew Morton,
	Russell King - ARM Linux, Andrea Arcangeli, linux-mm, LKML,
	linux-arch, Michal Hocko, Abdul Haleem, Joel Stanley, Kees Cook

From: Michal Hocko <mhocko@suse.com>

Both load_elf_interp and load_elf_binary rely on elf_map to map segments
on a controlled address and they use MAP_FIXED to enforce that. This is
however dangerous thing prone to silent data corruption which can be
even exploitable. Let's take CVE-2017-1000253 as an example. At the time
(before eab09532d400 ("binfmt_elf: use ELF_ET_DYN_BASE only for PIE"))
ELF_ET_DYN_BASE was at TASK_SIZE / 3 * 2 which is not that far away from
the stack top on 32b (legacy) memory layout (only 1GB away). Therefore
we could end up mapping over the existing stack with some luck.

The issue has been fixed since then (a87938b2e246 ("fs/binfmt_elf.c:
fix bug in loading of PIE binaries")), ELF_ET_DYN_BASE moved moved much
further from the stack (eab09532d400 and later by c715b72c1ba4 ("mm:
revert x86_64 and arm64 ELF_ET_DYN_BASE base changes")) and excessive
stack consumption early during execve fully stopped by da029c11e6b1
("exec: Limit arg stack to at most 75% of _STK_LIM"). So we should be
safe and any attack should be impractical. On the other hand this is
just too subtle assumption so it can break quite easily and hard to
spot.

I believe that the MAP_FIXED usage in load_elf_binary (et. al) is still
fundamentally dangerous. Moreover it shouldn't be even needed. We are
at the early process stage and so there shouldn't be unrelated mappings
(except for stack and loader) existing so mmap for a given address
should succeed even without MAP_FIXED. Something is terribly wrong if
this is not the case and we should rather fail than silently corrupt the
underlying mapping.

Address this issue by changing MAP_FIXED to the newly added
MAP_FIXED_SAFE. This will mean that mmap will fail if there is an
existing mapping clashing with the requested one without clobbering it.

Cc: Abdul Haleem <abdhalee@linux.vnet.ibm.com>
Cc: Joel Stanley <joel@jms.id.au>
Cc: Kees Cook <keescook@chromium.org>
Signed-off-by: Michal Hocko <mhocko@suse.com>
---
 arch/metag/kernel/process.c |  6 +++++-
 fs/binfmt_elf.c             | 12 ++++++++----
 2 files changed, 13 insertions(+), 5 deletions(-)

diff --git a/arch/metag/kernel/process.c b/arch/metag/kernel/process.c
index c4606ce743d2..2286140e54e0 100644
--- a/arch/metag/kernel/process.c
+++ b/arch/metag/kernel/process.c
@@ -398,7 +398,7 @@ unsigned long __metag_elf_map(struct file *filep, unsigned long addr,
 	tcm_tag = tcm_lookup_tag(addr);
 
 	if (tcm_tag != TCM_INVALID_TAG)
-		type &= ~MAP_FIXED;
+		type &= ~(MAP_FIXED | MAP_FIXED_SAFE);
 
 	/*
 	* total_size is the size of the ELF (interpreter) image.
@@ -416,6 +416,10 @@ unsigned long __metag_elf_map(struct file *filep, unsigned long addr,
 	} else
 		map_addr = vm_mmap(filep, addr, size, prot, type, off);
 
+	if ((type & MAP_FIXED_SAFE) && BAD_ADDR(map_addr))
+		pr_info("%d (%s): Uhuuh, elf segement at %p requested but the memory is mapped already\n",
+				task_pid_nr(current), tsk->comm, (void*)addr);
+
 	if (!BAD_ADDR(map_addr) && tcm_tag != TCM_INVALID_TAG) {
 		struct tcm_allocation *tcm;
 		unsigned long tcm_addr;
diff --git a/fs/binfmt_elf.c b/fs/binfmt_elf.c
index 6466153f2bf0..12b21942ccde 100644
--- a/fs/binfmt_elf.c
+++ b/fs/binfmt_elf.c
@@ -372,6 +372,10 @@ static unsigned long elf_map(struct file *filep, unsigned long addr,
 	} else
 		map_addr = vm_mmap(filep, addr, size, prot, type, off);
 
+	if ((type & MAP_FIXED_SAFE) && BAD_ADDR(map_addr))
+		pr_info("%d (%s): Uhuuh, elf segement at %p requested but the memory is mapped already\n",
+				task_pid_nr(current), current->comm, (void*)addr);
+
 	return(map_addr);
 }
 
@@ -569,7 +573,7 @@ static unsigned long load_elf_interp(struct elfhdr *interp_elf_ex,
 				elf_prot |= PROT_EXEC;
 			vaddr = eppnt->p_vaddr;
 			if (interp_elf_ex->e_type == ET_EXEC || load_addr_set)
-				elf_type |= MAP_FIXED;
+				elf_type |= MAP_FIXED_SAFE;
 			else if (no_base && interp_elf_ex->e_type == ET_DYN)
 				load_addr = -vaddr;
 
@@ -929,7 +933,7 @@ static int load_elf_binary(struct linux_binprm *bprm)
 		 * the ET_DYN load_addr calculations, proceed normally.
 		 */
 		if (loc->elf_ex.e_type == ET_EXEC || load_addr_set) {
-			elf_flags |= MAP_FIXED;
+			elf_flags |= MAP_FIXED_SAFE;
 		} else if (loc->elf_ex.e_type == ET_DYN) {
 			/*
 			 * This logic is run once for the first LOAD Program
@@ -965,7 +969,7 @@ static int load_elf_binary(struct linux_binprm *bprm)
 				load_bias = ELF_ET_DYN_BASE;
 				if (current->flags & PF_RANDOMIZE)
 					load_bias += arch_mmap_rnd();
-				elf_flags |= MAP_FIXED;
+				elf_flags |= MAP_FIXED_SAFE;
 			} else
 				load_bias = 0;
 
@@ -1220,7 +1224,7 @@ static int load_elf_library(struct file *file)
 			(eppnt->p_filesz +
 			 ELF_PAGEOFFSET(eppnt->p_vaddr)),
 			PROT_READ | PROT_WRITE | PROT_EXEC,
-			MAP_FIXED | MAP_PRIVATE | MAP_DENYWRITE,
+			MAP_FIXED_SAFE | MAP_PRIVATE | MAP_DENYWRITE,
 			(eppnt->p_offset -
 			 ELF_PAGEOFFSET(eppnt->p_vaddr)));
 	if (error != ELF_PAGESTART(eppnt->p_vaddr))
-- 
2.15.0

^ permalink raw reply related	[flat|nested] 130+ messages in thread

* [PATCH 2/2] fs, elf: drop MAP_FIXED usage from elf_map
@ 2017-11-16 10:19   ` Michal Hocko
  0 siblings, 0 replies; 130+ messages in thread
From: Michal Hocko @ 2017-11-16 10:19 UTC (permalink / raw)
  To: linux-api
  Cc: Khalid Aziz, Michael Ellerman, Andrew Morton,
	Russell King - ARM Linux, Andrea Arcangeli, linux-mm, LKML,
	linux-arch, Michal Hocko, Abdul Haleem, Joel Stanley, Kees Cook

From: Michal Hocko <mhocko@suse.com>

Both load_elf_interp and load_elf_binary rely on elf_map to map segments
on a controlled address and they use MAP_FIXED to enforce that. This is
however dangerous thing prone to silent data corruption which can be
even exploitable. Let's take CVE-2017-1000253 as an example. At the time
(before eab09532d400 ("binfmt_elf: use ELF_ET_DYN_BASE only for PIE"))
ELF_ET_DYN_BASE was at TASK_SIZE / 3 * 2 which is not that far away from
the stack top on 32b (legacy) memory layout (only 1GB away). Therefore
we could end up mapping over the existing stack with some luck.

The issue has been fixed since then (a87938b2e246 ("fs/binfmt_elf.c:
fix bug in loading of PIE binaries")), ELF_ET_DYN_BASE moved moved much
further from the stack (eab09532d400 and later by c715b72c1ba4 ("mm:
revert x86_64 and arm64 ELF_ET_DYN_BASE base changes")) and excessive
stack consumption early during execve fully stopped by da029c11e6b1
("exec: Limit arg stack to at most 75% of _STK_LIM"). So we should be
safe and any attack should be impractical. On the other hand this is
just too subtle assumption so it can break quite easily and hard to
spot.

I believe that the MAP_FIXED usage in load_elf_binary (et. al) is still
fundamentally dangerous. Moreover it shouldn't be even needed. We are
at the early process stage and so there shouldn't be unrelated mappings
(except for stack and loader) existing so mmap for a given address
should succeed even without MAP_FIXED. Something is terribly wrong if
this is not the case and we should rather fail than silently corrupt the
underlying mapping.

Address this issue by changing MAP_FIXED to the newly added
MAP_FIXED_SAFE. This will mean that mmap will fail if there is an
existing mapping clashing with the requested one without clobbering it.

Cc: Abdul Haleem <abdhalee@linux.vnet.ibm.com>
Cc: Joel Stanley <joel@jms.id.au>
Cc: Kees Cook <keescook@chromium.org>
Signed-off-by: Michal Hocko <mhocko@suse.com>
---
 arch/metag/kernel/process.c |  6 +++++-
 fs/binfmt_elf.c             | 12 ++++++++----
 2 files changed, 13 insertions(+), 5 deletions(-)

diff --git a/arch/metag/kernel/process.c b/arch/metag/kernel/process.c
index c4606ce743d2..2286140e54e0 100644
--- a/arch/metag/kernel/process.c
+++ b/arch/metag/kernel/process.c
@@ -398,7 +398,7 @@ unsigned long __metag_elf_map(struct file *filep, unsigned long addr,
 	tcm_tag = tcm_lookup_tag(addr);
 
 	if (tcm_tag != TCM_INVALID_TAG)
-		type &= ~MAP_FIXED;
+		type &= ~(MAP_FIXED | MAP_FIXED_SAFE);
 
 	/*
 	* total_size is the size of the ELF (interpreter) image.
@@ -416,6 +416,10 @@ unsigned long __metag_elf_map(struct file *filep, unsigned long addr,
 	} else
 		map_addr = vm_mmap(filep, addr, size, prot, type, off);
 
+	if ((type & MAP_FIXED_SAFE) && BAD_ADDR(map_addr))
+		pr_info("%d (%s): Uhuuh, elf segement at %p requested but the memory is mapped already\n",
+				task_pid_nr(current), tsk->comm, (void*)addr);
+
 	if (!BAD_ADDR(map_addr) && tcm_tag != TCM_INVALID_TAG) {
 		struct tcm_allocation *tcm;
 		unsigned long tcm_addr;
diff --git a/fs/binfmt_elf.c b/fs/binfmt_elf.c
index 6466153f2bf0..12b21942ccde 100644
--- a/fs/binfmt_elf.c
+++ b/fs/binfmt_elf.c
@@ -372,6 +372,10 @@ static unsigned long elf_map(struct file *filep, unsigned long addr,
 	} else
 		map_addr = vm_mmap(filep, addr, size, prot, type, off);
 
+	if ((type & MAP_FIXED_SAFE) && BAD_ADDR(map_addr))
+		pr_info("%d (%s): Uhuuh, elf segement at %p requested but the memory is mapped already\n",
+				task_pid_nr(current), current->comm, (void*)addr);
+
 	return(map_addr);
 }
 
@@ -569,7 +573,7 @@ static unsigned long load_elf_interp(struct elfhdr *interp_elf_ex,
 				elf_prot |= PROT_EXEC;
 			vaddr = eppnt->p_vaddr;
 			if (interp_elf_ex->e_type == ET_EXEC || load_addr_set)
-				elf_type |= MAP_FIXED;
+				elf_type |= MAP_FIXED_SAFE;
 			else if (no_base && interp_elf_ex->e_type == ET_DYN)
 				load_addr = -vaddr;
 
@@ -929,7 +933,7 @@ static int load_elf_binary(struct linux_binprm *bprm)
 		 * the ET_DYN load_addr calculations, proceed normally.
 		 */
 		if (loc->elf_ex.e_type == ET_EXEC || load_addr_set) {
-			elf_flags |= MAP_FIXED;
+			elf_flags |= MAP_FIXED_SAFE;
 		} else if (loc->elf_ex.e_type == ET_DYN) {
 			/*
 			 * This logic is run once for the first LOAD Program
@@ -965,7 +969,7 @@ static int load_elf_binary(struct linux_binprm *bprm)
 				load_bias = ELF_ET_DYN_BASE;
 				if (current->flags & PF_RANDOMIZE)
 					load_bias += arch_mmap_rnd();
-				elf_flags |= MAP_FIXED;
+				elf_flags |= MAP_FIXED_SAFE;
 			} else
 				load_bias = 0;
 
@@ -1220,7 +1224,7 @@ static int load_elf_library(struct file *file)
 			(eppnt->p_filesz +
 			 ELF_PAGEOFFSET(eppnt->p_vaddr)),
 			PROT_READ | PROT_WRITE | PROT_EXEC,
-			MAP_FIXED | MAP_PRIVATE | MAP_DENYWRITE,
+			MAP_FIXED_SAFE | MAP_PRIVATE | MAP_DENYWRITE,
 			(eppnt->p_offset -
 			 ELF_PAGEOFFSET(eppnt->p_vaddr)));
 	if (error != ELF_PAGESTART(eppnt->p_vaddr))
-- 
2.15.0

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 130+ messages in thread

* [PATCH 2/2] fs, elf: drop MAP_FIXED usage from elf_map
@ 2017-11-16 10:19   ` Michal Hocko
  0 siblings, 0 replies; 130+ messages in thread
From: Michal Hocko @ 2017-11-16 10:19 UTC (permalink / raw)
  To: linux-api
  Cc: Khalid Aziz, Michael Ellerman, Andrew Morton,
	Russell King - ARM Linux, Andrea Arcangeli, linux-mm, LKML,
	linux-arch, Michal Hocko, Abdul Haleem, Joel Stanley, Kees Cook

From: Michal Hocko <mhocko@suse.com>

Both load_elf_interp and load_elf_binary rely on elf_map to map segments
on a controlled address and they use MAP_FIXED to enforce that. This is
however dangerous thing prone to silent data corruption which can be
even exploitable. Let's take CVE-2017-1000253 as an example. At the time
(before eab09532d400 ("binfmt_elf: use ELF_ET_DYN_BASE only for PIE"))
ELF_ET_DYN_BASE was at TASK_SIZE / 3 * 2 which is not that far away from
the stack top on 32b (legacy) memory layout (only 1GB away). Therefore
we could end up mapping over the existing stack with some luck.

The issue has been fixed since then (a87938b2e246 ("fs/binfmt_elf.c:
fix bug in loading of PIE binaries")), ELF_ET_DYN_BASE moved moved much
further from the stack (eab09532d400 and later by c715b72c1ba4 ("mm:
revert x86_64 and arm64 ELF_ET_DYN_BASE base changes")) and excessive
stack consumption early during execve fully stopped by da029c11e6b1
("exec: Limit arg stack to at most 75% of _STK_LIM"). So we should be
safe and any attack should be impractical. On the other hand this is
just too subtle assumption so it can break quite easily and hard to
spot.

I believe that the MAP_FIXED usage in load_elf_binary (et. al) is still
fundamentally dangerous. Moreover it shouldn't be even needed. We are
at the early process stage and so there shouldn't be unrelated mappings
(except for stack and loader) existing so mmap for a given address
should succeed even without MAP_FIXED. Something is terribly wrong if
this is not the case and we should rather fail than silently corrupt the
underlying mapping.

Address this issue by changing MAP_FIXED to the newly added
MAP_FIXED_SAFE. This will mean that mmap will fail if there is an
existing mapping clashing with the requested one without clobbering it.

Cc: Abdul Haleem <abdhalee@linux.vnet.ibm.com>
Cc: Joel Stanley <joel@jms.id.au>
Cc: Kees Cook <keescook@chromium.org>
Signed-off-by: Michal Hocko <mhocko@suse.com>
---
 arch/metag/kernel/process.c |  6 +++++-
 fs/binfmt_elf.c             | 12 ++++++++----
 2 files changed, 13 insertions(+), 5 deletions(-)

diff --git a/arch/metag/kernel/process.c b/arch/metag/kernel/process.c
index c4606ce743d2..2286140e54e0 100644
--- a/arch/metag/kernel/process.c
+++ b/arch/metag/kernel/process.c
@@ -398,7 +398,7 @@ unsigned long __metag_elf_map(struct file *filep, unsigned long addr,
 	tcm_tag = tcm_lookup_tag(addr);
 
 	if (tcm_tag != TCM_INVALID_TAG)
-		type &= ~MAP_FIXED;
+		type &= ~(MAP_FIXED | MAP_FIXED_SAFE);
 
 	/*
 	* total_size is the size of the ELF (interpreter) image.
@@ -416,6 +416,10 @@ unsigned long __metag_elf_map(struct file *filep, unsigned long addr,
 	} else
 		map_addr = vm_mmap(filep, addr, size, prot, type, off);
 
+	if ((type & MAP_FIXED_SAFE) && BAD_ADDR(map_addr))
+		pr_info("%d (%s): Uhuuh, elf segement at %p requested but the memory is mapped already\n",
+				task_pid_nr(current), tsk->comm, (void*)addr);
+
 	if (!BAD_ADDR(map_addr) && tcm_tag != TCM_INVALID_TAG) {
 		struct tcm_allocation *tcm;
 		unsigned long tcm_addr;
diff --git a/fs/binfmt_elf.c b/fs/binfmt_elf.c
index 6466153f2bf0..12b21942ccde 100644
--- a/fs/binfmt_elf.c
+++ b/fs/binfmt_elf.c
@@ -372,6 +372,10 @@ static unsigned long elf_map(struct file *filep, unsigned long addr,
 	} else
 		map_addr = vm_mmap(filep, addr, size, prot, type, off);
 
+	if ((type & MAP_FIXED_SAFE) && BAD_ADDR(map_addr))
+		pr_info("%d (%s): Uhuuh, elf segement at %p requested but the memory is mapped already\n",
+				task_pid_nr(current), current->comm, (void*)addr);
+
 	return(map_addr);
 }
 
@@ -569,7 +573,7 @@ static unsigned long load_elf_interp(struct elfhdr *interp_elf_ex,
 				elf_prot |= PROT_EXEC;
 			vaddr = eppnt->p_vaddr;
 			if (interp_elf_ex->e_type == ET_EXEC || load_addr_set)
-				elf_type |= MAP_FIXED;
+				elf_type |= MAP_FIXED_SAFE;
 			else if (no_base && interp_elf_ex->e_type == ET_DYN)
 				load_addr = -vaddr;
 
@@ -929,7 +933,7 @@ static int load_elf_binary(struct linux_binprm *bprm)
 		 * the ET_DYN load_addr calculations, proceed normally.
 		 */
 		if (loc->elf_ex.e_type == ET_EXEC || load_addr_set) {
-			elf_flags |= MAP_FIXED;
+			elf_flags |= MAP_FIXED_SAFE;
 		} else if (loc->elf_ex.e_type == ET_DYN) {
 			/*
 			 * This logic is run once for the first LOAD Program
@@ -965,7 +969,7 @@ static int load_elf_binary(struct linux_binprm *bprm)
 				load_bias = ELF_ET_DYN_BASE;
 				if (current->flags & PF_RANDOMIZE)
 					load_bias += arch_mmap_rnd();
-				elf_flags |= MAP_FIXED;
+				elf_flags |= MAP_FIXED_SAFE;
 			} else
 				load_bias = 0;
 
@@ -1220,7 +1224,7 @@ static int load_elf_library(struct file *file)
 			(eppnt->p_filesz +
 			 ELF_PAGEOFFSET(eppnt->p_vaddr)),
 			PROT_READ | PROT_WRITE | PROT_EXEC,
-			MAP_FIXED | MAP_PRIVATE | MAP_DENYWRITE,
+			MAP_FIXED_SAFE | MAP_PRIVATE | MAP_DENYWRITE,
 			(eppnt->p_offset -
 			 ELF_PAGEOFFSET(eppnt->p_vaddr)));
 	if (error != ELF_PAGESTART(eppnt->p_vaddr))
-- 
2.15.0

^ permalink raw reply related	[flat|nested] 130+ messages in thread

end of thread, other threads:[~2018-05-31 21:46 UTC | newest]

Thread overview: 130+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-11-29 14:42 [PATCH 0/2] mm: introduce MAP_FIXED_SAFE Michal Hocko
2017-11-29 14:42 ` Michal Hocko
2017-11-29 14:42 ` Michal Hocko
2017-11-29 14:42 ` [PATCH 1/2] " Michal Hocko
2017-11-29 14:42   ` Michal Hocko
2017-11-29 14:42   ` Michal Hocko
2017-12-06  5:15   ` Michael Ellerman
2017-12-06  5:15     ` Michael Ellerman
2017-12-06  9:27     ` Michal Hocko
2017-12-06  9:27       ` Michal Hocko
2017-12-06 10:02       ` Michal Hocko
2017-12-06 10:02         ` Michal Hocko
2017-12-07 12:07   ` Pavel Machek
2017-12-07 12:07     ` Pavel Machek
2017-11-29 14:42 ` [PATCH 2/2] fs, elf: drop MAP_FIXED usage from elf_map Michal Hocko
2017-11-29 14:42   ` Michal Hocko
2017-11-29 14:42   ` Michal Hocko
2017-11-29 17:45   ` Khalid Aziz
2017-11-29 17:45     ` Khalid Aziz
2018-05-29 22:21     ` Mike Kravetz
2018-05-30  8:02       ` Michal Hocko
2018-05-30 15:00         ` Mike Kravetz
2018-05-30 16:25           ` Michal Hocko
2018-05-31  0:51             ` Mike Kravetz
2018-05-31  9:24               ` Michal Hocko
2018-05-31 21:46                 ` Mike Kravetz
2017-11-29 14:45 ` [PATCH] mmap.2: document new MAP_FIXED_SAFE flag Michal Hocko
2017-11-29 14:45   ` Michal Hocko
2017-11-29 14:45   ` Michal Hocko
2017-11-30  3:16   ` John Hubbard
2017-11-30  3:16     ` John Hubbard
2017-11-30  3:16     ` John Hubbard
2017-11-30  8:23     ` Michal Hocko
2017-11-30  8:23       ` Michal Hocko
2017-11-30  8:24   ` [PATCH v2] " Michal Hocko
2017-11-30  8:24     ` Michal Hocko
2017-11-30  8:24     ` Michal Hocko
2017-11-30  8:24     ` Michal Hocko
2017-11-30 18:31     ` John Hubbard
2017-11-30 18:31       ` John Hubbard
2017-11-30 18:31       ` John Hubbard
2017-11-30 18:39       ` Michal Hocko
2017-11-30 18:39         ` Michal Hocko
2017-11-29 15:13 ` [PATCH 0/2] mm: introduce MAP_FIXED_SAFE Rasmus Villemoes
2017-11-29 15:13   ` Rasmus Villemoes
2017-11-29 15:13   ` Rasmus Villemoes
2017-11-29 15:50   ` Michal Hocko
2017-11-29 15:50     ` Michal Hocko
2017-11-29 15:50     ` Michal Hocko
2017-11-29 22:15   ` Kees Cook
2017-11-29 22:15     ` Kees Cook
2017-11-29 22:12 ` Kees Cook
2017-11-29 22:12   ` Kees Cook
2017-11-29 22:25 ` Kees Cook
2017-11-29 22:25   ` Kees Cook
2017-11-30  6:58   ` Michal Hocko
2017-11-30  6:58     ` Michal Hocko
2017-11-30  6:58     ` Michal Hocko
2017-12-01 15:26     ` Cyril Hrubis
2017-12-01 15:26       ` Cyril Hrubis
2017-12-06  4:51       ` Michael Ellerman
2017-12-06  4:51         ` Michael Ellerman
2017-12-06  4:54         ` Matthew Wilcox
2017-12-06  4:54           ` Matthew Wilcox
2017-12-06  7:03           ` Matthew Wilcox
2017-12-06  7:03             ` Matthew Wilcox
2017-12-06  7:33             ` John Hubbard
2017-12-06  7:33               ` John Hubbard
2017-12-06  7:35               ` Florian Weimer
2017-12-06  7:35                 ` Florian Weimer
2017-12-06  7:35                 ` Florian Weimer
2017-12-06  8:06                 ` John Hubbard
2017-12-06  8:06                   ` John Hubbard
2017-12-06  8:06                   ` John Hubbard
2017-12-06  8:06                   ` John Hubbard
2017-12-06  8:54                   ` Florian Weimer
2017-12-06  8:54                     ` Florian Weimer
2017-12-06  8:54                     ` Florian Weimer
2017-12-07  5:46             ` Michael Ellerman
2017-12-07  5:46               ` Michael Ellerman
2017-12-07  5:46               ` Michael Ellerman
2017-12-07 19:14               ` Kees Cook
2017-12-07 19:14                 ` Kees Cook
2017-12-07 19:57                 ` Matthew Wilcox
2017-12-07 19:57                   ` Matthew Wilcox
2017-12-07 19:57                   ` Matthew Wilcox
2017-12-08  8:33                   ` Michal Hocko
2017-12-08  8:33                     ` Michal Hocko
2017-12-08 20:13                     ` Kees Cook
2017-12-08 20:13                       ` Kees Cook
2017-12-08 20:13                       ` Kees Cook
2017-12-08 20:57                       ` Matthew Wilcox
2017-12-08 20:57                         ` Matthew Wilcox
2017-12-08 20:57                         ` Matthew Wilcox
2017-12-08 11:08                   ` Michael Ellerman
2017-12-08 11:08                     ` Michael Ellerman
2017-12-08 14:27                     ` Pavel Machek
2017-12-08 20:31                       ` Cyril Hrubis
2017-12-08 20:31                         ` Cyril Hrubis
2017-12-08 20:31                         ` Cyril Hrubis
2017-12-08 20:47                       ` Florian Weimer
2017-12-08 20:47                         ` Florian Weimer
2017-12-08 20:47                         ` Florian Weimer
2017-12-08 14:33                     ` David Laight
2017-12-08 14:33                       ` David Laight
2017-12-06  4:50     ` Michael Ellerman
2017-12-06  4:50       ` Michael Ellerman
2017-12-06  7:33       ` Rasmus Villemoes
2017-12-06  7:33         ` Rasmus Villemoes
2017-12-06  7:33         ` Rasmus Villemoes
2017-12-06  9:08         ` Michal Hocko
2017-12-06  9:08           ` Michal Hocko
2017-12-06  9:08           ` Michal Hocko
2017-12-07  0:19           ` Kees Cook
2017-12-07  0:19             ` Kees Cook
2017-12-07  1:08             ` John Hubbard
2017-12-07  1:08               ` John Hubbard
  -- strict thread matches above, loose matches on Subject: below --
2017-12-13  9:25 [PATCH v2 " Michal Hocko
2017-12-13  9:25 ` [PATCH 2/2] fs, elf: drop MAP_FIXED usage from elf_map Michal Hocko
2017-12-13  9:25   ` Michal Hocko
2017-12-13  9:25   ` Michal Hocko
2018-04-18 10:51   ` Tetsuo Handa
2018-04-18 10:51     ` Tetsuo Handa
2018-04-18 11:33     ` Michal Hocko
2018-04-18 11:43       ` Tetsuo Handa
2018-04-18 11:55         ` Michal Hocko
2017-11-16 10:18 Michal Hocko
2017-11-16 10:19 ` [PATCH 2/2] fs, elf: drop MAP_FIXED usage from elf_map Michal Hocko
2017-11-16 10:19   ` Michal Hocko
2017-11-16 10:19   ` Michal Hocko
2017-11-17  0:30   ` Kees Cook
2017-11-17  0:30     ` Kees Cook

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.