linux-api.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* (unknown), 
@ 2017-11-16 10:18 Michal Hocko
  2017-11-16 10:18 ` [RFC PATCH 1/2] mm: introduce MAP_FIXED_SAFE Michal Hocko
                   ` (2 more replies)
  0 siblings, 3 replies; 22+ messages in thread
From: Michal Hocko @ 2017-11-16 10:18 UTC (permalink / raw)
  To: linux-api
  Cc: Khalid Aziz, Michael Ellerman, Andrew Morton,
	Russell King - ARM Linux, Andrea Arcangeli, linux-mm, LKML,
	linux-arch, Abdul Haleem, Joel Stanley, Kees Cook, Michal Hocko

Hi,
this has started as a follow up discussion [1][2] resulting in the
runtime failure caused by hardening patch [3] which removes MAP_FIXED
from the elf loader because MAP_FIXED is inherently dangerous as it
might silently clobber and existing underlying mapping (e.g. stack). The
reason for the failure is that some architectures enforce an alignment
for the given address hint without MAP_FIXED used (e.g. for shared or
file backed mappings).

One way around this would be excluding those archs which do alignment
tricks from the hardening [4]. The patch is really trivial but it has
been objected, rightfully so, that this screams for a more generic
solution. We basically want a non-destructive MAP_FIXED.

The first patch introduced MAP_FIXED_SAFE which enforces the given
address but unlike MAP_FIXED it fails with ENOMEM if the given range
conflicts with an existing one. The flag is introduced as a completely
new flag rather than a MAP_FIXED extension because of the backward
compatibility. We really want a never-clobber semantic even on older
kernels which do not recognize the flag. Unfortunately mmap sucks wrt.
flags evaluation because we do not EINVAL on unknown flags. On those
kernels we would simply use the traditional hint based semantic so the
caller can still get a different address (which sucks) but at least not
silently corrupt an existing mapping. I do not see a good way around
that. Except we won't export expose the new semantic to the userspace at
all. It seems there are users who would like to have something like that
[5], though. Atomic address range probing in the multithreaded programs
sounds like an interesting thing to me as well, although I do not have
any specific usecase in mind.

The second patch simply replaces MAP_FIXED use in elf loader by
MAP_FIXED_SAFE. I believe other places which rely on MAP_FIXED should
follow. Actually real MAP_FIXED usages should be docummented properly
and they should be more of an exception.

Does anybody see any fundamental reasons why this is a wrong approach?

Diffstat says
 arch/alpha/include/uapi/asm/mman.h   |  2 ++
 arch/metag/kernel/process.c          |  6 +++++-
 arch/mips/include/uapi/asm/mman.h    |  2 ++
 arch/parisc/include/uapi/asm/mman.h  |  2 ++
 arch/powerpc/include/uapi/asm/mman.h |  1 +
 arch/sparc/include/uapi/asm/mman.h   |  1 +
 arch/tile/include/uapi/asm/mman.h    |  1 +
 arch/xtensa/include/uapi/asm/mman.h  |  2 ++
 fs/binfmt_elf.c                      | 12 ++++++++----
 include/uapi/asm-generic/mman.h      |  1 +
 mm/mmap.c                            | 11 +++++++++++
 11 files changed, 36 insertions(+), 5 deletions(-)

[1] http://lkml.kernel.org/r/20171107162217.382cd754@canb.auug.org.au
[2] http://lkml.kernel.org/r/1510048229.12079.7.camel@abdul.in.ibm.com
[3] http://lkml.kernel.org/r/20171023082608.6167-1-mhocko@kernel.org
[4] http://lkml.kernel.org/r/20171113094203.aofz2e7kueitk55y@dhcp22.suse.cz
[5] http://lkml.kernel.org/r/87efp1w7vy.fsf@concordia.ellerman.id.au


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [RFC PATCH 1/2] mm: introduce MAP_FIXED_SAFE
  2017-11-16 10:18 (unknown), Michal Hocko
@ 2017-11-16 10:18 ` Michal Hocko
  2017-11-17  0:27   ` Kees Cook
                     ` (2 more replies)
  2017-11-16 10:19 ` [PATCH 2/2] fs, elf: drop MAP_FIXED usage from elf_map Michal Hocko
       [not found] ` <20171116101900.13621-1-mhocko-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
  2 siblings, 3 replies; 22+ messages in thread
From: Michal Hocko @ 2017-11-16 10:18 UTC (permalink / raw)
  To: linux-api
  Cc: Khalid Aziz, Michael Ellerman, Andrew Morton,
	Russell King - ARM Linux, Andrea Arcangeli, linux-mm, LKML,
	linux-arch, Michal Hocko

From: Michal Hocko <mhocko@suse.com>

MAP_FIXED is used quite often to enforce mapping at the particular
range. The main problem of this flag is, however, that it is inherently
dangerous because it unmaps existing mappings covered by the requested
range. This can cause silent memory corruptions. Some of them even with
serious security implications. While the current semantic might be
really desiderable in many cases there are others which would want to
enforce the given range but rather see a failure than a silent memory
corruption on a clashing range. Please note that there is no guarantee
that a given range is obeyed by the mmap even when it is free - e.g.
arch specific code is allowed to apply an alignment.

Introduce a new MAP_FIXED_SAFE flag for mmap to achieve this behavior.
It has the same semantic as MAP_FIXED wrt. the given address request
with a single exception that it fails with ENOMEM if the requested
address is already covered by an existing mapping. We still do rely on
get_unmaped_area to handle all the arch specific MAP_FIXED treatment and
check for a conflicting vma after it returns.

[set MAP_FIXED before round_hint_to_min as per Khalid Aziz]
Signed-off-by: Michal Hocko <mhocko@suse.com>
---
 arch/alpha/include/uapi/asm/mman.h   |  2 ++
 arch/mips/include/uapi/asm/mman.h    |  2 ++
 arch/parisc/include/uapi/asm/mman.h  |  2 ++
 arch/powerpc/include/uapi/asm/mman.h |  1 +
 arch/sparc/include/uapi/asm/mman.h   |  1 +
 arch/tile/include/uapi/asm/mman.h    |  1 +
 arch/xtensa/include/uapi/asm/mman.h  |  2 ++
 include/uapi/asm-generic/mman.h      |  1 +
 mm/mmap.c                            | 11 +++++++++++
 9 files changed, 23 insertions(+)

diff --git a/arch/alpha/include/uapi/asm/mman.h b/arch/alpha/include/uapi/asm/mman.h
index 3b26cc62dadb..0e5724e4b4ad 100644
--- a/arch/alpha/include/uapi/asm/mman.h
+++ b/arch/alpha/include/uapi/asm/mman.h
@@ -31,6 +31,8 @@
 #define MAP_STACK	0x80000		/* give out an address that is best suited for process/thread stacks */
 #define MAP_HUGETLB	0x100000	/* create a huge page mapping */
 
+#define MAP_FIXED_SAFE	0x200000	/* MAP_FIXED which doesn't unmap underlying mapping */
+
 #define MS_ASYNC	1		/* sync memory asynchronously */
 #define MS_SYNC		2		/* synchronous memory sync */
 #define MS_INVALIDATE	4		/* invalidate the caches */
diff --git a/arch/mips/include/uapi/asm/mman.h b/arch/mips/include/uapi/asm/mman.h
index da3216007fe0..fc5e61ef9fd4 100644
--- a/arch/mips/include/uapi/asm/mman.h
+++ b/arch/mips/include/uapi/asm/mman.h
@@ -49,6 +49,8 @@
 #define MAP_STACK	0x40000		/* give out an address that is best suited for process/thread stacks */
 #define MAP_HUGETLB	0x80000		/* create a huge page mapping */
 
+#define MAP_FIXED_SAFE	0x100000	/* MAP_FIXED which doesn't unmap underlying mapping */
+
 /*
  * Flags for msync
  */
diff --git a/arch/parisc/include/uapi/asm/mman.h b/arch/parisc/include/uapi/asm/mman.h
index cc9ba1d34779..c926487472fa 100644
--- a/arch/parisc/include/uapi/asm/mman.h
+++ b/arch/parisc/include/uapi/asm/mman.h
@@ -25,6 +25,8 @@
 #define MAP_STACK	0x40000		/* give out an address that is best suited for process/thread stacks */
 #define MAP_HUGETLB	0x80000		/* create a huge page mapping */
 
+#define MAP_FIXED_SAFE	0x100000	/* MAP_FIXED which doesn't unmap underlying mapping */
+
 #define MS_SYNC		1		/* synchronous memory sync */
 #define MS_ASYNC	2		/* sync memory asynchronously */
 #define MS_INVALIDATE	4		/* invalidate the caches */
diff --git a/arch/powerpc/include/uapi/asm/mman.h b/arch/powerpc/include/uapi/asm/mman.h
index 03c06ba7464f..d97342ca25b1 100644
--- a/arch/powerpc/include/uapi/asm/mman.h
+++ b/arch/powerpc/include/uapi/asm/mman.h
@@ -28,5 +28,6 @@
 #define MAP_NONBLOCK	0x10000		/* do not block on IO */
 #define MAP_STACK	0x20000		/* give out an address that is best suited for process/thread stacks */
 #define MAP_HUGETLB	0x40000		/* create a huge page mapping */
+#define MAP_FIXED_SAFE	0x800000	/* MAP_FIXED which doesn't unmap underlying mapping */
 
 #endif /* _UAPI_ASM_POWERPC_MMAN_H */
diff --git a/arch/sparc/include/uapi/asm/mman.h b/arch/sparc/include/uapi/asm/mman.h
index 9765896ecb2c..7b00477a7f9a 100644
--- a/arch/sparc/include/uapi/asm/mman.h
+++ b/arch/sparc/include/uapi/asm/mman.h
@@ -23,6 +23,7 @@
 #define MAP_NONBLOCK	0x10000		/* do not block on IO */
 #define MAP_STACK	0x20000		/* give out an address that is best suited for process/thread stacks */
 #define MAP_HUGETLB	0x40000		/* create a huge page mapping */
+#define MAP_FIXED_SAFE	0x80000		/* MAP_FIXED which doesn't unmap underlying mapping */
 
 
 #endif /* _UAPI__SPARC_MMAN_H__ */
diff --git a/arch/tile/include/uapi/asm/mman.h b/arch/tile/include/uapi/asm/mman.h
index 63ee13faf17d..d5d58d2dc95e 100644
--- a/arch/tile/include/uapi/asm/mman.h
+++ b/arch/tile/include/uapi/asm/mman.h
@@ -29,6 +29,7 @@
 #define MAP_DENYWRITE	0x0800		/* ETXTBSY */
 #define MAP_EXECUTABLE	0x1000		/* mark it as an executable */
 #define MAP_HUGETLB	0x4000		/* create a huge page mapping */
+#define MAP_FIXED_SAFE	0x8000		/* MAP_FIXED which doesn't unmap underlying mapping */
 
 
 /*
diff --git a/arch/xtensa/include/uapi/asm/mman.h b/arch/xtensa/include/uapi/asm/mman.h
index b15b278aa314..d665bd8b7cbd 100644
--- a/arch/xtensa/include/uapi/asm/mman.h
+++ b/arch/xtensa/include/uapi/asm/mman.h
@@ -55,6 +55,7 @@
 #define MAP_NONBLOCK	0x20000		/* do not block on IO */
 #define MAP_STACK	0x40000		/* give out an address that is best suited for process/thread stacks */
 #define MAP_HUGETLB	0x80000		/* create a huge page mapping */
+#define MAP_FIXED_SAFE	0x100000	/* MAP_FIXED which doesn't unmap underlying mapping */
 #ifdef CONFIG_MMAP_ALLOW_UNINITIALIZED
 # define MAP_UNINITIALIZED 0x4000000	/* For anonymous mmap, memory could be
 					 * uninitialized */
@@ -62,6 +63,7 @@
 # define MAP_UNINITIALIZED 0x0		/* Don't support this flag */
 #endif
 
+
 /*
  * Flags for msync
  */
diff --git a/include/uapi/asm-generic/mman.h b/include/uapi/asm-generic/mman.h
index 7162cd4cca73..64c46047fbd3 100644
--- a/include/uapi/asm-generic/mman.h
+++ b/include/uapi/asm-generic/mman.h
@@ -12,6 +12,7 @@
 #define MAP_NONBLOCK	0x10000		/* do not block on IO */
 #define MAP_STACK	0x20000		/* give out an address that is best suited for process/thread stacks */
 #define MAP_HUGETLB	0x40000		/* create a huge page mapping */
+#define MAP_FIXED_SAFE	0x80000		/* MAP_FIXED which doesn't unmap underlying mapping */
 
 /* Bits [26:31] are reserved, see mman-common.h for MAP_HUGETLB usage */
 
diff --git a/mm/mmap.c b/mm/mmap.c
index 680506faceae..89af0b5839a5 100644
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -1342,6 +1342,10 @@ unsigned long do_mmap(struct file *file, unsigned long addr,
 		if (!(file && path_noexec(&file->f_path)))
 			prot |= PROT_EXEC;
 
+	/* force arch specific MAP_FIXED handling in get_unmapped_area */
+	if (flags & MAP_FIXED_SAFE)
+		flags |= MAP_FIXED;
+
 	if (!(flags & MAP_FIXED))
 		addr = round_hint_to_min(addr);
 
@@ -1365,6 +1369,13 @@ unsigned long do_mmap(struct file *file, unsigned long addr,
 	if (offset_in_page(addr))
 		return addr;
 
+	if (flags & MAP_FIXED_SAFE) {
+		struct vm_area_struct *vma = find_vma(mm, addr);
+
+		if (vma && vma->vm_start <= addr)
+			return -ENOMEM;
+	}
+
 	if (prot == PROT_EXEC) {
 		pkey = execute_only_pkey(mm);
 		if (pkey < 0)
-- 
2.15.0

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 2/2] fs, elf: drop MAP_FIXED usage from elf_map
  2017-11-16 10:18 (unknown), Michal Hocko
  2017-11-16 10:18 ` [RFC PATCH 1/2] mm: introduce MAP_FIXED_SAFE Michal Hocko
@ 2017-11-16 10:19 ` Michal Hocko
  2017-11-17  0:30   ` Kees Cook
       [not found] ` <20171116101900.13621-1-mhocko-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
  2 siblings, 1 reply; 22+ messages in thread
From: Michal Hocko @ 2017-11-16 10:19 UTC (permalink / raw)
  To: linux-api
  Cc: Khalid Aziz, Michael Ellerman, Andrew Morton,
	Russell King - ARM Linux, Andrea Arcangeli, linux-mm, LKML,
	linux-arch, Michal Hocko, Abdul Haleem, Joel Stanley, Kees Cook

From: Michal Hocko <mhocko@suse.com>

Both load_elf_interp and load_elf_binary rely on elf_map to map segments
on a controlled address and they use MAP_FIXED to enforce that. This is
however dangerous thing prone to silent data corruption which can be
even exploitable. Let's take CVE-2017-1000253 as an example. At the time
(before eab09532d400 ("binfmt_elf: use ELF_ET_DYN_BASE only for PIE"))
ELF_ET_DYN_BASE was at TASK_SIZE / 3 * 2 which is not that far away from
the stack top on 32b (legacy) memory layout (only 1GB away). Therefore
we could end up mapping over the existing stack with some luck.

The issue has been fixed since then (a87938b2e246 ("fs/binfmt_elf.c:
fix bug in loading of PIE binaries")), ELF_ET_DYN_BASE moved moved much
further from the stack (eab09532d400 and later by c715b72c1ba4 ("mm:
revert x86_64 and arm64 ELF_ET_DYN_BASE base changes")) and excessive
stack consumption early during execve fully stopped by da029c11e6b1
("exec: Limit arg stack to at most 75% of _STK_LIM"). So we should be
safe and any attack should be impractical. On the other hand this is
just too subtle assumption so it can break quite easily and hard to
spot.

I believe that the MAP_FIXED usage in load_elf_binary (et. al) is still
fundamentally dangerous. Moreover it shouldn't be even needed. We are
at the early process stage and so there shouldn't be unrelated mappings
(except for stack and loader) existing so mmap for a given address
should succeed even without MAP_FIXED. Something is terribly wrong if
this is not the case and we should rather fail than silently corrupt the
underlying mapping.

Address this issue by changing MAP_FIXED to the newly added
MAP_FIXED_SAFE. This will mean that mmap will fail if there is an
existing mapping clashing with the requested one without clobbering it.

Cc: Abdul Haleem <abdhalee@linux.vnet.ibm.com>
Cc: Joel Stanley <joel@jms.id.au>
Cc: Kees Cook <keescook@chromium.org>
Signed-off-by: Michal Hocko <mhocko@suse.com>
---
 arch/metag/kernel/process.c |  6 +++++-
 fs/binfmt_elf.c             | 12 ++++++++----
 2 files changed, 13 insertions(+), 5 deletions(-)

diff --git a/arch/metag/kernel/process.c b/arch/metag/kernel/process.c
index c4606ce743d2..2286140e54e0 100644
--- a/arch/metag/kernel/process.c
+++ b/arch/metag/kernel/process.c
@@ -398,7 +398,7 @@ unsigned long __metag_elf_map(struct file *filep, unsigned long addr,
 	tcm_tag = tcm_lookup_tag(addr);
 
 	if (tcm_tag != TCM_INVALID_TAG)
-		type &= ~MAP_FIXED;
+		type &= ~(MAP_FIXED | MAP_FIXED_SAFE);
 
 	/*
 	* total_size is the size of the ELF (interpreter) image.
@@ -416,6 +416,10 @@ unsigned long __metag_elf_map(struct file *filep, unsigned long addr,
 	} else
 		map_addr = vm_mmap(filep, addr, size, prot, type, off);
 
+	if ((type & MAP_FIXED_SAFE) && BAD_ADDR(map_addr))
+		pr_info("%d (%s): Uhuuh, elf segement at %p requested but the memory is mapped already\n",
+				task_pid_nr(current), tsk->comm, (void*)addr);
+
 	if (!BAD_ADDR(map_addr) && tcm_tag != TCM_INVALID_TAG) {
 		struct tcm_allocation *tcm;
 		unsigned long tcm_addr;
diff --git a/fs/binfmt_elf.c b/fs/binfmt_elf.c
index 6466153f2bf0..12b21942ccde 100644
--- a/fs/binfmt_elf.c
+++ b/fs/binfmt_elf.c
@@ -372,6 +372,10 @@ static unsigned long elf_map(struct file *filep, unsigned long addr,
 	} else
 		map_addr = vm_mmap(filep, addr, size, prot, type, off);
 
+	if ((type & MAP_FIXED_SAFE) && BAD_ADDR(map_addr))
+		pr_info("%d (%s): Uhuuh, elf segement at %p requested but the memory is mapped already\n",
+				task_pid_nr(current), current->comm, (void*)addr);
+
 	return(map_addr);
 }
 
@@ -569,7 +573,7 @@ static unsigned long load_elf_interp(struct elfhdr *interp_elf_ex,
 				elf_prot |= PROT_EXEC;
 			vaddr = eppnt->p_vaddr;
 			if (interp_elf_ex->e_type == ET_EXEC || load_addr_set)
-				elf_type |= MAP_FIXED;
+				elf_type |= MAP_FIXED_SAFE;
 			else if (no_base && interp_elf_ex->e_type == ET_DYN)
 				load_addr = -vaddr;
 
@@ -929,7 +933,7 @@ static int load_elf_binary(struct linux_binprm *bprm)
 		 * the ET_DYN load_addr calculations, proceed normally.
 		 */
 		if (loc->elf_ex.e_type == ET_EXEC || load_addr_set) {
-			elf_flags |= MAP_FIXED;
+			elf_flags |= MAP_FIXED_SAFE;
 		} else if (loc->elf_ex.e_type == ET_DYN) {
 			/*
 			 * This logic is run once for the first LOAD Program
@@ -965,7 +969,7 @@ static int load_elf_binary(struct linux_binprm *bprm)
 				load_bias = ELF_ET_DYN_BASE;
 				if (current->flags & PF_RANDOMIZE)
 					load_bias += arch_mmap_rnd();
-				elf_flags |= MAP_FIXED;
+				elf_flags |= MAP_FIXED_SAFE;
 			} else
 				load_bias = 0;
 
@@ -1220,7 +1224,7 @@ static int load_elf_library(struct file *file)
 			(eppnt->p_filesz +
 			 ELF_PAGEOFFSET(eppnt->p_vaddr)),
 			PROT_READ | PROT_WRITE | PROT_EXEC,
-			MAP_FIXED | MAP_PRIVATE | MAP_DENYWRITE,
+			MAP_FIXED_SAFE | MAP_PRIVATE | MAP_DENYWRITE,
 			(eppnt->p_offset -
 			 ELF_PAGEOFFSET(eppnt->p_vaddr)));
 	if (error != ELF_PAGESTART(eppnt->p_vaddr))
-- 
2.15.0

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [RFC PATCH 0/2] mm: introduce MAP_FIXED_SAFE
       [not found] ` <20171116101900.13621-1-mhocko-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
@ 2017-11-16 12:14   ` Michal Hocko
       [not found]     ` <20171116121438.6vegs4wiahod3byl-2MMpYkNvuYDjFM9bn6wA6Q@public.gmane.org>
  2017-11-24  8:54     ` Michal Hocko
  0 siblings, 2 replies; 22+ messages in thread
From: Michal Hocko @ 2017-11-16 12:14 UTC (permalink / raw)
  To: linux-api-u79uwXL29TY76Z2rM5mHXA
  Cc: Khalid Aziz, Michael Ellerman, Andrew Morton,
	Russell King - ARM Linux, Andrea Arcangeli,
	linux-mm-Bw31MaZKKs3YtjvyW6yDsg, LKML,
	linux-arch-u79uwXL29TY76Z2rM5mHXA, Abdul Haleem, Joel Stanley,
	Kees Cook

[Ups, managed to screw the subject - fix it]

On Thu 16-11-17 11:18:58, Michal Hocko wrote:
> Hi,
> this has started as a follow up discussion [1][2] resulting in the
> runtime failure caused by hardening patch [3] which removes MAP_FIXED
> from the elf loader because MAP_FIXED is inherently dangerous as it
> might silently clobber and existing underlying mapping (e.g. stack). The
> reason for the failure is that some architectures enforce an alignment
> for the given address hint without MAP_FIXED used (e.g. for shared or
> file backed mappings).
> 
> One way around this would be excluding those archs which do alignment
> tricks from the hardening [4]. The patch is really trivial but it has
> been objected, rightfully so, that this screams for a more generic
> solution. We basically want a non-destructive MAP_FIXED.
> 
> The first patch introduced MAP_FIXED_SAFE which enforces the given
> address but unlike MAP_FIXED it fails with ENOMEM if the given range
> conflicts with an existing one. The flag is introduced as a completely
> new flag rather than a MAP_FIXED extension because of the backward
> compatibility. We really want a never-clobber semantic even on older
> kernels which do not recognize the flag. Unfortunately mmap sucks wrt.
> flags evaluation because we do not EINVAL on unknown flags. On those
> kernels we would simply use the traditional hint based semantic so the
> caller can still get a different address (which sucks) but at least not
> silently corrupt an existing mapping. I do not see a good way around
> that. Except we won't export expose the new semantic to the userspace at
> all. It seems there are users who would like to have something like that
> [5], though. Atomic address range probing in the multithreaded programs
> sounds like an interesting thing to me as well, although I do not have
> any specific usecase in mind.
> 
> The second patch simply replaces MAP_FIXED use in elf loader by
> MAP_FIXED_SAFE. I believe other places which rely on MAP_FIXED should
> follow. Actually real MAP_FIXED usages should be docummented properly
> and they should be more of an exception.
> 
> Does anybody see any fundamental reasons why this is a wrong approach?
> 
> Diffstat says
>  arch/alpha/include/uapi/asm/mman.h   |  2 ++
>  arch/metag/kernel/process.c          |  6 +++++-
>  arch/mips/include/uapi/asm/mman.h    |  2 ++
>  arch/parisc/include/uapi/asm/mman.h  |  2 ++
>  arch/powerpc/include/uapi/asm/mman.h |  1 +
>  arch/sparc/include/uapi/asm/mman.h   |  1 +
>  arch/tile/include/uapi/asm/mman.h    |  1 +
>  arch/xtensa/include/uapi/asm/mman.h  |  2 ++
>  fs/binfmt_elf.c                      | 12 ++++++++----
>  include/uapi/asm-generic/mman.h      |  1 +
>  mm/mmap.c                            | 11 +++++++++++
>  11 files changed, 36 insertions(+), 5 deletions(-)
> 
> [1] http://lkml.kernel.org/r/20171107162217.382cd754-3FnU+UHB4dNDw9hX6IcOSA@public.gmane.org
> [2] http://lkml.kernel.org/r/1510048229.12079.7.camel-JKZ9t1WPFCv1ENwx4SLHqw@public.gmane.org
> [3] http://lkml.kernel.org/r/20171023082608.6167-1-mhocko-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org
> [4] http://lkml.kernel.org/r/20171113094203.aofz2e7kueitk55y-2MMpYkNvuYDjFM9bn6wA6Q@public.gmane.org
> [5] http://lkml.kernel.org/r/87efp1w7vy.fsf-W0DJWXSxmBNbyGPkN3NxC2scP1bn1w/D@public.gmane.org
> 
> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo-Bw31MaZKKs0EbZ0PF+XxCw@public.gmane.org  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont-Bw31MaZKKs3YtjvyW6yDsg@public.gmane.org"> email-Bw31MaZKKs3YtjvyW6yDsg@public.gmane.org </a>

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFC PATCH 1/2] mm: introduce MAP_FIXED_SAFE
  2017-11-16 10:18 ` [RFC PATCH 1/2] mm: introduce MAP_FIXED_SAFE Michal Hocko
@ 2017-11-17  0:27   ` Kees Cook
       [not found]     ` <CAGXu5jKssQCcYcZujvQeFy5LTzhXSW=f-a0riB=4+caT1i38BQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  2017-11-17  7:30   ` Florian Weimer
  2017-11-17  8:37   ` John Hubbard
  2 siblings, 1 reply; 22+ messages in thread
From: Kees Cook @ 2017-11-17  0:27 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Linux API, Khalid Aziz, Michael Ellerman, Andrew Morton,
	Russell King - ARM Linux, Andrea Arcangeli, Linux-MM, LKML,
	linux-arch, Michal Hocko

On Thu, Nov 16, 2017 at 2:18 AM, Michal Hocko <mhocko@kernel.org> wrote:
> From: Michal Hocko <mhocko@suse.com>
>
> MAP_FIXED is used quite often to enforce mapping at the particular
> range. The main problem of this flag is, however, that it is inherently
> dangerous because it unmaps existing mappings covered by the requested
> range. This can cause silent memory corruptions. Some of them even with
> serious security implications. While the current semantic might be
> really desiderable in many cases there are others which would want to
> enforce the given range but rather see a failure than a silent memory
> corruption on a clashing range. Please note that there is no guarantee
> that a given range is obeyed by the mmap even when it is free - e.g.
> arch specific code is allowed to apply an alignment.
>
> Introduce a new MAP_FIXED_SAFE flag for mmap to achieve this behavior.
> It has the same semantic as MAP_FIXED wrt. the given address request
> with a single exception that it fails with ENOMEM if the requested
> address is already covered by an existing mapping. We still do rely on
> get_unmaped_area to handle all the arch specific MAP_FIXED treatment and
> check for a conflicting vma after it returns.

I like this much more than special-casing the ELF loader. It is an
unusual property that MAP_FIXED does _two_ things, so I like having
this split out.

Bikeshedding: maybe call this MAP_NO_CLOBBER? It's a modifier to
MAP_FIXED, really...

At the end of the day, I don't really care about the name, but "SAFE"
isn't very descriptive for what the flag changes about FIXED.

-Kees

>
> [set MAP_FIXED before round_hint_to_min as per Khalid Aziz]
> Signed-off-by: Michal Hocko <mhocko@suse.com>
> ---
>  arch/alpha/include/uapi/asm/mman.h   |  2 ++
>  arch/mips/include/uapi/asm/mman.h    |  2 ++
>  arch/parisc/include/uapi/asm/mman.h  |  2 ++
>  arch/powerpc/include/uapi/asm/mman.h |  1 +
>  arch/sparc/include/uapi/asm/mman.h   |  1 +
>  arch/tile/include/uapi/asm/mman.h    |  1 +
>  arch/xtensa/include/uapi/asm/mman.h  |  2 ++
>  include/uapi/asm-generic/mman.h      |  1 +
>  mm/mmap.c                            | 11 +++++++++++
>  9 files changed, 23 insertions(+)
>
> diff --git a/arch/alpha/include/uapi/asm/mman.h b/arch/alpha/include/uapi/asm/mman.h
> index 3b26cc62dadb..0e5724e4b4ad 100644
> --- a/arch/alpha/include/uapi/asm/mman.h
> +++ b/arch/alpha/include/uapi/asm/mman.h
> @@ -31,6 +31,8 @@
>  #define MAP_STACK      0x80000         /* give out an address that is best suited for process/thread stacks */
>  #define MAP_HUGETLB    0x100000        /* create a huge page mapping */
>
> +#define MAP_FIXED_SAFE 0x200000        /* MAP_FIXED which doesn't unmap underlying mapping */
> +
>  #define MS_ASYNC       1               /* sync memory asynchronously */
>  #define MS_SYNC                2               /* synchronous memory sync */
>  #define MS_INVALIDATE  4               /* invalidate the caches */
> diff --git a/arch/mips/include/uapi/asm/mman.h b/arch/mips/include/uapi/asm/mman.h
> index da3216007fe0..fc5e61ef9fd4 100644
> --- a/arch/mips/include/uapi/asm/mman.h
> +++ b/arch/mips/include/uapi/asm/mman.h
> @@ -49,6 +49,8 @@
>  #define MAP_STACK      0x40000         /* give out an address that is best suited for process/thread stacks */
>  #define MAP_HUGETLB    0x80000         /* create a huge page mapping */
>
> +#define MAP_FIXED_SAFE 0x100000        /* MAP_FIXED which doesn't unmap underlying mapping */
> +
>  /*
>   * Flags for msync
>   */
> diff --git a/arch/parisc/include/uapi/asm/mman.h b/arch/parisc/include/uapi/asm/mman.h
> index cc9ba1d34779..c926487472fa 100644
> --- a/arch/parisc/include/uapi/asm/mman.h
> +++ b/arch/parisc/include/uapi/asm/mman.h
> @@ -25,6 +25,8 @@
>  #define MAP_STACK      0x40000         /* give out an address that is best suited for process/thread stacks */
>  #define MAP_HUGETLB    0x80000         /* create a huge page mapping */
>
> +#define MAP_FIXED_SAFE 0x100000        /* MAP_FIXED which doesn't unmap underlying mapping */
> +
>  #define MS_SYNC                1               /* synchronous memory sync */
>  #define MS_ASYNC       2               /* sync memory asynchronously */
>  #define MS_INVALIDATE  4               /* invalidate the caches */
> diff --git a/arch/powerpc/include/uapi/asm/mman.h b/arch/powerpc/include/uapi/asm/mman.h
> index 03c06ba7464f..d97342ca25b1 100644
> --- a/arch/powerpc/include/uapi/asm/mman.h
> +++ b/arch/powerpc/include/uapi/asm/mman.h
> @@ -28,5 +28,6 @@
>  #define MAP_NONBLOCK   0x10000         /* do not block on IO */
>  #define MAP_STACK      0x20000         /* give out an address that is best suited for process/thread stacks */
>  #define MAP_HUGETLB    0x40000         /* create a huge page mapping */
> +#define MAP_FIXED_SAFE 0x800000        /* MAP_FIXED which doesn't unmap underlying mapping */
>
>  #endif /* _UAPI_ASM_POWERPC_MMAN_H */
> diff --git a/arch/sparc/include/uapi/asm/mman.h b/arch/sparc/include/uapi/asm/mman.h
> index 9765896ecb2c..7b00477a7f9a 100644
> --- a/arch/sparc/include/uapi/asm/mman.h
> +++ b/arch/sparc/include/uapi/asm/mman.h
> @@ -23,6 +23,7 @@
>  #define MAP_NONBLOCK   0x10000         /* do not block on IO */
>  #define MAP_STACK      0x20000         /* give out an address that is best suited for process/thread stacks */
>  #define MAP_HUGETLB    0x40000         /* create a huge page mapping */
> +#define MAP_FIXED_SAFE 0x80000         /* MAP_FIXED which doesn't unmap underlying mapping */
>
>
>  #endif /* _UAPI__SPARC_MMAN_H__ */
> diff --git a/arch/tile/include/uapi/asm/mman.h b/arch/tile/include/uapi/asm/mman.h
> index 63ee13faf17d..d5d58d2dc95e 100644
> --- a/arch/tile/include/uapi/asm/mman.h
> +++ b/arch/tile/include/uapi/asm/mman.h
> @@ -29,6 +29,7 @@
>  #define MAP_DENYWRITE  0x0800          /* ETXTBSY */
>  #define MAP_EXECUTABLE 0x1000          /* mark it as an executable */
>  #define MAP_HUGETLB    0x4000          /* create a huge page mapping */
> +#define MAP_FIXED_SAFE 0x8000          /* MAP_FIXED which doesn't unmap underlying mapping */
>
>
>  /*
> diff --git a/arch/xtensa/include/uapi/asm/mman.h b/arch/xtensa/include/uapi/asm/mman.h
> index b15b278aa314..d665bd8b7cbd 100644
> --- a/arch/xtensa/include/uapi/asm/mman.h
> +++ b/arch/xtensa/include/uapi/asm/mman.h
> @@ -55,6 +55,7 @@
>  #define MAP_NONBLOCK   0x20000         /* do not block on IO */
>  #define MAP_STACK      0x40000         /* give out an address that is best suited for process/thread stacks */
>  #define MAP_HUGETLB    0x80000         /* create a huge page mapping */
> +#define MAP_FIXED_SAFE 0x100000        /* MAP_FIXED which doesn't unmap underlying mapping */
>  #ifdef CONFIG_MMAP_ALLOW_UNINITIALIZED
>  # define MAP_UNINITIALIZED 0x4000000   /* For anonymous mmap, memory could be
>                                          * uninitialized */
> @@ -62,6 +63,7 @@
>  # define MAP_UNINITIALIZED 0x0         /* Don't support this flag */
>  #endif
>
> +
>  /*
>   * Flags for msync
>   */
> diff --git a/include/uapi/asm-generic/mman.h b/include/uapi/asm-generic/mman.h
> index 7162cd4cca73..64c46047fbd3 100644
> --- a/include/uapi/asm-generic/mman.h
> +++ b/include/uapi/asm-generic/mman.h
> @@ -12,6 +12,7 @@
>  #define MAP_NONBLOCK   0x10000         /* do not block on IO */
>  #define MAP_STACK      0x20000         /* give out an address that is best suited for process/thread stacks */
>  #define MAP_HUGETLB    0x40000         /* create a huge page mapping */
> +#define MAP_FIXED_SAFE 0x80000         /* MAP_FIXED which doesn't unmap underlying mapping */
>
>  /* Bits [26:31] are reserved, see mman-common.h for MAP_HUGETLB usage */
>
> diff --git a/mm/mmap.c b/mm/mmap.c
> index 680506faceae..89af0b5839a5 100644
> --- a/mm/mmap.c
> +++ b/mm/mmap.c
> @@ -1342,6 +1342,10 @@ unsigned long do_mmap(struct file *file, unsigned long addr,
>                 if (!(file && path_noexec(&file->f_path)))
>                         prot |= PROT_EXEC;
>
> +       /* force arch specific MAP_FIXED handling in get_unmapped_area */
> +       if (flags & MAP_FIXED_SAFE)
> +               flags |= MAP_FIXED;
> +
>         if (!(flags & MAP_FIXED))
>                 addr = round_hint_to_min(addr);
>
> @@ -1365,6 +1369,13 @@ unsigned long do_mmap(struct file *file, unsigned long addr,
>         if (offset_in_page(addr))
>                 return addr;
>
> +       if (flags & MAP_FIXED_SAFE) {
> +               struct vm_area_struct *vma = find_vma(mm, addr);
> +
> +               if (vma && vma->vm_start <= addr)
> +                       return -ENOMEM;
> +       }
> +
>         if (prot == PROT_EXEC) {
>                 pkey = execute_only_pkey(mm);
>                 if (pkey < 0)
> --
> 2.15.0
>



-- 
Kees Cook
Pixel Security

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 2/2] fs, elf: drop MAP_FIXED usage from elf_map
  2017-11-16 10:19 ` [PATCH 2/2] fs, elf: drop MAP_FIXED usage from elf_map Michal Hocko
@ 2017-11-17  0:30   ` Kees Cook
  0 siblings, 0 replies; 22+ messages in thread
From: Kees Cook @ 2017-11-17  0:30 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Linux API, Khalid Aziz, Michael Ellerman, Andrew Morton,
	Russell King - ARM Linux, Andrea Arcangeli, Linux-MM, LKML,
	linux-arch, Michal Hocko, Abdul Haleem, Joel Stanley

On Thu, Nov 16, 2017 at 2:19 AM, Michal Hocko <mhocko@kernel.org> wrote:
> From: Michal Hocko <mhocko@suse.com>
>
> Both load_elf_interp and load_elf_binary rely on elf_map to map segments
> on a controlled address and they use MAP_FIXED to enforce that. This is
> however dangerous thing prone to silent data corruption which can be
> even exploitable. Let's take CVE-2017-1000253 as an example. At the time
> (before eab09532d400 ("binfmt_elf: use ELF_ET_DYN_BASE only for PIE"))
> ELF_ET_DYN_BASE was at TASK_SIZE / 3 * 2 which is not that far away from
> the stack top on 32b (legacy) memory layout (only 1GB away). Therefore
> we could end up mapping over the existing stack with some luck.
>
> The issue has been fixed since then (a87938b2e246 ("fs/binfmt_elf.c:
> fix bug in loading of PIE binaries")), ELF_ET_DYN_BASE moved moved much
> further from the stack (eab09532d400 and later by c715b72c1ba4 ("mm:
> revert x86_64 and arm64 ELF_ET_DYN_BASE base changes")) and excessive
> stack consumption early during execve fully stopped by da029c11e6b1
> ("exec: Limit arg stack to at most 75% of _STK_LIM"). So we should be
> safe and any attack should be impractical. On the other hand this is
> just too subtle assumption so it can break quite easily and hard to
> spot.
>
> I believe that the MAP_FIXED usage in load_elf_binary (et. al) is still
> fundamentally dangerous. Moreover it shouldn't be even needed. We are
> at the early process stage and so there shouldn't be unrelated mappings
> (except for stack and loader) existing so mmap for a given address
> should succeed even without MAP_FIXED. Something is terribly wrong if
> this is not the case and we should rather fail than silently corrupt the
> underlying mapping.
>
> Address this issue by changing MAP_FIXED to the newly added
> MAP_FIXED_SAFE. This will mean that mmap will fail if there is an
> existing mapping clashing with the requested one without clobbering it.
>
> Cc: Abdul Haleem <abdhalee@linux.vnet.ibm.com>
> Cc: Joel Stanley <joel@jms.id.au>
> Cc: Kees Cook <keescook@chromium.org>
> Signed-off-by: Michal Hocko <mhocko@suse.com>

Once (if?) the name gets settled, this looks good to me:

Acked-by: Kees Cook <keescook@chromium.org>

-Kees

> ---
>  arch/metag/kernel/process.c |  6 +++++-
>  fs/binfmt_elf.c             | 12 ++++++++----
>  2 files changed, 13 insertions(+), 5 deletions(-)
>
> diff --git a/arch/metag/kernel/process.c b/arch/metag/kernel/process.c
> index c4606ce743d2..2286140e54e0 100644
> --- a/arch/metag/kernel/process.c
> +++ b/arch/metag/kernel/process.c
> @@ -398,7 +398,7 @@ unsigned long __metag_elf_map(struct file *filep, unsigned long addr,
>         tcm_tag = tcm_lookup_tag(addr);
>
>         if (tcm_tag != TCM_INVALID_TAG)
> -               type &= ~MAP_FIXED;
> +               type &= ~(MAP_FIXED | MAP_FIXED_SAFE);
>
>         /*
>         * total_size is the size of the ELF (interpreter) image.
> @@ -416,6 +416,10 @@ unsigned long __metag_elf_map(struct file *filep, unsigned long addr,
>         } else
>                 map_addr = vm_mmap(filep, addr, size, prot, type, off);
>
> +       if ((type & MAP_FIXED_SAFE) && BAD_ADDR(map_addr))
> +               pr_info("%d (%s): Uhuuh, elf segement at %p requested but the memory is mapped already\n",
> +                               task_pid_nr(current), tsk->comm, (void*)addr);
> +
>         if (!BAD_ADDR(map_addr) && tcm_tag != TCM_INVALID_TAG) {
>                 struct tcm_allocation *tcm;
>                 unsigned long tcm_addr;
> diff --git a/fs/binfmt_elf.c b/fs/binfmt_elf.c
> index 6466153f2bf0..12b21942ccde 100644
> --- a/fs/binfmt_elf.c
> +++ b/fs/binfmt_elf.c
> @@ -372,6 +372,10 @@ static unsigned long elf_map(struct file *filep, unsigned long addr,
>         } else
>                 map_addr = vm_mmap(filep, addr, size, prot, type, off);
>
> +       if ((type & MAP_FIXED_SAFE) && BAD_ADDR(map_addr))
> +               pr_info("%d (%s): Uhuuh, elf segement at %p requested but the memory is mapped already\n",
> +                               task_pid_nr(current), current->comm, (void*)addr);
> +
>         return(map_addr);
>  }
>
> @@ -569,7 +573,7 @@ static unsigned long load_elf_interp(struct elfhdr *interp_elf_ex,
>                                 elf_prot |= PROT_EXEC;
>                         vaddr = eppnt->p_vaddr;
>                         if (interp_elf_ex->e_type == ET_EXEC || load_addr_set)
> -                               elf_type |= MAP_FIXED;
> +                               elf_type |= MAP_FIXED_SAFE;
>                         else if (no_base && interp_elf_ex->e_type == ET_DYN)
>                                 load_addr = -vaddr;
>
> @@ -929,7 +933,7 @@ static int load_elf_binary(struct linux_binprm *bprm)
>                  * the ET_DYN load_addr calculations, proceed normally.
>                  */
>                 if (loc->elf_ex.e_type == ET_EXEC || load_addr_set) {
> -                       elf_flags |= MAP_FIXED;
> +                       elf_flags |= MAP_FIXED_SAFE;
>                 } else if (loc->elf_ex.e_type == ET_DYN) {
>                         /*
>                          * This logic is run once for the first LOAD Program
> @@ -965,7 +969,7 @@ static int load_elf_binary(struct linux_binprm *bprm)
>                                 load_bias = ELF_ET_DYN_BASE;
>                                 if (current->flags & PF_RANDOMIZE)
>                                         load_bias += arch_mmap_rnd();
> -                               elf_flags |= MAP_FIXED;
> +                               elf_flags |= MAP_FIXED_SAFE;
>                         } else
>                                 load_bias = 0;
>
> @@ -1220,7 +1224,7 @@ static int load_elf_library(struct file *file)
>                         (eppnt->p_filesz +
>                          ELF_PAGEOFFSET(eppnt->p_vaddr)),
>                         PROT_READ | PROT_WRITE | PROT_EXEC,
> -                       MAP_FIXED | MAP_PRIVATE | MAP_DENYWRITE,
> +                       MAP_FIXED_SAFE | MAP_PRIVATE | MAP_DENYWRITE,
>                         (eppnt->p_offset -
>                          ELF_PAGEOFFSET(eppnt->p_vaddr)));
>         if (error != ELF_PAGESTART(eppnt->p_vaddr))
> --
> 2.15.0
>



-- 
Kees Cook
Pixel Security

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFC PATCH 1/2] mm: introduce MAP_FIXED_SAFE
  2017-11-16 10:18 ` [RFC PATCH 1/2] mm: introduce MAP_FIXED_SAFE Michal Hocko
  2017-11-17  0:27   ` Kees Cook
@ 2017-11-17  7:30   ` Florian Weimer
  2017-11-20  8:55     ` Michal Hocko
  2017-11-17  8:37   ` John Hubbard
  2 siblings, 1 reply; 22+ messages in thread
From: Florian Weimer @ 2017-11-17  7:30 UTC (permalink / raw)
  To: Michal Hocko, linux-api
  Cc: Khalid Aziz, Michael Ellerman, Andrew Morton,
	Russell King - ARM Linux, Andrea Arcangeli, linux-mm, LKML,
	linux-arch, Michal Hocko

On 11/16/2017 11:18 AM, Michal Hocko wrote:
> +	if (flags & MAP_FIXED_SAFE) {
> +		struct vm_area_struct *vma = find_vma(mm, addr);
> +
> +		if (vma && vma->vm_start <= addr)
> +			return -ENOMEM;
> +	}

Could you pick a different error code which cannot also be caused by a 
an unrelated, possibly temporary condition?  Maybe EBUSY or EEXIST?

This would definitely help with application-based randomization of 
mappings, and there, actual ENOMEM and this error would have to be 
handled differently.

Thanks,
Florian

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFC PATCH 1/2] mm: introduce MAP_FIXED_SAFE
  2017-11-16 10:18 ` [RFC PATCH 1/2] mm: introduce MAP_FIXED_SAFE Michal Hocko
  2017-11-17  0:27   ` Kees Cook
  2017-11-17  7:30   ` Florian Weimer
@ 2017-11-17  8:37   ` John Hubbard
  2017-11-20  9:02     ` Michal Hocko
  2 siblings, 1 reply; 22+ messages in thread
From: John Hubbard @ 2017-11-17  8:37 UTC (permalink / raw)
  To: Michal Hocko, linux-api
  Cc: Khalid Aziz, Michael Ellerman, Andrew Morton,
	Russell King - ARM Linux, Andrea Arcangeli, linux-mm, LKML,
	linux-arch, Michal Hocko

On 11/16/2017 02:18 AM, Michal Hocko wrote:
> From: Michal Hocko <mhocko@suse.com>
> 
> MAP_FIXED is used quite often to enforce mapping at the particular
> range. The main problem of this flag is, however, that it is inherently
> dangerous because it unmaps existing mappings covered by the requested
> range. This can cause silent memory corruptions. Some of them even with
> serious security implications. While the current semantic might be
> really desiderable in many cases there are others which would want to
> enforce the given range but rather see a failure than a silent memory
> corruption on a clashing range. Please note that there is no guarantee
> that a given range is obeyed by the mmap even when it is free - e.g.
> arch specific code is allowed to apply an alignment.
> 
> Introduce a new MAP_FIXED_SAFE flag for mmap to achieve this behavior.
> It has the same semantic as MAP_FIXED wrt. the given address request
> with a single exception that it fails with ENOMEM if the requested
> address is already covered by an existing mapping. We still do rely on
> get_unmaped_area to handle all the arch specific MAP_FIXED treatment and
> check for a conflicting vma after it returns.
> 
> [set MAP_FIXED before round_hint_to_min as per Khalid Aziz]
> Signed-off-by: Michal Hocko <mhocko@suse.com>
> ---
>  arch/alpha/include/uapi/asm/mman.h   |  2 ++
>  arch/mips/include/uapi/asm/mman.h    |  2 ++
>  arch/parisc/include/uapi/asm/mman.h  |  2 ++
>  arch/powerpc/include/uapi/asm/mman.h |  1 +
>  arch/sparc/include/uapi/asm/mman.h   |  1 +
>  arch/tile/include/uapi/asm/mman.h    |  1 +
>  arch/xtensa/include/uapi/asm/mman.h  |  2 ++
>  include/uapi/asm-generic/mman.h      |  1 +
>  mm/mmap.c                            | 11 +++++++++++
>  9 files changed, 23 insertions(+)
> 
> diff --git a/arch/alpha/include/uapi/asm/mman.h b/arch/alpha/include/uapi/asm/mman.h
> index 3b26cc62dadb..0e5724e4b4ad 100644
> --- a/arch/alpha/include/uapi/asm/mman.h
> +++ b/arch/alpha/include/uapi/asm/mman.h
> @@ -31,6 +31,8 @@
>  #define MAP_STACK	0x80000		/* give out an address that is best suited for process/thread stacks */
>  #define MAP_HUGETLB	0x100000	/* create a huge page mapping */
>  
> +#define MAP_FIXED_SAFE	0x200000	/* MAP_FIXED which doesn't unmap underlying mapping */
> +
>  #define MS_ASYNC	1		/* sync memory asynchronously */
>  #define MS_SYNC		2		/* synchronous memory sync */
>  #define MS_INVALIDATE	4		/* invalidate the caches */
> diff --git a/arch/mips/include/uapi/asm/mman.h b/arch/mips/include/uapi/asm/mman.h
> index da3216007fe0..fc5e61ef9fd4 100644
> --- a/arch/mips/include/uapi/asm/mman.h
> +++ b/arch/mips/include/uapi/asm/mman.h
> @@ -49,6 +49,8 @@
>  #define MAP_STACK	0x40000		/* give out an address that is best suited for process/thread stacks */
>  #define MAP_HUGETLB	0x80000		/* create a huge page mapping */
>  
> +#define MAP_FIXED_SAFE	0x100000	/* MAP_FIXED which doesn't unmap underlying mapping */
> +
>  /*
>   * Flags for msync
>   */
> diff --git a/arch/parisc/include/uapi/asm/mman.h b/arch/parisc/include/uapi/asm/mman.h
> index cc9ba1d34779..c926487472fa 100644
> --- a/arch/parisc/include/uapi/asm/mman.h
> +++ b/arch/parisc/include/uapi/asm/mman.h
> @@ -25,6 +25,8 @@
>  #define MAP_STACK	0x40000		/* give out an address that is best suited for process/thread stacks */
>  #define MAP_HUGETLB	0x80000		/* create a huge page mapping */
>  
> +#define MAP_FIXED_SAFE	0x100000	/* MAP_FIXED which doesn't unmap underlying mapping */
> +
>  #define MS_SYNC		1		/* synchronous memory sync */
>  #define MS_ASYNC	2		/* sync memory asynchronously */
>  #define MS_INVALIDATE	4		/* invalidate the caches */
> diff --git a/arch/powerpc/include/uapi/asm/mman.h b/arch/powerpc/include/uapi/asm/mman.h
> index 03c06ba7464f..d97342ca25b1 100644
> --- a/arch/powerpc/include/uapi/asm/mman.h
> +++ b/arch/powerpc/include/uapi/asm/mman.h
> @@ -28,5 +28,6 @@
>  #define MAP_NONBLOCK	0x10000		/* do not block on IO */
>  #define MAP_STACK	0x20000		/* give out an address that is best suited for process/thread stacks */
>  #define MAP_HUGETLB	0x40000		/* create a huge page mapping */
> +#define MAP_FIXED_SAFE	0x800000	/* MAP_FIXED which doesn't unmap underlying mapping */
 

Hi Michal,

1. The powerpc change, above, has one too many zeroes. It should be 0x80000, 
not 0x800000.

2. For the one-line comments, if you phrase them like this:

/* Like MAP_FIXED, except that it doesn't unmap pre-existing mappings */

...I think that would be a little clearer.

more below:

>  #endif /* _UAPI_ASM_POWERPC_MMAN_H */
> diff --git a/arch/sparc/include/uapi/asm/mman.h b/arch/sparc/include/uapi/asm/mman.h
> index 9765896ecb2c..7b00477a7f9a 100644
> --- a/arch/sparc/include/uapi/asm/mman.h
> +++ b/arch/sparc/include/uapi/asm/mman.h
> @@ -23,6 +23,7 @@
>  #define MAP_NONBLOCK	0x10000		/* do not block on IO */
>  #define MAP_STACK	0x20000		/* give out an address that is best suited for process/thread stacks */
>  #define MAP_HUGETLB	0x40000		/* create a huge page mapping */
> +#define MAP_FIXED_SAFE	0x80000		/* MAP_FIXED which doesn't unmap underlying mapping */
>  
>  
>  #endif /* _UAPI__SPARC_MMAN_H__ */
> diff --git a/arch/tile/include/uapi/asm/mman.h b/arch/tile/include/uapi/asm/mman.h
> index 63ee13faf17d..d5d58d2dc95e 100644
> --- a/arch/tile/include/uapi/asm/mman.h
> +++ b/arch/tile/include/uapi/asm/mman.h
> @@ -29,6 +29,7 @@
>  #define MAP_DENYWRITE	0x0800		/* ETXTBSY */
>  #define MAP_EXECUTABLE	0x1000		/* mark it as an executable */
>  #define MAP_HUGETLB	0x4000		/* create a huge page mapping */
> +#define MAP_FIXED_SAFE	0x8000		/* MAP_FIXED which doesn't unmap underlying mapping */
>  
>  
>  /*
> diff --git a/arch/xtensa/include/uapi/asm/mman.h b/arch/xtensa/include/uapi/asm/mman.h
> index b15b278aa314..d665bd8b7cbd 100644
> --- a/arch/xtensa/include/uapi/asm/mman.h
> +++ b/arch/xtensa/include/uapi/asm/mman.h
> @@ -55,6 +55,7 @@
>  #define MAP_NONBLOCK	0x20000		/* do not block on IO */
>  #define MAP_STACK	0x40000		/* give out an address that is best suited for process/thread stacks */
>  #define MAP_HUGETLB	0x80000		/* create a huge page mapping */
> +#define MAP_FIXED_SAFE	0x100000	/* MAP_FIXED which doesn't unmap underlying mapping */
>  #ifdef CONFIG_MMAP_ALLOW_UNINITIALIZED
>  # define MAP_UNINITIALIZED 0x4000000	/* For anonymous mmap, memory could be
>  					 * uninitialized */
> @@ -62,6 +63,7 @@
>  # define MAP_UNINITIALIZED 0x0		/* Don't support this flag */
>  #endif
>  
> +
>  /*
>   * Flags for msync
>   */
> diff --git a/include/uapi/asm-generic/mman.h b/include/uapi/asm-generic/mman.h
> index 7162cd4cca73..64c46047fbd3 100644
> --- a/include/uapi/asm-generic/mman.h
> +++ b/include/uapi/asm-generic/mman.h
> @@ -12,6 +12,7 @@
>  #define MAP_NONBLOCK	0x10000		/* do not block on IO */
>  #define MAP_STACK	0x20000		/* give out an address that is best suited for process/thread stacks */
>  #define MAP_HUGETLB	0x40000		/* create a huge page mapping */
> +#define MAP_FIXED_SAFE	0x80000		/* MAP_FIXED which doesn't unmap underlying mapping */
>  
>  /* Bits [26:31] are reserved, see mman-common.h for MAP_HUGETLB usage */
>  
> diff --git a/mm/mmap.c b/mm/mmap.c
> index 680506faceae..89af0b5839a5 100644
> --- a/mm/mmap.c
> +++ b/mm/mmap.c
> @@ -1342,6 +1342,10 @@ unsigned long do_mmap(struct file *file, unsigned long addr,
>  		if (!(file && path_noexec(&file->f_path)))
>  			prot |= PROT_EXEC;
>  
> +	/* force arch specific MAP_FIXED handling in get_unmapped_area */
> +	if (flags & MAP_FIXED_SAFE)
> +		flags |= MAP_FIXED;

Hooking in at this point is a nice way to solve the problem. :)

For the naming and implementation, I see a couple of things that might improve
it slightly:

a) Change MAP_FIXED_SAFE to MAP_NO_CLOBBER (as per Kees' idea), but keep the
new flag independent, by omitting the above two lines. Instead of forcing
MAP_FIXED as a result of the new flag, you could simply fail to take any action
on MAP_NO_CLOBBER *unless* MAP_FIXED is set.

This is a bit easier to explain and reason about, as compared to a flag that
auto-sets another flag. I like this approach best.

  or

b) Change MAP_FIXED_SAFE to MAP_FIXED_NO_CLOBBER (also a variation on Kees' name
idea, but a little longer, a bit uglier, and clearer), and leave the implementation
the same.

thanks,
John Hubbard

> +
>  	if (!(flags & MAP_FIXED))
>  		addr = round_hint_to_min(addr);
>  
> @@ -1365,6 +1369,13 @@ unsigned long do_mmap(struct file *file, unsigned long addr,
>  	if (offset_in_page(addr))
>  		return addr;
>  
> +	if (flags & MAP_FIXED_SAFE) {
> +		struct vm_area_struct *vma = find_vma(mm, addr);
> +
> +		if (vma && vma->vm_start <= addr)
> +			return -ENOMEM;
> +	}
> +
>  	if (prot == PROT_EXEC) {
>  		pkey = execute_only_pkey(mm);
>  		if (pkey < 0)
> 

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFC PATCH 0/2] mm: introduce MAP_FIXED_SAFE
       [not found]     ` <20171116121438.6vegs4wiahod3byl-2MMpYkNvuYDjFM9bn6wA6Q@public.gmane.org>
@ 2017-11-17  8:45       ` John Hubbard
  2017-11-20  9:05         ` Michal Hocko
  0 siblings, 1 reply; 22+ messages in thread
From: John Hubbard @ 2017-11-17  8:45 UTC (permalink / raw)
  To: Michal Hocko, linux-api-u79uwXL29TY76Z2rM5mHXA
  Cc: Khalid Aziz, Michael Ellerman, Andrew Morton,
	Russell King - ARM Linux, Andrea Arcangeli,
	linux-mm-Bw31MaZKKs3YtjvyW6yDsg, LKML,
	linux-arch-u79uwXL29TY76Z2rM5mHXA, Abdul Haleem, Joel Stanley,
	Kees Cook

On 11/16/2017 04:14 AM, Michal Hocko wrote:
> [Ups, managed to screw the subject - fix it]
> 
> On Thu 16-11-17 11:18:58, Michal Hocko wrote:
>> Hi,
>> this has started as a follow up discussion [1][2] resulting in the
>> runtime failure caused by hardening patch [3] which removes MAP_FIXED
>> from the elf loader because MAP_FIXED is inherently dangerous as it
>> might silently clobber and existing underlying mapping (e.g. stack). The
>> reason for the failure is that some architectures enforce an alignment
>> for the given address hint without MAP_FIXED used (e.g. for shared or
>> file backed mappings).
>>
>> One way around this would be excluding those archs which do alignment
>> tricks from the hardening [4]. The patch is really trivial but it has
>> been objected, rightfully so, that this screams for a more generic
>> solution. We basically want a non-destructive MAP_FIXED.
>>
>> The first patch introduced MAP_FIXED_SAFE which enforces the given
>> address but unlike MAP_FIXED it fails with ENOMEM if the given range
>> conflicts with an existing one. The flag is introduced as a completely
>> new flag rather than a MAP_FIXED extension because of the backward
>> compatibility. We really want a never-clobber semantic even on older
>> kernels which do not recognize the flag. Unfortunately mmap sucks wrt.
>> flags evaluation because we do not EINVAL on unknown flags. On those
>> kernels we would simply use the traditional hint based semantic so the
>> caller can still get a different address (which sucks) but at least not
>> silently corrupt an existing mapping. I do not see a good way around
>> that. Except we won't export expose the new semantic to the userspace at
>> all. It seems there are users who would like to have something like that
>> [5], though. Atomic address range probing in the multithreaded programs
>> sounds like an interesting thing to me as well, although I do not have
>> any specific usecase in mind.

Hi Michal,

>From looking at the patchset, it seems to me that the new MAP_FIXED_SAFE
(or whatever it ends up being named) *would* be passed through from
user space. When you say that "we won't export expose the new semantic 
to the userspace at all", do you mean that glibc won't add it? Or
is there something I'm missing, that prevents that flag from getting
from the syscall, to do_mmap()?

On the usage: there are cases in user space that could probably make
good use of a no-clobber hint to MAP_FIXED. The user space code
that surrounds HMM (speaking loosely there--it's really any user space
code that manages a unified memory address space, across devices)
often ends up using MAP_FIXED, but MAP_FIXED crams several features
into one flag: an exact address, an "atomic" switch to the new mapping,
and unmapping the old mappings. That's pretty overloaded, so being
able to split it up a bit, by removing one of those features, seems
useful.

thanks,
John Hubbard

>>
>> The second patch simply replaces MAP_FIXED use in elf loader by
>> MAP_FIXED_SAFE. I believe other places which rely on MAP_FIXED should
>> follow. Actually real MAP_FIXED usages should be docummented properly
>> and they should be more of an exception.
>>
>> Does anybody see any fundamental reasons why this is a wrong approach?
>>
>> Diffstat says
>>  arch/alpha/include/uapi/asm/mman.h   |  2 ++
>>  arch/metag/kernel/process.c          |  6 +++++-
>>  arch/mips/include/uapi/asm/mman.h    |  2 ++
>>  arch/parisc/include/uapi/asm/mman.h  |  2 ++
>>  arch/powerpc/include/uapi/asm/mman.h |  1 +
>>  arch/sparc/include/uapi/asm/mman.h   |  1 +
>>  arch/tile/include/uapi/asm/mman.h    |  1 +
>>  arch/xtensa/include/uapi/asm/mman.h  |  2 ++
>>  fs/binfmt_elf.c                      | 12 ++++++++----
>>  include/uapi/asm-generic/mman.h      |  1 +
>>  mm/mmap.c                            | 11 +++++++++++
>>  11 files changed, 36 insertions(+), 5 deletions(-)
>>
>> [1] http://lkml.kernel.org/r/20171107162217.382cd754-3FnU+UHB4dNDw9hX6IcOSA@public.gmane.org
>> [2] http://lkml.kernel.org/r/1510048229.12079.7.camel-JKZ9t1WPFCv1ENwx4SLHqw@public.gmane.org
>> [3] http://lkml.kernel.org/r/20171023082608.6167-1-mhocko-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org
>> [4] http://lkml.kernel.org/r/20171113094203.aofz2e7kueitk55y-2MMpYkNvuYDjFM9bn6wA6Q@public.gmane.org
>> [5] http://lkml.kernel.org/r/87efp1w7vy.fsf-W0DJWXSxmBNbyGPkN3NxC2scP1bn1w/D@public.gmane.org
>>
>>
>> --
>> To unsubscribe, send a message with 'unsubscribe linux-mm' in
>> the body to majordomo-Bw31MaZKKs0EbZ0PF+XxCw@public.gmane.org  For more info on Linux MM,
>> see: http://www.linux-mm.org/ .
>> Don't email: <a href=mailto:"dont-Bw31MaZKKs3YtjvyW6yDsg@public.gmane.org"> email-Bw31MaZKKs3YtjvyW6yDsg@public.gmane.org </a>
> 

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFC PATCH 1/2] mm: introduce MAP_FIXED_SAFE
       [not found]     ` <CAGXu5jKssQCcYcZujvQeFy5LTzhXSW=f-a0riB=4+caT1i38BQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2017-11-17 19:12       ` Matthew Wilcox
  2017-11-20  8:43         ` Michal Hocko
  0 siblings, 1 reply; 22+ messages in thread
From: Matthew Wilcox @ 2017-11-17 19:12 UTC (permalink / raw)
  To: Kees Cook
  Cc: Michal Hocko, Linux API, Khalid Aziz, Michael Ellerman,
	Andrew Morton, Russell King - ARM Linux, Andrea Arcangeli,
	Linux-MM, LKML, linux-arch, Michal Hocko

On Thu, Nov 16, 2017 at 04:27:36PM -0800, Kees Cook wrote:
> On Thu, Nov 16, 2017 at 2:18 AM, Michal Hocko <mhocko-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> wrote:
> > From: Michal Hocko <mhocko-IBi9RG/b67k@public.gmane.org>
> >
> > MAP_FIXED is used quite often to enforce mapping at the particular
> > range. The main problem of this flag is, however, that it is inherently
> > dangerous because it unmaps existing mappings covered by the requested
> > range. This can cause silent memory corruptions. Some of them even with
> > serious security implications. While the current semantic might be
> > really desiderable in many cases there are others which would want to
> > enforce the given range but rather see a failure than a silent memory
> > corruption on a clashing range. Please note that there is no guarantee
> > that a given range is obeyed by the mmap even when it is free - e.g.
> > arch specific code is allowed to apply an alignment.
> >
> > Introduce a new MAP_FIXED_SAFE flag for mmap to achieve this behavior.
> > It has the same semantic as MAP_FIXED wrt. the given address request
> > with a single exception that it fails with ENOMEM if the requested
> > address is already covered by an existing mapping. We still do rely on
> > get_unmaped_area to handle all the arch specific MAP_FIXED treatment and
> > check for a conflicting vma after it returns.
> 
> I like this much more than special-casing the ELF loader. It is an
> unusual property that MAP_FIXED does _two_ things, so I like having
> this split out.
> 
> Bikeshedding: maybe call this MAP_NO_CLOBBER? It's a modifier to
> MAP_FIXED, really...

Way back when, I proposed a new flag called MAP_FIXED_WEAK.  I was
dissuaded from it when userspace people said it was just as easy for
them to provide the address hint, then run fixups on their data if the
address they were assigned wasn't the one they asked for.

The real problem is that MAP_FIXED should have been called MAP_FORCE.

So ... do we really have users that want failure instead of success at
a different address?  And if so, is it really a hardship for them to
make a call to unmap on success-at-the-wrong-address?

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFC PATCH 1/2] mm: introduce MAP_FIXED_SAFE
  2017-11-17 19:12       ` Matthew Wilcox
@ 2017-11-20  8:43         ` Michal Hocko
  0 siblings, 0 replies; 22+ messages in thread
From: Michal Hocko @ 2017-11-20  8:43 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Kees Cook, Linux API, Khalid Aziz, Michael Ellerman,
	Andrew Morton, Russell King - ARM Linux, Andrea Arcangeli,
	Linux-MM, LKML, linux-arch

On Fri 17-11-17 11:12:51, Matthew Wilcox wrote:
> On Thu, Nov 16, 2017 at 04:27:36PM -0800, Kees Cook wrote:
> > On Thu, Nov 16, 2017 at 2:18 AM, Michal Hocko <mhocko@kernel.org> wrote:
> > > From: Michal Hocko <mhocko@suse.com>
> > >
> > > MAP_FIXED is used quite often to enforce mapping at the particular
> > > range. The main problem of this flag is, however, that it is inherently
> > > dangerous because it unmaps existing mappings covered by the requested
> > > range. This can cause silent memory corruptions. Some of them even with
> > > serious security implications. While the current semantic might be
> > > really desiderable in many cases there are others which would want to
> > > enforce the given range but rather see a failure than a silent memory
> > > corruption on a clashing range. Please note that there is no guarantee
> > > that a given range is obeyed by the mmap even when it is free - e.g.
> > > arch specific code is allowed to apply an alignment.
> > >
> > > Introduce a new MAP_FIXED_SAFE flag for mmap to achieve this behavior.
> > > It has the same semantic as MAP_FIXED wrt. the given address request
> > > with a single exception that it fails with ENOMEM if the requested
> > > address is already covered by an existing mapping. We still do rely on
> > > get_unmaped_area to handle all the arch specific MAP_FIXED treatment and
> > > check for a conflicting vma after it returns.
> > 
> > I like this much more than special-casing the ELF loader. It is an
> > unusual property that MAP_FIXED does _two_ things, so I like having
> > this split out.
> > 
> > Bikeshedding: maybe call this MAP_NO_CLOBBER? It's a modifier to
> > MAP_FIXED, really...

Unfortunatelly doing this as an extension wouldn't work due to backward
compatibility issues. See [1]
 
> Way back when, I proposed a new flag called MAP_FIXED_WEAK.  I was
> dissuaded from it when userspace people said it was just as easy for
> them to provide the address hint, then run fixups on their data if the
> address they were assigned wasn't the one they asked for.
> 
> The real problem is that MAP_FIXED should have been called MAP_FORCE.
> 
> So ... do we really have users that want failure instead of success at
> a different address?

I am not really sure but Michael has pointed out to jemalloc [2] which
could probably use it.

> And if so, is it really a hardship for them to
> make a call to unmap on success-at-the-wrong-address?

How do you do something like that safely in a multithreaded environment?
You do not have any safe way to do atomic probe of a memory range.

As I've said, I do not insist on exporting this functionality to the
userspace. I can make it an internal flag (outside of the map type) and
use it solely in the kernel but considering how MAP_FIXED is tricky I
wouldn't be surprised if the userspace can find a use for this. The main
question is, are there any downsides to do so?

[1] http://lkml.kernel.org/r/20171114092916.ho5mvwc23xnelmod@dhcp22.suse.cz
[2] http://lkml.kernel.org/r/87efp1w7vy.fsf@concordia.ellerman.id.au

-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFC PATCH 1/2] mm: introduce MAP_FIXED_SAFE
  2017-11-17  7:30   ` Florian Weimer
@ 2017-11-20  8:55     ` Michal Hocko
  2017-11-20  9:10       ` Florian Weimer
  0 siblings, 1 reply; 22+ messages in thread
From: Michal Hocko @ 2017-11-20  8:55 UTC (permalink / raw)
  To: Florian Weimer
  Cc: linux-api, Khalid Aziz, Michael Ellerman, Andrew Morton,
	Russell King - ARM Linux, Andrea Arcangeli, linux-mm, LKML,
	linux-arch

On Fri 17-11-17 08:30:48, Florian Weimer wrote:
> On 11/16/2017 11:18 AM, Michal Hocko wrote:
> > +	if (flags & MAP_FIXED_SAFE) {
> > +		struct vm_area_struct *vma = find_vma(mm, addr);
> > +
> > +		if (vma && vma->vm_start <= addr)
> > +			return -ENOMEM;
> > +	}
> 
> Could you pick a different error code which cannot also be caused by a an
> unrelated, possibly temporary condition?  Maybe EBUSY or EEXIST?

Hmm, none of those are described in the man page. I am usually very
careful to not add new and potentially unexpected error codes but it is
true that a new flag should warrant a new error code. I am not sure
which one is more appropriate though. EBUSY suggests that retrying might
help which is true only if some other party unmaps the range. So EEXIST
would sound more natural.

> This would definitely help with application-based randomization of mappings,
> and there, actual ENOMEM and this error would have to be handled
> differently.

I see. Could you be more specific about the usecase you have in mind? I
would incorporate it into the patch description.
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFC PATCH 1/2] mm: introduce MAP_FIXED_SAFE
  2017-11-17  8:37   ` John Hubbard
@ 2017-11-20  9:02     ` Michal Hocko
  0 siblings, 0 replies; 22+ messages in thread
From: Michal Hocko @ 2017-11-20  9:02 UTC (permalink / raw)
  To: John Hubbard
  Cc: linux-api, Khalid Aziz, Michael Ellerman, Andrew Morton,
	Russell King - ARM Linux, Andrea Arcangeli, linux-mm, LKML,
	linux-arch

On Fri 17-11-17 00:37:18, John Hubbard wrote:
> On 11/16/2017 02:18 AM, Michal Hocko wrote:
[...]
> > diff --git a/arch/powerpc/include/uapi/asm/mman.h b/arch/powerpc/include/uapi/asm/mman.h
> > index 03c06ba7464f..d97342ca25b1 100644
> > --- a/arch/powerpc/include/uapi/asm/mman.h
> > +++ b/arch/powerpc/include/uapi/asm/mman.h
> > @@ -28,5 +28,6 @@
> >  #define MAP_NONBLOCK	0x10000		/* do not block on IO */
> >  #define MAP_STACK	0x20000		/* give out an address that is best suited for process/thread stacks */
> >  #define MAP_HUGETLB	0x40000		/* create a huge page mapping */
> > +#define MAP_FIXED_SAFE	0x800000	/* MAP_FIXED which doesn't unmap underlying mapping */
>  
> 
> Hi Michal,
> 
> 1. The powerpc change, above, has one too many zeroes. It should be 0x80000, 
> not 0x800000.

OK, I will fix it. It shouldn't matter much, because we only care about
non-clashing address but I agree that we should consume them from bottom
bits.

> 2. For the one-line comments, if you phrase them like this:
> 
> /* Like MAP_FIXED, except that it doesn't unmap pre-existing mappings */
> 
> ...I think that would be a little clearer.

I do not have any strong preference here.
[...]
> > diff --git a/mm/mmap.c b/mm/mmap.c
> > index 680506faceae..89af0b5839a5 100644
> > --- a/mm/mmap.c
> > +++ b/mm/mmap.c
> > @@ -1342,6 +1342,10 @@ unsigned long do_mmap(struct file *file, unsigned long addr,
> >  		if (!(file && path_noexec(&file->f_path)))
> >  			prot |= PROT_EXEC;
> >  
> > +	/* force arch specific MAP_FIXED handling in get_unmapped_area */
> > +	if (flags & MAP_FIXED_SAFE)
> > +		flags |= MAP_FIXED;
> 
> Hooking in at this point is a nice way to solve the problem. :)
> 
> For the naming and implementation, I see a couple of things that might improve
> it slightly:
> 
> a) Change MAP_FIXED_SAFE to MAP_NO_CLOBBER (as per Kees' idea), but keep the
> new flag independent, by omitting the above two lines. Instead of forcing
> MAP_FIXED as a result of the new flag, you could simply fail to take any action
> on MAP_NO_CLOBBER *unless* MAP_FIXED is set.
> 
> This is a bit easier to explain and reason about, as compared to a flag that
> auto-sets another flag. I like this approach best.

As I've exaplained in other email I do not think we can make this a
modifier.
 
>   or
> 
> b) Change MAP_FIXED_SAFE to MAP_FIXED_NO_CLOBBER (also a variation on Kees' name
> idea, but a little longer, a bit uglier, and clearer), and leave the implementation
> the same.

I do not have a _strong_ preference on the name itself. But I think that
_SAFE reflects the behavior slightly better because _NO_CLOBBER is not
very specific _when_ and _what_ we do not clobber while _SAFE is clear
on that it doesn't perform any unsafe operations.

But if the majority think that _NO_CLOBBER is better i will change it.
-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFC PATCH 0/2] mm: introduce MAP_FIXED_SAFE
  2017-11-17  8:45       ` John Hubbard
@ 2017-11-20  9:05         ` Michal Hocko
  2017-11-22  1:48           ` John Hubbard
  0 siblings, 1 reply; 22+ messages in thread
From: Michal Hocko @ 2017-11-20  9:05 UTC (permalink / raw)
  To: John Hubbard
  Cc: linux-api, Khalid Aziz, Michael Ellerman, Andrew Morton,
	Russell King - ARM Linux, Andrea Arcangeli, linux-mm, LKML,
	linux-arch, Abdul Haleem, Joel Stanley, Kees Cook

On Fri 17-11-17 00:45:49, John Hubbard wrote:
> On 11/16/2017 04:14 AM, Michal Hocko wrote:
> > [Ups, managed to screw the subject - fix it]
> > 
> > On Thu 16-11-17 11:18:58, Michal Hocko wrote:
> >> Hi,
> >> this has started as a follow up discussion [1][2] resulting in the
> >> runtime failure caused by hardening patch [3] which removes MAP_FIXED
> >> from the elf loader because MAP_FIXED is inherently dangerous as it
> >> might silently clobber and existing underlying mapping (e.g. stack). The
> >> reason for the failure is that some architectures enforce an alignment
> >> for the given address hint without MAP_FIXED used (e.g. for shared or
> >> file backed mappings).
> >>
> >> One way around this would be excluding those archs which do alignment
> >> tricks from the hardening [4]. The patch is really trivial but it has
> >> been objected, rightfully so, that this screams for a more generic
> >> solution. We basically want a non-destructive MAP_FIXED.
> >>
> >> The first patch introduced MAP_FIXED_SAFE which enforces the given
> >> address but unlike MAP_FIXED it fails with ENOMEM if the given range
> >> conflicts with an existing one. The flag is introduced as a completely
> >> new flag rather than a MAP_FIXED extension because of the backward
> >> compatibility. We really want a never-clobber semantic even on older
> >> kernels which do not recognize the flag. Unfortunately mmap sucks wrt.
> >> flags evaluation because we do not EINVAL on unknown flags. On those
> >> kernels we would simply use the traditional hint based semantic so the
> >> caller can still get a different address (which sucks) but at least not
> >> silently corrupt an existing mapping. I do not see a good way around
> >> that. Except we won't export expose the new semantic to the userspace at
> >> all. It seems there are users who would like to have something like that
> >> [5], though. Atomic address range probing in the multithreaded programs
> >> sounds like an interesting thing to me as well, although I do not have
> >> any specific usecase in mind.
> 
> Hi Michal,
> 
> From looking at the patchset, it seems to me that the new MAP_FIXED_SAFE
> (or whatever it ends up being named) *would* be passed through from
> user space. When you say that "we won't export expose the new semantic 
> to the userspace at all", do you mean that glibc won't add it? Or
> is there something I'm missing, that prevents that flag from getting
> from the syscall, to do_mmap()?

I meant that I could make it an internal flag outside of the map_type
space. So the userspace will not be able to use it.
 
> On the usage: there are cases in user space that could probably make
> good use of a no-clobber hint to MAP_FIXED. The user space code
> that surrounds HMM (speaking loosely there--it's really any user space
> code that manages a unified memory address space, across devices)
> often ends up using MAP_FIXED, but MAP_FIXED crams several features
> into one flag: an exact address, an "atomic" switch to the new mapping,
> and unmapping the old mappings. That's pretty overloaded, so being
> able to split it up a bit, by removing one of those features, seems
> useful.

Yes, atomic address range probing sounds useful. I cannot comment on HMM
usage but if you have any more specific I would welcome any links to add
them to the changelog.
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFC PATCH 1/2] mm: introduce MAP_FIXED_SAFE
  2017-11-20  8:55     ` Michal Hocko
@ 2017-11-20  9:10       ` Florian Weimer
       [not found]         ` <37a6e9ba-e0df-b65f-d5ef-871c25b5cb87-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
  0 siblings, 1 reply; 22+ messages in thread
From: Florian Weimer @ 2017-11-20  9:10 UTC (permalink / raw)
  To: Michal Hocko
  Cc: linux-api, Khalid Aziz, Michael Ellerman, Andrew Morton,
	Russell King - ARM Linux, Andrea Arcangeli, linux-mm, LKML,
	linux-arch

On 11/20/2017 09:55 AM, Michal Hocko wrote:
> On Fri 17-11-17 08:30:48, Florian Weimer wrote:
>> On 11/16/2017 11:18 AM, Michal Hocko wrote:
>>> +	if (flags & MAP_FIXED_SAFE) {
>>> +		struct vm_area_struct *vma = find_vma(mm, addr);
>>> +
>>> +		if (vma && vma->vm_start <= addr)
>>> +			return -ENOMEM;
>>> +	}
>>
>> Could you pick a different error code which cannot also be caused by a an
>> unrelated, possibly temporary condition?  Maybe EBUSY or EEXIST?
> 
> Hmm, none of those are described in the man page. I am usually very
> careful to not add new and potentially unexpected error codes but it is

I think this is a bad idea.  It leads to bizarre behavior, like open 
failing with EOVERFLOW with certain namespace configurations (which have 
nothing to do with file sizes).

Most of the manual pages are incomplete regarding error codes, and with 
seccomp filters and security modules, what error codes you actually get 
is anyone's guess.

> true that a new flag should warrant a new error code. I am not sure
> which one is more appropriate though. EBUSY suggests that retrying might
> help which is true only if some other party unmaps the range. So EEXIST
> would sound more natural.

Sure, EEXIST is completely fine.

>> This would definitely help with application-based randomization of mappings,
>> and there, actual ENOMEM and this error would have to be handled
>> differently.
> 
> I see. Could you be more specific about the usecase you have in mind? I
> would incorporate it into the patch description.

glibc ld.so currently maps DSOs without hints.  This means that the 
kernel will map right next to each other, and the offsets between them a 
completely predictable.  We would like to change that and supply a 
random address in a window of the address space.  If there is a 
conflict, we do not want the kernel to pick a non-random address. 
Instead, we would try again with a random address.

Thanks,
Florian

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFC PATCH 1/2] mm: introduce MAP_FIXED_SAFE
       [not found]         ` <37a6e9ba-e0df-b65f-d5ef-871c25b5cb87-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
@ 2017-11-20  9:33           ` Michal Hocko
  2017-11-20  9:45             ` Florian Weimer
  0 siblings, 1 reply; 22+ messages in thread
From: Michal Hocko @ 2017-11-20  9:33 UTC (permalink / raw)
  To: Florian Weimer
  Cc: linux-api-u79uwXL29TY76Z2rM5mHXA, Khalid Aziz, Michael Ellerman,
	Andrew Morton, Russell King - ARM Linux, Andrea Arcangeli,
	linux-mm-Bw31MaZKKs3YtjvyW6yDsg, LKML,
	linux-arch-u79uwXL29TY76Z2rM5mHXA

On Mon 20-11-17 10:10:32, Florian Weimer wrote:
> On 11/20/2017 09:55 AM, Michal Hocko wrote:
> > On Fri 17-11-17 08:30:48, Florian Weimer wrote:
> > > On 11/16/2017 11:18 AM, Michal Hocko wrote:
> > > > +	if (flags & MAP_FIXED_SAFE) {
> > > > +		struct vm_area_struct *vma = find_vma(mm, addr);
> > > > +
> > > > +		if (vma && vma->vm_start <= addr)
> > > > +			return -ENOMEM;
> > > > +	}
> > > 
> > > Could you pick a different error code which cannot also be caused by a an
> > > unrelated, possibly temporary condition?  Maybe EBUSY or EEXIST?
> > 
> > Hmm, none of those are described in the man page. I am usually very
> > careful to not add new and potentially unexpected error codes but it is
> 
> I think this is a bad idea.  It leads to bizarre behavior, like open failing
> with EOVERFLOW with certain namespace configurations (which have nothing to
> do with file sizes).

Ohh, I agree but breaking userspace is, you know, no-no. And an
unexpected error codes can break things terribly.

> Most of the manual pages are incomplete regarding error codes, and with
> seccomp filters and security modules, what error codes you actually get is
> anyone's guess.
> 
> > true that a new flag should warrant a new error code. I am not sure
> > which one is more appropriate though. EBUSY suggests that retrying might
> > help which is true only if some other party unmaps the range. So EEXIST
> > would sound more natural.
> 
> Sure, EEXIST is completely fine.

OK, I will use it.
 
> > > This would definitely help with application-based randomization of mappings,
> > > and there, actual ENOMEM and this error would have to be handled
> > > differently.
> > 
> > I see. Could you be more specific about the usecase you have in mind? I
> > would incorporate it into the patch description.
> 
> glibc ld.so currently maps DSOs without hints.  This means that the kernel
> will map right next to each other, and the offsets between them a completely
> predictable.  We would like to change that and supply a random address in a
> window of the address space.  If there is a conflict, we do not want the
> kernel to pick a non-random address. Instead, we would try again with a
> random address.

This makes sense to me. Thanks, I will add it to the cover letter.
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFC PATCH 1/2] mm: introduce MAP_FIXED_SAFE
  2017-11-20  9:33           ` Michal Hocko
@ 2017-11-20  9:45             ` Florian Weimer
  0 siblings, 0 replies; 22+ messages in thread
From: Florian Weimer @ 2017-11-20  9:45 UTC (permalink / raw)
  To: Michal Hocko
  Cc: linux-api, Khalid Aziz, Michael Ellerman, Andrew Morton,
	Russell King - ARM Linux, Andrea Arcangeli, linux-mm, LKML,
	linux-arch

On 11/20/2017 10:33 AM, Michal Hocko wrote:
> On Mon 20-11-17 10:10:32, Florian Weimer wrote:
>> On 11/20/2017 09:55 AM, Michal Hocko wrote:
>>> On Fri 17-11-17 08:30:48, Florian Weimer wrote:
>>>> On 11/16/2017 11:18 AM, Michal Hocko wrote:
>>>>> +	if (flags & MAP_FIXED_SAFE) {
>>>>> +		struct vm_area_struct *vma = find_vma(mm, addr);
>>>>> +
>>>>> +		if (vma && vma->vm_start <= addr)
>>>>> +			return -ENOMEM;
>>>>> +	}
>>>>
>>>> Could you pick a different error code which cannot also be caused by a an
>>>> unrelated, possibly temporary condition?  Maybe EBUSY or EEXIST?
>>>
>>> Hmm, none of those are described in the man page. I am usually very
>>> careful to not add new and potentially unexpected error codes but it is
>>
>> I think this is a bad idea.  It leads to bizarre behavior, like open failing
>> with EOVERFLOW with certain namespace configurations (which have nothing to
>> do with file sizes).
> 
> Ohh, I agree but breaking userspace is, you know, no-no. And an
> unexpected error codes can break things terribly.

On the glibc side, we see a lot of changes in error codes depending on 
kernel version, build and run-time configuration.  It never occurred to 
me that you guys think the precise error code is part of the userspace 
ABI.  Personally, I even assume that failure itself can disappear at any 
time (evidence: the f* functions which accept O_PATH in their non-*at 
variants).

Thanks,
Florian

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFC PATCH 0/2] mm: introduce MAP_FIXED_SAFE
  2017-11-20  9:05         ` Michal Hocko
@ 2017-11-22  1:48           ` John Hubbard
  2017-11-22 13:12             ` Michal Hocko
  0 siblings, 1 reply; 22+ messages in thread
From: John Hubbard @ 2017-11-22  1:48 UTC (permalink / raw)
  To: Michal Hocko
  Cc: linux-api, Khalid Aziz, Michael Ellerman, Andrew Morton,
	Russell King - ARM Linux, Andrea Arcangeli, linux-mm, LKML,
	linux-arch, Abdul Haleem, Joel Stanley, Kees Cook

On 11/20/2017 01:05 AM, Michal Hocko wrote:
> On Fri 17-11-17 00:45:49, John Hubbard wrote:
>> On 11/16/2017 04:14 AM, Michal Hocko wrote:
>>> [Ups, managed to screw the subject - fix it]
>>>
>>> On Thu 16-11-17 11:18:58, Michal Hocko wrote:
>>>> Hi,
>>>> this has started as a follow up discussion [1][2] resulting in the
>>>> runtime failure caused by hardening patch [3] which removes MAP_FIXED
>>>> from the elf loader because MAP_FIXED is inherently dangerous as it
>>>> might silently clobber and existing underlying mapping (e.g. stack). The
>>>> reason for the failure is that some architectures enforce an alignment
>>>> for the given address hint without MAP_FIXED used (e.g. for shared or
>>>> file backed mappings).
>>>>
>>>> One way around this would be excluding those archs which do alignment
>>>> tricks from the hardening [4]. The patch is really trivial but it has
>>>> been objected, rightfully so, that this screams for a more generic
>>>> solution. We basically want a non-destructive MAP_FIXED.
>>>>
>>>> The first patch introduced MAP_FIXED_SAFE which enforces the given
>>>> address but unlike MAP_FIXED it fails with ENOMEM if the given range
>>>> conflicts with an existing one. The flag is introduced as a completely
>>>> new flag rather than a MAP_FIXED extension because of the backward
>>>> compatibility. We really want a never-clobber semantic even on older
>>>> kernels which do not recognize the flag. Unfortunately mmap sucks wrt.
>>>> flags evaluation because we do not EINVAL on unknown flags. On those
>>>> kernels we would simply use the traditional hint based semantic so the
>>>> caller can still get a different address (which sucks) but at least not
>>>> silently corrupt an existing mapping. I do not see a good way around
>>>> that. Except we won't export expose the new semantic to the userspace at
>>>> all. It seems there are users who would like to have something like that
>>>> [5], though. Atomic address range probing in the multithreaded programs
>>>> sounds like an interesting thing to me as well, although I do not have
>>>> any specific usecase in mind.
>>
>> Hi Michal,
>>
>> From looking at the patchset, it seems to me that the new MAP_FIXED_SAFE
>> (or whatever it ends up being named) *would* be passed through from
>> user space. When you say that "we won't export expose the new semantic 
>> to the userspace at all", do you mean that glibc won't add it? Or
>> is there something I'm missing, that prevents that flag from getting
>> from the syscall, to do_mmap()?
> 
> I meant that I could make it an internal flag outside of the map_type
> space. So the userspace will not be able to use it.
>  
>> On the usage: there are cases in user space that could probably make
>> good use of a no-clobber hint to MAP_FIXED. The user space code
>> that surrounds HMM (speaking loosely there--it's really any user space
>> code that manages a unified memory address space, across devices)
>> often ends up using MAP_FIXED, but MAP_FIXED crams several features
>> into one flag: an exact address, an "atomic" switch to the new mapping,
>> and unmapping the old mappings. That's pretty overloaded, so being
>> able to split it up a bit, by removing one of those features, seems
>> useful.
> 
> Yes, atomic address range probing sounds useful. I cannot comment on HMM
> usage but if you have any more specific I would welcome any links to add
> them to the changelog.
> 

Hi Michal,

Yes, it really is useful for user space. I'll use CUDA as an example, but I 
think anything that enforces a uniform virtual addressing scheme across CPUs
and devices, probably has to do something eerily similar. CUDA does this:

a) Searches /proc/<pid>/maps for a "suitable" region of available VA space. 
"Suitable" generally means it has to have a base address within a certain
limited range (a particular device model might have odd limitations, for 
example), it has to be large enough, and alignment has to be large enough
(again, various devices may have constraints that lead us to do this).

This is of course subject to races with other threads in the process.

Let's say it finds a region starting at va.

b) Next it does: 
    p = mmap(va, ...) 

*without* setting MAP_FIXED, of course (so va is just a hint), to attempt to
safely reserve that region. If p != va, then in most cases, this is a failure
(almost certainly due to another thread getting a mapping from that region
before we did), and so this layer now has to call munmap(), before returning
a "failure: retry" to upper layers.

    IMPROVEMENT: --> if instead, we could call this:

            p = mmap(va, ... MAP_FIXED_NO_CLOBBER ...)

        , then we could skip the munmap() call upon failure. This is a small thing, 
        but it is useful here. (Thanks to Piotr Jaroszynski and Mark Hairgrove
        for helping me get that detail exactly right, btw.)

c) After that, CUDA suballocates from p, via: 
 
     q = mmap(sub_region_start, ... MAP_FIXED ...)

Interestingly enough, "freeing" is also done via MAP_FIXED, and setting PROT_NONE
to the subregion. Anyway, I just included (c) for general interest.

I expect that as we continue working on the open source compute software stack,
this new capability will be useful there, too.

Oh, and on the naming, when I described how your implementation worked (without
naming it) to Piotr, he said, "oh, something like map-fixed-no-clobber?". So I
think my miniature sociology naming data point here can bolster the case ever so
slightly for calling it MAP_FIXED_NO_CLOBBER. haha. :)

thanks,
John Hubbard

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFC PATCH 0/2] mm: introduce MAP_FIXED_SAFE
  2017-11-22  1:48           ` John Hubbard
@ 2017-11-22 13:12             ` Michal Hocko
  2017-11-22 13:20               ` Vlastimil Babka
  0 siblings, 1 reply; 22+ messages in thread
From: Michal Hocko @ 2017-11-22 13:12 UTC (permalink / raw)
  To: John Hubbard
  Cc: linux-api, Khalid Aziz, Michael Ellerman, Andrew Morton,
	Russell King - ARM Linux, Andrea Arcangeli, linux-mm, LKML,
	linux-arch, Abdul Haleem, Joel Stanley, Kees Cook

On Tue 21-11-17 17:48:31, John Hubbard wrote:
[...]
> Hi Michal,
> 
> Yes, it really is useful for user space. I'll use CUDA as an example, but I 
> think anything that enforces a uniform virtual addressing scheme across CPUs
> and devices, probably has to do something eerily similar. CUDA does this:
> 
> a) Searches /proc/<pid>/maps for a "suitable" region of available VA space. 
> "Suitable" generally means it has to have a base address within a certain
> limited range (a particular device model might have odd limitations, for 
> example), it has to be large enough, and alignment has to be large enough
> (again, various devices may have constraints that lead us to do this).
> 
> This is of course subject to races with other threads in the process.
> 
> Let's say it finds a region starting at va.
> 
> b) Next it does: 
>     p = mmap(va, ...) 
> 
> *without* setting MAP_FIXED, of course (so va is just a hint), to attempt to
> safely reserve that region. If p != va, then in most cases, this is a failure
> (almost certainly due to another thread getting a mapping from that region
> before we did), and so this layer now has to call munmap(), before returning
> a "failure: retry" to upper layers.
> 
>     IMPROVEMENT: --> if instead, we could call this:
> 
>             p = mmap(va, ... MAP_FIXED_NO_CLOBBER ...)
> 
>         , then we could skip the munmap() call upon failure. This is a small thing, 
>         but it is useful here. (Thanks to Piotr Jaroszynski and Mark Hairgrove
>         for helping me get that detail exactly right, btw.)
> 
> c) After that, CUDA suballocates from p, via: 
>  
>      q = mmap(sub_region_start, ... MAP_FIXED ...)
> 
> Interestingly enough, "freeing" is also done via MAP_FIXED, and setting PROT_NONE
> to the subregion. Anyway, I just included (c) for general interest.

OK, I will add this to the changelog. This is basically the "Atomic
address range probing in the multithreaded programs" I've had in the
cover letter.

> I expect that as we continue working on the open source compute software stack,
> this new capability will be useful there, too.
> 
> Oh, and on the naming, when I described how your implementation worked (without
> naming it) to Piotr, he said, "oh, something like map-fixed-no-clobber?". So I
> think my miniature sociology naming data point here can bolster the case ever so
> slightly for calling it MAP_FIXED_NO_CLOBBER. haha. :)

I will be probably stubborn and go with a shorter name I have currently.
I am not very fond-of-very-long-names.
-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFC PATCH 0/2] mm: introduce MAP_FIXED_SAFE
  2017-11-22 13:12             ` Michal Hocko
@ 2017-11-22 13:20               ` Vlastimil Babka
  0 siblings, 0 replies; 22+ messages in thread
From: Vlastimil Babka @ 2017-11-22 13:20 UTC (permalink / raw)
  To: Michal Hocko, John Hubbard
  Cc: linux-api, Khalid Aziz, Michael Ellerman, Andrew Morton,
	Russell King - ARM Linux, Andrea Arcangeli, linux-mm, LKML,
	linux-arch, Abdul Haleem, Joel Stanley, Kees Cook

On 11/22/2017 02:12 PM, Michal Hocko wrote:
> I will be probably stubborn and go with a shorter name I have currently.
> I am not very fond-of-very-long-names.

The short synonym for the last word is "German"

SCNR :P

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFC PATCH 0/2] mm: introduce MAP_FIXED_SAFE
  2017-11-16 12:14   ` [RFC PATCH 0/2] mm: introduce MAP_FIXED_SAFE Michal Hocko
       [not found]     ` <20171116121438.6vegs4wiahod3byl-2MMpYkNvuYDjFM9bn6wA6Q@public.gmane.org>
@ 2017-11-24  8:54     ` Michal Hocko
  2017-11-27 15:51       ` Khalid Aziz
  1 sibling, 1 reply; 22+ messages in thread
From: Michal Hocko @ 2017-11-24  8:54 UTC (permalink / raw)
  To: linux-api
  Cc: Khalid Aziz, Michael Ellerman, Andrew Morton,
	Russell King - ARM Linux, Andrea Arcangeli, linux-mm, LKML,
	linux-arch, Abdul Haleem, Joel Stanley, Kees Cook

Are there any more concerns? So far the biggest one was the name. The
other which suggests a flag as a modifier has been sorted out hopefully.
Is there anymore more before we can consider this for merging? Well
except for man page update which I will prepare of course. Can we target
this to 4.16?

On Thu 16-11-17 13:14:38, Michal Hocko wrote:
> [Ups, managed to screw the subject - fix it]
> 
> On Thu 16-11-17 11:18:58, Michal Hocko wrote:
> > Hi,
> > this has started as a follow up discussion [1][2] resulting in the
> > runtime failure caused by hardening patch [3] which removes MAP_FIXED
> > from the elf loader because MAP_FIXED is inherently dangerous as it
> > might silently clobber and existing underlying mapping (e.g. stack). The
> > reason for the failure is that some architectures enforce an alignment
> > for the given address hint without MAP_FIXED used (e.g. for shared or
> > file backed mappings).
> > 
> > One way around this would be excluding those archs which do alignment
> > tricks from the hardening [4]. The patch is really trivial but it has
> > been objected, rightfully so, that this screams for a more generic
> > solution. We basically want a non-destructive MAP_FIXED.
> > 
> > The first patch introduced MAP_FIXED_SAFE which enforces the given
> > address but unlike MAP_FIXED it fails with ENOMEM if the given range
> > conflicts with an existing one. The flag is introduced as a completely
> > new flag rather than a MAP_FIXED extension because of the backward
> > compatibility. We really want a never-clobber semantic even on older
> > kernels which do not recognize the flag. Unfortunately mmap sucks wrt.
> > flags evaluation because we do not EINVAL on unknown flags. On those
> > kernels we would simply use the traditional hint based semantic so the
> > caller can still get a different address (which sucks) but at least not
> > silently corrupt an existing mapping. I do not see a good way around
> > that. Except we won't export expose the new semantic to the userspace at
> > all. It seems there are users who would like to have something like that
> > [5], though. Atomic address range probing in the multithreaded programs
> > sounds like an interesting thing to me as well, although I do not have
> > any specific usecase in mind.
> > 
> > The second patch simply replaces MAP_FIXED use in elf loader by
> > MAP_FIXED_SAFE. I believe other places which rely on MAP_FIXED should
> > follow. Actually real MAP_FIXED usages should be docummented properly
> > and they should be more of an exception.
> > 
> > Does anybody see any fundamental reasons why this is a wrong approach?
> > 
> > Diffstat says
> >  arch/alpha/include/uapi/asm/mman.h   |  2 ++
> >  arch/metag/kernel/process.c          |  6 +++++-
> >  arch/mips/include/uapi/asm/mman.h    |  2 ++
> >  arch/parisc/include/uapi/asm/mman.h  |  2 ++
> >  arch/powerpc/include/uapi/asm/mman.h |  1 +
> >  arch/sparc/include/uapi/asm/mman.h   |  1 +
> >  arch/tile/include/uapi/asm/mman.h    |  1 +
> >  arch/xtensa/include/uapi/asm/mman.h  |  2 ++
> >  fs/binfmt_elf.c                      | 12 ++++++++----
> >  include/uapi/asm-generic/mman.h      |  1 +
> >  mm/mmap.c                            | 11 +++++++++++
> >  11 files changed, 36 insertions(+), 5 deletions(-)
> > 
> > [1] http://lkml.kernel.org/r/20171107162217.382cd754@canb.auug.org.au
> > [2] http://lkml.kernel.org/r/1510048229.12079.7.camel@abdul.in.ibm.com
> > [3] http://lkml.kernel.org/r/20171023082608.6167-1-mhocko@kernel.org
> > [4] http://lkml.kernel.org/r/20171113094203.aofz2e7kueitk55y@dhcp22.suse.cz
> > [5] http://lkml.kernel.org/r/87efp1w7vy.fsf@concordia.ellerman.id.au
> > 
> > 
> > --
> > To unsubscribe, send a message with 'unsubscribe linux-mm' in
> > the body to majordomo@kvack.org.  For more info on Linux MM,
> > see: http://www.linux-mm.org/ .
> > Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
> 
> -- 
> Michal Hocko
> SUSE Labs

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFC PATCH 0/2] mm: introduce MAP_FIXED_SAFE
  2017-11-24  8:54     ` Michal Hocko
@ 2017-11-27 15:51       ` Khalid Aziz
  0 siblings, 0 replies; 22+ messages in thread
From: Khalid Aziz @ 2017-11-27 15:51 UTC (permalink / raw)
  To: Michal Hocko, linux-api
  Cc: Michael Ellerman, Andrew Morton, Russell King - ARM Linux,
	Andrea Arcangeli, linux-mm, LKML, linux-arch, Abdul Haleem,
	Joel Stanley, Kees Cook

On 11/24/2017 01:54 AM, Michal Hocko wrote:
> Are there any more concerns? So far the biggest one was the name. The
> other which suggests a flag as a modifier has been sorted out hopefully.
> Is there anymore more before we can consider this for merging? Well
> except for man page update which I will prepare of course. Can we target
> this to 4.16?

I do not have concerns with this approach. I prefer a new flag as 
opposed to a modifier and the name is ok by me.

Reviewed-by: Khalid Aziz <khalid.aziz@oracle.com>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2017-11-27 15:51 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-11-16 10:18 (unknown), Michal Hocko
2017-11-16 10:18 ` [RFC PATCH 1/2] mm: introduce MAP_FIXED_SAFE Michal Hocko
2017-11-17  0:27   ` Kees Cook
     [not found]     ` <CAGXu5jKssQCcYcZujvQeFy5LTzhXSW=f-a0riB=4+caT1i38BQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2017-11-17 19:12       ` Matthew Wilcox
2017-11-20  8:43         ` Michal Hocko
2017-11-17  7:30   ` Florian Weimer
2017-11-20  8:55     ` Michal Hocko
2017-11-20  9:10       ` Florian Weimer
     [not found]         ` <37a6e9ba-e0df-b65f-d5ef-871c25b5cb87-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2017-11-20  9:33           ` Michal Hocko
2017-11-20  9:45             ` Florian Weimer
2017-11-17  8:37   ` John Hubbard
2017-11-20  9:02     ` Michal Hocko
2017-11-16 10:19 ` [PATCH 2/2] fs, elf: drop MAP_FIXED usage from elf_map Michal Hocko
2017-11-17  0:30   ` Kees Cook
     [not found] ` <20171116101900.13621-1-mhocko-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
2017-11-16 12:14   ` [RFC PATCH 0/2] mm: introduce MAP_FIXED_SAFE Michal Hocko
     [not found]     ` <20171116121438.6vegs4wiahod3byl-2MMpYkNvuYDjFM9bn6wA6Q@public.gmane.org>
2017-11-17  8:45       ` John Hubbard
2017-11-20  9:05         ` Michal Hocko
2017-11-22  1:48           ` John Hubbard
2017-11-22 13:12             ` Michal Hocko
2017-11-22 13:20               ` Vlastimil Babka
2017-11-24  8:54     ` Michal Hocko
2017-11-27 15:51       ` Khalid Aziz

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).