All of lore.kernel.org
 help / color / mirror / Atom feed
From: Michal Hocko <mhocko@kernel.org>
To: Andrew Morton <akpm@linux-foundation.org>,
	Linus Torvalds <torvalds@linux-foundation.org>
Cc: LKML <linux-kernel@vger.kernel.org>,
	Al Viro <viro@zeniv.linux.org.uk>, Baoquan He <bhe@redhat.com>,
	Ingo Molnar <mingo@redhat.com>,
	James Hogan <james.hogan@mips.com>, Jiri Kosina <jkosina@suse.cz>,
	Kees Cook <keescook@chromium.org>,
	Oleg Nesterov <oleg@redhat.com>
Subject: Re: [PATCH] fs, elf: drop MAP_FIXED usage from elf_map
Date: Tue, 31 Oct 2017 14:38:54 +0100	[thread overview]
Message-ID: <20171031133854.3enoq3pizuttrio4@dhcp22.suse.cz> (raw)
In-Reply-To: <20171023082608.6167-1-mhocko@kernel.org>

I realize that people were at LS and OSS last week and they are busy,
but I would like to ping you guys here so that this doesn't fall between
cracks...

On Mon 23-10-17 10:26:07, Michal Hocko wrote:
> From: Michal Hocko <mhocko@suse.com>
> 
> Both load_elf_interp and load_elf_binary rely on elf_map to map segments
> on a controlled address and they use MAP_FIXED to enforce that. This is
> however dangerous thing prone to silent data corruption which can be
> even exploitable. Let's take CVE-2017-1000253 as an example. At the time
> (before eab09532d400 ("binfmt_elf: use ELF_ET_DYN_BASE only for PIE"))
> ELF_ET_DYN_BASE was at TASK_SIZE / 3 * 2 which is not that far away from
> the stack top on 32b (legacy) memory layout (only 1GB away). Therefore
> we could end up mapping over the existing stack with some luck.
> 
> The issue has been fixed since then (a87938b2e246 ("fs/binfmt_elf.c:
> fix bug in loading of PIE binaries")), ELF_ET_DYN_BASE moved moved much
> further from the stack (eab09532d400 and later by c715b72c1ba4 ("mm:
> revert x86_64 and arm64 ELF_ET_DYN_BASE base changes")) and excessive
> stack consumption early during execve fully stopped by da029c11e6b1
> ("exec: Limit arg stack to at most 75% of _STK_LIM"). So we should be
> safe and any attack should be impractical. On the other hand this is
> just too subtle assumption so it can break quite easily and hard to
> spot.
> 
> I believe that the MAP_FIXED usage in load_elf_binary (et. al) is still
> fundamentally dangerous. Moreover it shouldn't be even needed. We are
> at the early process stage and so there shouldn't be unrelated mappings
> (except for stack and loader) existing so mmap for a given address
> should succeed even without MAP_FIXED. Something is terribly wrong if
> this is not the case and we should rather fail than silently corrupt the
> underlying mapping.
> 
> Address this issue by adding a helper elf_vm_mmap used by elf_map which
> drops MAP_FIXED when asking for the mapping and check whether the
> returned address really matches what the caller asked for and complain
> loudly if this is not the case and fail. Such a failure would be a
> kernel bug and it should alarm us to look what has gone wrong.
> 
> Changes since v1
> - metag is duplicating elf_map to reflect its tightly coupled memory
>   (TCM) segments. In case the mapping is not TCM based we still have
>   to be MAP_FIXED careful so duplicated elf_vm_mmap (reusing the generic
>   helper seems to be rather problematic due to include header dependency
>   hell).
> 
> Cc: Kees Cook <keescook@chromium.org>
> Cc: Linus Torvalds <torvalds@linux-foundation.org>
> Cc: Jiri Kosina <jkosina@suse.cz>
> Cc: Al Viro <viro@zeniv.linux.org.uk>
> Cc: Oleg Nesterov <oleg@redhat.com>
> Cc: Ingo Molnar <mingo@redhat.com>
> Cc: Baoquan He <bhe@redhat.com>
> Cc: James Hogan <james.hogan@mips.com>
> Signed-off-by: Michal Hocko <mhocko@suse.com>
> ---
> Hi,
> I've posted this more as an RFC previously [1] and it seems there were
> no fundamental objections. I have fixed up metag issue pointed by Kees
> in this version. I have also dropped the second patch because Kees was
> envisioning a potential danger [2]. I cannot say I would be convinced
> but the second patch is not really required for this one to go
> 
> I believe this is a more preferred way to handle potential early process
> address space conflicts than a silent corruption with potentially
> security drawbacks. I haven't marked this patch for stable because it
> doesn't fix any real issue right now but I would recommend applying this
> patch for a prevention because PIE vs. stack randomization has seen some
> exploitable issues in the recent path.
> 
> I am not sure which tree to push this through. Andrew, would you be
> willing to take it via mmotm (once acked of course)?
> 
> [1] http://lkml.kernel.org/r/20171016134446.19910-1-mhocko@kernel.org
> [2] http://lkml.kernel.org/r/20171016184335.hj6osq7su24e75jz@dhcp22.suse.cz
> 
>  arch/metag/kernel/process.c | 27 +++++++++++++++++++++++++--
>  fs/binfmt_elf.c             | 29 ++++++++++++++++++++++++++---
>  2 files changed, 51 insertions(+), 5 deletions(-)
> 
> diff --git a/arch/metag/kernel/process.c b/arch/metag/kernel/process.c
> index c4606ce743d2..b20596b4c4c2 100644
> --- a/arch/metag/kernel/process.c
> +++ b/arch/metag/kernel/process.c
> @@ -378,6 +378,29 @@ int dump_fpu(struct pt_regs *regs, elf_fpregset_t *fpu)
>  
>  #define BAD_ADDR(x) ((unsigned long)(x) >= TASK_SIZE)
>  
> +static unsigned long elf_vm_mmap(struct file *filep, unsigned long addr,
> +		unsigned long size, int prot, int type, unsigned long off)
> +{
> +	unsigned long map_addr;
> +
> +	/*
> +	 * If caller requests the mapping at a specific place, make sure we fail
> +	 * rather than potentially clobber an existing mapping which can have
> +	 * security consequences (e.g. smash over the stack area).
> +	 */
> +	map_addr = vm_mmap(filep, addr, size, prot, type & ~MAP_FIXED, off);
> +	if (BAD_ADDR(map_addr))
> +		return map_addr;
> +
> +	if ((type & MAP_FIXED) && map_addr != addr) {
> +		pr_info("Uhuuh, elf segement at %p requested but the memory is mapped already\n",
> +				(void*)addr);
> +		return -EAGAIN;
> +	}
> +
> +	return map_addr;
> +}
> +
>  unsigned long __metag_elf_map(struct file *filep, unsigned long addr,
>  			      struct elf_phdr *eppnt, int prot, int type,
>  			      unsigned long total_size)
> @@ -410,11 +433,11 @@ unsigned long __metag_elf_map(struct file *filep, unsigned long addr,
>  	*/
>  	if (total_size) {
>  		total_size = ELF_PAGEALIGN(total_size);
> -		map_addr = vm_mmap(filep, addr, total_size, prot, type, off);
> +		map_addr = elf_vm_mmap(filep, addr, total_size, prot, type, off);
>  		if (!BAD_ADDR(map_addr))
>  			vm_munmap(map_addr+size, total_size-size);
>  	} else
> -		map_addr = vm_mmap(filep, addr, size, prot, type, off);
> +		map_addr = elf_vm_mmap(filep, addr, size, prot, type, off);
>  
>  	if (!BAD_ADDR(map_addr) && tcm_tag != TCM_INVALID_TAG) {
>  		struct tcm_allocation *tcm;
> diff --git a/fs/binfmt_elf.c b/fs/binfmt_elf.c
> index 6466153f2bf0..09456e2add18 100644
> --- a/fs/binfmt_elf.c
> +++ b/fs/binfmt_elf.c
> @@ -341,6 +341,29 @@ create_elf_tables(struct linux_binprm *bprm, struct elfhdr *exec,
>  
>  #ifndef elf_map
>  
> +static unsigned long elf_vm_mmap(struct file *filep, unsigned long addr,
> +		unsigned long size, int prot, int type, unsigned long off)
> +{
> +	unsigned long map_addr;
> +
> +	/*
> +	 * If caller requests the mapping at a specific place, make sure we fail
> +	 * rather than potentially clobber an existing mapping which can have
> +	 * security consequences (e.g. smash over the stack area).
> +	 */
> +	map_addr = vm_mmap(filep, addr, size, prot, type & ~MAP_FIXED, off);
> +	if (BAD_ADDR(map_addr))
> +		return map_addr;
> +
> +	if ((type & MAP_FIXED) && map_addr != addr) {
> +		pr_info("Uhuuh, elf segement at %p requested but the memory is mapped already\n",
> +				(void*)addr);
> +		return -EAGAIN;
> +	}
> +
> +	return map_addr;
> +}
> +
>  static unsigned long elf_map(struct file *filep, unsigned long addr,
>  		struct elf_phdr *eppnt, int prot, int type,
>  		unsigned long total_size)
> @@ -366,11 +389,11 @@ static unsigned long elf_map(struct file *filep, unsigned long addr,
>  	*/
>  	if (total_size) {
>  		total_size = ELF_PAGEALIGN(total_size);
> -		map_addr = vm_mmap(filep, addr, total_size, prot, type, off);
> +		map_addr = elf_vm_mmap(filep, addr, total_size, prot, type, off);
>  		if (!BAD_ADDR(map_addr))
>  			vm_munmap(map_addr+size, total_size-size);
>  	} else
> -		map_addr = vm_mmap(filep, addr, size, prot, type, off);
> +		map_addr = elf_vm_mmap(filep, addr, size, prot, type, off);
>  
>  	return(map_addr);
>  }
> @@ -1215,7 +1238,7 @@ static int load_elf_library(struct file *file)
>  		eppnt++;
>  
>  	/* Now use mmap to map the library into memory. */
> -	error = vm_mmap(file,
> +	error = elf_vm_mmap(file,
>  			ELF_PAGESTART(eppnt->p_vaddr),
>  			(eppnt->p_filesz +
>  			 ELF_PAGEOFFSET(eppnt->p_vaddr)),
> -- 
> 2.14.2
> 

-- 
Michal Hocko
SUSE Labs

  reply	other threads:[~2017-10-31 13:38 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-10-23  8:26 [PATCH] fs, elf: drop MAP_FIXED usage from elf_map Michal Hocko
2017-10-31 13:38 ` Michal Hocko [this message]
2017-10-31 15:10 ` Kees Cook

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20171031133854.3enoq3pizuttrio4@dhcp22.suse.cz \
    --to=mhocko@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=bhe@redhat.com \
    --cc=james.hogan@mips.com \
    --cc=jkosina@suse.cz \
    --cc=keescook@chromium.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=oleg@redhat.com \
    --cc=torvalds@linux-foundation.org \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.