linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [BUG] binfmt_elf: get_user() called in vma_dump_size() after set_fs(KERNEL_DS)
@ 2009-02-06 17:10 Gerald Schaefer
  2009-02-06 21:44 ` Andrew Morton
  0 siblings, 1 reply; 8+ messages in thread
From: Gerald Schaefer @ 2009-02-06 17:10 UTC (permalink / raw)
  To: Roland McGrath
  Cc: linux-kernel, Martin Schwidefsky, Heiko Carstens, Andrew Morton,
	Linus Torvalds

Hi,

elf_core_dump() does a set_fs(KERNEL_DS) and then calls vma_dump_size(),
which uses get_user() to check for an ELF header at vma->vm_start in the
user mapping. This is a bug because vm_start is a user virtual address and
get_user() will fail or even read from a kernel address (KERNEL_DS).

Maybe a get_user_pages() should be used to get the user data, or a temporary
set_fs(USER_DS)?

--
Gerald



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [BUG] binfmt_elf: get_user() called in vma_dump_size() after set_fs(KERNEL_DS)
  2009-02-06 17:10 [BUG] binfmt_elf: get_user() called in vma_dump_size() after set_fs(KERNEL_DS) Gerald Schaefer
@ 2009-02-06 21:44 ` Andrew Morton
  2009-02-06 21:57   ` Linus Torvalds
  2009-02-06 22:07   ` Roland McGrath
  0 siblings, 2 replies; 8+ messages in thread
From: Andrew Morton @ 2009-02-06 21:44 UTC (permalink / raw)
  To: Gerald Schaefer
  Cc: roland, linux-kernel, schwidefsky, heiko.carstens, torvalds

On Fri, 06 Feb 2009 18:10:35 +0100
Gerald Schaefer <gerald.schaefer@de.ibm.com> wrote:

> Hi,
> 
> elf_core_dump() does a set_fs(KERNEL_DS) and then calls vma_dump_size(),
> which uses get_user() to check for an ELF header at vma->vm_start in the
> user mapping. This is a bug because vm_start is a user virtual address and
> get_user() will fail or even read from a kernel address (KERNEL_DS).
> 
> Maybe a get_user_pages() should be used to get the user data, or a temporary
> set_fs(USER_DS)?
> 

Could use __get_user() to skip the access_ok() check?

We'd need to be sure that the address isn't a kernel address or iomem
or something. 

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [BUG] binfmt_elf: get_user() called in vma_dump_size() after set_fs(KERNEL_DS)
  2009-02-06 21:44 ` Andrew Morton
@ 2009-02-06 21:57   ` Linus Torvalds
  2009-02-06 22:07   ` Roland McGrath
  1 sibling, 0 replies; 8+ messages in thread
From: Linus Torvalds @ 2009-02-06 21:57 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Gerald Schaefer, roland, linux-kernel, schwidefsky, heiko.carstens



On Fri, 6 Feb 2009, Andrew Morton wrote:

> On Fri, 06 Feb 2009 18:10:35 +0100
> Gerald Schaefer <gerald.schaefer@de.ibm.com> wrote:
> > 
> > elf_core_dump() does a set_fs(KERNEL_DS) and then calls vma_dump_size(),
> > which uses get_user() to check for an ELF header at vma->vm_start in the
> > user mapping. This is a bug because vm_start is a user virtual address and
> > get_user() will fail or even read from a kernel address (KERNEL_DS).
> > 
> > Maybe a get_user_pages() should be used to get the user data, or a temporary
> > set_fs(USER_DS)?
> > 
> 
> Could use __get_user() to skip the access_ok() check?

That's not the problem.

The problem is that
 (a) some architectures actually use separate address spaces (sparc, at 
     least), so using "get_user()" while you are in KERNEL_DS will 
     literally access the wrong thing.

     __get_user doesn't affect this part.

 (b) the security one: KERNEL_DS will change the access checks that 
     get_user() does, and allow access to kernel memory.

     Using __get_user _does_ affect this part by removing the checks, but 
     since the problem was that KERNEL_DS already _weakened_ the checks, 
     removign the too-weak checking doesn't actually help.

> We'd need to be sure that the address isn't a kernel address or iomem
> or something. 

As mentioned, even this won't actually help. On at least some sparc 
machines (sparc64?), memory accesses really are segmented, with different 
address spaces for user and kernel mode, so a user and kernel address may 
actually have the exact same pointer value, but point to different memory!

So you can't just check the address. You really have to do the whole 
"set_fs()" thing to set the segment register.

		Linus

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [BUG] binfmt_elf: get_user() called in vma_dump_size() after set_fs(KERNEL_DS)
  2009-02-06 21:44 ` Andrew Morton
  2009-02-06 21:57   ` Linus Torvalds
@ 2009-02-06 22:07   ` Roland McGrath
  2009-02-06 22:18     ` David Miller
  2009-02-06 22:19     ` Linus Torvalds
  1 sibling, 2 replies; 8+ messages in thread
From: Roland McGrath @ 2009-02-06 22:07 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Gerald Schaefer, linux-kernel, schwidefsky, heiko.carstens, torvalds

The address in question comes from a user vma's vm_start, so by definition
it has to be in the user part of the address space.  This code path is
excluded for VM_IO and the like by checks earlier in vma_dump_size.
So, is this problem purely theoretical?  I guess not on machines where
set_fs actually changes the meaning of the lower address space.

I think get_user_pages would certainly be overkill for this.  It's a check
to decide whether you need to pay the cost of get_user_pages, after all.

set_fs is quite cheap at least on most machines.  So a pair of set_fs calls
around that get_user call doesn't seem so bad.  OTOH, on the machines where
this actually matters at all (maybe just sparc, arm, s390?) it is
presumably (much?) more costly.  But it seems like the best solution, and
certainly is straightforward.


Thanks,
Roland

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [BUG] binfmt_elf: get_user() called in vma_dump_size() after set_fs(KERNEL_DS)
  2009-02-06 22:07   ` Roland McGrath
@ 2009-02-06 22:18     ` David Miller
  2009-02-06 22:19     ` Linus Torvalds
  1 sibling, 0 replies; 8+ messages in thread
From: David Miller @ 2009-02-06 22:18 UTC (permalink / raw)
  To: roland
  Cc: akpm, gerald.schaefer, linux-kernel, schwidefsky, heiko.carstens,
	torvalds

From: Roland McGrath <roland@redhat.com>
Date: Fri,  6 Feb 2009 14:07:17 -0800 (PST)

> set_fs is quite cheap at least on most machines.  So a pair of set_fs calls
> around that get_user call doesn't seem so bad.  OTOH, on the machines where
> this actually matters at all (maybe just sparc, arm, s390?) it is
> presumably (much?) more costly.  But it seems like the best solution, and
> certainly is straightforward.

On sparc set_fs() is just a privileged register write, so pretty cheap
and definitely less expensive than get_user_pages() :-)

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [BUG] binfmt_elf: get_user() called in vma_dump_size() after set_fs(KERNEL_DS)
  2009-02-06 22:07   ` Roland McGrath
  2009-02-06 22:18     ` David Miller
@ 2009-02-06 22:19     ` Linus Torvalds
  2009-02-07  1:55       ` [PATCH] elf core dump: fix get_user use Roland McGrath
  2009-02-07  1:55       ` [BUG] binfmt_elf: get_user() called in vma_dump_size() after set_fs(KERNEL_DS) Roland McGrath
  1 sibling, 2 replies; 8+ messages in thread
From: Linus Torvalds @ 2009-02-06 22:19 UTC (permalink / raw)
  To: Roland McGrath
  Cc: Andrew Morton, Gerald Schaefer, linux-kernel, schwidefsky,
	heiko.carstens



On Fri, 6 Feb 2009, Roland McGrath wrote:
> 
> set_fs is quite cheap at least on most machines.  So a pair of set_fs calls
> around that get_user call doesn't seem so bad.  OTOH, on the machines where
> this actually matters at all (maybe just sparc, arm, s390?) it is
> presumably (much?) more costly.  But it seems like the best solution, and
> certainly is straightforward.

Yes, I suspect just surrounding the load with set_fs(USER_DS) and then 
set_fs(KERNEL_DS) to put it back is likely the right thing to do.

The address is "safe" in that it does come from the vma, but we obviously 
do play games with things like the call gate mappings etc. Should we also 
perhaps do this only if the vma is marked readable and executable? That 
way we could avoid taking unnecessary faults there.

Probably doesn't really matter. 

			Linus

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH] elf core dump: fix get_user use
  2009-02-06 22:19     ` Linus Torvalds
@ 2009-02-07  1:55       ` Roland McGrath
  2009-02-07  1:55       ` [BUG] binfmt_elf: get_user() called in vma_dump_size() after set_fs(KERNEL_DS) Roland McGrath
  1 sibling, 0 replies; 8+ messages in thread
From: Roland McGrath @ 2009-02-07  1:55 UTC (permalink / raw)
  To: Linus Torvalds, Andrew Morton
  Cc: linux-kernel, Gerald Schaefer, Martin Schwidefsky, Heiko.Carstens

The following changes since commit 6cec50838ed04a9833fb5549f698d3756bbe7e72:
  Linus Torvalds (1):
        Merge branch 'for-linus' of git://git.kernel.org/.../tiwai/sound-2.6

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/frob/linux-2.6-roland.git to-linus

Roland McGrath (1):
      elf core dump: fix get_user use

 fs/binfmt_elf.c |   14 ++++++++++++--
 1 files changed, 12 insertions(+), 2 deletions(-)

Thanks,
Roland

---
[PATCH] elf core dump: fix get_user use

The elf_core_dump() code does its work with set_fs(KERNEL_DS) in force,
so vma_dump_size() needs to switch back with set_fs(USER_DS) to safely
use get_user() for a normal user-space address.

Checking for VM_READ optimizes out the case where get_user() would fail
anyway.  The vm_file check here was already superfluous given the control
flow earlier in the function, so that is a cleanup/optimization unrelated
to other changes but an obvious and trivial one.

Reported-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Signed-off-by: Roland McGrath <roland@redhat.com>
---
 fs/binfmt_elf.c |   14 ++++++++++++--
 1 files changed, 12 insertions(+), 2 deletions(-)

diff --git a/fs/binfmt_elf.c b/fs/binfmt_elf.c
index e3ff2b9..33b7235 100644
--- a/fs/binfmt_elf.c
+++ b/fs/binfmt_elf.c
@@ -1208,9 +1208,11 @@ static unsigned long vma_dump_size(struct vm_area_struct *vma,
 	 * check for an ELF header.  If we find one, dump the first page to
 	 * aid in determining what was mapped here.
 	 */
-	if (FILTER(ELF_HEADERS) && vma->vm_file != NULL && vma->vm_pgoff == 0) {
+	if (FILTER(ELF_HEADERS) &&
+	    vma->vm_pgoff == 0 && (vma->vm_flags & VM_READ)) {
 		u32 __user *header = (u32 __user *) vma->vm_start;
 		u32 word;
+		mm_segment_t fs = get_fs();
 		/*
 		 * Doing it this way gets the constant folded by GCC.
 		 */
@@ -1223,7 +1225,15 @@ static unsigned long vma_dump_size(struct vm_area_struct *vma,
 		magic.elfmag[EI_MAG1] = ELFMAG1;
 		magic.elfmag[EI_MAG2] = ELFMAG2;
 		magic.elfmag[EI_MAG3] = ELFMAG3;
-		if (get_user(word, header) == 0 && word == magic.cmp)
+		/*
+		 * Switch to the user "segment" for get_user(),
+		 * then put back what elf_core_dump() had in place.
+		 */
+		set_fs(USER_DS);
+		if (unlikely(get_user(word, header)))
+			word = 0;
+		set_fs(fs);
+		if (word == magic.cmp)
 			return PAGE_SIZE;
 	}
 

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [BUG] binfmt_elf: get_user() called in vma_dump_size() after set_fs(KERNEL_DS)
  2009-02-06 22:19     ` Linus Torvalds
  2009-02-07  1:55       ` [PATCH] elf core dump: fix get_user use Roland McGrath
@ 2009-02-07  1:55       ` Roland McGrath
  1 sibling, 0 replies; 8+ messages in thread
From: Roland McGrath @ 2009-02-07  1:55 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Andrew Morton, Gerald Schaefer, linux-kernel, schwidefsky,
	heiko.carstens

> Yes, I suspect just surrounding the load with set_fs(USER_DS) and then 
> set_fs(KERNEL_DS) to put it back is likely the right thing to do.

Agreed.

> The address is "safe" in that it does come from the vma, but we obviously 
> do play games with things like the call gate mappings etc. 

gate_vma has VM_ALWAYSDUMP so it never sees this whole path anyway.
I doubt there is any actual case.  But, point taken.

> Should we also perhaps do this only if the vma is marked readable and
> executable? That way we could avoid taking unnecessary faults there.
> 
> Probably doesn't really matter. 

I'm sure it doesn't really matter.  But, just to say it all: Requiring
VM_EXEC would actually exclude some valid cases.  Even requiring VM_READ is
less than perfect as far as pedantic semantics go--but there is no reason
not to check it since its lack rules out get_user() actually working
anyway.  I already chose that trade-off since get_user() here is so much
cheaper than get_user_pages().  The core dump from a stray
"mprotect(0,1UL<<32,PROT_NONE)" (i.e. presumably actually one with some
arithmetic error) would be useful if we made the check work without
VM_READ, but oh well.


Thanks,
Roland

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2009-02-07  1:56 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-02-06 17:10 [BUG] binfmt_elf: get_user() called in vma_dump_size() after set_fs(KERNEL_DS) Gerald Schaefer
2009-02-06 21:44 ` Andrew Morton
2009-02-06 21:57   ` Linus Torvalds
2009-02-06 22:07   ` Roland McGrath
2009-02-06 22:18     ` David Miller
2009-02-06 22:19     ` Linus Torvalds
2009-02-07  1:55       ` [PATCH] elf core dump: fix get_user use Roland McGrath
2009-02-07  1:55       ` [BUG] binfmt_elf: get_user() called in vma_dump_size() after set_fs(KERNEL_DS) Roland McGrath

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).