[1/2] VM/SELinux: require CAP_SYS_RAWIO for all mmap_zero operations
diff mbox series

Message ID 20090721144157.14159.23439.stgit@paris.rdu.redhat.com
State New, archived
Headers show
Series
  • [1/2] VM/SELinux: require CAP_SYS_RAWIO for all mmap_zero operations
Related show

Commit Message

Eric Paris July 21, 2009, 2:41 p.m. UTC
Currently non-SELinux systems need CAP_SYS_RAWIO for an application to mmap
the 0 page.  On SELinux systems they need a specific SELinux permission,
but do not need CAP_SYS_RAWIO.  This has proved to be a poor decision by
the SELinux team as, by default, SELinux users are logged in unconfined and
thus a malicious non-root has nothing stopping them from mapping the 0 page
of virtual memory.

On a non-SELinux system, a malicious non-root user is unable to do this, as
they need CAP_SYS_RAWIO.

This patch checks CAP_SYS_RAWIO for all operations which attemt to map a
page below mmap_min_addr.

Signed-off-by: Eric Paris <eparis@redhat.com>
---

 include/linux/security.h |    2 --
 mm/mmap.c                |   10 ++++++++++
 mm/mremap.c              |    8 ++++++++
 mm/nommu.c               |    3 +++
 security/capability.c    |    2 --
 5 files changed, 21 insertions(+), 4 deletions(-)


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Comments

Alan Cox July 21, 2009, 3:04 p.m. UTC | #1
On Tue, 21 Jul 2009 10:41:58 -0400
Eric Paris <eparis@redhat.com> wrote:

> Currently non-SELinux systems need CAP_SYS_RAWIO for an application to mmap
> the 0 page.  On SELinux systems they need a specific SELinux permission,
> but do not need CAP_SYS_RAWIO.  This has proved to be a poor decision by
> the SELinux team as, by default, SELinux users are logged in unconfined and
> thus a malicious non-root has nothing stopping them from mapping the 0 page
> of virtual memory.

So the poor decision is in fact that the SELinux users start as
unconfined rather than taking away a few things like the the min map
stuff which can then be given back to specific apps.

> On a non-SELinux system, a malicious non-root user is unable to do this, as
> they need CAP_SYS_RAWIO.
> 
> This patch checks CAP_SYS_RAWIO for all operations which attemt to map a
> page below mmap_min_addr.

So "wine" now needs to run with CAP_SYS_RAWIO or you turn all the
security off.

Am I missing something here, this "solution" sounds completely brain
dead ?

Alan
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
Eric Paris July 21, 2009, 3:18 p.m. UTC | #2
On Tue, 2009-07-21 at 16:04 +0100, Alan Cox wrote:
> On Tue, 21 Jul 2009 10:41:58 -0400
> Eric Paris <eparis@redhat.com> wrote:
> 
> > Currently non-SELinux systems need CAP_SYS_RAWIO for an application to mmap
> > the 0 page.  On SELinux systems they need a specific SELinux permission,
> > but do not need CAP_SYS_RAWIO.  This has proved to be a poor decision by
> > the SELinux team as, by default, SELinux users are logged in unconfined and
> > thus a malicious non-root has nothing stopping them from mapping the 0 page
> > of virtual memory.
> 
> So the poor decision is in fact that the SELinux users start as
> unconfined rather than taking away a few things like the the min map
> stuff which can then be given back to specific apps.

See Fedora Core 2 of an example why confining users causes people to
just turn off SELinux.  In F10 we had around 90% SELinux enforcing of
machines the reported to smolt for more than 3 months.  We are trying to
slowly reign people in with SELinux and confining the user is one of
those near impossibilities without breaking every power user out
there....

> > On a non-SELinux system, a malicious non-root user is unable to do this, as
> > they need CAP_SYS_RAWIO.
> > 
> > This patch checks CAP_SYS_RAWIO for all operations which attemt to map a
> > page below mmap_min_addr.
> 
> So "wine" now needs to run with CAP_SYS_RAWIO or you turn all the
> security off.

Now, as in today, yes.  To run wine today it either needs CAP_SYS_RAWIO
or you disable for the whole system.  I noted that I believe ubuntu just
turns it off for the whole system when you install WINE.  This patch
doesn't change that fact.  All it does is add that requirement to
SELinux systems that already exists on non-selinux systems.

> Am I missing something here, this "solution" sounds completely brain
> dead ?

Well, with patch 2/2 you still get your SELinux protections (only for 1
page) even if you disable it for the whole system.  So in the end, you
have better protection than you have today with this series....

-Eric

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
Alan Cox July 21, 2009, 3:38 p.m. UTC | #3
> turns it off for the whole system when you install WINE.  This patch
> doesn't change that fact.  All it does is add that requirement to
> SELinux systems that already exists on non-selinux systems.

Prior to this an SELinux system could implement sensible security. The
fact Fedora didn't was paranoia and fear over a bad experience made what
five years ago ?

> > Am I missing something here, this "solution" sounds completely brain
> > dead ?
> 
> Well, with patch 2/2 you still get your SELinux protections (only for 1
> page) even if you disable it for the whole system.  So in the end, you
> have better protection than you have today with this series....

We know one page isn't sufficient. That has been seen from some exploit
cases.

So this looks to me like a regression in features, that makes the system
less secure and doesn't solve anything at all.

Whereas if you just set the default SELinux user confinement to allow
everything but mapping low pages you wouldn't actually need to mess up
the kernel ?

Currently I have low page protection and I don't have to run wine as
CAP_SYS_RAWIO (which comes in the "sucidial ideas") category. I consider
the loss of that ability a regression.

Alan
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
Eric Paris July 21, 2009, 3:57 p.m. UTC | #4
On Tue, 2009-07-21 at 16:38 +0100, Alan Cox wrote:
> > turns it off for the whole system when you install WINE.  This patch
> > doesn't change that fact.  All it does is add that requirement to
> > SELinux systems that already exists on non-selinux systems.
> 
> Prior to this an SELinux system could implement sensible security. 

Sadly, it couldn't reasonably  :(

> > > Am I missing something here, this "solution" sounds completely brain
> > > dead ?
> > 
> > Well, with patch 2/2 you still get your SELinux protections (only for 1
> > page) even if you disable it for the whole system.  So in the end, you
> > have better protection than you have today with this series....
> 
> We know one page isn't sufficient. That has been seen from some exploit
> cases.

Yet another tunable is easy, and what I mentioned yesterday, i'd
probably put it in /selinux since it would be an selinux only thing.
This just seemed reasonable, since the Kconfig default is 4096 that's
what most people have anyway right?

> So this looks to me like a regression in features, that makes the system
> less secure and doesn't solve anything at all.
> 
> Whereas if you just set the default SELinux user confinement to allow
> everything but mapping low pages you wouldn't actually need to mess up
> the kernel ?

Herein lies the problem.  It sounds easy to do, but isn't.  Sure I can
remove mmap_zero from unconfined_t (and actually it should be that way
in rawhide by default by now) but like I said, it's not even a speed
bump to be that broad.  

runcon -t wine_t [my exploit]
win.

So now I have to stop allowing unconfined_t to specifically run things
as wine_t.  Easy enough to get around

chcon -t wine_exec_t [my exploit]
win.

Well crap, now I have to stop letting unconfined_t label things
wine_exec_t.  Easy enough to get around if you can load it as an rpm
(ok, this step is probably harder)

and hell, how do I know I can't just get wine some windows program to
get win to map the page for me?

Finding all of the contortions that an unconfined user can do is nearly
impossible.  It's one of the reasons a lot of selinux people argued
against the unconfined domain to begin with.  There are some analysis
tools used in high security environments to prove security goals but
unconfined is such a monstrosity it's too hard to get a handle on.  Make
everyone log in as user_t (man semanage) and you will be better (but I
haven't proven it is safe...)

> Currently I have low page protection and I don't have to run wine as
> CAP_SYS_RAWIO (which comes in the "sucidial ideas") category. I consider
> the loss of that ability a regression.

and you still could.  Just set mmap_min_addr = 0 and you get SELinux
protection for confined domains.  I'll gladly add an selinux tunable if
people like it so SELinux users who don't want to enforce the uid=0 rule
can do exactly everything they can do today.

Someone on this list has to know a wine guru.  Seems to me there has to
be a way that we can give wine CAP_SYS_RAWIO just long enough to map the
page so non-SELinux users aren't left in the lurch they are today.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
Alan Cox July 21, 2009, 4:09 p.m. UTC | #5
> This just seemed reasonable, since the Kconfig default is 4096 that's
> what most people have anyway right?

Yes - its rather inadequate when there are multi-page objects floating
around to exploit (remember you only need writes through data pointers for
exploits not executable code)

> runcon -t wine_t [my exploit]
> win.
> 
> So now I have to stop allowing unconfined_t to specifically run things
> as wine_t.  Easy enough to get around
> 
> chcon -t wine_exec_t [my exploit]
> win.
> 
> Well crap, now I have to stop letting unconfined_t label things
> wine_exec_t.  Easy enough to get around if you can load it as an rpm
> (ok, this step is probably harder)

If I can load it as an rpm then I'm the superuser and I already won so
that case isn't too bad really is it ?

So far you've described in SELinux the equivalent of

"user must not be able to chown and setuid their files to someone else"

which was hardly news in 1970 ;)

> and hell, how do I know I can't just get wine some windows program to
> get win to map the page for me?

Mathematical absolutes don't work here. How do you know 4K is enough, 64K
is enough - you never do that either. You can only make it a lot harder.

> unconfined is such a monstrosity it's too hard to get a handle on.  Make
> everyone log in as user_t (man semanage) and you will be better (but I
> haven't proven it is safe...)

Ok.

> and you still could.  Just set mmap_min_addr = 0 and you get SELinux
> protection for confined domains.  I'll gladly add an selinux tunable if
> people like it so SELinux users who don't want to enforce the uid=0 rule
> can do exactly everything they can do today.
> 
> Someone on this list has to know a wine guru.  Seems to me there has to
> be a way that we can give wine CAP_SYS_RAWIO just long enough to map the
> page so non-SELinux users aren't left in the lurch they are today.

I did look - but wine doesn't simply wrap itself around an application,
run it and die. The wine execution model is rather more complicated and
the need to map page zero can thus pop up later on. Some of the other
users you can do this with, and most of the other ones like LRMI of
course are CAP_SYS_RAWIO *anyway* as the main user of this stuff is vm86
bios execution.

It's a really ugly problem that almost begs for better hardware
facilities (such as the multiple independent address spaces in some
processors)

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
Eric Paris July 21, 2009, 4:23 p.m. UTC | #6
On Tue, 2009-07-21 at 17:09 +0100, Alan Cox wrote:

> It's a really ugly problem that almost begs for better hardware
> facilities (such as the multiple independent address spaces in some
> processors)

If only we knew someone who worked at intel.....   *smile*

Are you on board with the change I propose as long as I make the address
space controlled by SELinux tunable instead of fixed at one page?  Thus
allowing one to maintain the status quo?  Yeah, still sucks for
non-selinux systems and wine, but at least there can be hardening
against a non-root local authenticated user on a default fedora
install...

-Eric

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
Alan Cox July 21, 2009, 4:30 p.m. UTC | #7
On Tue, 21 Jul 2009 12:23:14 -0400
Eric Paris <eparis@redhat.com> wrote:

> On Tue, 2009-07-21 at 17:09 +0100, Alan Cox wrote:
> 
> > It's a really ugly problem that almost begs for better hardware
> > facilities (such as the multiple independent address spaces in some
> > processors)
> 
> If only we knew someone who worked at intel.....   *smile*
> 
> Are you on board with the change I propose as long as I make the address
> space controlled by SELinux tunable instead of fixed at one page?  Thus
> allowing one to maintain the status quo?  Yeah, still sucks for
> non-selinux systems and wine, but at least there can be hardening
> against a non-root local authenticated user on a default fedora
> install...

I don't like it much, but I have exactly zero better ideas. It's also
your area of expertise not mine, so I'm not going to object further.

Alan
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
Pavel Machek July 29, 2009, 3:06 p.m. UTC | #8
On Tue 2009-07-21 16:38:13, Alan Cox wrote:
> > turns it off for the whole system when you install WINE.  This patch
> > doesn't change that fact.  All it does is add that requirement to
> > SELinux systems that already exists on non-selinux systems.
> 
> Prior to this an SELinux system could implement sensible security. The
> fact Fedora didn't was paranoia and fear over a bad experience made what
> five years ago ?

OTOH enabling selinux previously only taken your rights away... which
is not the case with min_mmap_addr. Making it consistent seems useful.
								Pavel

Patch
diff mbox series

diff --git a/include/linux/security.h b/include/linux/security.h
index 1459091..f7d198a 100644
--- a/include/linux/security.h
+++ b/include/linux/security.h
@@ -2197,8 +2197,6 @@  static inline int security_file_mmap(struct file *file, unsigned long reqprot,
 				     unsigned long addr,
 				     unsigned long addr_only)
 {
-	if ((addr < mmap_min_addr) && !capable(CAP_SYS_RAWIO))
-		return -EACCES;
 	return 0;
 }
 
diff --git a/mm/mmap.c b/mm/mmap.c
index 34579b2..37fdc90 100644
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -1047,6 +1047,9 @@  unsigned long do_mmap_pgoff(struct file *file, unsigned long addr,
 		}
 	}
 
+	if ((addr < mmap_min_addr) && !capable(CAP_SYS_RAWIO))
+		return -EACCES;
+
 	error = security_file_mmap(file, reqprot, prot, flags, addr, 0);
 	if (error)
 		return error;
@@ -1657,6 +1660,10 @@  static int expand_downwards(struct vm_area_struct *vma,
 		return -ENOMEM;
 
 	address &= PAGE_MASK;
+
+	if ((address < mmap_min_addr) && !capable(CAP_SYS_RAWIO))
+		return -EACCES;
+
 	error = security_file_mmap(NULL, 0, 0, 0, address, 1);
 	if (error)
 		return error;
@@ -1998,6 +2005,9 @@  unsigned long do_brk(unsigned long addr, unsigned long len)
 	if (is_hugepage_only_range(mm, addr, len))
 		return -EINVAL;
 
+	if ((addr < mmap_min_addr) && !capable(CAP_SYS_RAWIO))
+		return -EACCES;
+
 	error = security_file_mmap(NULL, 0, 0, 0, addr, 1);
 	if (error)
 		return error;
diff --git a/mm/mremap.c b/mm/mremap.c
index a39b7b9..066e73d 100644
--- a/mm/mremap.c
+++ b/mm/mremap.c
@@ -299,6 +299,10 @@  unsigned long do_mremap(unsigned long addr,
 		if ((addr <= new_addr) && (addr+old_len) > new_addr)
 			goto out;
 
+		ret = -EACCES;
+		if ((new_addr < mmap_min_addr) && !capable(CAP_SYS_RAWIO))
+			goto out;
+
 		ret = security_file_mmap(NULL, 0, 0, 0, new_addr, 1);
 		if (ret)
 			goto out;
@@ -407,6 +411,10 @@  unsigned long do_mremap(unsigned long addr,
 				goto out;
 			}
 
+			ret = -EACCES;
+			if ((new_addr < mmap_min_addr) && !capable(CAP_SYS_RAWIO))
+				goto out;
+
 			ret = security_file_mmap(NULL, 0, 0, 0, new_addr, 1);
 			if (ret)
 				goto out;
diff --git a/mm/nommu.c b/mm/nommu.c
index 53cab10..c1f3eff 100644
--- a/mm/nommu.c
+++ b/mm/nommu.c
@@ -995,6 +995,9 @@  static int validate_mmap_request(struct file *file,
 	}
 
 	/* allow the security API to have its say */
+	if ((addr < mmap_min_addr) && !capable(CAP_SYS_RAWIO))
+		return -EACCES;
+
 	ret = security_file_mmap(file, reqprot, prot, flags, addr, 0);
 	if (ret < 0)
 		return ret;
diff --git a/security/capability.c b/security/capability.c
index f218dd3..a3a5d9b 100644
--- a/security/capability.c
+++ b/security/capability.c
@@ -334,8 +334,6 @@  static int cap_file_mmap(struct file *file, unsigned long reqprot,
 			 unsigned long prot, unsigned long flags,
 			 unsigned long addr, unsigned long addr_only)
 {
-	if ((addr < mmap_min_addr) && !capable(CAP_SYS_RAWIO))
-		return -EACCES;
 	return 0;
 }