linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Network isolation with RLIMIT_NETWORK, cont'd.
@ 2009-12-13  3:19 Michael Stone
  2009-12-13  3:26 ` [PATCH] Security: Implement RLIMIT_NETWORK Michael Stone
                   ` (2 more replies)
  0 siblings, 3 replies; 54+ messages in thread
From: Michael Stone @ 2009-12-13  3:19 UTC (permalink / raw)
  To: linux-kernel; +Cc: Michael Stone

Dear lkml,

A few months ago [1], I asked for feedback on a new network isolation primitive
named "RLIMIT_NETWORK" designed for use with Unix sandboxing utilities like
Rainbow, Plash, and friends [2]. Thank you to all those CC'ed for your helpful
early remarks.

Here is an updated patchset with responses to the following criticisms:

  1. ptrace() 
     
     It was pointed out by Alan Cox, Andi Kleen, and others that processes
     which dropped their RLIMIT_NETWORK rlimit were still able to directly
     perform networking through a ptrace()'d victim.

     The new patchset adds an access check to __ptrace_may_access() to prevent
     this behavior.

  2. unshare(CLONE_NEWNET)

     It was pointed out by James Morris that network namespaces could be used
     to implement behavior similar to the behavior this patchset is designed to
     implement. To address this criticism, I added support for network
     namespaces to my sandboxing utility (Rainbow).

     Unfortunately, I have discovered that network namespaces in their current
     form are not appropriate for my use cases because they prevent the
     namespace'd apps from connecting to the X server, even over plain old
     AF_UNIX sockets.

     The RLIMIT_NETWORK facility I propose contains a specific exception for
     AF_UNIX filesystem sockets since those sockets are already bound by
     regular Unix discretionary access control.

  3. style

     I have made the patches more consistent with the kernel style
     guidelines.

Further suggestions?

Michael

[1] http://kerneltrap.org/mailarchive/linux-netdev/2009/1/7/4624864
[2] http://sandboxing.org

^ permalink raw reply	[flat|nested] 54+ messages in thread

* [PATCH] Security: Implement RLIMIT_NETWORK.
  2009-12-13  3:19 Network isolation with RLIMIT_NETWORK, cont'd Michael Stone
@ 2009-12-13  3:26 ` Michael Stone
  2009-12-13  3:30 ` [PATCH] Security: Document RLIMIT_NETWORK Michael Stone
  2009-12-13  3:44 ` Network isolation with RLIMIT_NETWORK, cont'd Michael Stone
  2 siblings, 0 replies; 54+ messages in thread
From: Michael Stone @ 2009-12-13  3:26 UTC (permalink / raw)
  To: Michael Stone; +Cc: linux-kernel, Michael Stone

Daniel Bernstein has observed [1] that security-conscious userland processes
may benefit from the ability to irrevocably remove their ability to create,
bind, connect to, or send messages except in the case of previously connected
sockets or AF_UNIX filesystem sockets. We provide this facility by implementing
support for a new rlimit called RLIMIT_NETWORK.

This facility is particularly attractive to security platforms like OLPC
Bitfrost [2] and to isolation programs like Rainbow [3] and Plash [4].

[1]: http://cr.yp.to/unix/disablenetwork.html
[2]: http://wiki.laptop.org/go/OLPC_Bitfrost
[3]: http://wiki.laptop.org/go/Rainbow
[4]: http://plash.beasts.org/

Signed-off-by: Michael Stone <michael@laptop.org>
Tested-by: Bernie Innocenti <bernie@codewiz.org>
---
  fs/proc/base.c                 |    1 +
  include/asm-generic/resource.h |    4 ++-
  kernel/ptrace.c                |    3 ++
  net/socket.c                   |   53 ++++++++++++++++++++++++++++++----------
  net/unix/af_unix.c             |   28 +++++++++++++++++++-
  5 files changed, 73 insertions(+), 16 deletions(-)

diff --git a/fs/proc/base.c b/fs/proc/base.c
index af643b5..7a153f1 100644
--- a/fs/proc/base.c
+++ b/fs/proc/base.c
@@ -474,6 +474,7 @@ static const struct limit_names lnames[RLIM_NLIMITS] = {
  	[RLIMIT_NICE] = {"Max nice priority", NULL},
  	[RLIMIT_RTPRIO] = {"Max realtime priority", NULL},
  	[RLIMIT_RTTIME] = {"Max realtime timeout", "us"},
+	[RLIMIT_NETWORK] = {"Network access permitted", "boolean"},
  };
  
  /* Display limits for a process */
diff --git a/include/asm-generic/resource.h b/include/asm-generic/resource.h
index 587566f..2bed565 100644
--- a/include/asm-generic/resource.h
+++ b/include/asm-generic/resource.h
@@ -45,7 +45,8 @@
  					   0-39 for nice level 19 .. -20 */
  #define RLIMIT_RTPRIO		14	/* maximum realtime priority */
  #define RLIMIT_RTTIME		15	/* timeout for RT tasks in us */
-#define RLIM_NLIMITS		16
+#define RLIMIT_NETWORK		16	/* permit network access */
+#define RLIM_NLIMITS		17
  
  /*
   * SuS says limits have to be unsigned.
@@ -87,6 +88,7 @@
  	[RLIMIT_NICE]		= { 0, 0 },				\
  	[RLIMIT_RTPRIO]		= { 0, 0 },				\
  	[RLIMIT_RTTIME]		= {  RLIM_INFINITY,  RLIM_INFINITY },	\
+	[RLIMIT_NETWORK]	= {  RLIM_INFINITY,  RLIM_INFINITY },	\
  }
  
  #endif	/* __KERNEL__ */
diff --git a/kernel/ptrace.c b/kernel/ptrace.c
index 23bd09c..e3d2c63 100644
--- a/kernel/ptrace.c
+++ b/kernel/ptrace.c
@@ -22,6 +22,7 @@
  #include <linux/pid_namespace.h>
  #include <linux/syscalls.h>
  #include <linux/uaccess.h>
+#include <asm/resource.h>
  
  
  /*
@@ -151,6 +152,8 @@ int __ptrace_may_access(struct task_struct *task, unsigned int mode)
  		dumpable = get_dumpable(task->mm);
  	if (!dumpable && !capable(CAP_SYS_PTRACE))
  		return -EPERM;
+	if (!current->signal->rlim[RLIMIT_NETWORK].rlim_cur)
+		return -EPERM;
  
  	return security_ptrace_access_check(task, mode);
  }
diff --git a/net/socket.c b/net/socket.c
index b94c3dd..a2e2873 100644
--- a/net/socket.c
+++ b/net/socket.c
@@ -90,6 +90,7 @@
  
  #include <asm/uaccess.h>
  #include <asm/unistd.h>
+#include <asm/resource.h>
  
  #include <net/compat.h>
  #include <net/wext.h>
@@ -576,6 +577,12 @@ static inline int __sock_sendmsg(struct kiocb *iocb, struct socket *sock,
  	if (err)
  		return err;
  
+	err = -EPERM;
+	if (sock->sk->sk_family != AF_UNIX &&
+		!current->signal->rlim[RLIMIT_NETWORK].rlim_cur &&
+		(msg->msg_name != NULL || msg->msg_namelen != 0))
+		return err;
+
  	return sock->ops->sendmsg(iocb, sock, msg, size);
  }
  
@@ -1227,6 +1234,11 @@ static int __sock_create(struct net *net, int family, int type, int protocol,
  	if (err)
  		return err;
  
+	err = (family == AF_UNIX ||
+		current->signal->rlim[RLIMIT_NETWORK].rlim_cur) ? 0 : -EPERM;
+	if (err)
+		return err;
+
  	/*
  	 *	Allocate the socket and allow the family to set things up. if
  	 *	the protocol is 0, the family is instructed to select an appropriate
@@ -1465,19 +1477,29 @@ SYSCALL_DEFINE3(bind, int, fd, struct sockaddr __user *, umyaddr, int, addrlen)
  	int err, fput_needed;
  
  	sock = sockfd_lookup_light(fd, &err, &fput_needed);
-	if (sock) {
-		err = move_addr_to_kernel(umyaddr, addrlen, (struct sockaddr *)&address);
-		if (err >= 0) {
-			err = security_socket_bind(sock,
-						   (struct sockaddr *)&address,
-						   addrlen);
-			if (!err)
-				err = sock->ops->bind(sock,
-						      (struct sockaddr *)
-						      &address, addrlen);
-		}
-		fput_light(sock->file, fput_needed);
-	}
+	if (!sock)
+		goto out;
+
+	err = move_addr_to_kernel(umyaddr, addrlen, (struct sockaddr *)&address);
+	if (err < 0)
+		goto out_fput;
+
+	err = security_socket_bind(sock,
+				   (struct sockaddr *)&address,
+				   addrlen);
+	if (err)
+		goto out_fput;
+
+	err = (((struct sockaddr *)&address)->sa_family == AF_UNIX ||
+		current->signal->rlim[RLIMIT_NETWORK].rlim_cur) ? 0 : -EPERM;
+	if (err)
+		goto out_fput;
+
+	err = sock->ops->bind(sock, (struct sockaddr *) &address, addrlen);
+
+out_fput:
+	fput_light(sock->file, fput_needed);
+out:
  	return err;
  }
  
@@ -1639,6 +1661,11 @@ SYSCALL_DEFINE3(connect, int, fd, struct sockaddr __user *, uservaddr,
  	if (err)
  		goto out_put;
  
+	err = (((struct sockaddr *)&address)->sa_family == AF_UNIX ||
+		current->signal->rlim[RLIMIT_NETWORK].rlim_cur) ? 0 : -EPERM;
+	if (err)
+		goto out_put;
+
  	err = sock->ops->connect(sock, (struct sockaddr *)&address, addrlen,
  				 sock->file->f_flags);
  out_put:
diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c
index f255119..3bbc945 100644
--- a/net/unix/af_unix.c
+++ b/net/unix/af_unix.c
@@ -99,6 +99,7 @@
  #include <linux/fs.h>
  #include <linux/slab.h>
  #include <asm/uaccess.h>
+#include <asm/resource.h>
  #include <linux/skbuff.h>
  #include <linux/netdevice.h>
  #include <net/net_namespace.h>
@@ -797,6 +798,11 @@ static int unix_bind(struct socket *sock, struct sockaddr *uaddr, int addr_len)
  		goto out;
  	addr_len = err;
  
+	err = (current->signal->rlim[RLIMIT_NETWORK].rlim_cur ||
+		sunaddr->sun_path[0]) ? 0 : -EPERM;
+	if (err)
+		goto out;
+
  	mutex_lock(&u->readlock);
  
  	err = -EINVAL;
@@ -934,6 +940,11 @@ static int unix_dgram_connect(struct socket *sock, struct sockaddr *addr,
  			goto out;
  		alen = err;
  
+		err = (current->signal->rlim[RLIMIT_NETWORK].rlim_cur ||
+		       sunaddr->sun_path[0]) ? 0 : -EPERM;
+		if (err)
+			goto out;
+
  		if (test_bit(SOCK_PASSCRED, &sock->flags) &&
  		    !unix_sk(sk)->addr && (err = unix_autobind(sock)) != 0)
  			goto out;
@@ -1033,8 +1044,13 @@ static int unix_stream_connect(struct socket *sock, struct sockaddr *uaddr,
  		goto out;
  	addr_len = err;
  
-	if (test_bit(SOCK_PASSCRED, &sock->flags) && !u->addr &&
-	    (err = unix_autobind(sock)) != 0)
+	err = (current->signal->rlim[RLIMIT_NETWORK].rlim_cur ||
+		sunaddr->sun_path[0]) ? 0 : -EPERM;
+	if (err)
+		goto out;
+
+	if (test_bit(SOCK_PASSCRED, &sock->flags) && !u->addr
+		&& (err = unix_autobind(sock)) != 0)
  		goto out;
  
  	timeo = sock_sndtimeo(sk, flags & O_NONBLOCK);
@@ -1370,6 +1386,11 @@ static int unix_dgram_sendmsg(struct kiocb *kiocb, struct socket *sock,
  		if (err < 0)
  			goto out;
  		namelen = err;
+
+		err = -EPERM;
+		if (!current->signal->rlim[RLIMIT_NETWORK].rlim_cur &&
+			!sunaddr->sun_path[0])
+			goto out;
  	} else {
  		sunaddr = NULL;
  		err = -ENOTCONN;
@@ -1520,6 +1541,9 @@ static int unix_stream_sendmsg(struct kiocb *kiocb, struct socket *sock,
  	if (msg->msg_namelen) {
  		err = sk->sk_state == TCP_ESTABLISHED ? -EISCONN : -EOPNOTSUPP;
  		goto out_err;
+		/* RLIMIT_NETWORK requires no change here since connection-less
+		 * unix stream sockets are not supported.
+		 * See Documentation/rlimit_network.txt for details. */
  	} else {
  		sunaddr = NULL;
  		err = -ENOTCONN;
-- 
1.5.6.5


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH] Security: Document RLIMIT_NETWORK.
  2009-12-13  3:19 Network isolation with RLIMIT_NETWORK, cont'd Michael Stone
  2009-12-13  3:26 ` [PATCH] Security: Implement RLIMIT_NETWORK Michael Stone
@ 2009-12-13  3:30 ` Michael Stone
  2009-12-13  3:44 ` Network isolation with RLIMIT_NETWORK, cont'd Michael Stone
  2 siblings, 0 replies; 54+ messages in thread
From: Michael Stone @ 2009-12-13  3:30 UTC (permalink / raw)
  To: linux-kernel; +Cc: Michael Stone

Signed-off-by: Michael Stone <michael@laptop.org>
---
  Documentation/rlimit_network.txt |   55 ++++++++++++++++++++++++++++++++++++++
  1 files changed, 55 insertions(+), 0 deletions(-)
  create mode 100644 Documentation/rlimit_network.txt

diff --git a/Documentation/rlimit_network.txt b/Documentation/rlimit_network.txt
new file mode 100644
index 0000000..3307866
--- /dev/null
+++ b/Documentation/rlimit_network.txt
@@ -0,0 +1,55 @@
+Purpose
+-------
+
+Daniel Bernstein has observed [1] that security-conscious userland processes
+may benefit from the ability to irrevocably remove their ability to create,
+bind, connect to, or send messages except in the case of previously connected
+sockets or AF_UNIX filesystem sockets.
+
+This facility is particularly attractive to security platforms like OLPC
+Bitfrost [2] and to isolation programs like Rainbow [3] and Plash [4] because:
+
+  * it integrates well with standard techniques for writing privilege-separated
+    Unix programs
+
+  * it integrates well with the need to perform limited socket I/O, e.g., when
+    running X clients
+
+  * it's available to unprivileged programs
+
+  * it's a discretionary feature available to all of distributors,
+    administrators, authors, and users
+
+  * its effect is entirely local, rather than global (like netfilter)
+
+  * it's simple enough to have some hope of being used correctly
+
+Implementation
+--------------
+
+After considering implementations based on the Linux Security Module (LSM)
+framework, on SELinux in particular, on network namespaces (CLONE_NEWNET), and
+on direct modification of the kernel syscall and task_struct APIs, we came to
+the conclusion that the best way to implement this feature was to extend the
+resource limits framework with a new RLIMIT_NETWORK field and to modify the
+implementations of the relevant socket calls to return -EPERM when
+
+  current->signal->rlim[RLIMIT_NETWORK].rlim_cur == 0
+
+unless we are manipulating an AF_UNIX socket whose name does not begin with \0
+or, in the case of sendmsg(), unless we are manipulating a previously connected
+socket, i.e. one with
+
+  msg.msg_name == NULL && msg.msg_namelen == 0
+
+Finally, in response to criticism from Alan Cox, we insert a similar access
+check into __ptrace_may_access() to prevent processes which have dropped their
+networking privileges from performing network I/O by ptracing other processes.
+
+References
+----------
+
+[1]: http://cr.yp.to/unix/disablenetwork.html
+[2]: http://wiki.laptop.org/go/OLPC_Bitfrost
+[3]: http://wiki.laptop.org/go/Rainbow
+[4]: http://plash.beasts.org/
-- 
1.5.6.5


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* Re: Network isolation with RLIMIT_NETWORK, cont'd.
  2009-12-13  3:19 Network isolation with RLIMIT_NETWORK, cont'd Michael Stone
  2009-12-13  3:26 ` [PATCH] Security: Implement RLIMIT_NETWORK Michael Stone
  2009-12-13  3:30 ` [PATCH] Security: Document RLIMIT_NETWORK Michael Stone
@ 2009-12-13  3:44 ` Michael Stone
  2009-12-13  5:09   ` setrlimit(RLIMIT_NETWORK) vs. prctl(???) Michael Stone
                     ` (2 more replies)
  2 siblings, 3 replies; 54+ messages in thread
From: Michael Stone @ 2009-12-13  3:44 UTC (permalink / raw)
  To: linux-kernel, netdev, linux-security-module
  Cc: Andi Kleen, David Lang, Oliver Hartkopp, Alan Cox, Herbert Xu,
	Valdis Kletnieks, Bryan Donlan, Rémi Denis-Courmont,
	Evgeniy Polyakov, C. Scott Ananian, James Morris,
	Eric W. Biederman, Bernie Innocenti, Mark Seaborn

Gentlefolks,

You were all meant to be included on the CC-list for the letter and patches
which I just sent to lkml:

   http://lkml.org/lkml/2009/12/12/149

Apologies for the typo in my previous mail headers.

Regards,

Michael

^ permalink raw reply	[flat|nested] 54+ messages in thread

* setrlimit(RLIMIT_NETWORK) vs. prctl(???)
  2009-12-13  3:44 ` Network isolation with RLIMIT_NETWORK, cont'd Michael Stone
@ 2009-12-13  5:09   ` Michael Stone
  2009-12-13  5:20     ` Ulrich Drepper
  2009-12-13  8:32   ` Network isolation with RLIMIT_NETWORK, cont'd Rémi Denis-Courmont
  2009-12-13 10:05   ` Eric W. Biederman
  2 siblings, 1 reply; 54+ messages in thread
From: Michael Stone @ 2009-12-13  5:09 UTC (permalink / raw)
  To: linux-kernel, netdev, linux-security-module
  Cc: Andi Kleen, David Lang, Oliver Hartkopp, Alan Cox, Herbert Xu,
	Valdis Kletnieks, Bryan Donlan, Rémi Denis-Courmont,
	Evgeniy Polyakov, C. Scott Ananian, James Morris,
	Eric W. Biederman, Bernie Innocenti, Mark Seaborn, Michael Stone

Folks,

A colleague just asked me an excellent question about my approach which I'd
like to share with you. Paraphrasing, he wrote:

> rlimits seem very heavy for a simple inherited boolean flag. Also, creating
> a new one will require modifying a lot of delicate userland software.
> Wouldn't some new prctl() flags be a better choice?

Here's my response:

> You're absolutely right that choosing to expose this functionality as an
> rlimit (as opposed to as a new syscall or as a flag to an old syscall like
> prctl()) is a decision with complex consequences.
> 
> I picked rlimits for this patch (after trying the "new syscall" approach
> privately) because doing so provides exactly the interface, semantics, and
> userland integration that I want:
>
> interface: "unprivileged", "temporarily drop", "permanently drop", "get
> current state", "persist current state across exec()", and some room for
> future expansion of semantics by definining new state values between 0 and
> RLIMIT_INFINITY.
> 
> integration: lots of sandboxing code already contains logic to drop rlimits
> when starting up an isolated process. Furthermore, I think it would be really
> great to be able to limit networking from the shell via ulimit and on a
> per-user basis via /etc/security/limits.conf.
> 
> That being said, I'm not wedded to the decision. Could you give me some more
> specific examples of the kinds of changes in low-level userspace code that
> you're worried about?

Regards,

Michael

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: setrlimit(RLIMIT_NETWORK) vs. prctl(???)
  2009-12-13  5:09   ` setrlimit(RLIMIT_NETWORK) vs. prctl(???) Michael Stone
@ 2009-12-13  5:20     ` Ulrich Drepper
  2009-12-15  5:33       ` Michael Stone
  0 siblings, 1 reply; 54+ messages in thread
From: Ulrich Drepper @ 2009-12-13  5:20 UTC (permalink / raw)
  To: Michael Stone
  Cc: linux-kernel, netdev, linux-security-module, Andi Kleen,
	David Lang, Oliver Hartkopp, Alan Cox, Herbert Xu,
	Valdis Kletnieks, Bryan Donlan, Rémi Denis-Courmont,
	Evgeniy Polyakov, C. Scott Ananian, James Morris,
	Eric W. Biederman, Bernie Innocenti, Mark Seaborn

On Sat, Dec 12, 2009 at 21:09, Michael Stone <michael@laptop.org> wrote:
>> That being said, I'm not wedded to the decision. Could you give me some
>> more
>> specific examples of the kinds of changes in low-level userspace code that
>> you're worried about?

It was an accident that I sent the email privately.

As summarized in the paraphrased comment, it's a pain to deal with
rlimit extensions.  It's easy enough to do all this using prctl() with
the same semantics and without forcing any other code to be modified.
I let others more competent to judge the usefulness.  But using rlimit
as the interface is just plain wrong.

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: Network isolation with RLIMIT_NETWORK, cont'd.
  2009-12-13  3:44 ` Network isolation with RLIMIT_NETWORK, cont'd Michael Stone
  2009-12-13  5:09   ` setrlimit(RLIMIT_NETWORK) vs. prctl(???) Michael Stone
@ 2009-12-13  8:32   ` Rémi Denis-Courmont
  2009-12-13 13:44     ` Michael Stone
  2009-12-13 10:05   ` Eric W. Biederman
  2 siblings, 1 reply; 54+ messages in thread
From: Rémi Denis-Courmont @ 2009-12-13  8:32 UTC (permalink / raw)
  To: Michael Stone
  Cc: linux-kernel, netdev, linux-security-module, Andi Kleen,
	David Lang, Oliver Hartkopp, Alan Cox, Herbert Xu,
	Valdis Kletnieks, Bryan Donlan, Evgeniy Polyakov,
	C. Scott Ananian, James Morris, Eric W. Biederman,
	Bernie Innocenti, Mark Seaborn

	Hello,

Le dimanche 13 décembre 2009 05:44:18 Michael Stone, vous avez écrit :
> You were all meant to be included on the CC-list for the letter and patches
> which I just sent to lkml:
> 
>    http://lkml.org/lkml/2009/12/12/149

You explicitly mention the need to connect to the X server over local sockets.  
But won't that allow the sandboxed application to send synthetic events to any 
other X11 applications? Hence unless the whole X server has restricted network 
access, this seems a bit broken? D-Bus, which also uses local sockets, will 
exhibit similar issues, as will any unrestricted IPC mechanism in fact.

I am not sure if restricting network access but not other file descriptors 
makes that much sense... ? Then again, I'm not entirely clear what you are 
trying to solve.

If I had to sandbox something, I'd drop the process file limit to 0. That will 
effectively cut off network, file system, and POSIX IPCs. Unfortunately, the 
process can still use SysV IPC, ptrace(), and send signals to others. So those 
are the gaps I would first try to contain.

-- 
Rémi Denis-Courmont
http://www.remlab.net/
http://fi.linkedin.com/in/remidenis

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: Network isolation with RLIMIT_NETWORK, cont'd.
  2009-12-13  3:44 ` Network isolation with RLIMIT_NETWORK, cont'd Michael Stone
  2009-12-13  5:09   ` setrlimit(RLIMIT_NETWORK) vs. prctl(???) Michael Stone
  2009-12-13  8:32   ` Network isolation with RLIMIT_NETWORK, cont'd Rémi Denis-Courmont
@ 2009-12-13 10:05   ` Eric W. Biederman
  2009-12-13 14:21     ` Michael Stone
  2009-12-17 17:52     ` Andi Kleen
  2 siblings, 2 replies; 54+ messages in thread
From: Eric W. Biederman @ 2009-12-13 10:05 UTC (permalink / raw)
  To: Michael Stone
  Cc: linux-kernel, netdev, linux-security-module, Andi Kleen,
	David Lang, Oliver Hartkopp, Alan Cox, Herbert Xu,
	Valdis Kletnieks, Bryan Donlan, Rémi Denis-Courmont,
	Evgeniy Polyakov, C. Scott Ananian, James Morris,
	Bernie Innocenti, Mark Seaborn, Linux Containers


I have added the container's list to the cc as there is some overlap.

Michael Stone <michael@laptop.org> writes:

> Dear lkml,
> 
> A few months ago [1], I asked for feedback on a new network isolation primitive
> named "RLIMIT_NETWORK" designed for use with Unix sandboxing utilities like
> Rainbow, Plash, and friends [2]. Thank you to all those CC'ed for your helpful
> early remarks.
> 
> Here is an updated patchset with responses to the following criticisms:

Overall what you have looks addhoc, and very special case which is
likely to impair maintenance in the future.

Furthermore you have not addressed the primary issue that keeps
unshare(CLONE_NEWNET) requiring root privileges.  You can in theory
confuse a suid root application and cause it to take action with it's
elevated privileges that violate the security policy.  The network
namespace has more potential to confuse existing applications than
your mechanism, but the problem seems to remain.

>   1. ptrace() 
>      
>      It was pointed out by Alan Cox, Andi Kleen, and others that processes
>      which dropped their RLIMIT_NETWORK rlimit were still able to directly
>      perform networking through a ptrace()'d victim.
> 
>      The new patchset adds an access check to __ptrace_may_access() to prevent
>      this behavior.

Solve that with an unused uid.  That ptrace_may_access check is
completely non-intuitive, and a problem if we ever remove the current
== task security module bug avoidance.

>   2. unshare(CLONE_NEWNET)
> 
>      It was pointed out by James Morris that network namespaces could be used
>      to implement behavior similar to the behavior this patchset is designed to
>      implement. To address this criticism, I added support for network
>      namespaces to my sandboxing utility (Rainbow).
> 
>      Unfortunately, I have discovered that network namespaces in their current
>      form are not appropriate for my use cases because they prevent the
>      namespace'd apps from connecting to the X server, even over plain old
>      AF_UNIX sockets.

We discussed that a while ago, and there is no fundamental reason to
disallow opening unix domain sockets from another network namespace.
The reason this has not been done, is that no one has taken a good
hard look at the packet transmit path and said there are no technical
problems for packets traversing between two network namespaces.

It is probably time to revisit that.


>      The RLIMIT_NETWORK facility I propose contains a specific exception for
>      AF_UNIX filesystem sockets since those sockets are already bound by
>      regular Unix discretionary access control.

What is more significant that unix discretionary access control is the
fact that the set of available af_unix sockets you can bind to is filtered
by the mount namespace.


With respect to the problem of handling suid root applications my long
term plan is to finish the security credentials namespace aka
unshare(NEWUSER).  Making the capabilities namespace local and
changing all uid based checks from uid1 == uid2 to (ns1, uid1) ==
(ns2, uid2).  At which point suid root applications will not be a
problem because the problem root capabilities will not be available
for them to acquire.

Eric

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: Network isolation with RLIMIT_NETWORK, cont'd.
  2009-12-13  8:32   ` Network isolation with RLIMIT_NETWORK, cont'd Rémi Denis-Courmont
@ 2009-12-13 13:44     ` Michael Stone
  0 siblings, 0 replies; 54+ messages in thread
From: Michael Stone @ 2009-12-13 13:44 UTC (permalink / raw)
  To: Rémi Denis-Courmont
  Cc: Michael Stone, linux-kernel, netdev, linux-security-module,
	Andi Kleen, David Lang, Oliver Hartkopp, Alan Cox, Herbert Xu,
	Valdis Kletnieks, Bryan Donlan, Evgeniy Polyakov,
	C. Scott Ananian, James Morris, Eric W. Biederman,
	Bernie Innocenti, Mark Seaborn

Rémi,                                                                                                                                                                                                                                                               
> You explicitly mention the need to connect to the X server over local sockets.
> But won't that allow the sandboxed application to send synthetic events to any
> other X11 applications? 

X11 cookie authentication and socket ownership+permissions effectively control
access to the X server by local processes. Thus, as an isolation author, I may
easily grant my isolated process any of:

   a) full access to the main X server 
   b) some access to a nested X server (like a Xephyr) which I'm using to do
      some event filtering
   c) no access to any X server by witholding thec cookies or by changing the
      permissions on the X socket to be more restrictive

with existing techniques.

> Hence unless the whole X server has restricted network access, this seems a
> bit broken? 

Not broken for the reasons I mentioned above. However, using this rlimit to
disable fresh network access for the whole X server actually sounds like a
rather nice idea; thanks for suggesting it.

> D-Bus, which also uses local sockets, will exhibit similar issues, 

Absolutely. However, D-Bus, like X, already has strong authentication
mechanisms in place that permit me to use pre-existing Unix discretionary 
access control to limit what communication takes place. More specifically, I can 

   a) tell D-Bus to use a file-system socket and change the credentials on that
      socket

   b) use cookies to authenticate incoming connections

   c) explicitly tell D-Bus what users and groups may connect via configuration
      files

   d) explicitly tell D-Bus what users and groups may send and receive which
      messages via configuration files

> as will any unrestricted IPC mechanism in fact. I am not sure if restricting
> network access but not other file descriptors makes that much sense...? Then
> again, I'm not entirely clear what you are trying to solve.

Inadequately access-controlled IPC mechanisms are the specific problem that I
am trying to address. Fortunately, these mechanisms seem to be rare: the only
two that I know of are non-AF_UNIX sockets and ptrace(). All the other IPC
mechanisms that I have seen may be adequately restricted by changing file
permissions and ownership.

> If I had to sandbox something, I'd drop the process file limit to 0. 

That is a technique that is commonly used by many people in this space. It
works well for some limited use cases and, like SECCOMP, is too restrictive for
the kinds of general-purpose applications that I'm sandboxing.

If you're interested,

   http://cr.yp.to/unix/disablenetwork.html

lists several specific problems. To see more, just try dropping RLIMIT_NOFILE
to 0 before launching all your favorite apps. I'd be curious to hear how far
you get.

Regards,

Michael

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: Network isolation with RLIMIT_NETWORK, cont'd.
  2009-12-13 10:05   ` Eric W. Biederman
@ 2009-12-13 14:21     ` Michael Stone
       [not found]       ` <fb69ef3c0912170931l5cbf0e3dh81c88e6502651042@mail.gmail.com>
  2009-12-17 17:52     ` Andi Kleen
  1 sibling, 1 reply; 54+ messages in thread
From: Michael Stone @ 2009-12-13 14:21 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Michael Stone, linux-kernel, netdev, linux-security-module,
	Andi Kleen, David Lang, Oliver Hartkopp, Alan Cox, Herbert Xu,
	Valdis Kletnieks, Bryan Donlan, Rémi Denis-Courmont,
	Evgeniy Polyakov, C. Scott Ananian, James Morris,
	Bernie Innocenti, Mark Seaborn, Linux Containers

Eric Biederman wrote:

> I have added the container's list to the cc as there is some overlap.

Good idea; thanks.

> Overall what you have looks ad-hoc, and very special case which is
> likely to impair maintenance in the future.

Unfortunately, these are the semantics which are necessary to make further
progress on sandboxing real Linux apps with the discretionary access control
facilities which are available today.

> You can in theory confuse a suid root application and cause it to take action
> with it's elevated privileges that violate the security policy. 

You're right, in theory. In practice, the setuid-root facility is a rather
special escape hatch which *everyone* in this field knows must be carefully
audited and maintained when building or updating trustworthy systems.

Also, in practice, I'm not expecting perfection today. Nor was I last year, nor
am I next year. What I am expecting is that the kernel will supply me (perhaps
with my assistance along the way) with the access control facilities that I
need to do my job in userland. This is one of them.

> The network namespace has more potential to confuse existing applications
> than your mechanism, but the problem seems to remain.

I'm glad to hear that you find this mechanism to be comparatively less
confusing.

>>   1. ptrace() 
>>      
>>      It was pointed out by Alan Cox, Andi Kleen, and others that processes
>>      which dropped their RLIMIT_NETWORK rlimit were still able to directly
>>      perform networking through a ptrace()'d victim.
>> 
>>      The new patchset adds an access check to __ptrace_may_access() to prevent
>>      this behavior.
> 
> Solve that with an unused uid.  

I already do, in general. (As do the other people requesting this facility.)

The reason for the __ptrace_may_access() check is that the logical way for
*application authors* whose code is *already* running in a fresh uid to further
improve system security is to separate their network I/O from their parsing
code a process boundary and to drop networking privileges in the parser.

>>   2. unshare(CLONE_NEWNET)
>> 
>>      It was pointed out by James Morris that network namespaces could be used
>>      to implement behavior similar to the behavior this patchset is designed to
>>      implement. To address this criticism, I added support for network
>>      namespaces to my sandboxing utility (Rainbow).
>> 
>>      Unfortunately, I have discovered that network namespaces in their current
>>      form are not appropriate for my use cases because they prevent the
>>      namespace'd apps from connecting to the X server, even over plain old
>>      AF_UNIX sockets.
>
>We discussed that a while ago, and there is no fundamental reason to
>disallow opening unix domain sockets from another network namespace.

I disagree. I like that the network namespaces have (fairly) clear semantics.
They are excellent semantics for some of my other use cases, like testing
networked software [1]. They're probably quite nice for full-blown
containerization. They're just not right for the kind of lightweight sandboxing
of complicated legacy apps that I'm doing.

[1]: http://dev.laptop.org/git/users/mstone/dnshash/tree/docs/unit_testing.txt

>> The RLIMIT_NETWORK facility I propose contains a specific exception for
>> AF_UNIX filesystem sockets since those sockets are already bound by
>> regular Unix discretionary access control.
> 
> What is more significant than unix discretionary access control is the
> fact that the set of available af_unix sockets you can bind to is filtered
> by the mount namespace.

Actually, the Unix DAC is far more important for my purposes. The reason is
that it's unprivileged, already understood by literally *everyone* involved in
Unix security, and it has the best tools support of any access control
mechanism.

For comparison, I do use CLONE_NEWNS mount namespaces and they've been a real
pain because

   a) unlike in Plan 9, they're privileged,

   b) they greatly complicate debugging the isolated app because you see
      different things inside and outside the namespace,

   c) there's no good way to manipulate them from userland, and

   d) they're poorly documented outside of the mount man page.

Regards,

Michael

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: setrlimit(RLIMIT_NETWORK) vs. prctl(???)
  2009-12-13  5:20     ` Ulrich Drepper
@ 2009-12-15  5:33       ` Michael Stone
  2009-12-16 15:30         ` Michael Stone
  0 siblings, 1 reply; 54+ messages in thread
From: Michael Stone @ 2009-12-15  5:33 UTC (permalink / raw)
  To: Ulrich Drepper; +Cc: linux-kernel, netdev, linux-security-module

Ulrich Drepper wrote:
> On Sat, Dec 12, 2009 at 21:09, Michael Stone <michael@laptop.org> wrote:
>> That being said, I'm not wedded to the decision. Could you give me some
>> more specific examples of the kinds of changes in low-level userspace code
>> that you're worried about?
> 
> As summarized in the paraphrased comment, it's a pain to deal with
> rlimit extensions.  It's easy enough to do all this using prctl() with
> the same semantics and without forcing any other code to be modified.
> I let others more competent to judge the usefulness.  But using rlimit
> as the interface is just plain wrong.

I still like the rlimit-based interface because I think it gives good intuition
about how to use the facility and about how it ought to be exposed to high-level
parts of userland but it certainly can't hurt to cook up a version based on
prctl() so that we can make a fair comparison of the two. 

I'll see what I can come up with.

Regards,

Michael

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: setrlimit(RLIMIT_NETWORK) vs. prctl(???)
  2009-12-15  5:33       ` Michael Stone
@ 2009-12-16 15:30         ` Michael Stone
  2009-12-16 15:32           ` [PATCH] Security: Add prctl(PR_{GET,SET}_NETWORK) interface Michael Stone
                             ` (2 more replies)
  0 siblings, 3 replies; 54+ messages in thread
From: Michael Stone @ 2009-12-16 15:30 UTC (permalink / raw)
  To: Ulrich Drepper, Ulrich Drepper
  Cc: Michael Stone, linux-kernel, netdev, linux-security-module,
	Andi Kleen, David Lang, Oliver Hartkopp, Alan Cox, Herbert Xu,
	Valdis Kletnieks, Bryan Donlan, Evgeniy Polyakov,
	C. Scott Ananian, James Morris, Eric W. Biederman,
	Bernie Innocenti, Mark Seaborn

Ulrich,

As promised, here's a draft based on prctl() for comparison with the
rlimit()-based approach presented in the first attempt.

It behaves as I expect in simple testing with busybox "nc" and I'll do a more
thorough test shortly. I'm sending it now because I think that it's good enough
to give a decent overview of what the end result of this implementation
strategy might look like.

Regards,

Michael

------

Michael Stone (3):
  Security: Add prctl(PR_{GET,SET}_NETWORK) interface.
  Security: Implement prctl(PR_SET_NETWORK, PR_NETWORK_OFF) semantics.
  Security: Document prctl(PR_{GET,SET}_NETWORK).

 Documentation/prctl_network.txt |   69 +++++++++++++++++++++++++++++++++++++++
 include/linux/prctl.h           |    7 ++++
 include/linux/prctl_network.h   |    7 ++++
 include/linux/sched.h           |    2 +
 kernel/Makefile                 |    2 +-
 kernel/fork.c                   |    2 +
 kernel/prctl_network.c          |   37 +++++++++++++++++++++
 kernel/ptrace.c                 |    2 +
 kernel/sys.c                    |    7 ++++
 net/socket.c                    |   51 +++++++++++++++++++++-------
 net/unix/af_unix.c              |   19 +++++++++++
 11 files changed, 191 insertions(+), 14 deletions(-)
 create mode 100644 Documentation/prctl_network.txt
 create mode 100644 include/linux/prctl_network.h
 create mode 100644 kernel/prctl_network.c


^ permalink raw reply	[flat|nested] 54+ messages in thread

* [PATCH] Security: Add prctl(PR_{GET,SET}_NETWORK) interface.
  2009-12-16 15:30         ` Michael Stone
@ 2009-12-16 15:32           ` Michael Stone
  2009-12-16 15:59             ` Andi Kleen
                               ` (2 more replies)
  2009-12-16 15:32           ` [PATCH] Security: Implement prctl(PR_SET_NETWORK, PR_NETWORK_OFF) semantics Michael Stone
  2009-12-16 15:32           ` [PATCH] Security: Document prctl(PR_{GET,SET}_NETWORK) Michael Stone
  2 siblings, 3 replies; 54+ messages in thread
From: Michael Stone @ 2009-12-16 15:32 UTC (permalink / raw)
  To: Ulrich Drepper
  Cc: Michael Stone, linux-kernel, netdev, linux-security-module,
	Andi Kleen, David Lang, Oliver Hartkopp, Alan Cox, Herbert Xu,
	Valdis Kletnieks, Bryan Donlan, Evgeniy Polyakov,
	C. Scott Ananian, James Morris, Eric W. Biederman,
	Bernie Innocenti, Mark Seaborn

Daniel Bernstein has observed [1] that security-conscious userland processes
may benefit from the ability to irrevocably remove their ability to create,
bind, connect to, or send messages except in the case of previously connected
sockets or AF_UNIX filesystem sockets. We provide this facility by implementing
support for a new prctl(PR_SET_NETWORK) flag named PR_NETWORK_OFF.

This facility is particularly attractive to security platforms like OLPC
Bitfrost [2] and to isolation programs like Rainbow [3] and Plash [4].

[1]: http://cr.yp.to/unix/disablenetwork.html
[2]: http://wiki.laptop.org/go/OLPC_Bitfrost
[3]: http://wiki.laptop.org/go/Rainbow
[4]: http://plash.beasts.org/

Signed-off-by: Michael Stone <michael@laptop.org>
---
 include/linux/prctl.h         |    7 +++++++
 include/linux/prctl_network.h |    7 +++++++
 include/linux/sched.h         |    2 ++
 kernel/Makefile               |    2 +-
 kernel/prctl_network.c        |   37 +++++++++++++++++++++++++++++++++++++
 kernel/sys.c                  |    7 +++++++
 6 files changed, 61 insertions(+), 1 deletions(-)
 create mode 100644 include/linux/prctl_network.h
 create mode 100644 kernel/prctl_network.c

diff --git a/include/linux/prctl.h b/include/linux/prctl.h
index a3baeb2..4eb4110 100644
--- a/include/linux/prctl.h
+++ b/include/linux/prctl.h
@@ -102,4 +102,11 @@
 
 #define PR_MCE_KILL_GET 34
 
+/* Get/set process disable-network flags */
+#define PR_SET_NETWORK	35
+#define PR_GET_NETWORK	36
+# define PR_NETWORK_ON        0
+# define PR_NETWORK_OFF       1
+# define PR_NETWORK_ALL_FLAGS 1
+
 #endif /* _LINUX_PRCTL_H */
diff --git a/include/linux/prctl_network.h b/include/linux/prctl_network.h
new file mode 100644
index 0000000..2db83eb
--- /dev/null
+++ b/include/linux/prctl_network.h
@@ -0,0 +1,7 @@
+#ifndef _LINUX_PRCTL_NETWORK_H
+#define _LINUX_PRCTL_NETWORK_H
+
+extern long prctl_get_network(void);
+extern long prctl_set_network(unsigned long);
+
+#endif /* _LINUX_PRCTL_NETWORK_H */
diff --git a/include/linux/sched.h b/include/linux/sched.h
index 5c858f3..751d372 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -1395,6 +1395,8 @@ struct task_struct {
 	unsigned int sessionid;
 #endif
 	seccomp_t seccomp;
+/* Flags for limiting networking via prctl(PR_SET_NETWORK). */
+  unsigned long network;
 
 /* Thread group tracking */
    	u32 parent_exec_id;
diff --git a/kernel/Makefile b/kernel/Makefile
index 864ff75..cafbff2 100644
--- a/kernel/Makefile
+++ b/kernel/Makefile
@@ -10,7 +10,7 @@ obj-y     = sched.o fork.o exec_domain.o panic.o printk.o \
 	    kthread.o wait.o kfifo.o sys_ni.o posix-cpu-timers.o mutex.o \
 	    hrtimer.o rwsem.o nsproxy.o srcu.o semaphore.o \
 	    notifier.o ksysfs.o pm_qos_params.o sched_clock.o cred.o \
-	    async.o
+	    async.o prctl_network.o
 obj-y += groups.o
 
 ifdef CONFIG_FUNCTION_TRACER
diff --git a/kernel/prctl_network.c b/kernel/prctl_network.c
new file mode 100644
index 0000000..d173716
--- /dev/null
+++ b/kernel/prctl_network.c
@@ -0,0 +1,37 @@
+/*
+ * linux/kernel/prctl_network.c
+ *
+ * Copyright 2009  Michael Stone <michael@laptop.org>
+ *
+ * Turn off a process's ability to access new networks.
+ * See Documentation/prctl_network.txt for details.
+ */
+
+#include <linux/prctl_network.h>
+#include <linux/sched.h>
+#include <linux/prctl.h>
+
+long prctl_get_network(void)
+{
+	return current->network;
+}
+
+long prctl_set_network(unsigned long network_flags)
+{
+	long ret;
+
+	/* only dropping access is permitted */
+	ret = -EPERM;
+        if (current->network & ~network_flags)
+		goto out;
+
+	ret = -EINVAL;
+	if (network_flags & ~PR_NETWORK_ALL_FLAGS)
+		goto out;
+
+	current->network = network_flags;
+	ret = 0;
+
+out:
+	return ret;
+}
diff --git a/kernel/sys.c b/kernel/sys.c
index 20ccfb5..4eccc66 100644
--- a/kernel/sys.c
+++ b/kernel/sys.c
@@ -35,6 +35,7 @@
 #include <linux/cpu.h>
 #include <linux/ptrace.h>
 #include <linux/fs_struct.h>
+#include <linux/prctl_network.h>
 
 #include <linux/compat.h>
 #include <linux/syscalls.h>
@@ -1576,6 +1577,12 @@ SYSCALL_DEFINE5(prctl, int, option, unsigned long, arg2, unsigned long, arg3,
 			else
 				error = PR_MCE_KILL_DEFAULT;
 			break;
+		case PR_SET_NETWORK:
+			error = prctl_set_network(arg2);
+			break;
+		case PR_GET_NETWORK:
+			error = prctl_get_network();
+			break;
 		default:
 			error = -EINVAL;
 			break;
-- 
1.5.6.5


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH] Security: Implement prctl(PR_SET_NETWORK, PR_NETWORK_OFF) semantics.
  2009-12-16 15:30         ` Michael Stone
  2009-12-16 15:32           ` [PATCH] Security: Add prctl(PR_{GET,SET}_NETWORK) interface Michael Stone
@ 2009-12-16 15:32           ` Michael Stone
  2009-12-17 19:18             ` Eric W. Biederman
  2009-12-16 15:32           ` [PATCH] Security: Document prctl(PR_{GET,SET}_NETWORK) Michael Stone
  2 siblings, 1 reply; 54+ messages in thread
From: Michael Stone @ 2009-12-16 15:32 UTC (permalink / raw)
  To: Ulrich Drepper
  Cc: Michael Stone, linux-kernel, netdev, linux-security-module,
	Andi Kleen, David Lang, Oliver Hartkopp, Alan Cox, Herbert Xu,
	Valdis Kletnieks, Bryan Donlan, Evgeniy Polyakov,
	C. Scott Ananian, James Morris, Eric W. Biederman,
	Bernie Innocenti, Mark Seaborn

Return -EPERM any time we try to __sock_create(), sys_connect(), sys_bind(),
sys_sendmsg(), or __ptrace_may_access() from a process with PR_NETWORK_OFF set
in current->network unless we're working on a socket which is already connected
or on a non-abstract AF_UNIX socket.

Signed-off-by: Michael Stone <michael@laptop.org>
---
 kernel/fork.c      |    2 ++
 kernel/ptrace.c    |    2 ++
 net/socket.c       |   51 ++++++++++++++++++++++++++++++++++++++-------------
 net/unix/af_unix.c |   19 +++++++++++++++++++
 4 files changed, 61 insertions(+), 13 deletions(-)

diff --git a/kernel/fork.c b/kernel/fork.c
index 9bd9144..01a7644 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -1130,6 +1130,8 @@ static struct task_struct *copy_process(unsigned long clone_flags,
 
 	p->bts = NULL;
 
+	p->network = current->network;
+
 	p->stack_start = stack_start;
 
 	/* Perform scheduler related setup. Assign this task to a CPU. */
diff --git a/kernel/ptrace.c b/kernel/ptrace.c
index 23bd09c..5b38db0 100644
--- a/kernel/ptrace.c
+++ b/kernel/ptrace.c
@@ -151,6 +151,8 @@ int __ptrace_may_access(struct task_struct *task, unsigned int mode)
 		dumpable = get_dumpable(task->mm);
 	if (!dumpable && !capable(CAP_SYS_PTRACE))
 		return -EPERM;
+	if (current->network)
+		return -EPERM;
 
 	return security_ptrace_access_check(task, mode);
 }
diff --git a/net/socket.c b/net/socket.c
index b94c3dd..e59f906 100644
--- a/net/socket.c
+++ b/net/socket.c
@@ -87,6 +87,7 @@
 #include <linux/wireless.h>
 #include <linux/nsproxy.h>
 #include <linux/magic.h>
+#include <linux/sched.h>
 
 #include <asm/uaccess.h>
 #include <asm/unistd.h>
@@ -576,6 +577,12 @@ static inline int __sock_sendmsg(struct kiocb *iocb, struct socket *sock,
 	if (err)
 		return err;
 
+	err = -EPERM;
+	if (sock->sk->sk_family != AF_UNIX &&
+		current->network &&
+		(msg->msg_name != NULL || msg->msg_namelen != 0))
+		return err;
+
 	return sock->ops->sendmsg(iocb, sock, msg, size);
 }
 
@@ -1227,6 +1234,9 @@ static int __sock_create(struct net *net, int family, int type, int protocol,
 	if (err)
 		return err;
 
+	if (family != AF_UNIX && current->network)
+		return -EPERM;
+
 	/*
 	 *	Allocate the socket and allow the family to set things up. if
 	 *	the protocol is 0, the family is instructed to select an appropriate
@@ -1465,19 +1475,29 @@ SYSCALL_DEFINE3(bind, int, fd, struct sockaddr __user *, umyaddr, int, addrlen)
 	int err, fput_needed;
 
 	sock = sockfd_lookup_light(fd, &err, &fput_needed);
-	if (sock) {
-		err = move_addr_to_kernel(umyaddr, addrlen, (struct sockaddr *)&address);
-		if (err >= 0) {
-			err = security_socket_bind(sock,
-						   (struct sockaddr *)&address,
-						   addrlen);
-			if (!err)
-				err = sock->ops->bind(sock,
-						      (struct sockaddr *)
-						      &address, addrlen);
-		}
-		fput_light(sock->file, fput_needed);
-	}
+	if (!sock)
+		goto out;
+
+	err = move_addr_to_kernel(umyaddr, addrlen, (struct sockaddr *)&address);
+	if (err < 0)
+		goto out_fput;
+
+	err = security_socket_bind(sock,
+				   (struct sockaddr *)&address,
+				   addrlen);
+	if (err)
+		goto out_fput;
+
+	err = (((struct sockaddr *)&address)->sa_family != AF_UNIX &&
+		current->network) ? -EPERM : 0;
+	if (err)
+		goto out_fput;
+
+	err = sock->ops->bind(sock, (struct sockaddr *) &address, addrlen);
+
+out_fput:
+	fput_light(sock->file, fput_needed);
+out:
 	return err;
 }
 
@@ -1639,6 +1659,11 @@ SYSCALL_DEFINE3(connect, int, fd, struct sockaddr __user *, uservaddr,
 	if (err)
 		goto out_put;
 
+	err = (((struct sockaddr *)&address)->sa_family != AF_UNIX &&
+		current->network) ? -EPERM : 0;
+	if (err)
+		goto out_put;
+
 	err = sock->ops->connect(sock, (struct sockaddr *)&address, addrlen,
 				 sock->file->f_flags);
 out_put:
diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c
index f255119..2d984a9 100644
--- a/net/unix/af_unix.c
+++ b/net/unix/af_unix.c
@@ -797,6 +797,10 @@ static int unix_bind(struct socket *sock, struct sockaddr *uaddr, int addr_len)
 		goto out;
 	addr_len = err;
 
+	err = (current->network && !sunaddr->sun_path[0]) ? -EPERM : 0;
+	if (err)
+		goto out;
+
 	mutex_lock(&u->readlock);
 
 	err = -EINVAL;
@@ -934,6 +938,10 @@ static int unix_dgram_connect(struct socket *sock, struct sockaddr *addr,
 			goto out;
 		alen = err;
 
+		err = (current->network && !sunaddr->sun_path[0]) ? -EPERM : 0;
+		if (err)
+			goto out;
+
 		if (test_bit(SOCK_PASSCRED, &sock->flags) &&
 		    !unix_sk(sk)->addr && (err = unix_autobind(sock)) != 0)
 			goto out;
@@ -1033,6 +1041,10 @@ static int unix_stream_connect(struct socket *sock, struct sockaddr *uaddr,
 		goto out;
 	addr_len = err;
 
+	err = (current->network && !sunaddr->sun_path[0]) ? -EPERM : 0;
+	if (err)
+		goto out;
+
 	if (test_bit(SOCK_PASSCRED, &sock->flags) && !u->addr &&
 	    (err = unix_autobind(sock)) != 0)
 		goto out;
@@ -1370,6 +1382,10 @@ static int unix_dgram_sendmsg(struct kiocb *kiocb, struct socket *sock,
 		if (err < 0)
 			goto out;
 		namelen = err;
+
+		err = -EPERM;
+		if (current->network && !sunaddr->sun_path[0])
+			goto out;
 	} else {
 		sunaddr = NULL;
 		err = -ENOTCONN;
@@ -1520,6 +1536,9 @@ static int unix_stream_sendmsg(struct kiocb *kiocb, struct socket *sock,
 	if (msg->msg_namelen) {
 		err = sk->sk_state == TCP_ESTABLISHED ? -EISCONN : -EOPNOTSUPP;
 		goto out_err;
+		/* prctl(PR_SET_NETWORK) requires no change here since
+		 * connection-less unix stream sockets are not supported.
+		 * See Documentation/prctl_network.txt for details. */
 	} else {
 		sunaddr = NULL;
 		err = -ENOTCONN;
-- 
1.5.6.5


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH] Security: Document prctl(PR_{GET,SET}_NETWORK).
  2009-12-16 15:30         ` Michael Stone
  2009-12-16 15:32           ` [PATCH] Security: Add prctl(PR_{GET,SET}_NETWORK) interface Michael Stone
  2009-12-16 15:32           ` [PATCH] Security: Implement prctl(PR_SET_NETWORK, PR_NETWORK_OFF) semantics Michael Stone
@ 2009-12-16 15:32           ` Michael Stone
  2 siblings, 0 replies; 54+ messages in thread
From: Michael Stone @ 2009-12-16 15:32 UTC (permalink / raw)
  To: Ulrich Drepper
  Cc: Michael Stone, linux-kernel, netdev, linux-security-module,
	Andi Kleen, David Lang, Oliver Hartkopp, Alan Cox, Herbert Xu,
	Valdis Kletnieks, Bryan Donlan, Evgeniy Polyakov,
	C. Scott Ananian, James Morris, Eric W. Biederman,
	Bernie Innocenti, Mark Seaborn

Explain the purpose, interface, and semantics of the
prctl(PR_{GET,SET}_network) facility.

Also reference some example userland clients.

Signed-off-by: Michael Stone <michael@laptop.org>
---
 Documentation/prctl_network.txt |   69 +++++++++++++++++++++++++++++++++++++++
 1 files changed, 69 insertions(+), 0 deletions(-)
 create mode 100644 Documentation/prctl_network.txt

diff --git a/Documentation/prctl_network.txt b/Documentation/prctl_network.txt
new file mode 100644
index 0000000..bcf2f72
--- /dev/null
+++ b/Documentation/prctl_network.txt
@@ -0,0 +1,69 @@
+Purpose
+-------
+
+Daniel Bernstein has observed [1] that security-conscious userland processes
+may benefit from the ability to irrevocably remove their ability to create,
+bind, connect to, or send messages except in the case of previously connected
+sockets or AF_UNIX filesystem sockets.
+
+This facility is particularly attractive to security platforms like OLPC
+Bitfrost [2] and to isolation programs like Rainbow [3] and Plash [4] because:
+
+  * it integrates well with standard techniques for writing privilege-separated
+    Unix programs
+
+  * it integrates well with the need to perform limited socket I/O, e.g., when
+    running X clients
+
+  * it's available to unprivileged programs
+
+  * it's a discretionary feature available to all of distributors,
+    administrators, authors, and users
+
+  * its effect is entirely local, rather than global (like netfilter)
+
+  * it's simple enough to have some hope of being used correctly
+
+Implementation
+--------------
+
+After considering implementations based on the Linux Security Module (LSM)
+framework, on SELinux in particular, on network namespaces (CLONE_NEWNET), and
+on direct modification of the kernel syscall and task_struct APIs, we came to
+the conclusion that the best way to implement this feature was to extend the
+prctl() framework with a new pair of options named PR_{GET,SET}_NETWORK. These
+options cause prctl() to read or modify "current->network".
+
+Semantics
+---------
+
+current->network is a flags field which is preserved across all variants of
+fork() and exec().
+
+Writes which attempt to clear bits in current->network return -EPERM.
+
+The default value for current->network is named PR_NETWORK_OFF and is defined
+to be 0.
+
+Presently, only one flag is defined: PR_NETWORK_OFF.
+
+More flags may be defined in the future if they become needed.
+
+Attempts to set undefined flags result in -EINVAL.
+
+When PR_NETWORK_OFF is set, implementations of syscalls which may be used by
+the current process to perform autonomous networking will return -EPERM. For
+example, calls to socket(), bind(), connect(), sendmsg(), and ptrace() will
+return -EPERM except for cases we are manipulating an AF_UNIX socket whose name
+does not begin with \0 or, in the case of sendmsg(), unless we are manipulating
+a previously connected socket, i.e. one with
+
+  msg.msg_name == NULL && msg.msg_namelen == 0.
+
+References
+----------
+
+[1]: http://cr.yp.to/unix/disablenetwork.html
+[2]: http://wiki.laptop.org/go/OLPC_Bitfrost
+[3]: http://wiki.laptop.org/go/Rainbow
+[4]: http://plash.beasts.org/
-- 
1.5.6.5


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* Re: [PATCH] Security: Add prctl(PR_{GET,SET}_NETWORK) interface.
  2009-12-16 15:32           ` [PATCH] Security: Add prctl(PR_{GET,SET}_NETWORK) interface Michael Stone
@ 2009-12-16 15:59             ` Andi Kleen
  2009-12-17  1:25               ` Michael Stone
  2009-12-17  9:25             ` Américo Wang
  2009-12-17 17:23             ` Randy Dunlap
  2 siblings, 1 reply; 54+ messages in thread
From: Andi Kleen @ 2009-12-16 15:59 UTC (permalink / raw)
  To: Michael Stone
  Cc: Ulrich Drepper, linux-kernel, netdev, linux-security-module,
	Andi Kleen, David Lang, Oliver Hartkopp, Alan Cox, Herbert Xu,
	Valdis Kletnieks, Bryan Donlan, Evgeniy Polyakov,
	C. Scott Ananian, James Morris, Eric W. Biederman,
	Bernie Innocenti, Mark Seaborn

On Wed, Dec 16, 2009 at 10:32:43AM -0500, Michael Stone wrote:
> Daniel Bernstein has observed [1] that security-conscious userland processes
> may benefit from the ability to irrevocably remove their ability to create,
> bind, connect to, or send messages except in the case of previously connected
> sockets or AF_UNIX filesystem sockets. We provide this facility by implementing
> support for a new prctl(PR_SET_NETWORK) flag named PR_NETWORK_OFF.
> 
> This facility is particularly attractive to security platforms like OLPC
> Bitfrost [2] and to isolation programs like Rainbow [3] and Plash [4].

What would stop them from ptracing someone else running under the same
uid who still has the network access? If you ptrace you can do
arbitary system calls.

-Andi

-- 
ak@linux.intel.com -- Speaking for myself only.

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH] Security: Add prctl(PR_{GET,SET}_NETWORK) interface.
  2009-12-16 15:59             ` Andi Kleen
@ 2009-12-17  1:25               ` Michael Stone
  2009-12-17  8:52                 ` Andi Kleen
       [not found]                 ` <fb69ef3c0912170906t291a37c4r6c4758ddc7dd300b@mail.gmail.com>
  0 siblings, 2 replies; 54+ messages in thread
From: Michael Stone @ 2009-12-17  1:25 UTC (permalink / raw)
  To: Andi Kleen
  Cc: Michael Stone, Ulrich Drepper, linux-kernel, netdev,
	linux-security-module, Andi Kleen, David Lang, Oliver Hartkopp,
	Alan Cox, Herbert Xu, Valdis Kletnieks, Bryan Donlan,
	Evgeniy Polyakov, C. Scott Ananian, James Morris,
	Eric W. Biederman, Bernie Innocenti, Mark Seaborn

Andi Kleen wrote:
> On Wed, Dec 16, 2009 at 10:32:43AM -0500, Michael Stone wrote:
>> Daniel Bernstein has observed [1] that security-conscious userland processes
>> may benefit from the ability to irrevocably remove their ability to create,
>> bind, connect to, or send messages except in the case of previously
>> connected sockets or AF_UNIX filesystem sockets. We provide this facility by
>> implementing support for a new prctl(PR_SET_NETWORK) flag named
>> PR_NETWORK_OFF.
>> 
>> This facility is particularly attractive to security platforms like OLPC
>> Bitfrost [2] and to isolation programs like Rainbow [3] and Plash [4].
> 
> What would stop them from ptracing someone else running under the same
> uid who still has the network access? 

Just like in the (revised from last year) rlimits version, there's a hunk in
the prctl_network semantics patch which disables networking-via-ptrace() like
so:

diff --git a/kernel/ptrace.c b/kernel/ptrace.c
index 23bd09c..5b38db0 100644
--- a/kernel/ptrace.c
+++ b/kernel/ptrace.c
@@ -151,6 +151,8 @@ int __ptrace_may_access(struct task_struct *task, unsigned int mode)
                dumpable = get_dumpable(task->mm);
        if (!dumpable && !capable(CAP_SYS_PTRACE))
                return -EPERM;
+       if (current->network)
+               return -EPERM;

        return security_ptrace_access_check(task, mode);
  }

More questions?

Regards, and thanks for your interest,

Michael

^ permalink raw reply related	[flat|nested] 54+ messages in thread

* Re: [PATCH] Security: Add prctl(PR_{GET,SET}_NETWORK) interface.
  2009-12-17  1:25               ` Michael Stone
@ 2009-12-17  8:52                 ` Andi Kleen
       [not found]                 ` <fb69ef3c0912170906t291a37c4r6c4758ddc7dd300b@mail.gmail.com>
  1 sibling, 0 replies; 54+ messages in thread
From: Andi Kleen @ 2009-12-17  8:52 UTC (permalink / raw)
  To: Michael Stone
  Cc: Andi Kleen, Ulrich Drepper, linux-kernel, netdev,
	linux-security-module, David Lang, Oliver Hartkopp, Alan Cox,
	Herbert Xu, Valdis Kletnieks, Bryan Donlan, Evgeniy Polyakov,
	C. Scott Ananian, James Morris, Eric W. Biederman,
	Bernie Innocenti, Mark Seaborn

On Wed, Dec 16, 2009 at 08:25:40PM -0500, Michael Stone wrote:
> Andi Kleen wrote:
>> On Wed, Dec 16, 2009 at 10:32:43AM -0500, Michael Stone wrote:
>>> Daniel Bernstein has observed [1] that security-conscious userland processes
>>> may benefit from the ability to irrevocably remove their ability to create,
>>> bind, connect to, or send messages except in the case of previously
>>> connected sockets or AF_UNIX filesystem sockets. We provide this facility by
>>> implementing support for a new prctl(PR_SET_NETWORK) flag named
>>> PR_NETWORK_OFF.
>>>
>>> This facility is particularly attractive to security platforms like OLPC
>>> Bitfrost [2] and to isolation programs like Rainbow [3] and Plash [4].
>>
>> What would stop them from ptracing someone else running under the same
>> uid who still has the network access? 
>
> Just like in the (revised from last year) rlimits version, there's a hunk in
> the prctl_network semantics patch which disables networking-via-ptrace() like
> so:

Hmm, ok. Missed that. I hope there are not more big holes. Obviously
can't allow to change other executables, but I guess that's ok.

It's still some overlap with network name spaces, but there are also
some not directly mappable semantic differences.

I haven't reviewed the patches in detail btw.

-Andi

-- 
ak@linux.intel.com -- Speaking for myself only.

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH] Security: Add prctl(PR_{GET,SET}_NETWORK) interface.
  2009-12-16 15:32           ` [PATCH] Security: Add prctl(PR_{GET,SET}_NETWORK) interface Michael Stone
  2009-12-16 15:59             ` Andi Kleen
@ 2009-12-17  9:25             ` Américo Wang
  2009-12-17 16:28               ` Michael Stone
  2009-12-17 17:23             ` Randy Dunlap
  2 siblings, 1 reply; 54+ messages in thread
From: Américo Wang @ 2009-12-17  9:25 UTC (permalink / raw)
  To: Michael Stone
  Cc: Ulrich Drepper, linux-kernel, netdev, linux-security-module,
	Andi Kleen, David Lang, Oliver Hartkopp, Alan Cox, Herbert Xu,
	Valdis Kletnieks, Bryan Donlan, Evgeniy Polyakov,
	C. Scott Ananian, James Morris, Eric W. Biederman,
	Bernie Innocenti, Mark Seaborn

On Wed, Dec 16, 2009 at 11:32 PM, Michael Stone <michael@laptop.org> wrote:
> Daniel Bernstein has observed [1] that security-conscious userland processes
> may benefit from the ability to irrevocably remove their ability to create,
> bind, connect to, or send messages except in the case of previously connected
> sockets or AF_UNIX filesystem sockets. We provide this facility by implementing
> support for a new prctl(PR_SET_NETWORK) flag named PR_NETWORK_OFF.
>
> This facility is particularly attractive to security platforms like OLPC
> Bitfrost [2] and to isolation programs like Rainbow [3] and Plash [4].
>
> [1]: http://cr.yp.to/unix/disablenetwork.html
> [2]: http://wiki.laptop.org/go/OLPC_Bitfrost
> [3]: http://wiki.laptop.org/go/Rainbow
> [4]: http://plash.beasts.org/
>
> Signed-off-by: Michael Stone <michael@laptop.org>
> ---
>  include/linux/prctl.h         |    7 +++++++
>  include/linux/prctl_network.h |    7 +++++++
>  include/linux/sched.h         |    2 ++
>  kernel/Makefile               |    2 +-
>  kernel/prctl_network.c        |   37 +++++++++++++++++++++++++++++++++++++
>  kernel/sys.c                  |    7 +++++++
>  6 files changed, 61 insertions(+), 1 deletions(-)
>  create mode 100644 include/linux/prctl_network.h
>  create mode 100644 kernel/prctl_network.c
>
> diff --git a/include/linux/prctl.h b/include/linux/prctl.h
> index a3baeb2..4eb4110 100644
> --- a/include/linux/prctl.h
> +++ b/include/linux/prctl.h
> @@ -102,4 +102,11 @@
>
>  #define PR_MCE_KILL_GET 34
>
> +/* Get/set process disable-network flags */
> +#define PR_SET_NETWORK 35
> +#define PR_GET_NETWORK 36
> +# define PR_NETWORK_ON        0
> +# define PR_NETWORK_OFF       1
> +# define PR_NETWORK_ALL_FLAGS 1
> +
>  #endif /* _LINUX_PRCTL_H */
> diff --git a/include/linux/prctl_network.h b/include/linux/prctl_network.h
> new file mode 100644
> index 0000000..2db83eb
> --- /dev/null
> +++ b/include/linux/prctl_network.h
> @@ -0,0 +1,7 @@
> +#ifndef _LINUX_PRCTL_NETWORK_H
> +#define _LINUX_PRCTL_NETWORK_H
> +
> +extern long prctl_get_network(void);
> +extern long prctl_set_network(unsigned long);
> +
> +#endif /* _LINUX_PRCTL_NETWORK_H */
> diff --git a/include/linux/sched.h b/include/linux/sched.h
> index 5c858f3..751d372 100644
> --- a/include/linux/sched.h
> +++ b/include/linux/sched.h
> @@ -1395,6 +1395,8 @@ struct task_struct {
>        unsigned int sessionid;
>  #endif
>        seccomp_t seccomp;
> +/* Flags for limiting networking via prctl(PR_SET_NETWORK). */
> +  unsigned long network;
>
>  /* Thread group tracking */
>        u32 parent_exec_id;
> diff --git a/kernel/Makefile b/kernel/Makefile
> index 864ff75..cafbff2 100644
> --- a/kernel/Makefile
> +++ b/kernel/Makefile
> @@ -10,7 +10,7 @@ obj-y     = sched.o fork.o exec_domain.o panic.o printk.o \
>            kthread.o wait.o kfifo.o sys_ni.o posix-cpu-timers.o mutex.o \
>            hrtimer.o rwsem.o nsproxy.o srcu.o semaphore.o \
>            notifier.o ksysfs.o pm_qos_params.o sched_clock.o cred.o \
> -           async.o
> +           async.o prctl_network.o
>  obj-y += groups.o
>
>  ifdef CONFIG_FUNCTION_TRACER
> diff --git a/kernel/prctl_network.c b/kernel/prctl_network.c
> new file mode 100644
> index 0000000..d173716
> --- /dev/null
> +++ b/kernel/prctl_network.c
> @@ -0,0 +1,37 @@
> +/*
> + * linux/kernel/prctl_network.c
> + *
> + * Copyright 2009  Michael Stone <michael@laptop.org>
> + *
> + * Turn off a process's ability to access new networks.
> + * See Documentation/prctl_network.txt for details.
> + */
> +
> +#include <linux/prctl_network.h>
> +#include <linux/sched.h>
> +#include <linux/prctl.h>
> +
> +long prctl_get_network(void)
> +{
> +       return current->network;
> +}
> +
> +long prctl_set_network(unsigned long network_flags)
> +{
> +       long ret;
> +
> +       /* only dropping access is permitted */
> +       ret = -EPERM;
> +        if (current->network & ~network_flags)
> +               goto out;
> +
> +       ret = -EINVAL;
> +       if (network_flags & ~PR_NETWORK_ALL_FLAGS)
> +               goto out;
> +
> +       current->network = network_flags;
> +       ret = 0;
> +
> +out:
> +       return ret;
> +}


Sorry that I didn't follow the original disscusion.
Any reason why you introdce a new source file?
Why not just adding them to kernel/sys.c?


> diff --git a/kernel/sys.c b/kernel/sys.c
> index 20ccfb5..4eccc66 100644
> --- a/kernel/sys.c
> +++ b/kernel/sys.c
> @@ -35,6 +35,7 @@
>  #include <linux/cpu.h>
>  #include <linux/ptrace.h>
>  #include <linux/fs_struct.h>
> +#include <linux/prctl_network.h>
>
>  #include <linux/compat.h>
>  #include <linux/syscalls.h>
> @@ -1576,6 +1577,12 @@ SYSCALL_DEFINE5(prctl, int, option, unsigned long, arg2, unsigned long, arg3,
>                        else
>                                error = PR_MCE_KILL_DEFAULT;
>                        break;
> +               case PR_SET_NETWORK:
> +                       error = prctl_set_network(arg2);
> +                       break;
> +               case PR_GET_NETWORK:
> +                       error = prctl_get_network();
> +                       break;
>                default:
>                        error = -EINVAL;
>                        break;
> --
> 1.5.6.5
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
>

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH] Security: Add prctl(PR_{GET,SET}_NETWORK) interface.
  2009-12-17  9:25             ` Américo Wang
@ 2009-12-17 16:28               ` Michael Stone
  0 siblings, 0 replies; 54+ messages in thread
From: Michael Stone @ 2009-12-17 16:28 UTC (permalink / raw)
  To: Américo Wang
  Cc: Michael Stone, Ulrich Drepper, linux-kernel, netdev,
	linux-security-module, Andi Kleen, David Lang, Oliver Hartkopp,
	Alan Cox, Herbert Xu, Valdis Kletnieks, Bryan Donlan,
	Evgeniy Polyakov, C. Scott Ananian, James Morris,
	Eric W. Biederman, Bernie Innocenti, Mark Seaborn

Américo Wang wrote:

> Any reason why you introdce a new source file?
> Why not just adding them to kernel/sys.c?

Modifying kernel/sys.c would be fine too, I think. I'll incorporate this
suggestion into the next revision of the prctl()-based patches.

Thanks!

Michael

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH] Security: Add prctl(PR_{GET,SET}_NETWORK) interface.
       [not found]                 ` <fb69ef3c0912170906t291a37c4r6c4758ddc7dd300b@mail.gmail.com>
@ 2009-12-17 17:14                   ` Andi Kleen
  2009-12-17 22:58                     ` Mark Seaborn
  0 siblings, 1 reply; 54+ messages in thread
From: Andi Kleen @ 2009-12-17 17:14 UTC (permalink / raw)
  To: Mark Seaborn
  Cc: Michael Stone, Andi Kleen, Ulrich Drepper, linux-kernel, netdev,
	linux-security-module, David Lang, Oliver Hartkopp, Alan Cox,
	Herbert Xu, Valdis Kletnieks, Bryan Donlan, Evgeniy Polyakov,
	C. Scott Ananian, James Morris, Eric W. Biederman,
	Bernie Innocenti

> This is not very good because in some situations it is useful to disable
> connect() and bind() while still allowing ptracing of other processes.  For
> example, Plash creates a new UID for each sandbox and it is possible to use
> strace and gdb inside a sandbox.  Currently Plash is not able to block
> network access or allow only limited network access.  If you treat ptrace()
> this way we won't have the ability to use strace and gdb while limiting
> network access.

No that's not what the hunk does. I first thought the same. But it actually
just limits these processes from initiating ptracing themselves. You can still
attach gdb/strace to them.

Now I'm not sure if that's closing all holes, but at least I can't come
up with any obvious ones currently. I think I would still prefer a more
general security container in general.

-Andi
-- 
ak@linux.intel.com -- Speaking for myself only.

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH] Security: Add prctl(PR_{GET,SET}_NETWORK) interface.
  2009-12-16 15:32           ` [PATCH] Security: Add prctl(PR_{GET,SET}_NETWORK) interface Michael Stone
  2009-12-16 15:59             ` Andi Kleen
  2009-12-17  9:25             ` Américo Wang
@ 2009-12-17 17:23             ` Randy Dunlap
  2009-12-17 17:25               ` Randy Dunlap
  2 siblings, 1 reply; 54+ messages in thread
From: Randy Dunlap @ 2009-12-17 17:23 UTC (permalink / raw)
  To: Michael Stone
  Cc: Ulrich Drepper, linux-kernel, netdev, linux-security-module,
	Andi Kleen, David Lang, Oliver Hartkopp, Alan Cox, Herbert Xu,
	Valdis Kletnieks, Bryan Donlan, Evgeniy Polyakov,
	C. Scott Ananian, James Morris, Eric W. Biederman,
	Bernie Innocenti, Mark Seaborn

On Wed, 16 Dec 2009 10:32:43 -0500 Michael Stone wrote:


> ---
>  include/linux/prctl.h         |    7 +++++++
>  include/linux/prctl_network.h |    7 +++++++
>  include/linux/sched.h         |    2 ++
>  kernel/Makefile               |    2 +-
>  kernel/prctl_network.c        |   37 +++++++++++++++++++++++++++++++++++++
>  kernel/sys.c                  |    7 +++++++
>  6 files changed, 61 insertions(+), 1 deletions(-)
>  create mode 100644 include/linux/prctl_network.h
>  create mode 100644 kernel/prctl_network.c
> 

> diff --git a/kernel/prctl_network.c b/kernel/prctl_network.c
> new file mode 100644
> index 0000000..d173716
> --- /dev/null
> +++ b/kernel/prctl_network.c
> @@ -0,0 +1,37 @@
> +/*
> + * linux/kernel/prctl_network.c
> + *
> + * Copyright 2009  Michael Stone <michael@laptop.org>
> + *
> + * Turn off a process's ability to access new networks.
> + * See Documentation/prctl_network.txt for details.
> + */

Where is Documentation/prctl_network.txt ?
and it should probably be Documentation/prctl/network.txt .

thanks,
---
~Randy

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH] Security: Add prctl(PR_{GET,SET}_NETWORK) interface.
  2009-12-17 17:23             ` Randy Dunlap
@ 2009-12-17 17:25               ` Randy Dunlap
  0 siblings, 0 replies; 54+ messages in thread
From: Randy Dunlap @ 2009-12-17 17:25 UTC (permalink / raw)
  To: Michael Stone
  Cc: Ulrich Drepper, linux-kernel, netdev, linux-security-module,
	Andi Kleen, David Lang, Oliver Hartkopp, Alan Cox, Herbert Xu,
	Valdis Kletnieks, Bryan Donlan, Evgeniy Polyakov,
	C. Scott Ananian, James Morris, Eric W. Biederman,
	Bernie Innocenti, Mark Seaborn

On Thu, 17 Dec 2009 09:23:26 -0800 Randy Dunlap wrote:

> On Wed, 16 Dec 2009 10:32:43 -0500 Michael Stone wrote:
> 
> 
> > ---
> >  include/linux/prctl.h         |    7 +++++++
> >  include/linux/prctl_network.h |    7 +++++++
> >  include/linux/sched.h         |    2 ++
> >  kernel/Makefile               |    2 +-
> >  kernel/prctl_network.c        |   37 +++++++++++++++++++++++++++++++++++++
> >  kernel/sys.c                  |    7 +++++++
> >  6 files changed, 61 insertions(+), 1 deletions(-)
> >  create mode 100644 include/linux/prctl_network.h
> >  create mode 100644 kernel/prctl_network.c
> > 
> 
> > diff --git a/kernel/prctl_network.c b/kernel/prctl_network.c
> > new file mode 100644
> > index 0000000..d173716
> > --- /dev/null
> > +++ b/kernel/prctl_network.c
> > @@ -0,0 +1,37 @@
> > +/*
> > + * linux/kernel/prctl_network.c
> > + *
> > + * Copyright 2009  Michael Stone <michael@laptop.org>
> > + *
> > + * Turn off a process's ability to access new networks.
> > + * See Documentation/prctl_network.txt for details.
> > + */
> 
> Where is Documentation/prctl_network.txt ?
> and it should probably be Documentation/prctl/network.txt .

gag, I see it.  Sorry about that.
I think that the file name still needs to be changed.

---
~Randy

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: Network isolation with RLIMIT_NETWORK, cont'd.
  2009-12-13 10:05   ` Eric W. Biederman
  2009-12-13 14:21     ` Michael Stone
@ 2009-12-17 17:52     ` Andi Kleen
  1 sibling, 0 replies; 54+ messages in thread
From: Andi Kleen @ 2009-12-17 17:52 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Michael Stone, linux-kernel, netdev, linux-security-module,
	Andi Kleen, David Lang, Oliver Hartkopp, Alan Cox, Herbert Xu,
	Valdis Kletnieks, Bryan Donlan, Rémi Denis-Courmont,
	Evgeniy Polyakov, C. Scott Ananian, James Morris,
	Bernie Innocenti, Mark Seaborn, Linux Containers

> Solve that with an unused uid.  That ptrace_may_access check is
> completely non-intuitive, and a problem if we ever remove the current
> == task security module bug avoidance.

I thought he wanted to do that without suid?

If he can change uids he can as well just use full network namespaces.

-Andi

-- 
ak@linux.intel.com -- Speaking for myself only.

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: Network isolation with RLIMIT_NETWORK, cont'd.
       [not found]       ` <fb69ef3c0912170931l5cbf0e3dh81c88e6502651042@mail.gmail.com>
@ 2009-12-17 18:24         ` Bryan Donlan
  2009-12-17 19:35           ` Bernie Innocenti
  2009-12-17 19:23         ` Bernie Innocenti
  1 sibling, 1 reply; 54+ messages in thread
From: Bryan Donlan @ 2009-12-17 18:24 UTC (permalink / raw)
  To: Mark Seaborn
  Cc: Michael Stone, Eric W. Biederman, linux-kernel, netdev,
	linux-security-module, Andi Kleen, David Lang, Oliver Hartkopp,
	Alan Cox, Herbert Xu, Valdis Kletnieks, Rémi Denis-Courmont,
	Evgeniy Polyakov, C. Scott Ananian, James Morris,
	Bernie Innocenti, Linux Containers

On Thu, Dec 17, 2009 at 12:31 PM, Mark Seaborn <mrs@mythic-beasts.com> wrote:

> Maybe we could fix (b) by making mount namespaces into first class objects
> that can be named through a file descriptor, so that one process can
> manipulate another process's namespace without itself being subject to the
> namespace.

Can this be done using openat() and friends currently? It would seem
the natural way to implement this; open /proc/(pid)/root, then
openat() things from there (or even chdir to it and see the mounts
that it sees from there...)

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH] Security: Implement prctl(PR_SET_NETWORK, PR_NETWORK_OFF) semantics.
  2009-12-16 15:32           ` [PATCH] Security: Implement prctl(PR_SET_NETWORK, PR_NETWORK_OFF) semantics Michael Stone
@ 2009-12-17 19:18             ` Eric W. Biederman
  0 siblings, 0 replies; 54+ messages in thread
From: Eric W. Biederman @ 2009-12-17 19:18 UTC (permalink / raw)
  To: Michael Stone
  Cc: Ulrich Drepper, linux-kernel, netdev, linux-security-module,
	Andi Kleen, David Lang, Oliver Hartkopp, Alan Cox, Herbert Xu,
	Valdis Kletnieks, Bryan Donlan, Evgeniy Polyakov,
	C. Scott Ananian, James Morris, Bernie Innocenti, Mark Seaborn

Michael Stone <michael@laptop.org> writes:

> diff --git a/kernel/ptrace.c b/kernel/ptrace.c
> index 23bd09c..5b38db0 100644
> --- a/kernel/ptrace.c
> +++ b/kernel/ptrace.c
> @@ -151,6 +151,8 @@ int __ptrace_may_access(struct task_struct *task, unsigned int mode)
>  		dumpable = get_dumpable(task->mm);
>  	if (!dumpable && !capable(CAP_SYS_PTRACE))
>  		return -EPERM;
> +	if (current->network)
> +		return -EPERM;

The principle should be: you gain no privileges by ptracing.
Therefore this check should be:

	if (current->network && !task->network)
		return -EPERM;

Which keeps the ptrace logic from being a larger hammer than it needs
to be.

Eric

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: Network isolation with RLIMIT_NETWORK, cont'd.
       [not found]       ` <fb69ef3c0912170931l5cbf0e3dh81c88e6502651042@mail.gmail.com>
  2009-12-17 18:24         ` Bryan Donlan
@ 2009-12-17 19:23         ` Bernie Innocenti
  1 sibling, 0 replies; 54+ messages in thread
From: Bernie Innocenti @ 2009-12-17 19:23 UTC (permalink / raw)
  To: Mark Seaborn
  Cc: Michael Stone, Eric W. Biederman, linux-kernel, netdev,
	linux-security-module, Andi Kleen, David Lang, Oliver Hartkopp,
	Alan Cox, Herbert Xu, Valdis Kletnieks, Bryan Donlan,
	Rémi Denis-Courmont, Evgeniy Polyakov, C. Scott Ananian,
	James Morris, Linux Containers

On Thu, 2009-12-17 at 17:31 +0000, Mark Seaborn wrote:


> The reason chroot() and clone()/CLONE_NEWNS are privileged is that
> they provide a way to violate the assumptions of setuid/setgid
> executables.  If we add a per-process flag that prevents a process
> from exec'ing setuid executables, we could allow chroot() and
> CLONE_NEWNS when that flag is set.  That fixes (a).

I think this would be great.

> 
> Maybe we could fix (b) by making mount namespaces into first class
> objects that can be named through a file descriptor, so that one
> process can manipulate another process's namespace without itself
> being subject to the namespace.

I think Michael's problem with debugging is much more fundamental:
application programmers get confused when some filesystem operations
fail in the debugged process, while it works fine from the shell.

It would help if the kernel provided a way for a process to switch to
another process' namespace. Even better, it would be great if existing
namespaces could be mounted at an arbitrary position within another
namespace. Then one could use traditional shell tools to inspect it, or
even chroot into it.

</delirium>

-- 
   // Bernie Innocenti - http://codewiz.org/
 \X/  Sugar Labs       - http://sugarlabs.org/


^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: Network isolation with RLIMIT_NETWORK, cont'd.
  2009-12-17 18:24         ` Bryan Donlan
@ 2009-12-17 19:35           ` Bernie Innocenti
  2009-12-17 19:53             ` Bryan Donlan
  0 siblings, 1 reply; 54+ messages in thread
From: Bernie Innocenti @ 2009-12-17 19:35 UTC (permalink / raw)
  To: Bryan Donlan
  Cc: Mark Seaborn, Michael Stone, Eric W. Biederman, linux-kernel,
	netdev, linux-security-module, Andi Kleen, David Lang,
	Oliver Hartkopp, Alan Cox, Herbert Xu, Valdis Kletnieks,
	Rémi Denis-Courmont, Evgeniy Polyakov, C. Scott Ananian,
	James Morris, Linux Containers

On Thu, 2009-12-17 at 13:24 -0500, Bryan Donlan wrote:
> Can this be done using openat() and friends currently? It would seem
> the natural way to implement this; open /proc/(pid)/root, then
> openat() things from there (or even chdir to it and see the mounts
> that it sees from there...)

Yeah, but /proc/<pid>/root is just a symlink. It's correct for chroots,
but I doubt it can be meaningful for per-process namespaces.

If we were to implement Mark Seaborn's idea of naming
namespaces, /proc/<pid>/rootfd would be a file descriptor providing
access to the namespace through some fancy ioctls.

Or maybe not. Could such a file-descriptor be used as the source
argument to mount(), perhaps along with a new MS_NS flag?

Alternatively, perhaps one could come up with a userspace solution:
read /proc/<pid>/mounts and repeat all mounts, perhaps with a prefix.
The downsides are that it would require superuser privs and wouldn't
automatically stay synchronized with the real namespace.

-- 
   // Bernie Innocenti - http://codewiz.org/
 \X/  Sugar Labs       - http://sugarlabs.org/


^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: Network isolation with RLIMIT_NETWORK, cont'd.
  2009-12-17 19:35           ` Bernie Innocenti
@ 2009-12-17 19:53             ` Bryan Donlan
  0 siblings, 0 replies; 54+ messages in thread
From: Bryan Donlan @ 2009-12-17 19:53 UTC (permalink / raw)
  To: Bernie Innocenti
  Cc: Mark Seaborn, Michael Stone, Eric W. Biederman, linux-kernel,
	netdev, linux-security-module, Andi Kleen, David Lang,
	Oliver Hartkopp, Alan Cox, Herbert Xu, Valdis Kletnieks,
	Rémi Denis-Courmont, Evgeniy Polyakov, C. Scott Ananian,
	James Morris, Linux Containers

On Thu, Dec 17, 2009 at 2:35 PM, Bernie Innocenti <bernie@codewiz.org> wrote:
> On Thu, 2009-12-17 at 13:24 -0500, Bryan Donlan wrote:
>> Can this be done using openat() and friends currently? It would seem
>> the natural way to implement this; open /proc/(pid)/root, then
>> openat() things from there (or even chdir to it and see the mounts
>> that it sees from there...)
>
> Yeah, but /proc/<pid>/root is just a symlink. It's correct for chroots,
> but I doubt it can be meaningful for per-process namespaces.

The files in /proc/<pid>/fs are 'just symlinks', but opening them can
provide access to objects (eg, deleted files) not accessible through
the normal filesystem namespace. I see no reason, API-wise, why
/proc/<pid>/root couldn't be extended similarly - but I've not looked
at the namespaces implementation, so maybe there's some reason it'd be
difficult to implement...

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH] Security: Add prctl(PR_{GET,SET}_NETWORK) interface.
  2009-12-17 17:14                   ` Andi Kleen
@ 2009-12-17 22:58                     ` Mark Seaborn
  2009-12-18  3:00                       ` Michael Stone
  0 siblings, 1 reply; 54+ messages in thread
From: Mark Seaborn @ 2009-12-17 22:58 UTC (permalink / raw)
  To: andi
  Cc: michael, drepper, linux-kernel, netdev, linux-security-module,
	david, socketcan, alan, herbert, Valdis.Kletnieks, bdonlan, zbr,
	cscott, jmorris, ebiederm, bernie

Andi Kleen <andi@firstfloor.org> wrote:

> > This is not very good because in some situations it is useful to disable
> > connect() and bind() while still allowing ptracing of other processes.  For
> > example, Plash creates a new UID for each sandbox and it is possible to use
> > strace and gdb inside a sandbox.  Currently Plash is not able to block
> > network access or allow only limited network access.  If you treat ptrace()
> > this way we won't have the ability to use strace and gdb while limiting
> > network access.
> 
> No that's not what the hunk does. I first thought the same. But it actually
> just limits these processes from initiating ptracing themselves. You can still
> attach gdb/strace to them.

No, I specifically mean running gdb/strace *inside* the sandbox so
that sandboxed processes can initiate ptrace on other processes inside
the same sandbox.  At the moment you can create a Plash sandbox and
run strace inside it with a command like the following, and strace
will successfully ptrace the subprocess that it spawns:

pola-run -B -e strace echo foo

I wouldn't want this to stop working just because the "disable
networking" flag has been set for all processes in the sandbox.

In practice Plash would set "disable networking" for all sandboxed
processes and then selectively enable limited network access by
passing file descriptors into the sandboxed processes via a socket.


> Now I'm not sure if that's closing all holes, but at least I can't come
> up with any obvious ones currently. I think I would still prefer a more
> general security container in general.

Well, yeah, adding a boolean just for network access seems pretty ad-hoc.

It sets alarm bells ringing if "disable networking" also functions as
"disable initiating ptrace()".  Isn't there a way of doing the latter
independently?

Cheers,
Mark

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH] Security: Add prctl(PR_{GET,SET}_NETWORK) interface.
  2009-12-17 22:58                     ` Mark Seaborn
@ 2009-12-18  3:00                       ` Michael Stone
  2009-12-18  3:29                         ` [PATCH 1/3] Security: Add prctl(PR_{GET,SET}_NETWORK) interface. (v2) Michael Stone
                                           ` (4 more replies)
  0 siblings, 5 replies; 54+ messages in thread
From: Michael Stone @ 2009-12-18  3:00 UTC (permalink / raw)
  To: Mark Seaborn
  Cc: linux-kernel, netdev, linux-security-module, Andi Kleen,
	David Lang, Oliver Hartkopp, Alan Cox, Herbert Xu,
	Valdis Kletnieks, Bryan Donlan, Evgeniy Polyakov,
	C. Scott Ananian, James Morris, Eric W. Biederman,
	Bernie Innocenti, Mark Seaborn, Randy Dunlap, Américo Wang,
	Michael Stone

Mark, Andi, Eric, and Randy,

First, thanks for all the comments, questions, and suggestions. They're very
much appreciated.

@Randy: In the revised patches that follow, I moved the documentation to

   Documentation/prctl/network.txt 

as you requested.

@Américo: In the revised patches that follow, I moved prctl_{get,set}_network()
into sys/kernel.c as you suggested.

@Eric, Mark: regarding ptrace()-ing from network-disabled processes: I agree
that this functionality is critical and I have altered the
__ptrace_may_access() check to support it. 

The new rule I propose is equivalent to the rule I used in ptrace_set_network()
and is similar to the rule that Eric proposed earlier this afternoon. I now
propose:

   "You may ptrace() any process that has all the network restrictions you do."

This should take care of your use of strace without bending anything else into
an unnatural shape.

------------------------

@Andi, Mark

Next, let's talk about the "ad-hoc"-ness of the current patches. There seem to
be three issues:

   1. What is "network access"?

   2. How should "network access" be access-controlled?

   3. Should we add a per-process boolean flag enabling and disabling some
      kinds of network access?

Here are my thoughts:

   1. "Network access" refers to the ability of a security principal to send
      messages to or to receive messages from a different principal. For our
      purposes, principals may be thought of as processes.

   2. Messages are sent and received over "channels". Common channels include
      open file descriptors, memory segments, message queues, file systems,
      process signalling trees, and ptrace attachments.
      
   3. The creation of new channels between principals is a security-sensitive
      operation. 

   4. The decision about whether or not to authorize opening a new channel
      between security principals should be based on five inputs:

            a) the general system policy, if any, of the sysadmin
            b) the personal policies, if any, of the human operator(s)
            c) the authors' policies, if any, in security principal(s)
            d) the channel being requested
            e) security labels like pids, uids, and acls labeling the principals

   5. Linux today has pretty good support for controlling the creation of
      channels involving the filesystem and involving shared daemons. It has
      mediocre support for access control involving sysv-ipc mechanisms. It has
      terrible support for access control involving non-local principals like
      "the collection of people and programs receiving packets sent to
      destination 18.0.0.1:80 from source 192.168.0.3:34661".

   6. We can make it easier and safer to write and to run software by improving
      the access control mechanisms available for deciding whether or not to
      open new channels.

   7. The best way to improve said access control mechanisms today is to add a
      facility that permits any process to drop the (heretofor not formally
      recognized) privilege that causes the kernel to open new *insufficiently
      access controlled* network channels.

   8. Anything that has to pass a regular Unix uid/gid/world discretionary
      access check *and* which the partner principal(s) have the opportunity to
      turn down is sufficiently access controlled. Everything else is not.

      (For example, filesystem Unix sockets are sufficiently controlled but
      ptrace is not because the process being traced has no opportunity to say
      "don't open this channel".)

   9. My patch implements the simplest and most usable improvement available in
      this area.

Critiques?

Alternately, you've both expressed some interest in a more general facility for
restricting network access. Do either of you have specific ideas on what you'd
like to see?

Michael

^ permalink raw reply	[flat|nested] 54+ messages in thread

* [PATCH 1/3] Security: Add prctl(PR_{GET,SET}_NETWORK) interface. (v2)
  2009-12-18  3:00                       ` Michael Stone
@ 2009-12-18  3:29                         ` Michael Stone
  2009-12-18  4:43                           ` Valdis.Kletnieks
  2009-12-18 15:46                           ` Alan Cox
  2009-12-18  3:31                         ` [PATCH 2/3] Security: Implement prctl(PR_SET_NETWORK, PR_NETWORK_OFF) semantics. (v2) Michael Stone
                                           ` (3 subsequent siblings)
  4 siblings, 2 replies; 54+ messages in thread
From: Michael Stone @ 2009-12-18  3:29 UTC (permalink / raw)
  To: Mark Seaborn
  Cc: linux-kernel, netdev, linux-security-module, Andi Kleen,
	David Lang, Oliver Hartkopp, Alan Cox, Herbert Xu,
	Valdis Kletnieks, Bryan Donlan, Evgeniy Polyakov,
	C. Scott Ananian, James Morris, Eric W. Biederman,
	Bernie Innocenti, Mark Seaborn, Randy Dunlap, Américo Wang,
	Michael Stone

Daniel Bernstein has observed [1] that security-conscious userland processes
may benefit from the ability to irrevocably remove their ability to create,
bind, connect to, or send messages except in the case of previously connected
sockets or AF_UNIX filesystem sockets. We provide this facility by implementing
support for a new prctl(PR_SET_NETWORK) flag named PR_NETWORK_OFF.

This facility is particularly attractive to security platforms like OLPC
Bitfrost [2] and to isolation programs like Rainbow [3] and Plash [4].

[1]: http://cr.yp.to/unix/disablenetwork.html
[2]: http://wiki.laptop.org/go/OLPC_Bitfrost
[3]: http://wiki.laptop.org/go/Rainbow
[4]: http://plash.beasts.org/

Signed-off-by: Michael Stone <michael@laptop.org>
---
  include/linux/prctl.h         |    7 +++++++
  include/linux/prctl_network.h |    7 +++++++
  include/linux/sched.h         |    2 ++
  kernel/sys.c                  |   32 ++++++++++++++++++++++++++++++++
  4 files changed, 48 insertions(+), 0 deletions(-)
  create mode 100644 include/linux/prctl_network.h

diff --git a/include/linux/prctl.h b/include/linux/prctl.h
index a3baeb2..4eb4110 100644
--- a/include/linux/prctl.h
+++ b/include/linux/prctl.h
@@ -102,4 +102,11 @@
  
  #define PR_MCE_KILL_GET 34
  
+/* Get/set process disable-network flags */
+#define PR_SET_NETWORK	35
+#define PR_GET_NETWORK	36
+# define PR_NETWORK_ON        0
+# define PR_NETWORK_OFF       1
+# define PR_NETWORK_ALL_FLAGS 1
+
  #endif /* _LINUX_PRCTL_H */
diff --git a/include/linux/prctl_network.h b/include/linux/prctl_network.h
new file mode 100644
index 0000000..2db83eb
--- /dev/null
+++ b/include/linux/prctl_network.h
@@ -0,0 +1,7 @@
+#ifndef _LINUX_PRCTL_NETWORK_H
+#define _LINUX_PRCTL_NETWORK_H
+
+extern long prctl_get_network(void);
+extern long prctl_set_network(unsigned long);
+
+#endif /* _LINUX_PRCTL_NETWORK_H */
diff --git a/include/linux/sched.h b/include/linux/sched.h
index 5c858f3..751d372 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -1395,6 +1395,8 @@ struct task_struct {
  	unsigned int sessionid;
  #endif
  	seccomp_t seccomp;
+/* Flags for limiting networking via prctl(PR_SET_NETWORK). */
+  unsigned long network;
  
  /* Thread group tracking */
     	u32 parent_exec_id;
diff --git a/kernel/sys.c b/kernel/sys.c
index 20ccfb5..411a2ff 100644
--- a/kernel/sys.c
+++ b/kernel/sys.c
@@ -35,6 +35,7 @@
  #include <linux/cpu.h>
  #include <linux/ptrace.h>
  #include <linux/fs_struct.h>
+#include <linux/prctl_network.h>
  
  #include <linux/compat.h>
  #include <linux/syscalls.h>
@@ -1576,6 +1577,12 @@ SYSCALL_DEFINE5(prctl, int, option, unsigned long, arg2, unsigned long, arg3,
  			else
  				error = PR_MCE_KILL_DEFAULT;
  			break;
+		case PR_SET_NETWORK:
+			error = prctl_set_network(arg2);
+			break;
+		case PR_GET_NETWORK:
+			error = prctl_get_network();
+			break;
  		default:
  			error = -EINVAL;
  			break;
@@ -1583,6 +1590,31 @@ SYSCALL_DEFINE5(prctl, int, option, unsigned long, arg2, unsigned long, arg3,
  	return error;
  }
  
+long prctl_get_network(void)
+{
+	return current->network;
+}
+
+long prctl_set_network(unsigned long network_flags)
+{
+	long ret;
+
+	/* only dropping access is permitted */
+	ret = -EPERM;
+        if (current->network & ~network_flags)
+		goto out;
+
+	ret = -EINVAL;
+	if (network_flags & ~PR_NETWORK_ALL_FLAGS)
+		goto out;
+
+	current->network = network_flags;
+	ret = 0;
+
+out:
+	return ret;
+}
+
  SYSCALL_DEFINE3(getcpu, unsigned __user *, cpup, unsigned __user *, nodep,
  		struct getcpu_cache __user *, unused)
  {
-- 
1.6.6.rc1

^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH 2/3] Security: Implement prctl(PR_SET_NETWORK, PR_NETWORK_OFF) semantics. (v2)
  2009-12-18  3:00                       ` Michael Stone
  2009-12-18  3:29                         ` [PATCH 1/3] Security: Add prctl(PR_{GET,SET}_NETWORK) interface. (v2) Michael Stone
@ 2009-12-18  3:31                         ` Michael Stone
  2009-12-18  3:57                           ` Eric W. Biederman
  2009-12-18  3:32                         ` [PATCH 3/3] Security: Document prctl(PR_{GET,SET}_NETWORK). (v2) Michael Stone
                                           ` (2 subsequent siblings)
  4 siblings, 1 reply; 54+ messages in thread
From: Michael Stone @ 2009-12-18  3:31 UTC (permalink / raw)
  To: Mark Seaborn
  Cc: linux-kernel, netdev, linux-security-module, Andi Kleen,
	David Lang, Oliver Hartkopp, Alan Cox, Herbert Xu,
	Valdis Kletnieks, Bryan Donlan, Evgeniy Polyakov,
	C. Scott Ananian, James Morris, Eric W. Biederman,
	Bernie Innocenti, Mark Seaborn, Randy Dunlap, Américo Wang,
	Michael Stone

Return -EPERM any time we try to __sock_create(), sys_connect(), sys_bind(),
sys_sendmsg(), or __ptrace_may_access() from a process with PR_NETWORK_OFF set
in current->network unless we're working on a socket which is already connected
or on a non-abstract AF_UNIX socket.

Signed-off-by: Michael Stone <michael@laptop.org>
---
  kernel/fork.c      |    2 ++
  kernel/ptrace.c    |    3 +++
  kernel/sys.c       |    2 +-
  net/socket.c       |   51 ++++++++++++++++++++++++++++++++++++++-------------
  net/unix/af_unix.c |   19 +++++++++++++++++++
  5 files changed, 63 insertions(+), 14 deletions(-)

diff --git a/kernel/fork.c b/kernel/fork.c
index 9bd9144..01a7644 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -1130,6 +1130,8 @@ static struct task_struct *copy_process(unsigned long clone_flags,
  
  	p->bts = NULL;
  
+	p->network = current->network;
+
  	p->stack_start = stack_start;
  
  	/* Perform scheduler related setup. Assign this task to a CPU. */
diff --git a/kernel/ptrace.c b/kernel/ptrace.c
index 23bd09c..bcf87ba 100644
--- a/kernel/ptrace.c
+++ b/kernel/ptrace.c
@@ -151,6 +151,9 @@ int __ptrace_may_access(struct task_struct *task, unsigned int mode)
  		dumpable = get_dumpable(task->mm);
  	if (!dumpable && !capable(CAP_SYS_PTRACE))
  		return -EPERM;
+	/* does current have networking restrictions not shared by task? */
+	if (current->network & ~task->network)
+		return -EPERM;
  
  	return security_ptrace_access_check(task, mode);
  }
diff --git a/kernel/sys.c b/kernel/sys.c
index 411a2ff..481fa9c 100644
--- a/kernel/sys.c
+++ b/kernel/sys.c
@@ -1601,7 +1601,7 @@ long prctl_set_network(unsigned long network_flags)
  
  	/* only dropping access is permitted */
  	ret = -EPERM;
-        if (current->network & ~network_flags)
+	if (current->network & ~network_flags)
  		goto out;
  
  	ret = -EINVAL;
diff --git a/net/socket.c b/net/socket.c
index b94c3dd..e59f906 100644
--- a/net/socket.c
+++ b/net/socket.c
@@ -87,6 +87,7 @@
  #include <linux/wireless.h>
  #include <linux/nsproxy.h>
  #include <linux/magic.h>
+#include <linux/sched.h>
  
  #include <asm/uaccess.h>
  #include <asm/unistd.h>
@@ -576,6 +577,12 @@ static inline int __sock_sendmsg(struct kiocb *iocb, struct socket *sock,
  	if (err)
  		return err;
  
+	err = -EPERM;
+	if (sock->sk->sk_family != AF_UNIX &&
+		current->network &&
+		(msg->msg_name != NULL || msg->msg_namelen != 0))
+		return err;
+
  	return sock->ops->sendmsg(iocb, sock, msg, size);
  }
  
@@ -1227,6 +1234,9 @@ static int __sock_create(struct net *net, int family, int type, int protocol,
  	if (err)
  		return err;
  
+	if (family != AF_UNIX && current->network)
+		return -EPERM;
+
  	/*
  	 *	Allocate the socket and allow the family to set things up. if
  	 *	the protocol is 0, the family is instructed to select an appropriate
@@ -1465,19 +1475,29 @@ SYSCALL_DEFINE3(bind, int, fd, struct sockaddr __user *, umyaddr, int, addrlen)
  	int err, fput_needed;
  
  	sock = sockfd_lookup_light(fd, &err, &fput_needed);
-	if (sock) {
-		err = move_addr_to_kernel(umyaddr, addrlen, (struct sockaddr *)&address);
-		if (err >= 0) {
-			err = security_socket_bind(sock,
-						   (struct sockaddr *)&address,
-						   addrlen);
-			if (!err)
-				err = sock->ops->bind(sock,
-						      (struct sockaddr *)
-						      &address, addrlen);
-		}
-		fput_light(sock->file, fput_needed);
-	}
+	if (!sock)
+		goto out;
+
+	err = move_addr_to_kernel(umyaddr, addrlen, (struct sockaddr *)&address);
+	if (err < 0)
+		goto out_fput;
+
+	err = security_socket_bind(sock,
+				   (struct sockaddr *)&address,
+				   addrlen);
+	if (err)
+		goto out_fput;
+
+	err = (((struct sockaddr *)&address)->sa_family != AF_UNIX &&
+		current->network) ? -EPERM : 0;
+	if (err)
+		goto out_fput;
+
+	err = sock->ops->bind(sock, (struct sockaddr *) &address, addrlen);
+
+out_fput:
+	fput_light(sock->file, fput_needed);
+out:
  	return err;
  }
  
@@ -1639,6 +1659,11 @@ SYSCALL_DEFINE3(connect, int, fd, struct sockaddr __user *, uservaddr,
  	if (err)
  		goto out_put;
  
+	err = (((struct sockaddr *)&address)->sa_family != AF_UNIX &&
+		current->network) ? -EPERM : 0;
+	if (err)
+		goto out_put;
+
  	err = sock->ops->connect(sock, (struct sockaddr *)&address, addrlen,
  				 sock->file->f_flags);
  out_put:
diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c
index f255119..5087ae3 100644
--- a/net/unix/af_unix.c
+++ b/net/unix/af_unix.c
@@ -797,6 +797,10 @@ static int unix_bind(struct socket *sock, struct sockaddr *uaddr, int addr_len)
  		goto out;
  	addr_len = err;
  
+	err = (current->network && !sunaddr->sun_path[0]) ? -EPERM : 0;
+	if (err)
+		goto out;
+
  	mutex_lock(&u->readlock);
  
  	err = -EINVAL;
@@ -934,6 +938,10 @@ static int unix_dgram_connect(struct socket *sock, struct sockaddr *addr,
  			goto out;
  		alen = err;
  
+		err = (current->network && !sunaddr->sun_path[0]) ? -EPERM : 0;
+		if (err)
+			goto out;
+
  		if (test_bit(SOCK_PASSCRED, &sock->flags) &&
  		    !unix_sk(sk)->addr && (err = unix_autobind(sock)) != 0)
  			goto out;
@@ -1033,6 +1041,10 @@ static int unix_stream_connect(struct socket *sock, struct sockaddr *uaddr,
  		goto out;
  	addr_len = err;
  
+	err = (current->network && !sunaddr->sun_path[0]) ? -EPERM : 0;
+	if (err)
+		goto out;
+
  	if (test_bit(SOCK_PASSCRED, &sock->flags) && !u->addr &&
  	    (err = unix_autobind(sock)) != 0)
  		goto out;
@@ -1370,6 +1382,10 @@ static int unix_dgram_sendmsg(struct kiocb *kiocb, struct socket *sock,
  		if (err < 0)
  			goto out;
  		namelen = err;
+
+		err = -EPERM;
+		if (current->network && !sunaddr->sun_path[0])
+			goto out;
  	} else {
  		sunaddr = NULL;
  		err = -ENOTCONN;
@@ -1520,6 +1536,9 @@ static int unix_stream_sendmsg(struct kiocb *kiocb, struct socket *sock,
  	if (msg->msg_namelen) {
  		err = sk->sk_state == TCP_ESTABLISHED ? -EISCONN : -EOPNOTSUPP;
  		goto out_err;
+		/* prctl(PR_SET_NETWORK) requires no change here since
+		 * connection-less unix stream sockets are not supported.
+		 * See Documentation/prctl/network.txt for details. */
  	} else {
  		sunaddr = NULL;
  		err = -ENOTCONN;
-- 
1.6.6.rc1

^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH 3/3] Security: Document prctl(PR_{GET,SET}_NETWORK). (v2)
  2009-12-18  3:00                       ` Michael Stone
  2009-12-18  3:29                         ` [PATCH 1/3] Security: Add prctl(PR_{GET,SET}_NETWORK) interface. (v2) Michael Stone
  2009-12-18  3:31                         ` [PATCH 2/3] Security: Implement prctl(PR_SET_NETWORK, PR_NETWORK_OFF) semantics. (v2) Michael Stone
@ 2009-12-18  3:32                         ` Michael Stone
  2009-12-18 17:49                         ` [PATCH] Security: Add prctl(PR_{GET,SET}_NETWORK) interface Stephen Hemminger
  2009-12-20 17:53                         ` Mark Seaborn
  4 siblings, 0 replies; 54+ messages in thread
From: Michael Stone @ 2009-12-18  3:32 UTC (permalink / raw)
  To: Mark Seaborn
  Cc: linux-kernel, netdev, linux-security-module, Andi Kleen,
	David Lang, Oliver Hartkopp, Alan Cox, Herbert Xu,
	Valdis Kletnieks, Bryan Donlan, Evgeniy Polyakov,
	C. Scott Ananian, James Morris, Eric W. Biederman,
	Bernie Innocenti, Mark Seaborn, Randy Dunlap, Américo Wang,
	Michael Stone

Explain the purpose, interface, and semantics of the
prctl(PR_{GET,SET}_network) facility.

Also reference some example userland clients.

Signed-off-by: Michael Stone <michael@laptop.org>
---
  Documentation/prctl/network.txt |   72 +++++++++++++++++++++++++++++++++++++++
  1 files changed, 72 insertions(+), 0 deletions(-)
  create mode 100644 Documentation/prctl/network.txt

diff --git a/Documentation/prctl/network.txt b/Documentation/prctl/network.txt
new file mode 100644
index 0000000..b337722
--- /dev/null
+++ b/Documentation/prctl/network.txt
@@ -0,0 +1,72 @@
+Purpose
+-------
+
+Daniel Bernstein has observed [1] that security-conscious userland processes
+may benefit from the ability to irrevocably remove their ability to create,
+bind, connect to, or send messages except in the case of previously connected
+sockets or AF_UNIX filesystem sockets.
+
+This facility is particularly attractive to security platforms like OLPC
+Bitfrost [2] and to isolation programs like Rainbow [3] and Plash [4] because:
+
+  * it integrates well with standard techniques for writing privilege-separated
+    Unix programs
+
+  * it integrates well with the need to perform limited socket I/O, e.g., when
+    running X clients
+
+  * it's available to unprivileged programs
+
+  * it's a discretionary feature available to all of distributors,
+    administrators, authors, and users
+
+  * its effect is entirely local, rather than global (like netfilter)
+
+  * it's simple enough to have some hope of being used correctly
+
+Implementation
+--------------
+
+After considering implementations based on the Linux Security Module (LSM)
+framework, on SELinux in particular, on network namespaces (CLONE_NEWNET), and
+on direct modification of the kernel syscall and task_struct APIs, we came to
+the conclusion that the best way to implement this feature was to extend the
+prctl() framework with a new pair of options named PR_{GET,SET}_NETWORK. These
+options cause prctl() to read or modify "current->network".
+
+Semantics
+---------
+
+current->network is a flags field which is preserved across all variants of
+fork() and exec().
+
+Writes which attempt to clear bits in current->network return -EPERM.
+
+The default value for current->network is named PR_NETWORK_OFF and is defined
+to be 0.
+
+Presently, only one flag is defined: PR_NETWORK_OFF.
+
+More flags may be defined in the future if they become needed.
+
+Attempts to set undefined flags result in -EINVAL.
+
+When PR_NETWORK_OFF is set, implementations of syscalls which may be used by
+the current process to perform autonomous networking will return -EPERM. For
+example, calls to socket(), bind(), connect(), sendmsg(), and ptrace() will
+return -EPERM except for cases we are manipulating an AF_UNIX socket whose name
+does not begin with \0 or, in the case of sendmsg(), unless we are manipulating
+a previously connected socket, i.e. one with
+
+  msg.msg_name == NULL && msg.msg_namelen == 0
+
+or, in the case of ptrace(), we are ptracing() a process which has all of our
+own networking restriction flags set.
+
+References
+----------
+
+[1]: http://cr.yp.to/unix/disablenetwork.html
+[2]: http://wiki.laptop.org/go/OLPC_Bitfrost
+[3]: http://wiki.laptop.org/go/Rainbow
+[4]: http://plash.beasts.org/
-- 
1.6.6.rc1

^ permalink raw reply related	[flat|nested] 54+ messages in thread

* Re: [PATCH 2/3] Security: Implement prctl(PR_SET_NETWORK, PR_NETWORK_OFF) semantics. (v2)
  2009-12-18  3:31                         ` [PATCH 2/3] Security: Implement prctl(PR_SET_NETWORK, PR_NETWORK_OFF) semantics. (v2) Michael Stone
@ 2009-12-18  3:57                           ` Eric W. Biederman
  0 siblings, 0 replies; 54+ messages in thread
From: Eric W. Biederman @ 2009-12-18  3:57 UTC (permalink / raw)
  To: Michael Stone
  Cc: Mark Seaborn, linux-kernel, netdev, linux-security-module,
	Andi Kleen, David Lang, Oliver Hartkopp, Alan Cox, Herbert Xu,
	Valdis Kletnieks, Bryan Donlan, Evgeniy Polyakov,
	C. Scott Ananian, James Morris, Bernie Innocenti, Randy Dunlap,
	Américo Wang

Michael Stone <michael@laptop.org> writes:

> Return -EPERM any time we try to __sock_create(), sys_connect(), sys_bind(),
> sys_sendmsg(), or __ptrace_may_access() from a process with PR_NETWORK_OFF set
> in current->network unless we're working on a socket which is already connected
> or on a non-abstract AF_UNIX socket.

It appears to me that the current security hooks are sufficient for what
you are doing.

The one true security module business prevents you from actually using the
security hooks, but could you create wrappers for the network security
hooks so the logic of the network stack does not need to change.

At the very least the huge separation of the test for AF_UNIX and
the test to see if it is a an anonymous AF_UNIX socket is pretty
large.  Structuring the code in such a way as to keep that together would
be nice.

Eric

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH 1/3] Security: Add prctl(PR_{GET,SET}_NETWORK) interface. (v2)
  2009-12-18  3:29                         ` [PATCH 1/3] Security: Add prctl(PR_{GET,SET}_NETWORK) interface. (v2) Michael Stone
@ 2009-12-18  4:43                           ` Valdis.Kletnieks
  2009-12-18 15:46                           ` Alan Cox
  1 sibling, 0 replies; 54+ messages in thread
From: Valdis.Kletnieks @ 2009-12-18  4:43 UTC (permalink / raw)
  To: Michael Stone
  Cc: Mark Seaborn, linux-kernel, netdev, linux-security-module,
	Andi Kleen, David Lang, Oliver Hartkopp, Alan Cox, Herbert Xu,
	Bryan Donlan, Evgeniy Polyakov, C. Scott Ananian, James Morris,
	Eric W. Biederman, Bernie Innocenti, Randy Dunlap,
	Américo Wang

[-- Attachment #1: Type: text/plain, Size: 850 bytes --]

On Thu, 17 Dec 2009 22:29:57 EST, Michael Stone said:
> Daniel Bernstein has observed [1] that security-conscious userland processes
> may benefit from the ability to irrevocably remove their ability to create,
> bind, connect to, or send messages except in the case of previously connected
> sockets or AF_UNIX filesystem sockets. We provide this facility by implementing
> support for a new prctl(PR_SET_NETWORK) flag named PR_NETWORK_OFF.

Dan does indeed have a point - but is this better achieved via either
the already-existing LSM interfaces (opening the stacking-LSM can of worms
again), or the SECCOMP framework?  We already have 2 other ways to turn off
stuff, do we really want a third way?

Alternatively, could a more generalized prctl interface be leveraged to handle
SECCOMP, and/or other targeted things that want to stack with LSM?


[-- Attachment #2: Type: application/pgp-signature, Size: 227 bytes --]

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH 1/3] Security: Add prctl(PR_{GET,SET}_NETWORK) interface. (v2)
  2009-12-18  3:29                         ` [PATCH 1/3] Security: Add prctl(PR_{GET,SET}_NETWORK) interface. (v2) Michael Stone
  2009-12-18  4:43                           ` Valdis.Kletnieks
@ 2009-12-18 15:46                           ` Alan Cox
  2009-12-18 16:33                             ` [PATCH 1/3] Security: Add prctl(PR_{GET,SET}_NETWORK) Michael Stone
  1 sibling, 1 reply; 54+ messages in thread
From: Alan Cox @ 2009-12-18 15:46 UTC (permalink / raw)
  To: Michael Stone
  Cc: Mark Seaborn, linux-kernel, netdev, linux-security-module,
	Andi Kleen, David Lang, Oliver Hartkopp, Herbert Xu,
	Valdis Kletnieks, Bryan Donlan, Evgeniy Polyakov,
	C. Scott Ananian, James Morris, Eric W. Biederman,
	Bernie Innocenti, Randy Dunlap, Américo Wang, Michael Stone

On Thu, 17 Dec 2009 22:29:57 -0500
Michael Stone <michael@laptop.org> wrote:

> Daniel Bernstein has observed [1] that security-conscious userland processes

Dan Bernstein has observed many things .. ;)

> may benefit from the ability to irrevocably remove their ability to create,
> bind, connect to, or send messages except in the case of previously connected
> sockets or AF_UNIX filesystem sockets. We provide this facility by implementing
> support for a new prctl(PR_SET_NETWORK) flag named PR_NETWORK_OFF.

This is a security model, it belongs as a security model using LSM. You
can already do it with SELinux and the like as far as I can see but
that's not to say you shouldn't submit it also as a small handy
standalone security module for people who don't want to load the big
security modules.

Otherwise you end up putting crap in fast paths that nobody needs but
everyone pays for and weird tests and hacks for address family and like
into core network code.

The fact the patches look utterly ugly should be telling you something -
which is that you are using the wrong hammer

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH 1/3] Security: Add prctl(PR_{GET,SET}_NETWORK)
  2009-12-18 15:46                           ` Alan Cox
@ 2009-12-18 16:33                             ` Michael Stone
  2009-12-18 17:20                               ` Alan Cox
                                                 ` (2 more replies)
  0 siblings, 3 replies; 54+ messages in thread
From: Michael Stone @ 2009-12-18 16:33 UTC (permalink / raw)
  To: Alan Cox
  Cc: Michael Stone, linux-kernel, netdev, linux-security-module,
	Andi Kleen, David Lang, Oliver Hartkopp, Alan Cox, Herbert Xu,
	Valdis Kletnieks, Bryan Donlan, Evgeniy Polyakov,
	C. Scott Ananian, James Morris, Eric W. Biederman,
	Bernie Innocenti, Mark Seaborn, Randy Dunlap, Américo Wang

Alan Cox wrote:

> This is a security model, it belongs as a security model using LSM. 

I'll see what I can cook up for you.

However, please don't be surprised when the resulting cover letter states that
the LSM-based version *does not* resolve the situation to my satisfaction as a
userland hacker due to the well-known and long-standing adoption and
compositionality problems facing small LSMs. ;)

Regards,

Michael

P.S. - Dan is cited in my patch because I wish to honor him for anticipating my
desires early, clearly, and in writing. However, if you know of an earlier
citation, then I'll be happy to include that one too.

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH 1/3] Security: Add prctl(PR_{GET,SET}_NETWORK)
  2009-12-18 16:33                             ` [PATCH 1/3] Security: Add prctl(PR_{GET,SET}_NETWORK) Michael Stone
@ 2009-12-18 17:20                               ` Alan Cox
  2009-12-18 17:47                                 ` Eric W. Biederman
  2009-12-24  1:42                               ` [PATCH 0/3] Discarding networking privilege via LSM Michael Stone
  2009-12-25 17:09                               ` [PATCH 1/3] Security: Add prctl(PR_{GET,SET}_NETWORK) Pavel Machek
  2 siblings, 1 reply; 54+ messages in thread
From: Alan Cox @ 2009-12-18 17:20 UTC (permalink / raw)
  To: Michael Stone
  Cc: Michael Stone, linux-kernel, netdev, linux-security-module,
	Andi Kleen, David Lang, Oliver Hartkopp, Herbert Xu,
	Valdis Kletnieks, Bryan Donlan, Evgeniy Polyakov,
	C. Scott Ananian, James Morris, Eric W. Biederman,
	Bernie Innocenti, Mark Seaborn, Randy Dunlap, Américo Wang

> the LSM-based version *does not* resolve the situation to my satisfaction as a
> userland hacker due to the well-known and long-standing adoption and
> compositionality problems facing small LSMs. ;)

For things like Fedora it's probably an "interesting idea, perhaps we
should do it using SELinux" sort of problem, but a config option for a
magic network prctl is also going to be hard to adopt without producing a
good use case - and avoiding that by dumping crap into everyones kernel
fast paths isn't a good idea either.


^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH 1/3] Security: Add prctl(PR_{GET,SET}_NETWORK)
  2009-12-18 17:20                               ` Alan Cox
@ 2009-12-18 17:47                                 ` Eric W. Biederman
  2009-12-24  6:13                                   ` Michael Stone
  0 siblings, 1 reply; 54+ messages in thread
From: Eric W. Biederman @ 2009-12-18 17:47 UTC (permalink / raw)
  To: Michael Stone
  Cc: linux-kernel, Alan Cox, netdev, linux-security-module,
	Andi Kleen, David Lang, Oliver Hartkopp, Herbert Xu,
	Valdis Kletnieks, Bryan Donlan, Evgeniy Polyakov,
	C. Scott Ananian, James Morris, Bernie Innocenti, Mark Seaborn,
	Randy Dunlap, Américo Wang

Alan Cox <alan@lxorguk.ukuu.org.uk> writes:

>> the LSM-based version *does not* resolve the situation to my satisfaction as a
>> userland hacker due to the well-known and long-standing adoption and
>> compositionality problems facing small LSMs. ;)
>
> For things like Fedora it's probably an "interesting idea, perhaps we
> should do it using SELinux" sort of problem, but a config option for a
> magic network prctl is also going to be hard to adopt without producing a
> good use case - and avoiding that by dumping crap into everyones kernel
> fast paths isn't a good idea either.

If I understand the problem the goal is to disable access to ipc
mechanism that don't have the usual unix permissions.  To get
something that is usable for non-root processes, and to get something
that is widely deployed so you don't have to jump through hoops in
end user applications to use it.

We have widely deployed mechanisms that are what you want or nearly
what you want already in the form of the various namespaces built for
containers.

I propose you introduce a permanent disable of executing suid 
applications. 

After which point it is another trivial patch to allow unsharing of
the network namespace if executing suid applications are disabled.

Eric

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH] Security: Add prctl(PR_{GET,SET}_NETWORK) interface.
  2009-12-18  3:00                       ` Michael Stone
                                           ` (2 preceding siblings ...)
  2009-12-18  3:32                         ` [PATCH 3/3] Security: Document prctl(PR_{GET,SET}_NETWORK). (v2) Michael Stone
@ 2009-12-18 17:49                         ` Stephen Hemminger
  2009-12-19 12:02                           ` David Wagner
  2009-12-20 17:53                         ` Mark Seaborn
  4 siblings, 1 reply; 54+ messages in thread
From: Stephen Hemminger @ 2009-12-18 17:49 UTC (permalink / raw)
  To: Michael Stone
  Cc: Mark Seaborn, linux-kernel, netdev, linux-security-module,
	Andi Kleen, David Lang, Oliver Hartkopp, Alan Cox, Herbert Xu,
	Valdis Kletnieks, Bryan Donlan, Evgeniy Polyakov,
	C. Scott Ananian, James Morris, Eric W. Biederman,
	Bernie Innocenti, Randy Dunlap, Américo Wang, Michael Stone

On Thu, 17 Dec 2009 22:00:57 -0500
Michael Stone <michael@laptop.org> wrote:

>    5. Linux today has pretty good support for controlling the creation of
>       channels involving the filesystem and involving shared daemons. It has
>       mediocre support for access control involving sysv-ipc mechanisms. It has
>       terrible support for access control involving non-local principals like
>       "the collection of people and programs receiving packets sent to
>       destination 18.0.0.1:80 from source 192.168.0.3:34661".

The policy control for this is done today on linux via the firewalling infrastructure.
It is not clear to me that moving over to the security infrastructure is an overall
gain from the security or user interface perspective.

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH] Security: Add prctl(PR_{GET,SET}_NETWORK) interface.
  2009-12-18 17:49                         ` [PATCH] Security: Add prctl(PR_{GET,SET}_NETWORK) interface Stephen Hemminger
@ 2009-12-19 12:02                           ` David Wagner
  2009-12-19 12:29                             ` Alan Cox
  0 siblings, 1 reply; 54+ messages in thread
From: David Wagner @ 2009-12-19 12:02 UTC (permalink / raw)
  To: linux-kernel

Stephen Hemminger  wrote:
>Michael Stone <michael@laptop.org> wrote:
>>    5. Linux today has pretty good support for controlling the creation of
>>       channels involving the filesystem and involving shared daemons. It has
>>       mediocre support for access control involving sysv-ipc
>mechanisms. It has
>>       terrible support for access control involving non-local principals like
>>       "the collection of people and programs receiving packets sent to
>>       destination 18.0.0.1:80 from source 192.168.0.3:34661".
>
>The policy control for this is done today on linux via the firewalling
>infrastructure.

I don't know of any reasonable way to introduce firewall rules
that apply only to a specific process; nor do I know of any way
for a user-level (non-root) process to specify and apply such
rules.  So it doesn't sound to me like the firewalling infrastructure
meets the requirements for which this patch was introduced.  Or did
I miss something?

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH] Security: Add prctl(PR_{GET,SET}_NETWORK) interface.
  2009-12-19 12:02                           ` David Wagner
@ 2009-12-19 12:29                             ` Alan Cox
  0 siblings, 0 replies; 54+ messages in thread
From: Alan Cox @ 2009-12-19 12:29 UTC (permalink / raw)
  To: David Wagner; +Cc: linux-kernel

O> I don't know of any reasonable way to introduce firewall rules
> that apply only to a specific process; nor do I know of any way
> for a user-level (non-root) process to specify and apply such
> rules.  So it doesn't sound to me like the firewalling infrastructure
> meets the requirements for which this patch was introduced.  Or did
> I miss something?

You can push BPF style filters onto sockets in Linux. They are not just
tied to some arbitary capture device. A process imposing its own isn't
too hard - imposing them on another process (or user) gets more complex
but cotnainers can probably do what is needed nowdays.

(We also have other weirdness like AX.25 where the mac address depends on
the user id of course)

Alan

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH] Security: Add prctl(PR_{GET,SET}_NETWORK) interface.
  2009-12-18  3:00                       ` Michael Stone
                                           ` (3 preceding siblings ...)
  2009-12-18 17:49                         ` [PATCH] Security: Add prctl(PR_{GET,SET}_NETWORK) interface Stephen Hemminger
@ 2009-12-20 17:53                         ` Mark Seaborn
  4 siblings, 0 replies; 54+ messages in thread
From: Mark Seaborn @ 2009-12-20 17:53 UTC (permalink / raw)
  To: Michael Stone
  Cc: linux-kernel, netdev, linux-security-module, Andi Kleen,
	David Lang, Oliver Hartkopp, Alan Cox, Herbert Xu,
	Valdis Kletnieks, Bryan Donlan, Evgeniy Polyakov,
	C. Scott Ananian, James Morris, Eric W. Biederman,
	Bernie Innocenti, Randy Dunlap, Américo Wang

On Fri, Dec 18, 2009 at 3:00 AM, Michael Stone <michael@laptop.org> wrote:

> @Eric, Mark: regarding ptrace()-ing from network-disabled processes: I agree
> that this functionality is critical and I have altered the
> __ptrace_may_access() check to support it.
> The new rule I propose is equivalent to the rule I used in ptrace_set_network()
> and is similar to the rule that Eric proposed earlier this afternoon. I now
> propose:
>
>  "You may ptrace() any process that has all the network restrictions you do."
>
> This should take care of your use of strace without bending anything else into
> an unnatural shape.

I am in two minds about this.  On the one hand, it adds the
flexibility that I asked for.  On the other hand, it is a more
complicated rule to have fixed in the kernel.

It still seems wrong to me that the disable-networking flag should
affect ptrace() at all.

The reason is that the disable-networking flag is not useful on its
own.  Anyone who uses it will use it in combination with some other
authority-limiting mechanism.  They will already have a story for how
to prevent sandboxed processes with interfering with other processes
via ptrace(), kill(), writing to ~/.bashrc, etc.  There's no point in
disabling network access for a process if it has full access to your
home directory and can cause programs to be run with your full
authority as a user.

So if there is already a way to control access to ptrace(), we
shouldn't add another check to the kernel's access control rules.
They are complicated enough already.


On ad-hocness: I am very much in favour of providing unprivileged
mechanisms for switching off sources of ambient authority.  But it
does not seem very useful to provide an unprivileged mechanism to
switch off network access if there is no unprivileged mechanism for
switching off access to the filesystem namespace, which is usually a
more important source of authority.  Maybe we should solve both
problems?

Cheers,
Mark

^ permalink raw reply	[flat|nested] 54+ messages in thread

* [PATCH 0/3] Discarding networking privilege via LSM
  2009-12-18 16:33                             ` [PATCH 1/3] Security: Add prctl(PR_{GET,SET}_NETWORK) Michael Stone
  2009-12-18 17:20                               ` Alan Cox
@ 2009-12-24  1:42                               ` Michael Stone
  2009-12-24  1:44                                 ` [PATCH 1/3] Security: Add prctl(PR_{GET,SET}_NETWORK) interface. (v3) Michael Stone
                                                   ` (2 more replies)
  2009-12-25 17:09                               ` [PATCH 1/3] Security: Add prctl(PR_{GET,SET}_NETWORK) Pavel Machek
  2 siblings, 3 replies; 54+ messages in thread
From: Michael Stone @ 2009-12-24  1:42 UTC (permalink / raw)
  To: Alan Cox
  Cc: Michael Stone, linux-kernel, netdev, linux-security-module,
	Andi Kleen, David Lang, Oliver Hartkopp, Alan Cox, Herbert Xu,
	Valdis Kletnieks, Bryan Donlan, Evgeniy Polyakov,
	C. Scott Ananian, James Morris, Eric W. Biederman,
	Bernie Innocenti, Mark Seaborn, Randy Dunlap, Américo Wang

Alan,

As you requested, here's a (rough) draft of my patch series which uses the
security_* hooks instead of direct modification of the networking functions. 

Have you further suggestions for improvement?

Regards,

Michael

P.S. - The most notable behavioral difference between this patch and the
previous one is that abstract unix sockets are exempted from control in this
patch but are restricted by the previous one. We can revisit this detail in
subsequent patches if this approach seems viable.

Michael Stone (3):
   Security: Add prctl(PR_{GET,SET}_NETWORK) interface. (v3)
   Security: Implement prctl(PR_SET_NETWORK, PR_NETWORK_OFF) semantics. (v3)
   Security: Document prctl(PR_{GET,SET}_NETWORK). (v3)

  Documentation/prctl/network.txt |   74 ++++++++++++++++++++++++++
  include/linux/prctl.h           |    7 +++
  include/linux/prctl_network.h   |    7 +++
  include/linux/sched.h           |    2 +
  kernel/sys.c                    |   32 +++++++++++
  security/Kconfig                |   13 +++++
  security/Makefile               |    1 +
  security/prctl_network.c        |  110 +++++++++++++++++++++++++++++++++++++++
  8 files changed, 246 insertions(+), 0 deletions(-)
  create mode 100644 Documentation/prctl/network.txt
  create mode 100644 include/linux/prctl_network.h
  create mode 100644 security/prctl_network.c

^ permalink raw reply	[flat|nested] 54+ messages in thread

* [PATCH 1/3] Security: Add prctl(PR_{GET,SET}_NETWORK) interface. (v3)
  2009-12-24  1:42                               ` [PATCH 0/3] Discarding networking privilege via LSM Michael Stone
@ 2009-12-24  1:44                                 ` Michael Stone
  2009-12-24  4:38                                   ` Samir Bellabes
  2009-12-24  1:45                                 ` [PATCH 2/3] Security: Implement prctl(PR_SET_NETWORK, PR_NETWORK_OFF) semantics. (v3) Michael Stone
  2009-12-24  1:45                                 ` [PATCH 3/3] Security: Document prctl(PR_{GET,SET}_NETWORK). (v3) Michael Stone
  2 siblings, 1 reply; 54+ messages in thread
From: Michael Stone @ 2009-12-24  1:44 UTC (permalink / raw)
  To: Alan Cox
  Cc: Michael Stone, linux-kernel, netdev, linux-security-module,
	Andi Kleen, David Lang, Oliver Hartkopp, Alan Cox, Herbert Xu,
	Valdis Kletnieks, Bryan Donlan, Evgeniy Polyakov,
	C. Scott Ananian, James Morris, Eric W. Biederman,
	Bernie Innocenti, Mark Seaborn, Randy Dunlap, Américo Wang

Daniel Bernstein has observed [1] that security-conscious userland processes
may benefit from the ability to irrevocably remove their ability to create,
bind, connect to, or send messages except in the case of previously connected
sockets or AF_UNIX filesystem sockets. We provide this facility via a new
prctl option-pair (PR_SET_NETWORK, PR_GET_NETWORK) and a new
prctl(PR_SET_NETWORK) flag named PR_NETWORK_OFF.

This facility is particularly attractive to security platforms like OLPC
Bitfrost [2] and to isolation programs like Rainbow [3] and Plash [4].

[1]: http://cr.yp.to/unix/disablenetwork.html
[2]: http://wiki.laptop.org/go/OLPC_Bitfrost
[3]: http://wiki.laptop.org/go/Rainbow
[4]: http://plash.beasts.org/

Signed-off-by: Michael Stone <michael@laptop.org>
---
  include/linux/prctl.h         |    7 +++++++
  include/linux/prctl_network.h |    7 +++++++
  include/linux/sched.h         |    2 ++
  kernel/sys.c                  |   32 ++++++++++++++++++++++++++++++++
  4 files changed, 48 insertions(+), 0 deletions(-)
  create mode 100644 include/linux/prctl_network.h

diff --git a/include/linux/prctl.h b/include/linux/prctl.h
index a3baeb2..4eb4110 100644
--- a/include/linux/prctl.h
+++ b/include/linux/prctl.h
@@ -102,4 +102,11 @@
  
  #define PR_MCE_KILL_GET 34
  
+/* Get/set process disable-network flags */
+#define PR_SET_NETWORK	35
+#define PR_GET_NETWORK	36
+# define PR_NETWORK_ON        0
+# define PR_NETWORK_OFF       1
+# define PR_NETWORK_ALL_FLAGS 1
+
  #endif /* _LINUX_PRCTL_H */
diff --git a/include/linux/prctl_network.h b/include/linux/prctl_network.h
new file mode 100644
index 0000000..2db83eb
--- /dev/null
+++ b/include/linux/prctl_network.h
@@ -0,0 +1,7 @@
+#ifndef _LINUX_PRCTL_NETWORK_H
+#define _LINUX_PRCTL_NETWORK_H
+
+extern long prctl_get_network(void);
+extern long prctl_set_network(unsigned long);
+
+#endif /* _LINUX_PRCTL_NETWORK_H */
diff --git a/include/linux/sched.h b/include/linux/sched.h
index f2f842d..0c65c55 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -1402,6 +1402,8 @@ struct task_struct {
  	unsigned int sessionid;
  #endif
  	seccomp_t seccomp;
+/* Flags for limiting networking via prctl(PR_SET_NETWORK). */
+  unsigned long network;
  
  /* Thread group tracking */
     	u32 parent_exec_id;
diff --git a/kernel/sys.c b/kernel/sys.c
index 26a6b73..e7d345c 100644
--- a/kernel/sys.c
+++ b/kernel/sys.c
@@ -35,6 +35,7 @@
  #include <linux/cpu.h>
  #include <linux/ptrace.h>
  #include <linux/fs_struct.h>
+#include <linux/prctl_network.h>
  
  #include <linux/compat.h>
  #include <linux/syscalls.h>
@@ -1578,6 +1579,12 @@ SYSCALL_DEFINE5(prctl, int, option, unsigned long, arg2, unsigned long, arg3,
  			else
  				error = PR_MCE_KILL_DEFAULT;
  			break;
+		case PR_SET_NETWORK:
+			error = prctl_set_network(arg2);
+			break;
+		case PR_GET_NETWORK:
+			error = prctl_get_network();
+			break;
  		default:
  			error = -EINVAL;
  			break;
@@ -1585,6 +1592,31 @@ SYSCALL_DEFINE5(prctl, int, option, unsigned long, arg2, unsigned long, arg3,
  	return error;
  }
  
+long prctl_get_network(void)
+{
+	return current->network;
+}
+
+long prctl_set_network(unsigned long network_flags)
+{
+	long ret;
+
+	/* only dropping access is permitted */
+	ret = -EPERM;
+        if (current->network & ~network_flags)
+		goto out;
+
+	ret = -EINVAL;
+	if (network_flags & ~PR_NETWORK_ALL_FLAGS)
+		goto out;
+
+	current->network = network_flags;
+	ret = 0;
+
+out:
+	return ret;
+}
+
  SYSCALL_DEFINE3(getcpu, unsigned __user *, cpup, unsigned __user *, nodep,
  		struct getcpu_cache __user *, unused)
  {
-- 
1.6.6.rc1

^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH 2/3] Security: Implement prctl(PR_SET_NETWORK, PR_NETWORK_OFF) semantics. (v3)
  2009-12-24  1:42                               ` [PATCH 0/3] Discarding networking privilege via LSM Michael Stone
  2009-12-24  1:44                                 ` [PATCH 1/3] Security: Add prctl(PR_{GET,SET}_NETWORK) interface. (v3) Michael Stone
@ 2009-12-24  1:45                                 ` Michael Stone
  2009-12-24  1:45                                 ` [PATCH 3/3] Security: Document prctl(PR_{GET,SET}_NETWORK). (v3) Michael Stone
  2 siblings, 0 replies; 54+ messages in thread
From: Michael Stone @ 2009-12-24  1:45 UTC (permalink / raw)
  To: Alan Cox
  Cc: Michael Stone, linux-kernel, netdev, linux-security-module,
	Andi Kleen, David Lang, Oliver Hartkopp, Alan Cox, Herbert Xu,
	Valdis Kletnieks, Bryan Donlan, Evgeniy Polyakov,
	C. Scott Ananian, James Morris, Eric W. Biederman,
	Bernie Innocenti, Mark Seaborn, Randy Dunlap, Américo Wang

Implement security_* hooks for socket_create, socket_bind, socket_connect,
socket_sendmsg, and ptrace_access_check which return -EPERM when called from a
process with networking restrictions. Exempt AF_UNIX sockets.

Signed-off-by: Michael Stone <michael@laptop.org>
---
  security/Kconfig         |   13 +++++
  security/Makefile        |    1 +
  security/prctl_network.c |  110 ++++++++++++++++++++++++++++++++++++++++++++++
  3 files changed, 124 insertions(+), 0 deletions(-)
  create mode 100644 security/prctl_network.c

diff --git a/security/Kconfig b/security/Kconfig
index 226b955..740a7fe 100644
--- a/security/Kconfig
+++ b/security/Kconfig
@@ -137,6 +137,19 @@ config LSM_MMAP_MIN_ADDR
  	  this low address space will need the permission specific to the
  	  systems running LSM.
  
+config SECURITY_PRCTL_NETWORK
+	tristate "prctl(PR_{GET,SET}_NETWORK) support"
+	depends on SECURITY_NETWORK
+	help
+    This enables processes to drop networking privileges via
+    prctl(PR_SET_NETWORK, PR_NETWORK_OFF), which is used by OLPC's isolation
+    shell, <http://wiki.laptop.org/go/Rainbow> to implement discretionary
+    network isolation.
+
+    See Documentation/prctl/network.txt for more information about this LSM.
+
+	  If you are unsure how to answer this question, answer N.
+
  source security/selinux/Kconfig
  source security/smack/Kconfig
  source security/tomoyo/Kconfig
diff --git a/security/Makefile b/security/Makefile
index da20a19..92ce65d 100644
--- a/security/Makefile
+++ b/security/Makefile
@@ -20,6 +20,7 @@ obj-$(CONFIG_SECURITY_SMACK)		+= smack/built-in.o
  obj-$(CONFIG_AUDIT)			+= lsm_audit.o
  obj-$(CONFIG_SECURITY_TOMOYO)		+= tomoyo/built-in.o
  obj-$(CONFIG_CGROUP_DEVICE)		+= device_cgroup.o
+obj-$(CONFIG_SECURITY_PRCTL_NETWORK)	+= prctl_network.o
  
  # Object integrity file lists
  subdir-$(CONFIG_IMA)			+= integrity/ima
diff --git a/security/prctl_network.c b/security/prctl_network.c
new file mode 100644
index 0000000..2da6051
--- /dev/null
+++ b/security/prctl_network.c
@@ -0,0 +1,110 @@
+/*
+ * prctl_network LSM.
+ *
+ * Copyright (C) 2008-2009 Michael Stone <michael@laptop.org>
+ * Based on sample code from security/root_plug.c, (C) 2002 Greg Kroah-Hartman.
+ *
+ * Implements the prctl(PR_SET_NETWORK, PR_NETWORK_OFF) syscall.
+ *
+ * See Documentation/prctl/network.txt for more information about this code.
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License as
+ * published by the Free Software Foundation, version 2 of the
+ * License.
+ */
+
+#include <linux/init.h>
+#include <linux/kernel.h>
+#include <linux/errno.h>
+#include <linux/sched.h>
+#include <net/sock.h>
+#include <linux/socket.h>
+#include <linux/security.h>
+
+static inline int maybe_allow(void)
+{
+	if (current->network)
+		return -EPERM;
+	return 0;
+}
+
+static inline int prctl_network_socket_create_hook (int family, int type,
+						int protocol, int kern)
+{
+	if (family == AF_UNIX)
+		return 0;
+	return maybe_allow();
+}
+
+static inline int prctl_network_socket_bind_hook(struct socket * sock,
+					     struct sockaddr * address,
+					     int addrlen)
+{
+	if (address->sa_family == AF_UNIX)
+		return 0;
+	return maybe_allow();
+}
+
+static inline int prctl_network_socket_connect_hook(struct socket * sock,
+						struct sockaddr * address,
+						int addrlen)
+{
+	if (address->sa_family == AF_UNIX)
+		return 0;
+	return maybe_allow();
+}
+
+static inline int prctl_network_socket_sendmsg_hook(struct socket * sock,
+						struct msghdr * msg, int size)
+{
+	if (sock->sk->sk_family != PF_UNIX &&
+		current->network &&
+		(msg->msg_name != NULL || msg->msg_namelen != 0))
+		return -EPERM;
+	return 0;
+}
+
+static inline int prctl_network_ptrace_access_check_hook(struct task_struct *child, unsigned int mode)
+{
+	/* does current have networking restrictions not shared by child? */
+	if (current->network & ~child->network)
+		return -EPERM;
+	return 0;
+}
+
+/* static inline int prctl_network_ptrace_traceme(struct task_struct *parent) ? */
+
+static struct security_operations prctl_network_security_ops = {
+	.name                   = "prctl_net",
+	.socket_create		= prctl_network_socket_create_hook,
+	.socket_bind		= prctl_network_socket_bind_hook,
+	.socket_connect		= prctl_network_socket_connect_hook,
+	.socket_sendmsg		= prctl_network_socket_sendmsg_hook,
+	.ptrace_access_check	= prctl_network_ptrace_access_check_hook,
+};
+
+static int __init prctl_network_security_init (void)
+{
+	if (!security_module_enable(&prctl_network_security_ops)) {
+		printk (KERN_INFO
+			"Failure enabling prctl_network_lsm.\n");
+		return 0;
+	}
+
+	/* register ourselves with the security framework */
+	if (register_security (&prctl_network_security_ops)) {
+		printk (KERN_INFO
+			"Failure registering prctl_network_lsm with the kernel\n");
+		return 0;
+	}
+
+	printk (KERN_INFO "prctl_network_lsm initialized\n");
+
+	return 0;
+}
+
+security_initcall (prctl_network_security_init);
+
+MODULE_DESCRIPTION("prctl_network LSM; implementing prctl(PR_SET_NETWORK, PR_NETWORK_OFF).");
+MODULE_LICENSE("GPL");
-- 
1.6.6.rc1

^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH 3/3] Security: Document prctl(PR_{GET,SET}_NETWORK). (v3)
  2009-12-24  1:42                               ` [PATCH 0/3] Discarding networking privilege via LSM Michael Stone
  2009-12-24  1:44                                 ` [PATCH 1/3] Security: Add prctl(PR_{GET,SET}_NETWORK) interface. (v3) Michael Stone
  2009-12-24  1:45                                 ` [PATCH 2/3] Security: Implement prctl(PR_SET_NETWORK, PR_NETWORK_OFF) semantics. (v3) Michael Stone
@ 2009-12-24  1:45                                 ` Michael Stone
  2 siblings, 0 replies; 54+ messages in thread
From: Michael Stone @ 2009-12-24  1:45 UTC (permalink / raw)
  To: Alan Cox
  Cc: Michael Stone, linux-kernel, netdev, linux-security-module,
	Andi Kleen, David Lang, Oliver Hartkopp, Alan Cox, Herbert Xu,
	Valdis Kletnieks, Bryan Donlan, Evgeniy Polyakov,
	C. Scott Ananian, James Morris, Eric W. Biederman,
	Bernie Innocenti, Mark Seaborn, Randy Dunlap, Américo Wang

Explain the purpose, interface, and semantics of the
prctl(PR_{GET,SET}_network) facility and LSM.

Also reference some example userland clients.

Signed-off-by: Michael Stone <michael@laptop.org>
---
  Documentation/prctl/network.txt |   74 +++++++++++++++++++++++++++++++++++++++
  1 files changed, 74 insertions(+), 0 deletions(-)
  create mode 100644 Documentation/prctl/network.txt

diff --git a/Documentation/prctl/network.txt b/Documentation/prctl/network.txt
new file mode 100644
index 0000000..8b45d23
--- /dev/null
+++ b/Documentation/prctl/network.txt
@@ -0,0 +1,74 @@
+Purpose
+-------
+
+Daniel Bernstein has observed [1] that security-conscious userland processes
+may benefit from the ability to irrevocably remove their ability to create,
+bind, connect to, or send messages except in the case of previously connected
+sockets or AF_UNIX filesystem sockets.
+
+This facility is particularly attractive to security platforms like OLPC
+Bitfrost [2] and to isolation programs like Rainbow [3] and Plash [4] because:
+
+  * it integrates well with standard techniques for writing privilege-separated
+    Unix programs
+
+  * it integrates well with the need to perform limited socket I/O, e.g., when
+    running X clients
+
+  * it's available to unprivileged programs
+
+  * it's a discretionary feature available to all of distributors,
+    administrators, authors, and users
+
+  * its effect is entirely local, rather than global (like netfilter)
+
+  * it's simple enough to have some hope of being used correctly
+
+Implementation
+--------------
+
+After considering implementations based on the Linux Security Module (LSM)
+framework, on SELinux, on network namespaces (CLONE_NEWNET), on direct
+modification of the kernel syscall and task_struct APIs and after seeking
+advice from members of LKML, we came to the conclusion that the best way to
+implement this feature was to extend the prctl() framework with a new pair of
+options named PR_{GET,SET}_NETWORK and to write an LSM to implement the
+resulting PR_NETWORK_OFF semantics. These options cause prctl() to read or
+modify "current->network".
+
+Semantics
+---------
+
+current->network is a flags field which is preserved across all variants of
+fork() and exec().
+
+Writes which attempt to clear bits in current->network return -EPERM.
+
+The default value for current->network is named PR_NETWORK_ON and is defined
+to be 0.
+
+Presently, only one flag is defined: PR_NETWORK_OFF.
+
+More flags may be defined in the future if they become needed.
+
+Attempts to set undefined flags result in -EINVAL.
+
+When PR_NETWORK_OFF is set, implementations of syscalls which may be used by
+the current process to perform autonomous networking will return -EPERM. For
+example, calls to socket(), bind(), connect(), sendmsg(), and ptrace() will
+return -EPERM except for cases we are manipulating an AF_UNIX socket or, in the
+case of sendmsg(), unless we are manipulating a previously connected socket,
+i.e. one with
+
+  msg.msg_name == NULL && msg.msg_namelen == 0
+
+or, in the case of ptrace(), unless we are ptracing() a process which has all
+of our own networking restriction flags set.
+
+References
+----------
+
+[1]: http://cr.yp.to/unix/disablenetwork.html
+[2]: http://wiki.laptop.org/go/OLPC_Bitfrost
+[3]: http://wiki.laptop.org/go/Rainbow
+[4]: http://plash.beasts.org/
-- 
1.6.6.rc1

^ permalink raw reply related	[flat|nested] 54+ messages in thread

* Re: [PATCH 1/3] Security: Add prctl(PR_{GET,SET}_NETWORK) interface. (v3)
  2009-12-24  1:44                                 ` [PATCH 1/3] Security: Add prctl(PR_{GET,SET}_NETWORK) interface. (v3) Michael Stone
@ 2009-12-24  4:38                                   ` Samir Bellabes
  2009-12-24  5:44                                     ` Michael Stone
  2009-12-24  5:51                                     ` Tetsuo Handa
  0 siblings, 2 replies; 54+ messages in thread
From: Samir Bellabes @ 2009-12-24  4:38 UTC (permalink / raw)
  To: Michael Stone
  Cc: Alan Cox, linux-kernel, netdev, linux-security-module,
	Andi Kleen, David Lang, Oliver Hartkopp, Herbert Xu,
	Valdis Kletnieks, Bryan Donlan, Evgeniy Polyakov,
	C. Scott Ananian, James Morris, Eric W. Biederman,
	Bernie Innocenti, Mark Seaborn, Randy Dunlap, Américo Wang

Michael Stone <michael@laptop.org> writes:

> diff --git a/include/linux/sched.h b/include/linux/sched.h
> index f2f842d..0c65c55 100644
> --- a/include/linux/sched.h
> +++ b/include/linux/sched.h
> @@ -1402,6 +1402,8 @@ struct task_struct {
>   	unsigned int sessionid;
>   #endif
>   	seccomp_t seccomp;
> +/* Flags for limiting networking via prctl(PR_SET_NETWORK). */
> +  unsigned long network;
>   
>   /* Thread group tracking */
>      	u32 parent_exec_id;

I think this is unnecessary, as LSM module, you should use the
void* security member of the structure cred. 

this member allows you to mark task_struct as you which, it's a kind of
abstraction provided to all security modules.

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH 1/3] Security: Add prctl(PR_{GET,SET}_NETWORK) interface. (v3)
  2009-12-24  4:38                                   ` Samir Bellabes
@ 2009-12-24  5:44                                     ` Michael Stone
  2009-12-24  5:51                                     ` Tetsuo Handa
  1 sibling, 0 replies; 54+ messages in thread
From: Michael Stone @ 2009-12-24  5:44 UTC (permalink / raw)
  To: Samir Bellabes
  Cc: Michael Stone, linux-kernel, netdev, linux-security-module,
	Andi Kleen, David Lang, Oliver Hartkopp, Alan Cox, Herbert Xu,
	Valdis Kletnieks, Bryan Donlan, Evgeniy Polyakov,
	C. Scott Ananian, James Morris, Eric W. Biederman,
	Bernie Innocenti, Mark Seaborn, Randy Dunlap, Américo Wang

> I think this is unnecessary, as LSM module, you should use the
> void* security member of the structure cred. 

The change you propose is easily made but I'm having trouble seeing how making
it would help my purpose: the field you name is already in use by other parts
of the kernel which my functionality is intended to complement.

That being said, I'd be very happy to prepare a version of the patch using the
strategy you suggest if it would be directly useful to you or if you can show
me how it would contribute to my goals.

Regards, and thanks for your comment,

Michael

P.S. - Perhaps a reasonable alternative would be to the definition of the field
conditional on CONFIGURE_SECURITY_PRCTL_NETWORK?

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH 1/3] Security: Add prctl(PR_{GET,SET}_NETWORK) interface. (v3)
  2009-12-24  4:38                                   ` Samir Bellabes
  2009-12-24  5:44                                     ` Michael Stone
@ 2009-12-24  5:51                                     ` Tetsuo Handa
  1 sibling, 0 replies; 54+ messages in thread
From: Tetsuo Handa @ 2009-12-24  5:51 UTC (permalink / raw)
  To: sam
  Cc: alan, linux-kernel, netdev, linux-security-module, andi, david,
	socketcan, herbert, Valdis.Kletnieks, bdonlan, zbr, cscott,
	jmorris, ebiederm, bernie, mrs, randy.dunlap, xiyou.wangcong,
	michael

Samir Bellabes wrote:
> > diff --git a/include/linux/sched.h b/include/linux/sched.h
> > index f2f842d..0c65c55 100644
> > --- a/include/linux/sched.h
> > +++ b/include/linux/sched.h
> > @@ -1402,6 +1402,8 @@ struct task_struct {
> >   	unsigned int sessionid;
> >   #endif
> >   	seccomp_t seccomp;
> > +/* Flags for limiting networking via prctl(PR_SET_NETWORK). */
> > +  unsigned long network;
> >   
> >   /* Thread group tracking */
> >      	u32 parent_exec_id;
> 
> I think this is unnecessary, as LSM module, you should use the
> void* security member of the structure cred. 
> 
> this member allows you to mark task_struct as you which, it's a kind of
> abstraction provided to all security modules.

I want to use per task_struct variable. Since cred is copy-on-write, we have to
use kmalloc()/kfree() whenever we modify variable in cred. That introduces
unnwanted error paths (i.e. memory allocation failure) and overhead.

Old version of TOMOYO had similar mechanism that allows userland programs to
disable specific operations (disable chroot(), disable execve(), disable
mount() etc. ; which is different from POSIX capabilities).
I added "unsigned int dropped_capability;" to task_struct for implementing it.
Adding variables to task_struct makes it possible to error-path-free.
I prefer adding "void *security;" to task_struct which is duplicated upon fork() and
released upon exit().

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH 1/3] Security: Add prctl(PR_{GET,SET}_NETWORK)
  2009-12-18 17:47                                 ` Eric W. Biederman
@ 2009-12-24  6:13                                   ` Michael Stone
  2009-12-24 12:37                                     ` Eric W. Biederman
  0 siblings, 1 reply; 54+ messages in thread
From: Michael Stone @ 2009-12-24  6:13 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: linux-kernel, netdev, linux-security-module, Andi Kleen,
	David Lang, Oliver Hartkopp, Alan Cox, Herbert Xu,
	Valdis Kletnieks, Bryan Donlan, Evgeniy Polyakov,
	C. Scott Ananian, James Morris, Eric W. Biederman,
	Bernie Innocenti, Mark Seaborn, Randy Dunlap, Américo Wang,
	Michael Stone

> Eric Biederman writes:
>> Alan Cox <alan@lxorguk.ukuu.org.uk> writes:
>>> Michael Stone writes:
>>>> the LSM-based version *does not* resolve the situation to my satisfaction as a
>>>> userland hacker due to the well-known and long-standing adoption and
>>>> compositionality problems facing small LSMs. ;)
>>>
>>> For things like Fedora it's probably an "interesting idea, perhaps we
>>> should do it using SELinux" sort of problem, but a config option for a
>>> magic network prctl is also going to be hard to adopt without producing a
>>> good use case - and avoiding that by dumping crap into everyones kernel
>>> fast paths isn't a good idea either.
>
>If I understand the problem the goal is to disable access to ipc
>mechanism that don't have the usual unix permissions.  To get
>something that is usable for non-root processes, and to get something
>that is widely deployed so you don't have to jump through hoops in
>end user applications to use it.

Eric,

You understand correctly. Thank you for this cogent restatement.

>We have widely deployed mechanisms that are what you want or nearly
>what you want already in the form of the various namespaces built for
>containers.

It's true that your work is closer to what I want than anything else that I've
seen so far...

>I propose you introduce a permanent disable of executing suid 
>applications. 

I'm open to the idea but I don't understand the need that motivates it yet.

Could you please explain further? (or point me to an existing explanation?)

>After which point it is another trivial patch to allow unsharing of
>the network namespace if executing suid applications are disabled.

How do you propose to address the problem with the Unix sockets?

Regards,

Michael

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH 1/3] Security: Add prctl(PR_{GET,SET}_NETWORK)
  2009-12-24  6:13                                   ` Michael Stone
@ 2009-12-24 12:37                                     ` Eric W. Biederman
  0 siblings, 0 replies; 54+ messages in thread
From: Eric W. Biederman @ 2009-12-24 12:37 UTC (permalink / raw)
  To: Michael Stone
  Cc: linux-kernel, netdev, linux-security-module, Andi Kleen,
	David Lang, Oliver Hartkopp, Alan Cox, Herbert Xu,
	Valdis Kletnieks, Bryan Donlan, Evgeniy Polyakov,
	C. Scott Ananian, James Morris, Bernie Innocenti, Mark Seaborn,
	Randy Dunlap, Américo Wang

Michael Stone <michael@laptop.org> writes:

>> Eric Biederman writes:
>>> Alan Cox <alan@lxorguk.ukuu.org.uk> writes:
>>>> Michael Stone writes:
>>>>> the LSM-based version *does not* resolve the situation to my satisfaction as a
>>>>> userland hacker due to the well-known and long-standing adoption and
>>>>> compositionality problems facing small LSMs. ;)
>>>>
>>>> For things like Fedora it's probably an "interesting idea, perhaps we
>>>> should do it using SELinux" sort of problem, but a config option for a
>>>> magic network prctl is also going to be hard to adopt without producing a
>>>> good use case - and avoiding that by dumping crap into everyones kernel
>>>> fast paths isn't a good idea either.
>>
>>If I understand the problem the goal is to disable access to ipc
>>mechanism that don't have the usual unix permissions.  To get
>>something that is usable for non-root processes, and to get something
>>that is widely deployed so you don't have to jump through hoops in
>>end user applications to use it.
>
> Eric,
>
> You understand correctly. Thank you for this cogent restatement.
>
>>We have widely deployed mechanisms that are what you want or nearly
>>what you want already in the form of the various namespaces built for
>>containers.
>
> It's true that your work is closer to what I want than anything else that I've
> seen so far...
>
>> I propose you introduce a permanent disable of executing suid applications. 
>
> I'm open to the idea but I don't understand the need that motivates it yet.
> Could you please explain further? (or point me to an existing explanation?)

With namespaces it is possible to masquarade as a trusted source,
of information to a suid program such as /etc/passwd or a NIS server.

A one-way removal of the ability to exec suid programs is generally
simple and handy like chroot, and removes the need for CAP_SYS_ADMIN
in most cases.

Plan 9 did not support suid executables and supported an unprivileged
equivalent of unshare(NEWNS).

I need the full unprivileged unshare of USERNS for my primary
uses today as I need to perform normally root only activities
like mounting loopback devices, and setting up networking. Your
uses of limiting of ipc do not appear to require that.

>>After which point it is another trivial patch to allow unsharing of
>>the network namespace if executing suid applications are disabled.
>
> How do you propose to address the problem with the Unix sockets?

Careful code review of the patch to allow talking between network
namespaces with unix domain sockets.  This is a feature that we
simply have not merged yet.  Semantically it is fine today. It is
simply no one has answered the question what other implications
could there be.  Now that I know of 2 or 3 compelling use
cases and most of the rest of the work done.  It seems time to
relax the restriction.

Eric

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH 1/3] Security: Add prctl(PR_{GET,SET}_NETWORK)
  2009-12-18 16:33                             ` [PATCH 1/3] Security: Add prctl(PR_{GET,SET}_NETWORK) Michael Stone
  2009-12-18 17:20                               ` Alan Cox
  2009-12-24  1:42                               ` [PATCH 0/3] Discarding networking privilege via LSM Michael Stone
@ 2009-12-25 17:09                               ` Pavel Machek
  2 siblings, 0 replies; 54+ messages in thread
From: Pavel Machek @ 2009-12-25 17:09 UTC (permalink / raw)
  To: Michael Stone
  Cc: Alan Cox, linux-kernel, netdev, linux-security-module,
	Andi Kleen, David Lang, Oliver Hartkopp, Herbert Xu,
	Valdis Kletnieks, Bryan Donlan, Evgeniy Polyakov,
	C. Scott Ananian, James Morris, Eric W. Biederman,
	Bernie Innocenti, Mark Seaborn, Randy Dunlap, Am?rico Wang

On Fri 2009-12-18 11:33:48, Michael Stone wrote:
> Alan Cox wrote:
>
>> This is a security model, it belongs as a security model using LSM. 
>
> I'll see what I can cook up for you.
>
> However, please don't be surprised when the resulting cover letter states that
> the LSM-based version *does not* resolve the situation to my satisfaction as a
> userland hacker due to the well-known and long-standing adoption and
> compositionality problems facing small LSMs. ;)

Maybe it is time to fix the LSM? This excuse is much too common...
									Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 54+ messages in thread

end of thread, other threads:[~2009-12-25 17:09 UTC | newest]

Thread overview: 54+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-12-13  3:19 Network isolation with RLIMIT_NETWORK, cont'd Michael Stone
2009-12-13  3:26 ` [PATCH] Security: Implement RLIMIT_NETWORK Michael Stone
2009-12-13  3:30 ` [PATCH] Security: Document RLIMIT_NETWORK Michael Stone
2009-12-13  3:44 ` Network isolation with RLIMIT_NETWORK, cont'd Michael Stone
2009-12-13  5:09   ` setrlimit(RLIMIT_NETWORK) vs. prctl(???) Michael Stone
2009-12-13  5:20     ` Ulrich Drepper
2009-12-15  5:33       ` Michael Stone
2009-12-16 15:30         ` Michael Stone
2009-12-16 15:32           ` [PATCH] Security: Add prctl(PR_{GET,SET}_NETWORK) interface Michael Stone
2009-12-16 15:59             ` Andi Kleen
2009-12-17  1:25               ` Michael Stone
2009-12-17  8:52                 ` Andi Kleen
     [not found]                 ` <fb69ef3c0912170906t291a37c4r6c4758ddc7dd300b@mail.gmail.com>
2009-12-17 17:14                   ` Andi Kleen
2009-12-17 22:58                     ` Mark Seaborn
2009-12-18  3:00                       ` Michael Stone
2009-12-18  3:29                         ` [PATCH 1/3] Security: Add prctl(PR_{GET,SET}_NETWORK) interface. (v2) Michael Stone
2009-12-18  4:43                           ` Valdis.Kletnieks
2009-12-18 15:46                           ` Alan Cox
2009-12-18 16:33                             ` [PATCH 1/3] Security: Add prctl(PR_{GET,SET}_NETWORK) Michael Stone
2009-12-18 17:20                               ` Alan Cox
2009-12-18 17:47                                 ` Eric W. Biederman
2009-12-24  6:13                                   ` Michael Stone
2009-12-24 12:37                                     ` Eric W. Biederman
2009-12-24  1:42                               ` [PATCH 0/3] Discarding networking privilege via LSM Michael Stone
2009-12-24  1:44                                 ` [PATCH 1/3] Security: Add prctl(PR_{GET,SET}_NETWORK) interface. (v3) Michael Stone
2009-12-24  4:38                                   ` Samir Bellabes
2009-12-24  5:44                                     ` Michael Stone
2009-12-24  5:51                                     ` Tetsuo Handa
2009-12-24  1:45                                 ` [PATCH 2/3] Security: Implement prctl(PR_SET_NETWORK, PR_NETWORK_OFF) semantics. (v3) Michael Stone
2009-12-24  1:45                                 ` [PATCH 3/3] Security: Document prctl(PR_{GET,SET}_NETWORK). (v3) Michael Stone
2009-12-25 17:09                               ` [PATCH 1/3] Security: Add prctl(PR_{GET,SET}_NETWORK) Pavel Machek
2009-12-18  3:31                         ` [PATCH 2/3] Security: Implement prctl(PR_SET_NETWORK, PR_NETWORK_OFF) semantics. (v2) Michael Stone
2009-12-18  3:57                           ` Eric W. Biederman
2009-12-18  3:32                         ` [PATCH 3/3] Security: Document prctl(PR_{GET,SET}_NETWORK). (v2) Michael Stone
2009-12-18 17:49                         ` [PATCH] Security: Add prctl(PR_{GET,SET}_NETWORK) interface Stephen Hemminger
2009-12-19 12:02                           ` David Wagner
2009-12-19 12:29                             ` Alan Cox
2009-12-20 17:53                         ` Mark Seaborn
2009-12-17  9:25             ` Américo Wang
2009-12-17 16:28               ` Michael Stone
2009-12-17 17:23             ` Randy Dunlap
2009-12-17 17:25               ` Randy Dunlap
2009-12-16 15:32           ` [PATCH] Security: Implement prctl(PR_SET_NETWORK, PR_NETWORK_OFF) semantics Michael Stone
2009-12-17 19:18             ` Eric W. Biederman
2009-12-16 15:32           ` [PATCH] Security: Document prctl(PR_{GET,SET}_NETWORK) Michael Stone
2009-12-13  8:32   ` Network isolation with RLIMIT_NETWORK, cont'd Rémi Denis-Courmont
2009-12-13 13:44     ` Michael Stone
2009-12-13 10:05   ` Eric W. Biederman
2009-12-13 14:21     ` Michael Stone
     [not found]       ` <fb69ef3c0912170931l5cbf0e3dh81c88e6502651042@mail.gmail.com>
2009-12-17 18:24         ` Bryan Donlan
2009-12-17 19:35           ` Bernie Innocenti
2009-12-17 19:53             ` Bryan Donlan
2009-12-17 19:23         ` Bernie Innocenti
2009-12-17 17:52     ` Andi Kleen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).