All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 2/2] taskstats: restrict access to user
@ 2011-06-24 12:09 Vasiliy Kulikov
  2011-06-29  1:27 ` Balbir Singh
  2011-06-29 20:09 ` [Security] " Linus Torvalds
  0 siblings, 2 replies; 49+ messages in thread
From: Vasiliy Kulikov @ 2011-06-24 12:09 UTC (permalink / raw)
  To: linux-kernel
  Cc: Balbir Singh, Andrew Morton, Al Viro, David Rientjes,
	Stephen Wilson, KOSAKI Motohiro, security, Eric Paris,
	Solar Designer

taskstats information may be used for gathering private information.
E.g. for openssh and vsftpd daemons read_characters/write_characters may
be used to learn the precise password length.  Restrict it to processes
being able to ptrace the target process.

For TASKSTATS_CMD_ATTR_REGISTER_CPUMASK the fix is euid check instead of
a ptrace check as the handler is processed in the context of the target
process, not the listener process'.  When ptrace_task_may_access_current()
is introduced, it should be used instead of euid check.  Currently there
is a small race when a process temporarily changes its euid (e.g. to
access user's files), until the process sets euid back user's processes
may gather privileged process' statistics.

Signed-off-by: Vasiliy Kulikov <segoon@openwall.com>
---
 kernel/taskstats.c |   23 ++++++++++++++++++++++-
 1 files changed, 22 insertions(+), 1 deletions(-)

diff --git a/kernel/taskstats.c b/kernel/taskstats.c
index 9ffea36..d92c95a 100644
--- a/kernel/taskstats.c
+++ b/kernel/taskstats.c
@@ -27,6 +27,7 @@
 #include <linux/cgroup.h>
 #include <linux/fs.h>
 #include <linux/file.h>
+#include <linux/ptrace.h>
 #include <net/genetlink.h>
 #include <asm/atomic.h>
 
@@ -132,6 +133,8 @@ static void send_cpu_listeners(struct sk_buff *skb,
 	struct sk_buff *skb_next, *skb_cur = skb;
 	void *reply = genlmsg_data(genlhdr);
 	int rc, delcount = 0;
+	const struct cred *cred = current_cred();
+	struct task_struct *task;
 
 	rc = genlmsg_end(skb, reply);
 	if (rc < 0) {
@@ -142,6 +145,15 @@ static void send_cpu_listeners(struct sk_buff *skb,
 	rc = 0;
 	down_read(&listeners->sem);
 	list_for_each_entry(s, &listeners->list, list) {
+
+		rcu_read_lock();
+		task = find_task_by_vpid(s->pid);
+		if (!task || __task_cred(task)->euid != cred->euid) {
+			rcu_read_unlock();
+			continue;
+		}
+		rcu_read_unlock();
+
 		skb_next = NULL;
 		if (!list_is_last(&s->list, &listeners->list)) {
 			skb_next = skb_clone(skb_cur, GFP_KERNEL);
@@ -199,14 +211,19 @@ static void fill_stats(struct task_struct *tsk, struct taskstats *stats)
 static int fill_stats_for_pid(pid_t pid, struct taskstats *stats)
 {
 	struct task_struct *tsk;
+	int rc = -ESRCH;
 
 	rcu_read_lock();
 	tsk = find_task_by_vpid(pid);
+	if (tsk && !ptrace_may_access(tsk, PTRACE_MODE_READ)) {
+		tsk = NULL;
+		rc = -EACCES;
+	}
 	if (tsk)
 		get_task_struct(tsk);
 	rcu_read_unlock();
 	if (!tsk)
-		return -ESRCH;
+		return rc;
 	fill_stats(tsk, stats);
 	put_task_struct(tsk);
 	return 0;
@@ -224,6 +241,10 @@ static int fill_stats_for_tgid(pid_t tgid, struct taskstats *stats)
 	 */
 	rcu_read_lock();
 	first = find_task_by_vpid(tgid);
+	if (first && !ptrace_may_access(first, PTRACE_MODE_READ)) {
+		rc = -EACCES;
+		goto out;
+	}
 
 	if (!first || !lock_task_sighand(first, &flags))
 		goto out;
-- 
1.7.0.4


^ permalink raw reply related	[flat|nested] 49+ messages in thread

* Re: [PATCH 2/2] taskstats: restrict access to user
  2011-06-24 12:09 [PATCH 2/2] taskstats: restrict access to user Vasiliy Kulikov
@ 2011-06-29  1:27 ` Balbir Singh
  2011-06-29 11:42   ` Vasiliy Kulikov
  2011-06-29 20:17   ` Vasiliy Kulikov
  2011-06-29 20:09 ` [Security] " Linus Torvalds
  1 sibling, 2 replies; 49+ messages in thread
From: Balbir Singh @ 2011-06-29  1:27 UTC (permalink / raw)
  To: Vasiliy Kulikov
  Cc: linux-kernel, Balbir Singh, Andrew Morton, Al Viro,
	David Rientjes, Stephen Wilson, KOSAKI Motohiro, security,
	Eric Paris, Solar Designer

On Fri, Jun 24, 2011 at 5:39 PM, Vasiliy Kulikov <segoon@openwall.com> wrote:
> taskstats information may be used for gathering private information.
> E.g. for openssh and vsftpd daemons read_characters/write_characters may
> be used to learn the precise password length.  Restrict it to processes
> being able to ptrace the target process.
>
> For TASKSTATS_CMD_ATTR_REGISTER_CPUMASK the fix is euid check instead of
> a ptrace check as the handler is processed in the context of the target
> process, not the listener process'.  When ptrace_task_may_access_current()
> is introduced, it should be used instead of euid check.  Currently there
> is a small race when a process temporarily changes its euid (e.g. to
> access user's files), until the process sets euid back user's processes
> may gather privileged process' statistics.
>
> Signed-off-by: Vasiliy Kulikov <segoon@openwall.com>
> ---
>  kernel/taskstats.c |   23 ++++++++++++++++++++++-
>  1 files changed, 22 insertions(+), 1 deletions(-)
>
> diff --git a/kernel/taskstats.c b/kernel/taskstats.c
> index 9ffea36..d92c95a 100644
> --- a/kernel/taskstats.c
> +++ b/kernel/taskstats.c
> @@ -27,6 +27,7 @@
>  #include <linux/cgroup.h>
>  #include <linux/fs.h>
>  #include <linux/file.h>
> +#include <linux/ptrace.h>
>  #include <net/genetlink.h>
>  #include <asm/atomic.h>
>
> @@ -132,6 +133,8 @@ static void send_cpu_listeners(struct sk_buff *skb,
>        struct sk_buff *skb_next, *skb_cur = skb;
>        void *reply = genlmsg_data(genlhdr);
>        int rc, delcount = 0;
> +       const struct cred *cred = current_cred();
> +       struct task_struct *task;
>
>        rc = genlmsg_end(skb, reply);
>        if (rc < 0) {
> @@ -142,6 +145,15 @@ static void send_cpu_listeners(struct sk_buff *skb,
>        rc = 0;
>        down_read(&listeners->sem);

Why not grab RCU lock here

>        list_for_each_entry(s, &listeners->list, list) {
> +
> +               rcu_read_lock();

You'll end up grabbing RCU read lock too often here, do you need to?

> +               task = find_task_by_vpid(s->pid);
> +               if (!task || __task_cred(task)->euid != cred->euid) {
> +                       rcu_read_unlock();
> +                       continue;
> +               }
> +               rcu_read_unlock();
> +

Release the lock prior to up_read()

>                skb_next = NULL;
>                if (!list_is_last(&s->list, &listeners->list)) {
>                        skb_next = skb_clone(skb_cur, GFP_KERNEL);
> @@ -199,14 +211,19 @@ static void fill_stats(struct task_struct *tsk, struct taskstats *stats)
>  static int fill_stats_for_pid(pid_t pid, struct taskstats *stats)
>  {
>        struct task_struct *tsk;
> +       int rc = -ESRCH;
>
>        rcu_read_lock();
>        tsk = find_task_by_vpid(pid);
> +       if (tsk && !ptrace_may_access(tsk, PTRACE_MODE_READ)) {
> +               tsk = NULL;
> +               rc = -EACCES;
> +       }
>        if (tsk)
>                get_task_struct(tsk);
>        rcu_read_unlock();
>        if (!tsk)
> -               return -ESRCH;
> +               return rc;
>        fill_stats(tsk, stats);
>        put_task_struct(tsk);
>        return 0;
> @@ -224,6 +241,10 @@ static int fill_stats_for_tgid(pid_t tgid, struct taskstats *stats)
>         */
>        rcu_read_lock();
>        first = find_task_by_vpid(tgid);
> +       if (first && !ptrace_may_access(first, PTRACE_MODE_READ)) {
> +               rc = -EACCES;
> +               goto out;
> +       }
>
>        if (!first || !lock_task_sighand(first, &flags))
>                goto out;

Balbir Singh

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 2/2] taskstats: restrict access to user
  2011-06-29  1:27 ` Balbir Singh
@ 2011-06-29 11:42   ` Vasiliy Kulikov
  2011-06-29 20:17   ` Vasiliy Kulikov
  1 sibling, 0 replies; 49+ messages in thread
From: Vasiliy Kulikov @ 2011-06-29 11:42 UTC (permalink / raw)
  To: Balbir Singh
  Cc: linux-kernel, Balbir Singh, Andrew Morton, Al Viro,
	David Rientjes, Stephen Wilson, KOSAKI Motohiro, security,
	Eric Paris, Solar Designer

On Wed, Jun 29, 2011 at 06:57 +0530, Balbir Singh wrote:
> On Fri, Jun 24, 2011 at 5:39 PM, Vasiliy Kulikov <segoon@openwall.com> wrote:
> > taskstats information may be used for gathering private information.
> > E.g. for openssh and vsftpd daemons read_characters/write_characters may
> > be used to learn the precise password length.  Restrict it to processes
> > being able to ptrace the target process.
> >
> > For TASKSTATS_CMD_ATTR_REGISTER_CPUMASK the fix is euid check instead of
> > a ptrace check as the handler is processed in the context of the target
> > process, not the listener process'.  When ptrace_task_may_access_current()
> > is introduced, it should be used instead of euid check.  Currently there
> > is a small race when a process temporarily changes its euid (e.g. to
> > access user's files), until the process sets euid back user's processes
> > may gather privileged process' statistics.
> >
> > Signed-off-by: Vasiliy Kulikov <segoon@openwall.com>
> > ---
> >  kernel/taskstats.c |   23 ++++++++++++++++++++++-
> >  1 files changed, 22 insertions(+), 1 deletions(-)
> >
> > diff --git a/kernel/taskstats.c b/kernel/taskstats.c
> > index 9ffea36..d92c95a 100644
> > --- a/kernel/taskstats.c
> > +++ b/kernel/taskstats.c
> > @@ -27,6 +27,7 @@
> >  #include <linux/cgroup.h>
> >  #include <linux/fs.h>
> >  #include <linux/file.h>
> > +#include <linux/ptrace.h>
> >  #include <net/genetlink.h>
> >  #include <asm/atomic.h>
> >
> > @@ -132,6 +133,8 @@ static void send_cpu_listeners(struct sk_buff *skb,
> >        struct sk_buff *skb_next, *skb_cur = skb;
> >        void *reply = genlmsg_data(genlhdr);
> >        int rc, delcount = 0;
> > +       const struct cred *cred = current_cred();
> > +       struct task_struct *task;
> >
> >        rc = genlmsg_end(skb, reply);
> >        if (rc < 0) {
> > @@ -142,6 +145,15 @@ static void send_cpu_listeners(struct sk_buff *skb,
> >        rc = 0;
> >        down_read(&listeners->sem);
> 
> Why not grab RCU lock here

Yes, it makes sense.  I was thinking about not holding a lock for a too
long time, but it should be rather cheap anyway.


Thank you,

-- 
Vasiliy Kulikov
http://www.openwall.com - bringing security into open computing environments

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [Security] [PATCH 2/2] taskstats: restrict access to user
  2011-06-24 12:09 [PATCH 2/2] taskstats: restrict access to user Vasiliy Kulikov
  2011-06-29  1:27 ` Balbir Singh
@ 2011-06-29 20:09 ` Linus Torvalds
  2011-06-30  7:57     ` [kernel-hardening] " Vasiliy Kulikov
  1 sibling, 1 reply; 49+ messages in thread
From: Linus Torvalds @ 2011-06-29 20:09 UTC (permalink / raw)
  To: Vasiliy Kulikov, Shailabh Nagar, Balbir Singh
  Cc: linux-kernel, security, Solar Designer, Eric Paris,
	Stephen Wilson, KOSAKI Motohiro, David Rientjes, Andrew Morton,
	Balbir Singh

On Fri, Jun 24, 2011 at 5:09 AM, Vasiliy Kulikov <segoon@openwall.com> wrote:
> taskstats information may be used for gathering private information.
> E.g. for openssh and vsftpd daemons read_characters/write_characters may
> be used to learn the precise password length.  Restrict it to processes
> being able to ptrace the target process.

Ok, having looked at this some more, I'm quite ready to just mark the
whole TASKSTATS config option as BROKEN. It does seem to be a horrible
security hazard, and very little seems to use it.

It also seems to be really fundamentally broken. Afaik, there's no way
to filter taskstats not only by security issues (Vasiliy's patch
really is very ugly), but it seems to be some global cross-namespace
thing too, so it exposes taskstats across pid namespaces afaik. It
does that even with Vasiliy's patch, afaik, although then I think you
need to have collissions in the namespaces if I read the code
correctly.

I suspect that could be fixed by adding a pid namespace to the
'listener' structure. Also adding a 'cred' pointer (or the actual
listener thread pointer) to it would make Vasiliy's patch more
palatable, since then you wouldn't need to look up the credentials at
send_cpu_listeners() time.

Maybe I have mis-read the code. But it does all make me shudder. There
doesn't even seem to be all that many _users_ of the thing, so the
problems it has really makes me go "is that code worth it"? We
probably should never have merged it in the first place.

                         Linus

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 2/2] taskstats: restrict access to user
  2011-06-29  1:27 ` Balbir Singh
  2011-06-29 11:42   ` Vasiliy Kulikov
@ 2011-06-29 20:17   ` Vasiliy Kulikov
  2011-07-02  7:36       ` [kernel-hardening] " Vasiliy Kulikov
  1 sibling, 1 reply; 49+ messages in thread
From: Vasiliy Kulikov @ 2011-06-29 20:17 UTC (permalink / raw)
  To: Balbir Singh
  Cc: linux-kernel, Balbir Singh, Andrew Morton, Al Viro,
	David Rientjes, Stephen Wilson, KOSAKI Motohiro, security,
	Eric Paris, Solar Designer

On Wed, Jun 29, 2011 at 06:57 +0530, Balbir Singh wrote:
> On Fri, Jun 24, 2011 at 5:39 PM, Vasiliy Kulikov <segoon@openwall.com> wrote:
> > +               task = find_task_by_vpid(s->pid);
> > +               if (!task || __task_cred(task)->euid != cred->euid) {

If consider this patch for inclusion, it also needs some check for
the listener root ability.  __task_cred(task)->euid == 0 or smth like
that.  But ptrace_task_may_access_current() is better.

> > +                       rcu_read_unlock();
> > +                       continue;
> > +               }
> > +               rcu_read_unlock();
> > +
> 
> Release the lock prior to up_read()
> 
> >                skb_next = NULL;
> >                if (!list_is_last(&s->list, &listeners->list)) {
> >                        skb_next = skb_clone(skb_cur, GFP_KERNEL);
> > @@ -199,14 +211,19 @@ static void fill_stats(struct task_struct *tsk, struct taskstats *stats)
> >  static int fill_stats_for_pid(pid_t pid, struct taskstats *stats)
> >  {
> >        struct task_struct *tsk;
> > +       int rc = -ESRCH;
> >
> >        rcu_read_lock();
> >        tsk = find_task_by_vpid(pid);
> > +       if (tsk && !ptrace_may_access(tsk, PTRACE_MODE_READ)) {
> > +               tsk = NULL;
> > +               rc = -EACCES;
> > +       }
> >        if (tsk)
> >                get_task_struct(tsk);
> >        rcu_read_unlock();
> >        if (!tsk)
> > -               return -ESRCH;
> > +               return rc;
> >        fill_stats(tsk, stats);
> >        put_task_struct(tsk);
> >        return 0;
> > @@ -224,6 +241,10 @@ static int fill_stats_for_tgid(pid_t tgid, struct taskstats *stats)
> >         */
> >        rcu_read_lock();
> >        first = find_task_by_vpid(tgid);
> > +       if (first && !ptrace_may_access(first, PTRACE_MODE_READ)) {
> > +               rc = -EACCES;
> > +               goto out;
> > +       }
> >
> >        if (!first || !lock_task_sighand(first, &flags))
> >                goto out;
> 
> Balbir Singh

-- 
Vasiliy Kulikov
http://www.openwall.com - bringing security into open computing environments

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [Security] [PATCH 2/2] taskstats: restrict access to user
  2011-06-29 20:09 ` [Security] " Linus Torvalds
@ 2011-06-30  7:57     ` Vasiliy Kulikov
  0 siblings, 0 replies; 49+ messages in thread
From: Vasiliy Kulikov @ 2011-06-30  7:57 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Shailabh Nagar, Balbir Singh, linux-kernel, security, Eric Paris,
	Stephen Wilson, KOSAKI Motohiro, David Rientjes, Andrew Morton,
	Balbir Singh, kernel-hardening

On Wed, Jun 29, 2011 at 13:09 -0700, Linus Torvalds wrote:
> Ok, having looked at this some more, I'm quite ready to just mark the
> whole TASKSTATS config option as BROKEN. It does seem to be a horrible
> security hazard, and very little seems to use it.
> 
> It also seems to be really fundamentally broken. Afaik, there's no way
> to filter taskstats not only by security issues (Vasiliy's patch
> really is very ugly), but it seems to be some global cross-namespace
> thing too, so it exposes taskstats across pid namespaces afaik.

The problem here is that it keeps a pid in a global list.  This list is
then browsed by all namespaces.  Looking into the code, I see 2 real
problems (didn't check, though):

1) If there are 2+ pid namespaces, one listener in pid_ns #1 and some
process dies in pid_ns #2, then the dying task wouldn't find the
listener via find_task_by_vpid() and would just delete the listener from
listeners list.  Looks like this problem was created by optimization
patch f9fd8914c1acca0.

2) Netlink sockets are per net namespace, but this accounting thing is per
pid namespace.  So, if the dying task and the listener are in a single
pid namespace, but in different net namespaces, the message will not be
sent via genlmsg_unicast().  I suspect it is a problem of all non-net
netlink sockets.


> It
> does that even with Vasiliy's patch, afaik, although then I think you
> need to have collissions in the namespaces if I read the code
> correctly.
> 
> I suspect that could be fixed by adding a pid namespace to the
> 'listener' structure. Also adding a 'cred' pointer (or the actual
> listener thread pointer) to it would make Vasiliy's patch more
> palatable, since then you wouldn't need to look up the credentials at
> send_cpu_listeners() time.
> 
> Maybe I have mis-read the code. But it does all make me shudder. There
> doesn't even seem to be all that many _users_ of the thing, so the
> problems it has really makes me go "is that code worth it"? We
> probably should never have merged it in the first place.

> but it seems to be some global cross-namespace
> thing too, so it exposes taskstats across pid namespaces afaik.


Thanks,

-- 
Vasiliy Kulikov
http://www.openwall.com - bringing security into open computing environments

^ permalink raw reply	[flat|nested] 49+ messages in thread

* [kernel-hardening] Re: [Security] [PATCH 2/2] taskstats: restrict access to user
@ 2011-06-30  7:57     ` Vasiliy Kulikov
  0 siblings, 0 replies; 49+ messages in thread
From: Vasiliy Kulikov @ 2011-06-30  7:57 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Shailabh Nagar, Balbir Singh, linux-kernel, security, Eric Paris,
	Stephen Wilson, KOSAKI Motohiro, David Rientjes, Andrew Morton,
	Balbir Singh, kernel-hardening

On Wed, Jun 29, 2011 at 13:09 -0700, Linus Torvalds wrote:
> Ok, having looked at this some more, I'm quite ready to just mark the
> whole TASKSTATS config option as BROKEN. It does seem to be a horrible
> security hazard, and very little seems to use it.
> 
> It also seems to be really fundamentally broken. Afaik, there's no way
> to filter taskstats not only by security issues (Vasiliy's patch
> really is very ugly), but it seems to be some global cross-namespace
> thing too, so it exposes taskstats across pid namespaces afaik.

The problem here is that it keeps a pid in a global list.  This list is
then browsed by all namespaces.  Looking into the code, I see 2 real
problems (didn't check, though):

1) If there are 2+ pid namespaces, one listener in pid_ns #1 and some
process dies in pid_ns #2, then the dying task wouldn't find the
listener via find_task_by_vpid() and would just delete the listener from
listeners list.  Looks like this problem was created by optimization
patch f9fd8914c1acca0.

2) Netlink sockets are per net namespace, but this accounting thing is per
pid namespace.  So, if the dying task and the listener are in a single
pid namespace, but in different net namespaces, the message will not be
sent via genlmsg_unicast().  I suspect it is a problem of all non-net
netlink sockets.


> It
> does that even with Vasiliy's patch, afaik, although then I think you
> need to have collissions in the namespaces if I read the code
> correctly.
> 
> I suspect that could be fixed by adding a pid namespace to the
> 'listener' structure. Also adding a 'cred' pointer (or the actual
> listener thread pointer) to it would make Vasiliy's patch more
> palatable, since then you wouldn't need to look up the credentials at
> send_cpu_listeners() time.
> 
> Maybe I have mis-read the code. But it does all make me shudder. There
> doesn't even seem to be all that many _users_ of the thing, so the
> problems it has really makes me go "is that code worth it"? We
> probably should never have merged it in the first place.

> but it seems to be some global cross-namespace
> thing too, so it exposes taskstats across pid namespaces afaik.


Thanks,

-- 
Vasiliy Kulikov
http://www.openwall.com - bringing security into open computing environments

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [Security] [PATCH 2/2] taskstats: restrict access to user
  2011-06-30  7:57     ` [kernel-hardening] " Vasiliy Kulikov
@ 2011-06-30 10:59       ` Balbir Singh
  -1 siblings, 0 replies; 49+ messages in thread
From: Balbir Singh @ 2011-06-30 10:59 UTC (permalink / raw)
  To: Vasiliy Kulikov
  Cc: Linus Torvalds, Shailabh Nagar, linux-kernel, security,
	Eric Paris, Stephen Wilson, KOSAKI Motohiro, David Rientjes,
	Andrew Morton, Balbir Singh, kernel-hardening

On Thu, Jun 30, 2011 at 1:27 PM, Vasiliy Kulikov <segoon@openwall.com> wrote:
> On Wed, Jun 29, 2011 at 13:09 -0700, Linus Torvalds wrote:
>> Ok, having looked at this some more, I'm quite ready to just mark the
>> whole TASKSTATS config option as BROKEN. It does seem to be a horrible
>> security hazard, and very little seems to use it.
>>

I am not sure how you define BROKEN, BROKEN as per security rules?
/proc/$pid/xxx also expose similar information.

>> It also seems to be really fundamentally broken. Afaik, there's no way
>> to filter taskstats not only by security issues (Vasiliy's patch
>> really is very ugly), but it seems to be some global cross-namespace
>> thing too, so it exposes taskstats across pid namespaces afaik.
>
> The problem here is that it keeps a pid in a global list.  This list is
> then browsed by all namespaces.  Looking into the code, I see 2 real
> problems (didn't check, though):
>
> 1) If there are 2+ pid namespaces, one listener in pid_ns #1 and some
> process dies in pid_ns #2, then the dying task wouldn't find the
> listener via find_task_by_vpid() and would just delete the listener from
> listeners list.  Looks like this problem was created by optimization
> patch f9fd8914c1acca0.

I must admit, I did not pay much attention to pid namespace changes as
they were being made to taskstats. I'll look at the code later this
week.

>
> 2) Netlink sockets are per net namespace, but this accounting thing is per
> pid namespace.  So, if the dying task and the listener are in a single
> pid namespace, but in different net namespaces, the message will not be
> sent via genlmsg_unicast().  I suspect it is a problem of all non-net
> netlink sockets.
>

Good catch, under what circumstances would tasks have the same pid
namespace, but different net namespaces?

>
>> It
>> does that even with Vasiliy's patch, afaik, although then I think you
>> need to have collissions in the namespaces if I read the code
>> correctly.
>>
>> I suspect that could be fixed by adding a pid namespace to the
>> 'listener' structure. Also adding a 'cred' pointer (or the actual
>> listener thread pointer) to it would make Vasiliy's patch more
>> palatable, since then you wouldn't need to look up the credentials at
>> send_cpu_listeners() time.
>>

A simple work around could be to disable taskstats under network
namespace or even the current behaviour should just warn the user that
the network namespace of the listener is distinct and disallow the
listener

>> Maybe I have mis-read the code. But it does all make me shudder. There
>> doesn't even seem to be all that many _users_ of the thing, so the
>> problems it has really makes me go "is that code worth it"? We
>> probably should never have merged it in the first place.

iotop(), getdelays (in kernel) with many developers using it and many
more. The interface has been published for a while now

Thanks for the review,
Balbir Singh

^ permalink raw reply	[flat|nested] 49+ messages in thread

* [kernel-hardening] Re: [Security] [PATCH 2/2] taskstats: restrict access to user
@ 2011-06-30 10:59       ` Balbir Singh
  0 siblings, 0 replies; 49+ messages in thread
From: Balbir Singh @ 2011-06-30 10:59 UTC (permalink / raw)
  To: Vasiliy Kulikov
  Cc: Linus Torvalds, Shailabh Nagar, linux-kernel, security,
	Eric Paris, Stephen Wilson, KOSAKI Motohiro, David Rientjes,
	Andrew Morton, Balbir Singh, kernel-hardening

On Thu, Jun 30, 2011 at 1:27 PM, Vasiliy Kulikov <segoon@openwall.com> wrote:
> On Wed, Jun 29, 2011 at 13:09 -0700, Linus Torvalds wrote:
>> Ok, having looked at this some more, I'm quite ready to just mark the
>> whole TASKSTATS config option as BROKEN. It does seem to be a horrible
>> security hazard, and very little seems to use it.
>>

I am not sure how you define BROKEN, BROKEN as per security rules?
/proc/$pid/xxx also expose similar information.

>> It also seems to be really fundamentally broken. Afaik, there's no way
>> to filter taskstats not only by security issues (Vasiliy's patch
>> really is very ugly), but it seems to be some global cross-namespace
>> thing too, so it exposes taskstats across pid namespaces afaik.
>
> The problem here is that it keeps a pid in a global list.  This list is
> then browsed by all namespaces.  Looking into the code, I see 2 real
> problems (didn't check, though):
>
> 1) If there are 2+ pid namespaces, one listener in pid_ns #1 and some
> process dies in pid_ns #2, then the dying task wouldn't find the
> listener via find_task_by_vpid() and would just delete the listener from
> listeners list.  Looks like this problem was created by optimization
> patch f9fd8914c1acca0.

I must admit, I did not pay much attention to pid namespace changes as
they were being made to taskstats. I'll look at the code later this
week.

>
> 2) Netlink sockets are per net namespace, but this accounting thing is per
> pid namespace.  So, if the dying task and the listener are in a single
> pid namespace, but in different net namespaces, the message will not be
> sent via genlmsg_unicast().  I suspect it is a problem of all non-net
> netlink sockets.
>

Good catch, under what circumstances would tasks have the same pid
namespace, but different net namespaces?

>
>> It
>> does that even with Vasiliy's patch, afaik, although then I think you
>> need to have collissions in the namespaces if I read the code
>> correctly.
>>
>> I suspect that could be fixed by adding a pid namespace to the
>> 'listener' structure. Also adding a 'cred' pointer (or the actual
>> listener thread pointer) to it would make Vasiliy's patch more
>> palatable, since then you wouldn't need to look up the credentials at
>> send_cpu_listeners() time.
>>

A simple work around could be to disable taskstats under network
namespace or even the current behaviour should just warn the user that
the network namespace of the listener is distinct and disallow the
listener

>> Maybe I have mis-read the code. But it does all make me shudder. There
>> doesn't even seem to be all that many _users_ of the thing, so the
>> problems it has really makes me go "is that code worth it"? We
>> probably should never have merged it in the first place.

iotop(), getdelays (in kernel) with many developers using it and many
more. The interface has been published for a while now

Thanks for the review,
Balbir Singh

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [Security] [PATCH 2/2] taskstats: restrict access to user
  2011-06-30 10:59       ` [kernel-hardening] " Balbir Singh
@ 2011-06-30 12:08         ` Vasiliy Kulikov
  -1 siblings, 0 replies; 49+ messages in thread
From: Vasiliy Kulikov @ 2011-06-30 12:08 UTC (permalink / raw)
  To: Balbir Singh
  Cc: Linus Torvalds, Shailabh Nagar, linux-kernel, security,
	Eric Paris, Stephen Wilson, KOSAKI Motohiro, David Rientjes,
	Andrew Morton, Balbir Singh, kernel-hardening

On Thu, Jun 30, 2011 at 16:29 +0530, Balbir Singh wrote:
> On Thu, Jun 30, 2011 at 1:27 PM, Vasiliy Kulikov <segoon@openwall.com> wrote:
> > On Wed, Jun 29, 2011 at 13:09 -0700, Linus Torvalds wrote:
> >> Ok, having looked at this some more, I'm quite ready to just mark the
> >> whole TASKSTATS config option as BROKEN. It does seem to be a horrible
> >> security hazard, and very little seems to use it.
> >>
> 
> I am not sure how you define BROKEN, BROKEN as per security rules?
> /proc/$pid/xxx also expose similar information.

I'm sure this information is also dangerous.


> >> It also seems to be really fundamentally broken. Afaik, there's no way
> >> to filter taskstats not only by security issues (Vasiliy's patch
> >> really is very ugly), but it seems to be some global cross-namespace
> >> thing too, so it exposes taskstats across pid namespaces afaik.
> >
> > The problem here is that it keeps a pid in a global list.  This list is
> > then browsed by all namespaces.  Looking into the code, I see 2 real
> > problems (didn't check, though):
> >
> > 1) If there are 2+ pid namespaces, one listener in pid_ns #1 and some
> > process dies in pid_ns #2, then the dying task wouldn't find the
> > listener via find_task_by_vpid() and would just delete the listener from
> > listeners list.  Looks like this problem was created by optimization
> > patch f9fd8914c1acca0.
> 
> I must admit, I did not pay much attention to pid namespace changes as
> they were being made to taskstats. I'll look at the code later this
> week.
> 
> >
> > 2) Netlink sockets are per net namespace, but this accounting thing is per
> > pid namespace.  So, if the dying task and the listener are in a single
> > pid namespace, but in different net namespaces, the message will not be
> > sent via genlmsg_unicast().  I suspect it is a problem of all non-net
> > netlink sockets.

This is not a problem of all non-net netlink sockets, but only of those
who finds a task by the pid.  AFAIU, the normal netlink policy is using
broadcast packets.


3) A process cannot own 2 taskstats sockets because the information
stored in listeners array doesn't make difference between different
sockets of the same task.  Both searches would stop on the first socket,
the first socket would receive 2 messages while the second would be
silent.  It was introduced by f9fd8914c1acca0.


> Good catch, under what circumstances would tasks have the same pid
> namespace, but different net namespaces?

Umm, if one did unshare() or clone() to unshare only net namespace?


> >> It
> >> does that even with Vasiliy's patch, afaik, although then I think you
> >> need to have collissions in the namespaces if I read the code
> >> correctly.
> >>
> >> I suspect that could be fixed by adding a pid namespace to the
> >> 'listener' structure. Also adding a 'cred' pointer (or the actual
> >> listener thread pointer) to it would make Vasiliy's patch more
> >> palatable, since then you wouldn't need to look up the credentials at
> >> send_cpu_listeners() time.
> >>
> 
> A simple work around could be to disable taskstats under network
> namespace or even the current behaviour should just warn the user that
> the network namespace of the listener is distinct and disallow the
> listener
> 
> >> Maybe I have mis-read the code. But it does all make me shudder. There
> >> doesn't even seem to be all that many _users_ of the thing, so the
> >> problems it has really makes me go "is that code worth it"? We
> >> probably should never have merged it in the first place.
> 
> iotop(), getdelays (in kernel) with many developers using it and many
> more. The interface has been published for a while now
> 
> Thanks for the review,
> Balbir Singh


Thanks,

-- 
Vasiliy Kulikov
http://www.openwall.com - bringing security into open computing environments

^ permalink raw reply	[flat|nested] 49+ messages in thread

* [kernel-hardening] Re: [Security] [PATCH 2/2] taskstats: restrict access to user
@ 2011-06-30 12:08         ` Vasiliy Kulikov
  0 siblings, 0 replies; 49+ messages in thread
From: Vasiliy Kulikov @ 2011-06-30 12:08 UTC (permalink / raw)
  To: Balbir Singh
  Cc: Linus Torvalds, Shailabh Nagar, linux-kernel, security,
	Eric Paris, Stephen Wilson, KOSAKI Motohiro, David Rientjes,
	Andrew Morton, Balbir Singh, kernel-hardening

On Thu, Jun 30, 2011 at 16:29 +0530, Balbir Singh wrote:
> On Thu, Jun 30, 2011 at 1:27 PM, Vasiliy Kulikov <segoon@openwall.com> wrote:
> > On Wed, Jun 29, 2011 at 13:09 -0700, Linus Torvalds wrote:
> >> Ok, having looked at this some more, I'm quite ready to just mark the
> >> whole TASKSTATS config option as BROKEN. It does seem to be a horrible
> >> security hazard, and very little seems to use it.
> >>
> 
> I am not sure how you define BROKEN, BROKEN as per security rules?
> /proc/$pid/xxx also expose similar information.

I'm sure this information is also dangerous.


> >> It also seems to be really fundamentally broken. Afaik, there's no way
> >> to filter taskstats not only by security issues (Vasiliy's patch
> >> really is very ugly), but it seems to be some global cross-namespace
> >> thing too, so it exposes taskstats across pid namespaces afaik.
> >
> > The problem here is that it keeps a pid in a global list.  This list is
> > then browsed by all namespaces.  Looking into the code, I see 2 real
> > problems (didn't check, though):
> >
> > 1) If there are 2+ pid namespaces, one listener in pid_ns #1 and some
> > process dies in pid_ns #2, then the dying task wouldn't find the
> > listener via find_task_by_vpid() and would just delete the listener from
> > listeners list.  Looks like this problem was created by optimization
> > patch f9fd8914c1acca0.
> 
> I must admit, I did not pay much attention to pid namespace changes as
> they were being made to taskstats. I'll look at the code later this
> week.
> 
> >
> > 2) Netlink sockets are per net namespace, but this accounting thing is per
> > pid namespace.  So, if the dying task and the listener are in a single
> > pid namespace, but in different net namespaces, the message will not be
> > sent via genlmsg_unicast().  I suspect it is a problem of all non-net
> > netlink sockets.

This is not a problem of all non-net netlink sockets, but only of those
who finds a task by the pid.  AFAIU, the normal netlink policy is using
broadcast packets.


3) A process cannot own 2 taskstats sockets because the information
stored in listeners array doesn't make difference between different
sockets of the same task.  Both searches would stop on the first socket,
the first socket would receive 2 messages while the second would be
silent.  It was introduced by f9fd8914c1acca0.


> Good catch, under what circumstances would tasks have the same pid
> namespace, but different net namespaces?

Umm, if one did unshare() or clone() to unshare only net namespace?


> >> It
> >> does that even with Vasiliy's patch, afaik, although then I think you
> >> need to have collissions in the namespaces if I read the code
> >> correctly.
> >>
> >> I suspect that could be fixed by adding a pid namespace to the
> >> 'listener' structure. Also adding a 'cred' pointer (or the actual
> >> listener thread pointer) to it would make Vasiliy's patch more
> >> palatable, since then you wouldn't need to look up the credentials at
> >> send_cpu_listeners() time.
> >>
> 
> A simple work around could be to disable taskstats under network
> namespace or even the current behaviour should just warn the user that
> the network namespace of the listener is distinct and disallow the
> listener
> 
> >> Maybe I have mis-read the code. But it does all make me shudder. There
> >> doesn't even seem to be all that many _users_ of the thing, so the
> >> problems it has really makes me go "is that code worth it"? We
> >> probably should never have merged it in the first place.
> 
> iotop(), getdelays (in kernel) with many developers using it and many
> more. The interface has been published for a while now
> 
> Thanks for the review,
> Balbir Singh


Thanks,

-- 
Vasiliy Kulikov
http://www.openwall.com - bringing security into open computing environments

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [Security] [PATCH 2/2] taskstats: restrict access to user
  2011-06-30 10:59       ` [kernel-hardening] " Balbir Singh
@ 2011-06-30 16:40         ` Linus Torvalds
  -1 siblings, 0 replies; 49+ messages in thread
From: Linus Torvalds @ 2011-06-30 16:40 UTC (permalink / raw)
  To: Balbir Singh
  Cc: Vasiliy Kulikov, Shailabh Nagar, linux-kernel, security,
	Eric Paris, Stephen Wilson, KOSAKI Motohiro, David Rientjes,
	Andrew Morton, Balbir Singh, kernel-hardening

[-- Attachment #1: Type: text/plain, Size: 2259 bytes --]

On Thu, Jun 30, 2011 at 3:59 AM, Balbir Singh <bsingharora@gmail.com> wrote:
> On Thu, Jun 30, 2011 at 1:27 PM, Vasiliy Kulikov <segoon@openwall.com> wrote:
>> On Wed, Jun 29, 2011 at 13:09 -0700, Linus Torvalds wrote:
>>> Ok, having looked at this some more, I'm quite ready to just mark the
>>> whole TASKSTATS config option as BROKEN. It does seem to be a horrible
>>> security hazard, and very little seems to use it.
>>>
>
> I am not sure how you define BROKEN, BROKEN as per security rules?
> /proc/$pid/xxx also expose similar information.

No it doesn't.

/proc/$pid/xyz has always had proper security checks. Now, sometimes
they've been a bit too lax (iow, we've had cases where we just allowed
things we shouldn't - like this "io" file), but /proc/pid/ at least
*conceptually* has always had the "do I have permission to read this
or not". Do a "cat /proc/*/* > /dev/null" and you'll be getting a lot
of "permission denied" etc.

TASKSTATS? Nothing. Nada. Completely open.

That's just broken.

TASKSTATS also isn't apparently used by any normal program, and so
never got updated to namespaces. Again, /proc/xyz/ at least *tries*:
I'm not guaranteeing that it doesn't have some gaping holes, but I can
at least find the logic that saves and uses namespace information.

Again, TASKSTATS? Not so much.

> I must admit, I did not pay much attention to pid namespace changes as
> they were being made to taskstats. I'll look at the code later this
> week.

Some of the pid namespace issues could be fixed with the attached kind
of starting point.

But it's broken. See the comment I added. Why is it broken? Again -
taskstats is fundamentally broken, and doesn't seem to actually clean
up after itself. The "cleanup" depends on noticing at run-time that
some pid is stale and no longer listening. Again, that's just
fundamental brokenness in taskstats.

And it also only look sat pid namespaces. The network namespace issue
is separate.

So that's why I think it should be marked BROKEN. What applications
actually depend on this? iotop and what else? Because if it's just
iotop, I do suspect we might be better off telling people "ok,
disabling this will break iotop, but quite frankly, you're better off
without it".

                          Linus

[-- Attachment #2: patch.diff --]
[-- Type: text/x-patch, Size: 2575 bytes --]

 kernel/taskstats.c |   16 ++++++++++++++--
 1 files changed, 14 insertions(+), 2 deletions(-)

diff --git a/kernel/taskstats.c b/kernel/taskstats.c
index fc0f22005417..620e89ae96cf 100644
--- a/kernel/taskstats.c
+++ b/kernel/taskstats.c
@@ -27,6 +27,7 @@
 #include <linux/cgroup.h>
 #include <linux/fs.h>
 #include <linux/file.h>
+#include <linux/pid_namespace.h>
 #include <net/genetlink.h>
 #include <asm/atomic.h>
 
@@ -60,6 +61,7 @@ static const struct nla_policy cgroupstats_cmd_get_policy[CGROUPSTATS_CMD_ATTR_M
 struct listener {
 	struct list_head list;
 	pid_t pid;
+	struct pid_namespace *pid_ns;
 	char valid;
 };
 
@@ -127,6 +129,7 @@ static int send_reply(struct sk_buff *skb, struct genl_info *info)
 static void send_cpu_listeners(struct sk_buff *skb,
 					struct listener_list *listeners)
 {
+	struct pid_namespace *current_ns;
 	struct genlmsghdr *genlhdr = nlmsg_data(nlmsg_hdr(skb));
 	struct listener *s, *tmp;
 	struct sk_buff *skb_next, *skb_cur = skb;
@@ -140,8 +143,15 @@ static void send_cpu_listeners(struct sk_buff *skb,
 	}
 
 	rc = 0;
+	current_ns = task_active_pid_ns(current);
 	down_read(&listeners->sem);
 	list_for_each_entry(s, &listeners->list, list) {
+		/*
+		 * BROKEN: what if "s" is stale? This way it will
+		 * never be marked invalid!
+		 */
+		if (s->pid_ns != current_ns)
+			continue;
 		skb_next = NULL;
 		if (!list_is_last(&s->list, &listeners->list)) {
 			skb_next = skb_clone(skb_cur, GFP_KERNEL);
@@ -287,6 +297,7 @@ static int add_del_listener(pid_t pid, const struct cpumask *mask, int isadd)
 	struct listener_list *listeners;
 	struct listener *s, *tmp, *s2;
 	unsigned int cpu;
+	struct pid_namespace *pid_ns = task_active_pid_ns(current);
 
 	if (!cpumask_subset(mask, cpu_possible_mask))
 		return -EINVAL;
@@ -300,13 +311,14 @@ static int add_del_listener(pid_t pid, const struct cpumask *mask, int isadd)
 			if (!s)
 				goto cleanup;
 			s->pid = pid;
+			s->pid_ns = pid_ns;
 			INIT_LIST_HEAD(&s->list);
 			s->valid = 1;
 
 			listeners = &per_cpu(listener_array, cpu);
 			down_write(&listeners->sem);
 			list_for_each_entry_safe(s2, tmp, &listeners->list, list) {
-				if (s2->pid == pid)
+				if (s2->pid == pid && s2->pid_ns == pid_ns)
 					goto next_cpu;
 			}
 			list_add(&s->list, &listeners->list);
@@ -324,7 +336,7 @@ cleanup:
 		listeners = &per_cpu(listener_array, cpu);
 		down_write(&listeners->sem);
 		list_for_each_entry_safe(s, tmp, &listeners->list, list) {
-			if (s->pid == pid) {
+			if (s->pid == pid && s->pid_ns == pid_ns) {
 				list_del(&s->list);
 				kfree(s);
 				break;

^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [kernel-hardening] Re: [Security] [PATCH 2/2] taskstats: restrict access to user
@ 2011-06-30 16:40         ` Linus Torvalds
  0 siblings, 0 replies; 49+ messages in thread
From: Linus Torvalds @ 2011-06-30 16:40 UTC (permalink / raw)
  To: Balbir Singh
  Cc: Vasiliy Kulikov, Shailabh Nagar, linux-kernel, security,
	Eric Paris, Stephen Wilson, KOSAKI Motohiro, David Rientjes,
	Andrew Morton, Balbir Singh, kernel-hardening

[-- Attachment #1: Type: text/plain, Size: 2259 bytes --]

On Thu, Jun 30, 2011 at 3:59 AM, Balbir Singh <bsingharora@gmail.com> wrote:
> On Thu, Jun 30, 2011 at 1:27 PM, Vasiliy Kulikov <segoon@openwall.com> wrote:
>> On Wed, Jun 29, 2011 at 13:09 -0700, Linus Torvalds wrote:
>>> Ok, having looked at this some more, I'm quite ready to just mark the
>>> whole TASKSTATS config option as BROKEN. It does seem to be a horrible
>>> security hazard, and very little seems to use it.
>>>
>
> I am not sure how you define BROKEN, BROKEN as per security rules?
> /proc/$pid/xxx also expose similar information.

No it doesn't.

/proc/$pid/xyz has always had proper security checks. Now, sometimes
they've been a bit too lax (iow, we've had cases where we just allowed
things we shouldn't - like this "io" file), but /proc/pid/ at least
*conceptually* has always had the "do I have permission to read this
or not". Do a "cat /proc/*/* > /dev/null" and you'll be getting a lot
of "permission denied" etc.

TASKSTATS? Nothing. Nada. Completely open.

That's just broken.

TASKSTATS also isn't apparently used by any normal program, and so
never got updated to namespaces. Again, /proc/xyz/ at least *tries*:
I'm not guaranteeing that it doesn't have some gaping holes, but I can
at least find the logic that saves and uses namespace information.

Again, TASKSTATS? Not so much.

> I must admit, I did not pay much attention to pid namespace changes as
> they were being made to taskstats. I'll look at the code later this
> week.

Some of the pid namespace issues could be fixed with the attached kind
of starting point.

But it's broken. See the comment I added. Why is it broken? Again -
taskstats is fundamentally broken, and doesn't seem to actually clean
up after itself. The "cleanup" depends on noticing at run-time that
some pid is stale and no longer listening. Again, that's just
fundamental brokenness in taskstats.

And it also only look sat pid namespaces. The network namespace issue
is separate.

So that's why I think it should be marked BROKEN. What applications
actually depend on this? iotop and what else? Because if it's just
iotop, I do suspect we might be better off telling people "ok,
disabling this will break iotop, but quite frankly, you're better off
without it".

                          Linus

[-- Attachment #2: patch.diff --]
[-- Type: text/x-patch, Size: 2575 bytes --]

 kernel/taskstats.c |   16 ++++++++++++++--
 1 files changed, 14 insertions(+), 2 deletions(-)

diff --git a/kernel/taskstats.c b/kernel/taskstats.c
index fc0f22005417..620e89ae96cf 100644
--- a/kernel/taskstats.c
+++ b/kernel/taskstats.c
@@ -27,6 +27,7 @@
 #include <linux/cgroup.h>
 #include <linux/fs.h>
 #include <linux/file.h>
+#include <linux/pid_namespace.h>
 #include <net/genetlink.h>
 #include <asm/atomic.h>
 
@@ -60,6 +61,7 @@ static const struct nla_policy cgroupstats_cmd_get_policy[CGROUPSTATS_CMD_ATTR_M
 struct listener {
 	struct list_head list;
 	pid_t pid;
+	struct pid_namespace *pid_ns;
 	char valid;
 };
 
@@ -127,6 +129,7 @@ static int send_reply(struct sk_buff *skb, struct genl_info *info)
 static void send_cpu_listeners(struct sk_buff *skb,
 					struct listener_list *listeners)
 {
+	struct pid_namespace *current_ns;
 	struct genlmsghdr *genlhdr = nlmsg_data(nlmsg_hdr(skb));
 	struct listener *s, *tmp;
 	struct sk_buff *skb_next, *skb_cur = skb;
@@ -140,8 +143,15 @@ static void send_cpu_listeners(struct sk_buff *skb,
 	}
 
 	rc = 0;
+	current_ns = task_active_pid_ns(current);
 	down_read(&listeners->sem);
 	list_for_each_entry(s, &listeners->list, list) {
+		/*
+		 * BROKEN: what if "s" is stale? This way it will
+		 * never be marked invalid!
+		 */
+		if (s->pid_ns != current_ns)
+			continue;
 		skb_next = NULL;
 		if (!list_is_last(&s->list, &listeners->list)) {
 			skb_next = skb_clone(skb_cur, GFP_KERNEL);
@@ -287,6 +297,7 @@ static int add_del_listener(pid_t pid, const struct cpumask *mask, int isadd)
 	struct listener_list *listeners;
 	struct listener *s, *tmp, *s2;
 	unsigned int cpu;
+	struct pid_namespace *pid_ns = task_active_pid_ns(current);
 
 	if (!cpumask_subset(mask, cpu_possible_mask))
 		return -EINVAL;
@@ -300,13 +311,14 @@ static int add_del_listener(pid_t pid, const struct cpumask *mask, int isadd)
 			if (!s)
 				goto cleanup;
 			s->pid = pid;
+			s->pid_ns = pid_ns;
 			INIT_LIST_HEAD(&s->list);
 			s->valid = 1;
 
 			listeners = &per_cpu(listener_array, cpu);
 			down_write(&listeners->sem);
 			list_for_each_entry_safe(s2, tmp, &listeners->list, list) {
-				if (s2->pid == pid)
+				if (s2->pid == pid && s2->pid_ns == pid_ns)
 					goto next_cpu;
 			}
 			list_add(&s->list, &listeners->list);
@@ -324,7 +336,7 @@ cleanup:
 		listeners = &per_cpu(listener_array, cpu);
 		down_write(&listeners->sem);
 		list_for_each_entry_safe(s, tmp, &listeners->list, list) {
-			if (s->pid == pid) {
+			if (s->pid == pid && s->pid_ns == pid_ns) {
 				list_del(&s->list);
 				kfree(s);
 				break;

^ permalink raw reply related	[flat|nested] 49+ messages in thread

* Re: [Security] [PATCH 2/2] taskstats: restrict access to user
  2011-06-30 16:40         ` [kernel-hardening] " Linus Torvalds
@ 2011-07-01  3:02           ` Balbir Singh
  -1 siblings, 0 replies; 49+ messages in thread
From: Balbir Singh @ 2011-07-01  3:02 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Vasiliy Kulikov, Shailabh Nagar, linux-kernel, security,
	Eric Paris, Stephen Wilson, KOSAKI Motohiro, David Rientjes,
	Andrew Morton, Balbir Singh, kernel-hardening

On Thu, Jun 30, 2011 at 10:10 PM, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
> On Thu, Jun 30, 2011 at 3:59 AM, Balbir Singh <bsingharora@gmail.com> wrote:
>> On Thu, Jun 30, 2011 at 1:27 PM, Vasiliy Kulikov <segoon@openwall.com> wrote:
>>> On Wed, Jun 29, 2011 at 13:09 -0700, Linus Torvalds wrote:
>>>> Ok, having looked at this some more, I'm quite ready to just mark the
>>>> whole TASKSTATS config option as BROKEN. It does seem to be a horrible
>>>> security hazard, and very little seems to use it.
>>>>
>>
>> I am not sure how you define BROKEN, BROKEN as per security rules?
>> /proc/$pid/xxx also expose similar information.
>
> No it doesn't.
>
> /proc/$pid/xyz has always had proper security checks. Now, sometimes
> they've been a bit too lax (iow, we've had cases where we just allowed
> things we shouldn't - like this "io" file), but /proc/pid/ at least
> *conceptually* has always had the "do I have permission to read this
> or not". Do a "cat /proc/*/* > /dev/null" and you'll be getting a lot
> of "permission denied" etc.
>

I've tried that, but fundamentally the statistics we export in
taskstats is what a utility like top would need. Compare them against
cat /proc/$pid/sched and /proc/$pid/status

> TASKSTATS? Nothing. Nada. Completely open.
>
> That's just broken.
>

Except for the I/O part, which turned about to be a cause of concern,
look at struct taskstats and please see the comment above. We have
some additional delay statistics, but I don't see them as a core
broken issue

> TASKSTATS also isn't apparently used by any normal program, and so
> never got updated to namespaces. Again, /proc/xyz/ at least *tries*:
> I'm not guaranteeing that it doesn't have some gaping holes, but I can
> at least find the logic that saves and uses namespace information.
>

Or namespaces are not being used as much, I know a lot of users use
iotop(), I've frequently gotten and handled requests to enable the
feature across distros

> Again, TASKSTATS? Not so much.
>
>> I must admit, I did not pay much attention to pid namespace changes as
>> they were being made to taskstats. I'll look at the code later this
>> week.
>
> Some of the pid namespace issues could be fixed with the attached kind
> of starting point.
>
> But it's broken. See the comment I added. Why is it broken? Again -
> taskstats is fundamentally broken, and doesn't seem to actually clean
> up after itself. The "cleanup" depends on noticing at run-time that
> some pid is stale and no longer listening. Again, that's just
> fundamental brokenness in taskstats.
>

That is lazy cleanup to avoid trapping exit events of listeners and
then having to check against the list. But if the cleanup needs to be
more active, it should be easy to fix. What I am saying is that it is
not as BROKEN as you make it sound. The attached patch is definitely a
step in the right direction and should be applied.

> And it also only look sat pid namespaces. The network namespace issue
> is separate.
>
> So that's why I think it should be marked BROKEN. What applications
> actually depend on this? iotop and what else? Because if it's just
> iotop, I do suspect we might be better off telling people "ok,
> disabling this will break iotop, but quite frankly, you're better off
> without it".

I beg to differ, due to the reasons above. I'd rather find time and
fix the pending issues (network namespace), you've fixed the pid
namespace issue. I'd also look for exiting listeners

Balbir Singh

^ permalink raw reply	[flat|nested] 49+ messages in thread

* [kernel-hardening] Re: [Security] [PATCH 2/2] taskstats: restrict access to user
@ 2011-07-01  3:02           ` Balbir Singh
  0 siblings, 0 replies; 49+ messages in thread
From: Balbir Singh @ 2011-07-01  3:02 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Vasiliy Kulikov, Shailabh Nagar, linux-kernel, security,
	Eric Paris, Stephen Wilson, KOSAKI Motohiro, David Rientjes,
	Andrew Morton, Balbir Singh, kernel-hardening

On Thu, Jun 30, 2011 at 10:10 PM, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
> On Thu, Jun 30, 2011 at 3:59 AM, Balbir Singh <bsingharora@gmail.com> wrote:
>> On Thu, Jun 30, 2011 at 1:27 PM, Vasiliy Kulikov <segoon@openwall.com> wrote:
>>> On Wed, Jun 29, 2011 at 13:09 -0700, Linus Torvalds wrote:
>>>> Ok, having looked at this some more, I'm quite ready to just mark the
>>>> whole TASKSTATS config option as BROKEN. It does seem to be a horrible
>>>> security hazard, and very little seems to use it.
>>>>
>>
>> I am not sure how you define BROKEN, BROKEN as per security rules?
>> /proc/$pid/xxx also expose similar information.
>
> No it doesn't.
>
> /proc/$pid/xyz has always had proper security checks. Now, sometimes
> they've been a bit too lax (iow, we've had cases where we just allowed
> things we shouldn't - like this "io" file), but /proc/pid/ at least
> *conceptually* has always had the "do I have permission to read this
> or not". Do a "cat /proc/*/* > /dev/null" and you'll be getting a lot
> of "permission denied" etc.
>

I've tried that, but fundamentally the statistics we export in
taskstats is what a utility like top would need. Compare them against
cat /proc/$pid/sched and /proc/$pid/status

> TASKSTATS? Nothing. Nada. Completely open.
>
> That's just broken.
>

Except for the I/O part, which turned about to be a cause of concern,
look at struct taskstats and please see the comment above. We have
some additional delay statistics, but I don't see them as a core
broken issue

> TASKSTATS also isn't apparently used by any normal program, and so
> never got updated to namespaces. Again, /proc/xyz/ at least *tries*:
> I'm not guaranteeing that it doesn't have some gaping holes, but I can
> at least find the logic that saves and uses namespace information.
>

Or namespaces are not being used as much, I know a lot of users use
iotop(), I've frequently gotten and handled requests to enable the
feature across distros

> Again, TASKSTATS? Not so much.
>
>> I must admit, I did not pay much attention to pid namespace changes as
>> they were being made to taskstats. I'll look at the code later this
>> week.
>
> Some of the pid namespace issues could be fixed with the attached kind
> of starting point.
>
> But it's broken. See the comment I added. Why is it broken? Again -
> taskstats is fundamentally broken, and doesn't seem to actually clean
> up after itself. The "cleanup" depends on noticing at run-time that
> some pid is stale and no longer listening. Again, that's just
> fundamental brokenness in taskstats.
>

That is lazy cleanup to avoid trapping exit events of listeners and
then having to check against the list. But if the cleanup needs to be
more active, it should be easy to fix. What I am saying is that it is
not as BROKEN as you make it sound. The attached patch is definitely a
step in the right direction and should be applied.

> And it also only look sat pid namespaces. The network namespace issue
> is separate.
>
> So that's why I think it should be marked BROKEN. What applications
> actually depend on this? iotop and what else? Because if it's just
> iotop, I do suspect we might be better off telling people "ok,
> disabling this will break iotop, but quite frankly, you're better off
> without it".

I beg to differ, due to the reasons above. I'd rather find time and
fix the pending issues (network namespace), you've fixed the pid
namespace issue. I'd also look for exiting listeners

Balbir Singh

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 2/2] taskstats: restrict access to user
  2011-06-29 20:17   ` Vasiliy Kulikov
@ 2011-07-02  7:36       ` Vasiliy Kulikov
  0 siblings, 0 replies; 49+ messages in thread
From: Vasiliy Kulikov @ 2011-07-02  7:36 UTC (permalink / raw)
  To: Balbir Singh, Linus Torvalds
  Cc: linux-kernel, Andrew Morton, Al Viro, David Rientjes,
	Stephen Wilson, KOSAKI Motohiro, security, Eric Paris,
	kernel-hardening

On Thu, Jun 30, 2011 at 00:17 +0400, Vasiliy Kulikov wrote:
> On Wed, Jun 29, 2011 at 06:57 +0530, Balbir Singh wrote:
> > On Fri, Jun 24, 2011 at 5:39 PM, Vasiliy Kulikov <segoon@openwall.com> wrote:
> > > +               task = find_task_by_vpid(s->pid);
> > > +               if (!task || __task_cred(task)->euid != cred->euid) {
> 
> If consider this patch for inclusion, it also needs some check for
> the listener root ability.  __task_cred(task)->euid == 0 or smth like
> that.  But ptrace_task_may_access_current() is better.

So...  What to do with this patch?  Should I resend it with euid==0 and
rcu optimication, wait for the new ptrace interface or what?


Thanks,

-- 
Vasiliy Kulikov
http://www.openwall.com - bringing security into open computing environments

^ permalink raw reply	[flat|nested] 49+ messages in thread

* [kernel-hardening] Re: [PATCH 2/2] taskstats: restrict access to user
@ 2011-07-02  7:36       ` Vasiliy Kulikov
  0 siblings, 0 replies; 49+ messages in thread
From: Vasiliy Kulikov @ 2011-07-02  7:36 UTC (permalink / raw)
  To: Balbir Singh, Linus Torvalds
  Cc: linux-kernel, Andrew Morton, Al Viro, David Rientjes,
	Stephen Wilson, KOSAKI Motohiro, security, Eric Paris,
	kernel-hardening

On Thu, Jun 30, 2011 at 00:17 +0400, Vasiliy Kulikov wrote:
> On Wed, Jun 29, 2011 at 06:57 +0530, Balbir Singh wrote:
> > On Fri, Jun 24, 2011 at 5:39 PM, Vasiliy Kulikov <segoon@openwall.com> wrote:
> > > +               task = find_task_by_vpid(s->pid);
> > > +               if (!task || __task_cred(task)->euid != cred->euid) {
> 
> If consider this patch for inclusion, it also needs some check for
> the listener root ability.  __task_cred(task)->euid == 0 or smth like
> that.  But ptrace_task_may_access_current() is better.

So...  What to do with this patch?  Should I resend it with euid==0 and
rcu optimication, wait for the new ptrace interface or what?


Thanks,

-- 
Vasiliy Kulikov
http://www.openwall.com - bringing security into open computing environments

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 2/2] taskstats: restrict access to user
  2011-07-02  7:36       ` [kernel-hardening] " Vasiliy Kulikov
@ 2011-07-04  2:57         ` Balbir Singh
  -1 siblings, 0 replies; 49+ messages in thread
From: Balbir Singh @ 2011-07-04  2:57 UTC (permalink / raw)
  To: Vasiliy Kulikov
  Cc: Linus Torvalds, linux-kernel, Andrew Morton, Al Viro,
	David Rientjes, Stephen Wilson, KOSAKI Motohiro, security,
	Eric Paris, kernel-hardening

On Sat, Jul 2, 2011 at 1:06 PM, Vasiliy Kulikov <segoon@openwall.com> wrote:
> On Thu, Jun 30, 2011 at 00:17 +0400, Vasiliy Kulikov wrote:
>> On Wed, Jun 29, 2011 at 06:57 +0530, Balbir Singh wrote:
>> > On Fri, Jun 24, 2011 at 5:39 PM, Vasiliy Kulikov <segoon@openwall.com> wrote:
>> > > +               task = find_task_by_vpid(s->pid);
>> > > +               if (!task || __task_cred(task)->euid != cred->euid) {
>>
>> If consider this patch for inclusion, it also needs some check for
>> the listener root ability.  __task_cred(task)->euid == 0 or smth like
>> that.  But ptrace_task_may_access_current() is better.
>
> So...  What to do with this patch?  Should I resend it with euid==0 and
> rcu optimication, wait for the new ptrace interface or what?
>
>
Hi, Vasiliy,

Would it be possible to audit the entire taskstats structure to see
what fields should and should not be exposed. If a particular field
should not be exposed, we can fill it in with a special value
indicating additional permission is required to see it and fill in
those fields only when credentials match like you've shown

Does that make sense?
Balbir Singh

^ permalink raw reply	[flat|nested] 49+ messages in thread

* [kernel-hardening] Re: [PATCH 2/2] taskstats: restrict access to user
@ 2011-07-04  2:57         ` Balbir Singh
  0 siblings, 0 replies; 49+ messages in thread
From: Balbir Singh @ 2011-07-04  2:57 UTC (permalink / raw)
  To: Vasiliy Kulikov
  Cc: Linus Torvalds, linux-kernel, Andrew Morton, Al Viro,
	David Rientjes, Stephen Wilson, KOSAKI Motohiro, security,
	Eric Paris, kernel-hardening

On Sat, Jul 2, 2011 at 1:06 PM, Vasiliy Kulikov <segoon@openwall.com> wrote:
> On Thu, Jun 30, 2011 at 00:17 +0400, Vasiliy Kulikov wrote:
>> On Wed, Jun 29, 2011 at 06:57 +0530, Balbir Singh wrote:
>> > On Fri, Jun 24, 2011 at 5:39 PM, Vasiliy Kulikov <segoon@openwall.com> wrote:
>> > > +               task = find_task_by_vpid(s->pid);
>> > > +               if (!task || __task_cred(task)->euid != cred->euid) {
>>
>> If consider this patch for inclusion, it also needs some check for
>> the listener root ability.  __task_cred(task)->euid == 0 or smth like
>> that.  But ptrace_task_may_access_current() is better.
>
> So...  What to do with this patch?  Should I resend it with euid==0 and
> rcu optimication, wait for the new ptrace interface or what?
>
>
Hi, Vasiliy,

Would it be possible to audit the entire taskstats structure to see
what fields should and should not be exposed. If a particular field
should not be exposed, we can fill it in with a special value
indicating additional permission is required to see it and fill in
those fields only when credentials match like you've shown

Does that make sense?
Balbir Singh

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 2/2] taskstats: restrict access to user
  2011-07-04  2:57         ` [kernel-hardening] " Balbir Singh
@ 2011-07-04 17:45           ` Vasiliy Kulikov
  -1 siblings, 0 replies; 49+ messages in thread
From: Vasiliy Kulikov @ 2011-07-04 17:45 UTC (permalink / raw)
  To: Balbir Singh
  Cc: Linus Torvalds, linux-kernel, Andrew Morton, Al Viro,
	David Rientjes, Stephen Wilson, KOSAKI Motohiro, security,
	Eric Paris, kernel-hardening

Hi Balbir,

On Mon, Jul 04, 2011 at 08:27 +0530, Balbir Singh wrote:
> Would it be possible to audit the entire taskstats structure to see
> what fields should and should not be exposed. If a particular field
> should not be exposed, we can fill it in with a special value
> indicating additional permission is required to see it and fill in
> those fields only when credentials match like you've shown
> 
> Does that make sense?

I'm afraid there is no universal opinion about it.  My point is
(almost) all this information (unless it can be gathered more simple
way) can be used by an attacker in one or another situation (highly
conditional, he might have to win a race(s), etc.), which might not be
approved by other developers without a proof.

The review would be reduced to trying to exploit some programs using
this information (so, it would be still incomplete).  I don't feel
myself enough experienced in such types of exploitation to (a) imagine
the abstract attack scenario and (b) find a program this attack would be
targeted to.

The already known danger is these io fields.  I *suspect* scheduler,
timer, page faults, exit status, and memory related information can be
used in situations where changing these fields might reveal the very
fact of some private computation.  These are only my thoughts and I
couldn't find any more or less realistic scenario of exploitation.

ac_comm and credential fields values show the same information as procfs
files, but I still want to introduce configuration mechanism with a
separate patch to restrict these fields (both in taskstats and procfs)
to complicate attacker's job of probing system specific information.


So, I cannot fully answer to your question, sorry.

-- 
Vasiliy Kulikov
http://www.openwall.com - bringing security into open computing environments

^ permalink raw reply	[flat|nested] 49+ messages in thread

* [kernel-hardening] Re: [PATCH 2/2] taskstats: restrict access to user
@ 2011-07-04 17:45           ` Vasiliy Kulikov
  0 siblings, 0 replies; 49+ messages in thread
From: Vasiliy Kulikov @ 2011-07-04 17:45 UTC (permalink / raw)
  To: Balbir Singh
  Cc: Linus Torvalds, linux-kernel, Andrew Morton, Al Viro,
	David Rientjes, Stephen Wilson, KOSAKI Motohiro, security,
	Eric Paris, kernel-hardening

Hi Balbir,

On Mon, Jul 04, 2011 at 08:27 +0530, Balbir Singh wrote:
> Would it be possible to audit the entire taskstats structure to see
> what fields should and should not be exposed. If a particular field
> should not be exposed, we can fill it in with a special value
> indicating additional permission is required to see it and fill in
> those fields only when credentials match like you've shown
> 
> Does that make sense?

I'm afraid there is no universal opinion about it.  My point is
(almost) all this information (unless it can be gathered more simple
way) can be used by an attacker in one or another situation (highly
conditional, he might have to win a race(s), etc.), which might not be
approved by other developers without a proof.

The review would be reduced to trying to exploit some programs using
this information (so, it would be still incomplete).  I don't feel
myself enough experienced in such types of exploitation to (a) imagine
the abstract attack scenario and (b) find a program this attack would be
targeted to.

The already known danger is these io fields.  I *suspect* scheduler,
timer, page faults, exit status, and memory related information can be
used in situations where changing these fields might reveal the very
fact of some private computation.  These are only my thoughts and I
couldn't find any more or less realistic scenario of exploitation.

ac_comm and credential fields values show the same information as procfs
files, but I still want to introduce configuration mechanism with a
separate patch to restrict these fields (both in taskstats and procfs)
to complicate attacker's job of probing system specific information.


So, I cannot fully answer to your question, sorry.

-- 
Vasiliy Kulikov
http://www.openwall.com - bringing security into open computing environments

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 2/2] taskstats: restrict access to user
  2011-07-04 17:45           ` [kernel-hardening] " Vasiliy Kulikov
@ 2011-07-07  8:55             ` Vasiliy Kulikov
  -1 siblings, 0 replies; 49+ messages in thread
From: Vasiliy Kulikov @ 2011-07-07  8:55 UTC (permalink / raw)
  To: Balbir Singh
  Cc: Linus Torvalds, linux-kernel, Andrew Morton, Al Viro,
	David Rientjes, Stephen Wilson, KOSAKI Motohiro, security,
	Eric Paris, kernel-hardening

On Mon, Jul 04, 2011 at 21:45 +0400, Vasiliy Kulikov wrote:
> The already known danger is these io fields.

Two more things:

1) unblocking netlink socket on task exit is a rather useful help to win
different races.  E.g. if the vulnerable program has the code -

    wait(NULL);
    do_smth_racy();

- then the attacker's task listening for the taskstats event will be
effectively woken up just before the racy code.  It might greatly
increase the chanses to win the race => to exploit the bug.
(The same defect exists in inotify.)


2) taskstats gives the task information at the precisely specific moment
- task death.  So, the attacker shouldn't guess whether some event
occured or not.  The formula of gotten information is _exactly_ task
activity during the life.  On the contrary, getting the same information
from procfs files might result in some inaccuracy because of measuring
time inaccuracy (scheduler's variability, different disks' load, etc.).

Of cource, (2) makes sense only if some sensible information is still
available through taskstats.


Thanks,

-- 
Vasiliy Kulikov
http://www.openwall.com - bringing security into open computing environments

^ permalink raw reply	[flat|nested] 49+ messages in thread

* [kernel-hardening] Re: [PATCH 2/2] taskstats: restrict access to user
@ 2011-07-07  8:55             ` Vasiliy Kulikov
  0 siblings, 0 replies; 49+ messages in thread
From: Vasiliy Kulikov @ 2011-07-07  8:55 UTC (permalink / raw)
  To: Balbir Singh
  Cc: Linus Torvalds, linux-kernel, Andrew Morton, Al Viro,
	David Rientjes, Stephen Wilson, KOSAKI Motohiro, security,
	Eric Paris, kernel-hardening

On Mon, Jul 04, 2011 at 21:45 +0400, Vasiliy Kulikov wrote:
> The already known danger is these io fields.

Two more things:

1) unblocking netlink socket on task exit is a rather useful help to win
different races.  E.g. if the vulnerable program has the code -

    wait(NULL);
    do_smth_racy();

- then the attacker's task listening for the taskstats event will be
effectively woken up just before the racy code.  It might greatly
increase the chanses to win the race => to exploit the bug.
(The same defect exists in inotify.)


2) taskstats gives the task information at the precisely specific moment
- task death.  So, the attacker shouldn't guess whether some event
occured or not.  The formula of gotten information is _exactly_ task
activity during the life.  On the contrary, getting the same information
from procfs files might result in some inaccuracy because of measuring
time inaccuracy (scheduler's variability, different disks' load, etc.).

Of cource, (2) makes sense only if some sensible information is still
available through taskstats.


Thanks,

-- 
Vasiliy Kulikov
http://www.openwall.com - bringing security into open computing environments

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 2/2] taskstats: restrict access to user
  2011-07-07  8:55             ` [kernel-hardening] " Vasiliy Kulikov
@ 2011-07-07 11:53               ` Balbir Singh
  -1 siblings, 0 replies; 49+ messages in thread
From: Balbir Singh @ 2011-07-07 11:53 UTC (permalink / raw)
  To: Vasiliy Kulikov
  Cc: Linus Torvalds, linux-kernel, Andrew Morton, Al Viro,
	David Rientjes, Stephen Wilson, KOSAKI Motohiro, security,
	Eric Paris, kernel-hardening

On Thu, Jul 7, 2011 at 2:25 PM, Vasiliy Kulikov <segoon@openwall.com> wrote:
> On Mon, Jul 04, 2011 at 21:45 +0400, Vasiliy Kulikov wrote:
>> The already known danger is these io fields.
>
> Two more things:
>
> 1) unblocking netlink socket on task exit is a rather useful help to win
> different races.  E.g. if the vulnerable program has the code -
>
>    wait(NULL);
>    do_smth_racy();
>
> - then the attacker's task listening for the taskstats event will be
> effectively woken up just before the racy code.  It might greatly
> increase the chanses to win the race => to exploit the bug.
> (The same defect exists in inotify.)
>

I don't see why taskstats is singled out, please look at proc
notifiers as well. I don't buy this use case, what are we trying to
save here and why is taskstats responsible, because it notifies?

>
> 2) taskstats gives the task information at the precisely specific moment
> - task death.  So, the attacker shouldn't guess whether some event
> occured or not.  The formula of gotten information is _exactly_ task
> activity during the life.  On the contrary, getting the same information
> from procfs files might result in some inaccuracy because of measuring
> time inaccuracy (scheduler's variability, different disks' load, etc.).
>
> Of cource, (2) makes sense only if some sensible information is still
> available through taskstats.

Again this makes no sense to me, at the end we send accumulated data
and that data can be read from /proc/$pid (mostly). The race is that
while I go off to read the data the process might disappear taking all
of its data with it, which is what taskstats tries to solve among
other things. Your use case has a lot of hand waving, which I frankly
cannot put to a logical place in my mind.

Balbir Singh

^ permalink raw reply	[flat|nested] 49+ messages in thread

* [kernel-hardening] Re: [PATCH 2/2] taskstats: restrict access to user
@ 2011-07-07 11:53               ` Balbir Singh
  0 siblings, 0 replies; 49+ messages in thread
From: Balbir Singh @ 2011-07-07 11:53 UTC (permalink / raw)
  To: Vasiliy Kulikov
  Cc: Linus Torvalds, linux-kernel, Andrew Morton, Al Viro,
	David Rientjes, Stephen Wilson, KOSAKI Motohiro, security,
	Eric Paris, kernel-hardening

On Thu, Jul 7, 2011 at 2:25 PM, Vasiliy Kulikov <segoon@openwall.com> wrote:
> On Mon, Jul 04, 2011 at 21:45 +0400, Vasiliy Kulikov wrote:
>> The already known danger is these io fields.
>
> Two more things:
>
> 1) unblocking netlink socket on task exit is a rather useful help to win
> different races.  E.g. if the vulnerable program has the code -
>
>    wait(NULL);
>    do_smth_racy();
>
> - then the attacker's task listening for the taskstats event will be
> effectively woken up just before the racy code.  It might greatly
> increase the chanses to win the race => to exploit the bug.
> (The same defect exists in inotify.)
>

I don't see why taskstats is singled out, please look at proc
notifiers as well. I don't buy this use case, what are we trying to
save here and why is taskstats responsible, because it notifies?

>
> 2) taskstats gives the task information at the precisely specific moment
> - task death.  So, the attacker shouldn't guess whether some event
> occured or not.  The formula of gotten information is _exactly_ task
> activity during the life.  On the contrary, getting the same information
> from procfs files might result in some inaccuracy because of measuring
> time inaccuracy (scheduler's variability, different disks' load, etc.).
>
> Of cource, (2) makes sense only if some sensible information is still
> available through taskstats.

Again this makes no sense to me, at the end we send accumulated data
and that data can be read from /proc/$pid (mostly). The race is that
while I go off to read the data the process might disappear taking all
of its data with it, which is what taskstats tries to solve among
other things. Your use case has a lot of hand waving, which I frankly
cannot put to a logical place in my mind.

Balbir Singh

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 2/2] taskstats: restrict access to user
  2011-07-07 11:53               ` [kernel-hardening] " Balbir Singh
@ 2011-07-07 16:23                 ` Vasiliy Kulikov
  -1 siblings, 0 replies; 49+ messages in thread
From: Vasiliy Kulikov @ 2011-07-07 16:23 UTC (permalink / raw)
  To: Balbir Singh
  Cc: Linus Torvalds, linux-kernel, Andrew Morton, Al Viro,
	David Rientjes, Stephen Wilson, KOSAKI Motohiro, security,
	Eric Paris, kernel-hardening

On Thu, Jul 07, 2011 at 17:23 +0530, Balbir Singh wrote:
> On Thu, Jul 7, 2011 at 2:25 PM, Vasiliy Kulikov <segoon@openwall.com> wrote:
> > 1) unblocking netlink socket on task exit is a rather useful help to win
> > different races.  E.g. if the vulnerable program has the code -
> >
> >    wait(NULL);
> >    do_smth_racy();
> >
> > - then the attacker's task listening for the taskstats event will be
> > effectively woken up just before the racy code.  It might greatly
> > increase the chanses to win the race => to exploit the bug.
> > (The same defect exists in inotify.)
> >
> 
> I don't see why taskstats is singled out, please look at proc
> notifiers as well.

Do you mean proc connector?  AFAICS, it is available to root only.  And
no, taskstats is not singled out - I mentioned inotify as another
example.  The kernel might be vulnerable to many side channel attacks,
using different interfaces.


> I don't buy this use case, what are we trying to
> save here and why is taskstats responsible, because it notifies?

Because it notifies _asynchronously_ in sense of the subject and
synchronously in sense of the object's activity.  It gives a hint when
some probable "chechpoint" occured.

Please compare in the example I've posted above the cases of "poll"
(like test -e /proc/$pid) and "wait" (taskstats).  In the poll case it's
very easy to loose the moment of the race because of rescheduling.  In
the wait case the attacker task wakes up very closely to the race place.


> > 2) taskstats gives the task information at the precisely specific moment
> > - task death.  So, the attacker shouldn't guess whether some event
> > occured or not.  The formula of gotten information is _exactly_ task
> > activity during the life.  On the contrary, getting the same information
> > from procfs files might result in some inaccuracy because of measuring
> > time inaccuracy (scheduler's variability, different disks' load, etc.).
> >
> > Of cource, (2) makes sense only if some sensible information is still
> > available through taskstats.
> 
> Again this makes no sense to me, at the end we send accumulated data
> and that data can be read from /proc/$pid (mostly).

Umm...  If both taskstats and procfs expose some sensible information,
both should be fixed, no?


> The race is that
> while I go off to read the data the process might disappear taking all
> of its data with it, which is what taskstats tries to solve among
> other things.

Or the last succeeded measurement didn't happen after some sensible
event.

Introducing this "race" neither fixes some bug or fully prevents some
exploitation technique.  It might _reduce the chance_ of exploitation.

In my ssh exploit an attacker using procfs would have to poll
/proc/PID/io while 2 other processes would run - privileged sshd and
unprivileged sshd.  The scheduler would try to run both sshds
on different CPUs of 2 CPU system in parallel because sshds actively
exchange the data via pipes.  So, the poller might not run on any CPU
while the unpivileged sshd is dying.  By using taskstats I get the
precise information from the first attempt.


Thanks,

-- 
Vasiliy Kulikov
http://www.openwall.com - bringing security into open computing environments

^ permalink raw reply	[flat|nested] 49+ messages in thread

* [kernel-hardening] Re: [PATCH 2/2] taskstats: restrict access to user
@ 2011-07-07 16:23                 ` Vasiliy Kulikov
  0 siblings, 0 replies; 49+ messages in thread
From: Vasiliy Kulikov @ 2011-07-07 16:23 UTC (permalink / raw)
  To: Balbir Singh
  Cc: Linus Torvalds, linux-kernel, Andrew Morton, Al Viro,
	David Rientjes, Stephen Wilson, KOSAKI Motohiro, security,
	Eric Paris, kernel-hardening

On Thu, Jul 07, 2011 at 17:23 +0530, Balbir Singh wrote:
> On Thu, Jul 7, 2011 at 2:25 PM, Vasiliy Kulikov <segoon@openwall.com> wrote:
> > 1) unblocking netlink socket on task exit is a rather useful help to win
> > different races.  E.g. if the vulnerable program has the code -
> >
> >    wait(NULL);
> >    do_smth_racy();
> >
> > - then the attacker's task listening for the taskstats event will be
> > effectively woken up just before the racy code.  It might greatly
> > increase the chanses to win the race => to exploit the bug.
> > (The same defect exists in inotify.)
> >
> 
> I don't see why taskstats is singled out, please look at proc
> notifiers as well.

Do you mean proc connector?  AFAICS, it is available to root only.  And
no, taskstats is not singled out - I mentioned inotify as another
example.  The kernel might be vulnerable to many side channel attacks,
using different interfaces.


> I don't buy this use case, what are we trying to
> save here and why is taskstats responsible, because it notifies?

Because it notifies _asynchronously_ in sense of the subject and
synchronously in sense of the object's activity.  It gives a hint when
some probable "chechpoint" occured.

Please compare in the example I've posted above the cases of "poll"
(like test -e /proc/$pid) and "wait" (taskstats).  In the poll case it's
very easy to loose the moment of the race because of rescheduling.  In
the wait case the attacker task wakes up very closely to the race place.


> > 2) taskstats gives the task information at the precisely specific moment
> > - task death.  So, the attacker shouldn't guess whether some event
> > occured or not.  The formula of gotten information is _exactly_ task
> > activity during the life.  On the contrary, getting the same information
> > from procfs files might result in some inaccuracy because of measuring
> > time inaccuracy (scheduler's variability, different disks' load, etc.).
> >
> > Of cource, (2) makes sense only if some sensible information is still
> > available through taskstats.
> 
> Again this makes no sense to me, at the end we send accumulated data
> and that data can be read from /proc/$pid (mostly).

Umm...  If both taskstats and procfs expose some sensible information,
both should be fixed, no?


> The race is that
> while I go off to read the data the process might disappear taking all
> of its data with it, which is what taskstats tries to solve among
> other things.

Or the last succeeded measurement didn't happen after some sensible
event.

Introducing this "race" neither fixes some bug or fully prevents some
exploitation technique.  It might _reduce the chance_ of exploitation.

In my ssh exploit an attacker using procfs would have to poll
/proc/PID/io while 2 other processes would run - privileged sshd and
unprivileged sshd.  The scheduler would try to run both sshds
on different CPUs of 2 CPU system in parallel because sshds actively
exchange the data via pipes.  So, the poller might not run on any CPU
while the unpivileged sshd is dying.  By using taskstats I get the
precise information from the first attempt.


Thanks,

-- 
Vasiliy Kulikov
http://www.openwall.com - bringing security into open computing environments

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 2/2] taskstats: restrict access to user
  2011-07-07 16:23                 ` [kernel-hardening] " Vasiliy Kulikov
@ 2011-07-09 15:36                   ` Balbir Singh
  -1 siblings, 0 replies; 49+ messages in thread
From: Balbir Singh @ 2011-07-09 15:36 UTC (permalink / raw)
  To: Vasiliy Kulikov
  Cc: Linus Torvalds, linux-kernel, Andrew Morton, Al Viro,
	David Rientjes, Stephen Wilson, KOSAKI Motohiro, security,
	Eric Paris, kernel-hardening

On Thu, Jul 7, 2011 at 9:53 PM, Vasiliy Kulikov <segoon@openwall.com> wrote:
> On Thu, Jul 07, 2011 at 17:23 +0530, Balbir Singh wrote:
>> On Thu, Jul 7, 2011 at 2:25 PM, Vasiliy Kulikov <segoon@openwall.com> wrote:
>> > 1) unblocking netlink socket on task exit is a rather useful help to win
>> > different races.  E.g. if the vulnerable program has the code -
>> >
>> >    wait(NULL);
>> >    do_smth_racy();
>> >
>> > - then the attacker's task listening for the taskstats event will be
>> > effectively woken up just before the racy code.  It might greatly
>> > increase the chanses to win the race => to exploit the bug.
>> > (The same defect exists in inotify.)
>> >
>>
>> I don't see why taskstats is singled out, please look at proc
>> notifiers as well.
>
> Do you mean proc connector?  AFAICS, it is available to root only.  And

Yes, good point, it reuses the netlink nl_groups mechanism

> no, taskstats is not singled out - I mentioned inotify as another
> example.  The kernel might be vulnerable to many side channel attacks,
> using different interfaces.
>
>
>> I don't buy this use case, what are we trying to
>> save here and why is taskstats responsible, because it notifies?
>
> Because it notifies _asynchronously_ in sense of the subject and
> synchronously in sense of the object's activity.  It gives a hint when
> some probable "chechpoint" occured.
>
> Please compare in the example I've posted above the cases of "poll"
> (like test -e /proc/$pid) and "wait" (taskstats).  In the poll case it's
> very easy to loose the moment of the race because of rescheduling.  In
> the wait case the attacker task wakes up very closely to the race place.
>

I tried a simple experiment and dnotify and it is possible to get
events on exit. But that is not the point, you seem to suggest that an
exit is a significant event for getting information about a task that
can lead to security issues?

>
>> > 2) taskstats gives the task information at the precisely specific moment
>> > - task death.  So, the attacker shouldn't guess whether some event
>> > occured or not.  The formula of gotten information is _exactly_ task
>> > activity during the life.  On the contrary, getting the same information
>> > from procfs files might result in some inaccuracy because of measuring
>> > time inaccuracy (scheduler's variability, different disks' load, etc.).
>> >
>> > Of cource, (2) makes sense only if some sensible information is still
>> > available through taskstats.
>>
>> Again this makes no sense to me, at the end we send accumulated data
>> and that data can be read from /proc/$pid (mostly).
>
> Umm...  If both taskstats and procfs expose some sensible information,
> both should be fixed, no?
>

Yes, sure. I am all for auditing the fields of taskstats and
/proc/$pid/* and making the distinction as to what can be exposed. Do
you at this point find anything that only taskstats exports that is
harmful?

>
>> The race is that
>> while I go off to read the data the process might disappear taking all
>> of its data with it, which is what taskstats tries to solve among
>> other things.
>
> Or the last succeeded measurement didn't happen after some sensible
> event.
>
> Introducing this "race" neither fixes some bug or fully prevents some
> exploitation technique.  It might _reduce the chance_ of exploitation.
>
> In my ssh exploit an attacker using procfs would have to poll
> /proc/PID/io while 2 other processes would run - privileged sshd and
> unprivileged sshd.  The scheduler would try to run both sshds
> on different CPUs of 2 CPU system in parallel because sshds actively
> exchange the data via pipes.  So, the poller might not run on any CPU
> while the unpivileged sshd is dying.  By using taskstats I get the
> precise information from the first attempt.

How do you use this information? Basically your concern is

1. Information taskstats exposes (I agree, we need to audit and filter)
2. Exit events (I have a tough time digesting this one even with your
examples, could you please share some details, code to show the
exploit)

Balbir Singh

^ permalink raw reply	[flat|nested] 49+ messages in thread

* [kernel-hardening] Re: [PATCH 2/2] taskstats: restrict access to user
@ 2011-07-09 15:36                   ` Balbir Singh
  0 siblings, 0 replies; 49+ messages in thread
From: Balbir Singh @ 2011-07-09 15:36 UTC (permalink / raw)
  To: Vasiliy Kulikov
  Cc: Linus Torvalds, linux-kernel, Andrew Morton, Al Viro,
	David Rientjes, Stephen Wilson, KOSAKI Motohiro, security,
	Eric Paris, kernel-hardening

On Thu, Jul 7, 2011 at 9:53 PM, Vasiliy Kulikov <segoon@openwall.com> wrote:
> On Thu, Jul 07, 2011 at 17:23 +0530, Balbir Singh wrote:
>> On Thu, Jul 7, 2011 at 2:25 PM, Vasiliy Kulikov <segoon@openwall.com> wrote:
>> > 1) unblocking netlink socket on task exit is a rather useful help to win
>> > different races.  E.g. if the vulnerable program has the code -
>> >
>> >    wait(NULL);
>> >    do_smth_racy();
>> >
>> > - then the attacker's task listening for the taskstats event will be
>> > effectively woken up just before the racy code.  It might greatly
>> > increase the chanses to win the race => to exploit the bug.
>> > (The same defect exists in inotify.)
>> >
>>
>> I don't see why taskstats is singled out, please look at proc
>> notifiers as well.
>
> Do you mean proc connector?  AFAICS, it is available to root only.  And

Yes, good point, it reuses the netlink nl_groups mechanism

> no, taskstats is not singled out - I mentioned inotify as another
> example.  The kernel might be vulnerable to many side channel attacks,
> using different interfaces.
>
>
>> I don't buy this use case, what are we trying to
>> save here and why is taskstats responsible, because it notifies?
>
> Because it notifies _asynchronously_ in sense of the subject and
> synchronously in sense of the object's activity.  It gives a hint when
> some probable "chechpoint" occured.
>
> Please compare in the example I've posted above the cases of "poll"
> (like test -e /proc/$pid) and "wait" (taskstats).  In the poll case it's
> very easy to loose the moment of the race because of rescheduling.  In
> the wait case the attacker task wakes up very closely to the race place.
>

I tried a simple experiment and dnotify and it is possible to get
events on exit. But that is not the point, you seem to suggest that an
exit is a significant event for getting information about a task that
can lead to security issues?

>
>> > 2) taskstats gives the task information at the precisely specific moment
>> > - task death.  So, the attacker shouldn't guess whether some event
>> > occured or not.  The formula of gotten information is _exactly_ task
>> > activity during the life.  On the contrary, getting the same information
>> > from procfs files might result in some inaccuracy because of measuring
>> > time inaccuracy (scheduler's variability, different disks' load, etc.).
>> >
>> > Of cource, (2) makes sense only if some sensible information is still
>> > available through taskstats.
>>
>> Again this makes no sense to me, at the end we send accumulated data
>> and that data can be read from /proc/$pid (mostly).
>
> Umm...  If both taskstats and procfs expose some sensible information,
> both should be fixed, no?
>

Yes, sure. I am all for auditing the fields of taskstats and
/proc/$pid/* and making the distinction as to what can be exposed. Do
you at this point find anything that only taskstats exports that is
harmful?

>
>> The race is that
>> while I go off to read the data the process might disappear taking all
>> of its data with it, which is what taskstats tries to solve among
>> other things.
>
> Or the last succeeded measurement didn't happen after some sensible
> event.
>
> Introducing this "race" neither fixes some bug or fully prevents some
> exploitation technique.  It might _reduce the chance_ of exploitation.
>
> In my ssh exploit an attacker using procfs would have to poll
> /proc/PID/io while 2 other processes would run - privileged sshd and
> unprivileged sshd.  The scheduler would try to run both sshds
> on different CPUs of 2 CPU system in parallel because sshds actively
> exchange the data via pipes.  So, the poller might not run on any CPU
> while the unpivileged sshd is dying.  By using taskstats I get the
> precise information from the first attempt.

How do you use this information? Basically your concern is

1. Information taskstats exposes (I agree, we need to audit and filter)
2. Exit events (I have a tough time digesting this one even with your
examples, could you please share some details, code to show the
exploit)

Balbir Singh

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 2/2] taskstats: restrict access to user
  2011-07-09 15:36                   ` [kernel-hardening] " Balbir Singh
@ 2011-07-11 14:07                     ` Vasiliy Kulikov
  -1 siblings, 0 replies; 49+ messages in thread
From: Vasiliy Kulikov @ 2011-07-11 14:07 UTC (permalink / raw)
  To: Balbir Singh
  Cc: Linus Torvalds, linux-kernel, Andrew Morton, Al Viro,
	David Rientjes, Stephen Wilson, KOSAKI Motohiro, security,
	Eric Paris, kernel-hardening

On Sat, Jul 09, 2011 at 21:06 +0530, Balbir Singh wrote:
> On Thu, Jul 7, 2011 at 9:53 PM, Vasiliy Kulikov <segoon@openwall.com> wrote:
> > On Thu, Jul 07, 2011 at 17:23 +0530, Balbir Singh wrote:
> >> I don't buy this use case, what are we trying to
> >> save here and why is taskstats responsible, because it notifies?
> >
> > Because it notifies _asynchronously_ in sense of the subject and
> > synchronously in sense of the object's activity.  It gives a hint when
> > some probable "chechpoint" occured.
> >
> > Please compare in the example I've posted above the cases of "poll"
> > (like test -e /proc/$pid) and "wait" (taskstats).  In the poll case it's
> > very easy to loose the moment of the race because of rescheduling.  In
> > the wait case the attacker task wakes up very closely to the race place.
> >
> 
> I tried a simple experiment and dnotify and it is possible to get
> events on exit. But that is not the point, you seem to suggest that an
> exit is a significant event for getting information about a task that
> can lead to security issues?

If there is already some flaw in program, the knowledge of an exit event
(it's not the only such event, just a sample) might make things worse.


> Do
> you at this point find anything that only taskstats exports that is
> harmful?

No.


> >> The race is that
> >> while I go off to read the data the process might disappear taking all
> >> of its data with it, which is what taskstats tries to solve among
> >> other things.
> >
> > Or the last succeeded measurement didn't happen after some sensible
> > event.
> >
> > Introducing this "race" neither fixes some bug or fully prevents some
> > exploitation technique.  It might _reduce the chance_ of exploitation.
> >
> > In my ssh exploit an attacker using procfs would have to poll
> > /proc/PID/io while 2 other processes would run - privileged sshd and
> > unprivileged sshd.  The scheduler would try to run both sshds
> > on different CPUs of 2 CPU system in parallel because sshds actively
> > exchange the data via pipes.  So, the poller might not run on any CPU
> > while the unpivileged sshd is dying.  By using taskstats I get the
> > precise information from the first attempt.
> 
> How do you use this information? Basically your concern is
> 
> 1. Information taskstats exposes (I agree, we need to audit and filter)
> 2. Exit events (I have a tough time digesting this one even with your
> examples, could you please share some details, code to show the
> exploit)

The code is plain - register and wait for ssd exit.  Pass Length = chars
- CONST.  That's all.

If I use procfs, I have to poll /proc/PID/io.  I have to (1) catch the
right moment for the measurement and (2) identify whether I've actually
succeeded in the measurement time (that I've measured that I want to
measure).  With taskstats (1) and (2) are solved by definition.  But
it seems to me I'm starting to make circles :\


My sceptic position about the whole taskstats/procfs ability to gather
aliens' processes information:

"The core problem here is that by giving *some part* of information about
internal task activity the kernel violating the task privacy, strictly
speaking.  A program doing IO expects this activity to be kept private.
This revealed part may or may not reveal sensible information, depends
on the specific program."

http://www.openwall.com/lists/oss-security/2011/06/29/4

Thanks,

-- 
Vasiliy Kulikov
http://www.openwall.com - bringing security into open computing environments

^ permalink raw reply	[flat|nested] 49+ messages in thread

* [kernel-hardening] Re: [PATCH 2/2] taskstats: restrict access to user
@ 2011-07-11 14:07                     ` Vasiliy Kulikov
  0 siblings, 0 replies; 49+ messages in thread
From: Vasiliy Kulikov @ 2011-07-11 14:07 UTC (permalink / raw)
  To: Balbir Singh
  Cc: Linus Torvalds, linux-kernel, Andrew Morton, Al Viro,
	David Rientjes, Stephen Wilson, KOSAKI Motohiro, security,
	Eric Paris, kernel-hardening

On Sat, Jul 09, 2011 at 21:06 +0530, Balbir Singh wrote:
> On Thu, Jul 7, 2011 at 9:53 PM, Vasiliy Kulikov <segoon@openwall.com> wrote:
> > On Thu, Jul 07, 2011 at 17:23 +0530, Balbir Singh wrote:
> >> I don't buy this use case, what are we trying to
> >> save here and why is taskstats responsible, because it notifies?
> >
> > Because it notifies _asynchronously_ in sense of the subject and
> > synchronously in sense of the object's activity.  It gives a hint when
> > some probable "chechpoint" occured.
> >
> > Please compare in the example I've posted above the cases of "poll"
> > (like test -e /proc/$pid) and "wait" (taskstats).  In the poll case it's
> > very easy to loose the moment of the race because of rescheduling.  In
> > the wait case the attacker task wakes up very closely to the race place.
> >
> 
> I tried a simple experiment and dnotify and it is possible to get
> events on exit. But that is not the point, you seem to suggest that an
> exit is a significant event for getting information about a task that
> can lead to security issues?

If there is already some flaw in program, the knowledge of an exit event
(it's not the only such event, just a sample) might make things worse.


> Do
> you at this point find anything that only taskstats exports that is
> harmful?

No.


> >> The race is that
> >> while I go off to read the data the process might disappear taking all
> >> of its data with it, which is what taskstats tries to solve among
> >> other things.
> >
> > Or the last succeeded measurement didn't happen after some sensible
> > event.
> >
> > Introducing this "race" neither fixes some bug or fully prevents some
> > exploitation technique.  It might _reduce the chance_ of exploitation.
> >
> > In my ssh exploit an attacker using procfs would have to poll
> > /proc/PID/io while 2 other processes would run - privileged sshd and
> > unprivileged sshd.  The scheduler would try to run both sshds
> > on different CPUs of 2 CPU system in parallel because sshds actively
> > exchange the data via pipes.  So, the poller might not run on any CPU
> > while the unpivileged sshd is dying.  By using taskstats I get the
> > precise information from the first attempt.
> 
> How do you use this information? Basically your concern is
> 
> 1. Information taskstats exposes (I agree, we need to audit and filter)
> 2. Exit events (I have a tough time digesting this one even with your
> examples, could you please share some details, code to show the
> exploit)

The code is plain - register and wait for ssd exit.  Pass Length = chars
- CONST.  That's all.

If I use procfs, I have to poll /proc/PID/io.  I have to (1) catch the
right moment for the measurement and (2) identify whether I've actually
succeeded in the measurement time (that I've measured that I want to
measure).  With taskstats (1) and (2) are solved by definition.  But
it seems to me I'm starting to make circles :\


My sceptic position about the whole taskstats/procfs ability to gather
aliens' processes information:

"The core problem here is that by giving *some part* of information about
internal task activity the kernel violating the task privacy, strictly
speaking.  A program doing IO expects this activity to be kept private.
This revealed part may or may not reveal sensible information, depends
on the specific program."

http://www.openwall.com/lists/oss-security/2011/06/29/4

Thanks,

-- 
Vasiliy Kulikov
http://www.openwall.com - bringing security into open computing environments

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [Security] [PATCH 2/2] taskstats: restrict access to user
  2011-07-01  3:02           ` [kernel-hardening] " Balbir Singh
@ 2011-09-19 16:40             ` Linus Torvalds
  -1 siblings, 0 replies; 49+ messages in thread
From: Linus Torvalds @ 2011-09-19 16:40 UTC (permalink / raw)
  To: Balbir Singh
  Cc: Vasiliy Kulikov, Shailabh Nagar, linux-kernel, security,
	Eric Paris, Stephen Wilson, KOSAKI Motohiro, David Rientjes,
	Andrew Morton, Balbir Singh, kernel-hardening

[-- Attachment #1: Type: text/plain, Size: 1365 bytes --]

On Thu, Jun 30, 2011 at 8:02 PM, Balbir Singh <bsingharora@gmail.com> wrote:
>>
>> So that's why I think it should be marked BROKEN. What applications
>> actually depend on this? iotop and what else? Because if it's just
>> iotop, I do suspect we might be better off telling people "ok,
>> disabling this will break iotop, but quite frankly, you're better off
>> without it".
>
> I beg to differ, due to the reasons above. I'd rather find time and
> fix the pending issues (network namespace), you've fixed the pid
> namespace issue. I'd also look for exiting listeners

So nothing ever happened on this thread, afaik.

You can still read sensitive information at a byte granularity with taskstats.

Balbir never sent any of the fixes he was supposed to, and none of the
namespace issues have gotten fixed.

It's now almost three months later, and things are still equally broken.

I think we need to just disable TASKSTAT's. Nobody maintains it, it's
been a known issue for months, people pointed out problems and even
sent patches, and nothing happened.

Maybe we can minimize it with the appended patch, but dammit, we need
to do *something*. If I don't get any reasonable replies, I'm really
going to have to mark this as known-BROKEN, since nothing ever
happens, and the "maintainer" clearly doesn't care about security
issues.

                         Linus

[-- Attachment #2: patch.diff --]
[-- Type: text/x-patch, Size: 1481 bytes --]

 kernel/taskstats.c |    2 ++
 kernel/tsacct.c    |    7 ++++---
 2 files changed, 6 insertions(+), 3 deletions(-)

diff --git a/kernel/taskstats.c b/kernel/taskstats.c
index e19ce1454ee1..5874d4866b3c 100644
--- a/kernel/taskstats.c
+++ b/kernel/taskstats.c
@@ -457,6 +457,8 @@ static int cmd_attr_register_cpumask(struct genl_info *info)
 	cpumask_var_t mask;
 	int rc;
 
+	if (!capable(CAP_SYS_ADMIN))
+		return -EPERM;
 	if (!alloc_cpumask_var(&mask, GFP_KERNEL))
 		return -ENOMEM;
 	rc = parse(info->attrs[TASKSTATS_CMD_ATTR_REGISTER_CPUMASK], mask);
diff --git a/kernel/tsacct.c b/kernel/tsacct.c
index 24dc60d9fa1f..110ca5a03bd6 100644
--- a/kernel/tsacct.c
+++ b/kernel/tsacct.c
@@ -78,6 +78,7 @@ void bacct_add_tsk(struct taskstats *stats, struct task_struct *tsk)
 
 #define KB 1024
 #define MB (1024*KB)
+#define KB_MASK (~(KB-1))
 /*
  * fill in extended accounting fields
  */
@@ -100,9 +101,9 @@ void xacct_add_tsk(struct taskstats *stats, struct task_struct *p)
 	stats->read_syscalls	= p->ioac.syscr;
 	stats->write_syscalls	= p->ioac.syscw;
 #ifdef CONFIG_TASK_IO_ACCOUNTING
-	stats->read_bytes	= p->ioac.read_bytes;
-	stats->write_bytes	= p->ioac.write_bytes;
-	stats->cancelled_write_bytes = p->ioac.cancelled_write_bytes;
+	stats->read_bytes	= p->ioac.read_bytes & KB_MASK;
+	stats->write_bytes	= p->ioac.write_bytes & KB_MASK;
+	stats->cancelled_write_bytes = p->ioac.cancelled_write_bytes & KB_MASK;
 #else
 	stats->read_bytes	= 0;
 	stats->write_bytes	= 0;

^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [kernel-hardening] Re: [Security] [PATCH 2/2] taskstats: restrict access to user
@ 2011-09-19 16:40             ` Linus Torvalds
  0 siblings, 0 replies; 49+ messages in thread
From: Linus Torvalds @ 2011-09-19 16:40 UTC (permalink / raw)
  To: Balbir Singh
  Cc: Vasiliy Kulikov, Shailabh Nagar, linux-kernel, security,
	Eric Paris, Stephen Wilson, KOSAKI Motohiro, David Rientjes,
	Andrew Morton, Balbir Singh, kernel-hardening

[-- Attachment #1: Type: text/plain, Size: 1365 bytes --]

On Thu, Jun 30, 2011 at 8:02 PM, Balbir Singh <bsingharora@gmail.com> wrote:
>>
>> So that's why I think it should be marked BROKEN. What applications
>> actually depend on this? iotop and what else? Because if it's just
>> iotop, I do suspect we might be better off telling people "ok,
>> disabling this will break iotop, but quite frankly, you're better off
>> without it".
>
> I beg to differ, due to the reasons above. I'd rather find time and
> fix the pending issues (network namespace), you've fixed the pid
> namespace issue. I'd also look for exiting listeners

So nothing ever happened on this thread, afaik.

You can still read sensitive information at a byte granularity with taskstats.

Balbir never sent any of the fixes he was supposed to, and none of the
namespace issues have gotten fixed.

It's now almost three months later, and things are still equally broken.

I think we need to just disable TASKSTAT's. Nobody maintains it, it's
been a known issue for months, people pointed out problems and even
sent patches, and nothing happened.

Maybe we can minimize it with the appended patch, but dammit, we need
to do *something*. If I don't get any reasonable replies, I'm really
going to have to mark this as known-BROKEN, since nothing ever
happens, and the "maintainer" clearly doesn't care about security
issues.

                         Linus

[-- Attachment #2: patch.diff --]
[-- Type: text/x-patch, Size: 1481 bytes --]

 kernel/taskstats.c |    2 ++
 kernel/tsacct.c    |    7 ++++---
 2 files changed, 6 insertions(+), 3 deletions(-)

diff --git a/kernel/taskstats.c b/kernel/taskstats.c
index e19ce1454ee1..5874d4866b3c 100644
--- a/kernel/taskstats.c
+++ b/kernel/taskstats.c
@@ -457,6 +457,8 @@ static int cmd_attr_register_cpumask(struct genl_info *info)
 	cpumask_var_t mask;
 	int rc;
 
+	if (!capable(CAP_SYS_ADMIN))
+		return -EPERM;
 	if (!alloc_cpumask_var(&mask, GFP_KERNEL))
 		return -ENOMEM;
 	rc = parse(info->attrs[TASKSTATS_CMD_ATTR_REGISTER_CPUMASK], mask);
diff --git a/kernel/tsacct.c b/kernel/tsacct.c
index 24dc60d9fa1f..110ca5a03bd6 100644
--- a/kernel/tsacct.c
+++ b/kernel/tsacct.c
@@ -78,6 +78,7 @@ void bacct_add_tsk(struct taskstats *stats, struct task_struct *tsk)
 
 #define KB 1024
 #define MB (1024*KB)
+#define KB_MASK (~(KB-1))
 /*
  * fill in extended accounting fields
  */
@@ -100,9 +101,9 @@ void xacct_add_tsk(struct taskstats *stats, struct task_struct *p)
 	stats->read_syscalls	= p->ioac.syscr;
 	stats->write_syscalls	= p->ioac.syscw;
 #ifdef CONFIG_TASK_IO_ACCOUNTING
-	stats->read_bytes	= p->ioac.read_bytes;
-	stats->write_bytes	= p->ioac.write_bytes;
-	stats->cancelled_write_bytes = p->ioac.cancelled_write_bytes;
+	stats->read_bytes	= p->ioac.read_bytes & KB_MASK;
+	stats->write_bytes	= p->ioac.write_bytes & KB_MASK;
+	stats->cancelled_write_bytes = p->ioac.cancelled_write_bytes & KB_MASK;
 #else
 	stats->read_bytes	= 0;
 	stats->write_bytes	= 0;

^ permalink raw reply related	[flat|nested] 49+ messages in thread

* Re: [Security] [PATCH 2/2] taskstats: restrict access to user
  2011-09-19 16:40             ` [kernel-hardening] " Linus Torvalds
@ 2011-09-19 17:20               ` Balbir Singh
  -1 siblings, 0 replies; 49+ messages in thread
From: Balbir Singh @ 2011-09-19 17:20 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Vasiliy Kulikov, Shailabh Nagar, linux-kernel, security,
	Eric Paris, Stephen Wilson, KOSAKI Motohiro, David Rientjes,
	Andrew Morton, Balbir Singh, kernel-hardening

On Mon, Sep 19, 2011 at 10:10 PM, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
> On Thu, Jun 30, 2011 at 8:02 PM, Balbir Singh <bsingharora@gmail.com> wrote:
> >>
> >> So that's why I think it should be marked BROKEN. What applications
> >> actually depend on this? iotop and what else? Because if it's just
> >> iotop, I do suspect we might be better off telling people "ok,
> >> disabling this will break iotop, but quite frankly, you're better off
> >> without it".
> >
> > I beg to differ, due to the reasons above. I'd rather find time and
> > fix the pending issues (network namespace), you've fixed the pid
> > namespace issue. I'd also look for exiting listeners
>
> So nothing ever happened on this thread, afaik.
>
> You can still read sensitive information at a byte granularity with taskstats.
>
> Balbir never sent any of the fixes he was supposed to, and none of the
> namespace issues have gotten fixed.
>
> It's now almost three months later, and things are still equally broken.
>
> I think we need to just disable TASKSTAT's. Nobody maintains it, it's
> been a known issue for months, people pointed out problems and even
> sent patches, and nothing happened.
>
> Maybe we can minimize it with the appended patch, but dammit, we need
> to do *something*. If I don't get any reasonable replies, I'm really
> going to have to mark this as known-BROKEN, since nothing ever
> happens, and the "maintainer" clearly doesn't care about security
> issues.
>

Sorry, I've been bogged down with work issues and have not had time to
look at it. If someone else wants to take a look while I am busy, I'd
be happy. The patch you've sent seems reasonable, but I'd suggest a
changelog

"Change taskstats user interface, henceforth we need (for security
purposes) CAP_SYS_ADMIN to receive taskstats
data on a particular CPU, a subset or all CPUs on the system. The
patch also rounds the data returned to the KiloByte
boundary for IO parameters, read_bytes, write_bytes and cancelled_write_bytes"

Thanks for looking into this.
Balbir

^ permalink raw reply	[flat|nested] 49+ messages in thread

* [kernel-hardening] Re: [Security] [PATCH 2/2] taskstats: restrict access to user
@ 2011-09-19 17:20               ` Balbir Singh
  0 siblings, 0 replies; 49+ messages in thread
From: Balbir Singh @ 2011-09-19 17:20 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Vasiliy Kulikov, Shailabh Nagar, linux-kernel, security,
	Eric Paris, Stephen Wilson, KOSAKI Motohiro, David Rientjes,
	Andrew Morton, Balbir Singh, kernel-hardening

On Mon, Sep 19, 2011 at 10:10 PM, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
> On Thu, Jun 30, 2011 at 8:02 PM, Balbir Singh <bsingharora@gmail.com> wrote:
> >>
> >> So that's why I think it should be marked BROKEN. What applications
> >> actually depend on this? iotop and what else? Because if it's just
> >> iotop, I do suspect we might be better off telling people "ok,
> >> disabling this will break iotop, but quite frankly, you're better off
> >> without it".
> >
> > I beg to differ, due to the reasons above. I'd rather find time and
> > fix the pending issues (network namespace), you've fixed the pid
> > namespace issue. I'd also look for exiting listeners
>
> So nothing ever happened on this thread, afaik.
>
> You can still read sensitive information at a byte granularity with taskstats.
>
> Balbir never sent any of the fixes he was supposed to, and none of the
> namespace issues have gotten fixed.
>
> It's now almost three months later, and things are still equally broken.
>
> I think we need to just disable TASKSTAT's. Nobody maintains it, it's
> been a known issue for months, people pointed out problems and even
> sent patches, and nothing happened.
>
> Maybe we can minimize it with the appended patch, but dammit, we need
> to do *something*. If I don't get any reasonable replies, I'm really
> going to have to mark this as known-BROKEN, since nothing ever
> happens, and the "maintainer" clearly doesn't care about security
> issues.
>

Sorry, I've been bogged down with work issues and have not had time to
look at it. If someone else wants to take a look while I am busy, I'd
be happy. The patch you've sent seems reasonable, but I'd suggest a
changelog

"Change taskstats user interface, henceforth we need (for security
purposes) CAP_SYS_ADMIN to receive taskstats
data on a particular CPU, a subset or all CPUs on the system. The
patch also rounds the data returned to the KiloByte
boundary for IO parameters, read_bytes, write_bytes and cancelled_write_bytes"

Thanks for looking into this.
Balbir

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [Security] [PATCH 2/2] taskstats: restrict access to user
  2011-09-19 16:40             ` [kernel-hardening] " Linus Torvalds
@ 2011-09-19 17:39               ` Vasiliy Kulikov
  -1 siblings, 0 replies; 49+ messages in thread
From: Vasiliy Kulikov @ 2011-09-19 17:39 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Balbir Singh, Shailabh Nagar, linux-kernel, security, Eric Paris,
	Stephen Wilson, KOSAKI Motohiro, David Rientjes, Andrew Morton,
	Balbir Singh, kernel-hardening

On Mon, Sep 19, 2011 at 09:40 -0700, Linus Torvalds wrote:
> diff --git a/kernel/taskstats.c b/kernel/taskstats.c
> index e19ce1454ee1..5874d4866b3c 100644
> --- a/kernel/taskstats.c
> +++ b/kernel/taskstats.c
> @@ -457,6 +457,8 @@ static int cmd_attr_register_cpumask(struct genl_info *info)
>  	cpumask_var_t mask;
>  	int rc;
>  
> +	if (!capable(CAP_SYS_ADMIN))
> +		return -EPERM;

Shouldn't it simply protect taskstats_user_cmd()?  You may still poll
the counters with TASKSTATS_CMD_ATTR_PID/TASKSTATS_CMD_ATTR_TGID.

Or ptrace_may_access() in taskstats_user_cmd().

>  	if (!alloc_cpumask_var(&mask, GFP_KERNEL))
>  		return -ENOMEM;
>  	rc = parse(info->attrs[TASKSTATS_CMD_ATTR_REGISTER_CPUMASK], mask);
> diff --git a/kernel/tsacct.c b/kernel/tsacct.c
> index 24dc60d9fa1f..110ca5a03bd6 100644
> --- a/kernel/tsacct.c
> +++ b/kernel/tsacct.c
> @@ -78,6 +78,7 @@ void bacct_add_tsk(struct taskstats *stats, struct task_struct *tsk)
>  
>  #define KB 1024
>  #define MB (1024*KB)
> +#define KB_MASK (~(KB-1))
>  /*
>   * fill in extended accounting fields
>   */
> @@ -100,9 +101,9 @@ void xacct_add_tsk(struct taskstats *stats, struct task_struct *p)
>  	stats->read_syscalls	= p->ioac.syscr;
>  	stats->write_syscalls	= p->ioac.syscw;
>  #ifdef CONFIG_TASK_IO_ACCOUNTING
> -	stats->read_bytes	= p->ioac.read_bytes;
> -	stats->write_bytes	= p->ioac.write_bytes;
> -	stats->cancelled_write_bytes = p->ioac.cancelled_write_bytes;
> +	stats->read_bytes	= p->ioac.read_bytes & KB_MASK;
> +	stats->write_bytes	= p->ioac.write_bytes & KB_MASK;
> +	stats->cancelled_write_bytes = p->ioac.cancelled_write_bytes & KB_MASK;
>  #else
>  	stats->read_bytes	= 0;
>  	stats->write_bytes	= 0;


-- 
Vasiliy Kulikov
http://www.openwall.com - bringing security into open computing environments

^ permalink raw reply	[flat|nested] 49+ messages in thread

* [kernel-hardening] Re: [Security] [PATCH 2/2] taskstats: restrict access to user
@ 2011-09-19 17:39               ` Vasiliy Kulikov
  0 siblings, 0 replies; 49+ messages in thread
From: Vasiliy Kulikov @ 2011-09-19 17:39 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Balbir Singh, Shailabh Nagar, linux-kernel, security, Eric Paris,
	Stephen Wilson, KOSAKI Motohiro, David Rientjes, Andrew Morton,
	Balbir Singh, kernel-hardening

On Mon, Sep 19, 2011 at 09:40 -0700, Linus Torvalds wrote:
> diff --git a/kernel/taskstats.c b/kernel/taskstats.c
> index e19ce1454ee1..5874d4866b3c 100644
> --- a/kernel/taskstats.c
> +++ b/kernel/taskstats.c
> @@ -457,6 +457,8 @@ static int cmd_attr_register_cpumask(struct genl_info *info)
>  	cpumask_var_t mask;
>  	int rc;
>  
> +	if (!capable(CAP_SYS_ADMIN))
> +		return -EPERM;

Shouldn't it simply protect taskstats_user_cmd()?  You may still poll
the counters with TASKSTATS_CMD_ATTR_PID/TASKSTATS_CMD_ATTR_TGID.

Or ptrace_may_access() in taskstats_user_cmd().

>  	if (!alloc_cpumask_var(&mask, GFP_KERNEL))
>  		return -ENOMEM;
>  	rc = parse(info->attrs[TASKSTATS_CMD_ATTR_REGISTER_CPUMASK], mask);
> diff --git a/kernel/tsacct.c b/kernel/tsacct.c
> index 24dc60d9fa1f..110ca5a03bd6 100644
> --- a/kernel/tsacct.c
> +++ b/kernel/tsacct.c
> @@ -78,6 +78,7 @@ void bacct_add_tsk(struct taskstats *stats, struct task_struct *tsk)
>  
>  #define KB 1024
>  #define MB (1024*KB)
> +#define KB_MASK (~(KB-1))
>  /*
>   * fill in extended accounting fields
>   */
> @@ -100,9 +101,9 @@ void xacct_add_tsk(struct taskstats *stats, struct task_struct *p)
>  	stats->read_syscalls	= p->ioac.syscr;
>  	stats->write_syscalls	= p->ioac.syscw;
>  #ifdef CONFIG_TASK_IO_ACCOUNTING
> -	stats->read_bytes	= p->ioac.read_bytes;
> -	stats->write_bytes	= p->ioac.write_bytes;
> -	stats->cancelled_write_bytes = p->ioac.cancelled_write_bytes;
> +	stats->read_bytes	= p->ioac.read_bytes & KB_MASK;
> +	stats->write_bytes	= p->ioac.write_bytes & KB_MASK;
> +	stats->cancelled_write_bytes = p->ioac.cancelled_write_bytes & KB_MASK;
>  #else
>  	stats->read_bytes	= 0;
>  	stats->write_bytes	= 0;


-- 
Vasiliy Kulikov
http://www.openwall.com - bringing security into open computing environments

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [Security] [PATCH 2/2] taskstats: restrict access to user
  2011-09-19 17:39               ` [kernel-hardening] " Vasiliy Kulikov
@ 2011-09-19 17:45                 ` Linus Torvalds
  -1 siblings, 0 replies; 49+ messages in thread
From: Linus Torvalds @ 2011-09-19 17:45 UTC (permalink / raw)
  To: Vasiliy Kulikov
  Cc: Balbir Singh, Shailabh Nagar, linux-kernel, security, Eric Paris,
	Stephen Wilson, KOSAKI Motohiro, David Rientjes, Andrew Morton,
	Balbir Singh, kernel-hardening

On Mon, Sep 19, 2011 at 10:39 AM, Vasiliy Kulikov <segoon@openwall.com> wrote:
>
> Shouldn't it simply protect taskstats_user_cmd()?  You may still poll
> the counters with TASKSTATS_CMD_ATTR_PID/TASKSTATS_CMD_ATTR_TGID.

Yeah, I wondered where I'd really want to hook it in, that was the
other option.

However, one thing that I'm currently independently asking some
networking people is whether that patch guarantees anything at all: is
the netlink command even guaranteed to be run in the same context as
the person sending it?

After all, it comes in as a packet of data.  How synchronous is the
genetlink thing guaranteed to be in the first place?

IOW, are *any* of those "check current capabilities/euid" approaches
really guaranteed to be valid? Are they valid today, will they
necessarily be valid in a year?

                Linus

^ permalink raw reply	[flat|nested] 49+ messages in thread

* [kernel-hardening] Re: [Security] [PATCH 2/2] taskstats: restrict access to user
@ 2011-09-19 17:45                 ` Linus Torvalds
  0 siblings, 0 replies; 49+ messages in thread
From: Linus Torvalds @ 2011-09-19 17:45 UTC (permalink / raw)
  To: Vasiliy Kulikov
  Cc: Balbir Singh, Shailabh Nagar, linux-kernel, security, Eric Paris,
	Stephen Wilson, KOSAKI Motohiro, David Rientjes, Andrew Morton,
	Balbir Singh, kernel-hardening

On Mon, Sep 19, 2011 at 10:39 AM, Vasiliy Kulikov <segoon@openwall.com> wrote:
>
> Shouldn't it simply protect taskstats_user_cmd()?  You may still poll
> the counters with TASKSTATS_CMD_ATTR_PID/TASKSTATS_CMD_ATTR_TGID.

Yeah, I wondered where I'd really want to hook it in, that was the
other option.

However, one thing that I'm currently independently asking some
networking people is whether that patch guarantees anything at all: is
the netlink command even guaranteed to be run in the same context as
the person sending it?

After all, it comes in as a packet of data.  How synchronous is the
genetlink thing guaranteed to be in the first place?

IOW, are *any* of those "check current capabilities/euid" approaches
really guaranteed to be valid? Are they valid today, will they
necessarily be valid in a year?

                Linus

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [Security] [PATCH 2/2] taskstats: restrict access to user
  2011-09-19 17:39               ` [kernel-hardening] " Vasiliy Kulikov
@ 2011-09-19 17:47                 ` Balbir Singh
  -1 siblings, 0 replies; 49+ messages in thread
From: Balbir Singh @ 2011-09-19 17:47 UTC (permalink / raw)
  To: Vasiliy Kulikov
  Cc: Linus Torvalds, Shailabh Nagar, linux-kernel, security,
	Eric Paris, Stephen Wilson, KOSAKI Motohiro, David Rientjes,
	Andrew Morton, Balbir Singh, kernel-hardening

On Mon, Sep 19, 2011 at 11:09 PM, Vasiliy Kulikov <segoon@openwall.com> wrote:
> On Mon, Sep 19, 2011 at 09:40 -0700, Linus Torvalds wrote:
>> diff --git a/kernel/taskstats.c b/kernel/taskstats.c
>> index e19ce1454ee1..5874d4866b3c 100644
>> --- a/kernel/taskstats.c
>> +++ b/kernel/taskstats.c
>> @@ -457,6 +457,8 @@ static int cmd_attr_register_cpumask(struct genl_info *info)
>>       cpumask_var_t mask;
>>       int rc;
>>
>> +     if (!capable(CAP_SYS_ADMIN))
>> +             return -EPERM;
>
> Shouldn't it simply protect taskstats_user_cmd()?  You may still poll
> the counters with TASKSTATS_CMD_ATTR_PID/TASKSTATS_CMD_ATTR_TGID.
>

Yep, it should if we want to move it all to root access.

Balbir

^ permalink raw reply	[flat|nested] 49+ messages in thread

* [kernel-hardening] Re: [Security] [PATCH 2/2] taskstats: restrict access to user
@ 2011-09-19 17:47                 ` Balbir Singh
  0 siblings, 0 replies; 49+ messages in thread
From: Balbir Singh @ 2011-09-19 17:47 UTC (permalink / raw)
  To: Vasiliy Kulikov
  Cc: Linus Torvalds, Shailabh Nagar, linux-kernel, security,
	Eric Paris, Stephen Wilson, KOSAKI Motohiro, David Rientjes,
	Andrew Morton, Balbir Singh, kernel-hardening

On Mon, Sep 19, 2011 at 11:09 PM, Vasiliy Kulikov <segoon@openwall.com> wrote:
> On Mon, Sep 19, 2011 at 09:40 -0700, Linus Torvalds wrote:
>> diff --git a/kernel/taskstats.c b/kernel/taskstats.c
>> index e19ce1454ee1..5874d4866b3c 100644
>> --- a/kernel/taskstats.c
>> +++ b/kernel/taskstats.c
>> @@ -457,6 +457,8 @@ static int cmd_attr_register_cpumask(struct genl_info *info)
>>       cpumask_var_t mask;
>>       int rc;
>>
>> +     if (!capable(CAP_SYS_ADMIN))
>> +             return -EPERM;
>
> Shouldn't it simply protect taskstats_user_cmd()?  You may still poll
> the counters with TASKSTATS_CMD_ATTR_PID/TASKSTATS_CMD_ATTR_TGID.
>

Yep, it should if we want to move it all to root access.

Balbir

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [Security] [PATCH 2/2] taskstats: restrict access to user
  2011-09-19 16:40             ` [kernel-hardening] " Linus Torvalds
@ 2011-09-19 18:29               ` Andi Kleen
  -1 siblings, 0 replies; 49+ messages in thread
From: Andi Kleen @ 2011-09-19 18:29 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Balbir Singh, Vasiliy Kulikov, Shailabh Nagar, linux-kernel,
	security, Eric Paris, Stephen Wilson, KOSAKI Motohiro,
	David Rientjes, Andrew Morton, Balbir Singh, kernel-hardening

Linus Torvalds <torvalds@linux-foundation.org> writes:
>  	rc = parse(info->attrs[TASKSTATS_CMD_ATTR_REGISTER_CPUMASK], mask);
> diff --git a/kernel/tsacct.c b/kernel/tsacct.c
> index 24dc60d9fa1f..110ca5a03bd6 100644
> --- a/kernel/tsacct.c
> +++ b/kernel/tsacct.c
> @@ -78,6 +78,7 @@ void bacct_add_tsk(struct taskstats *stats, struct task_struct *tsk)
>  
>  #define KB 1024

Needs to be 1024ULL, because the counters are 64bit.

>  #define MB (1024*KB)
> +#define KB_MASK (~(KB-1))

Otherwise you lose the upper 32bits here.

-Andi

-- 
ak@linux.intel.com -- Speaking for myself only

^ permalink raw reply	[flat|nested] 49+ messages in thread

* [kernel-hardening] Re: [Security] [PATCH 2/2] taskstats: restrict access to user
@ 2011-09-19 18:29               ` Andi Kleen
  0 siblings, 0 replies; 49+ messages in thread
From: Andi Kleen @ 2011-09-19 18:29 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Balbir Singh, Vasiliy Kulikov, Shailabh Nagar, linux-kernel,
	security, Eric Paris, Stephen Wilson, KOSAKI Motohiro,
	David Rientjes, Andrew Morton, Balbir Singh, kernel-hardening

Linus Torvalds <torvalds@linux-foundation.org> writes:
>  	rc = parse(info->attrs[TASKSTATS_CMD_ATTR_REGISTER_CPUMASK], mask);
> diff --git a/kernel/tsacct.c b/kernel/tsacct.c
> index 24dc60d9fa1f..110ca5a03bd6 100644
> --- a/kernel/tsacct.c
> +++ b/kernel/tsacct.c
> @@ -78,6 +78,7 @@ void bacct_add_tsk(struct taskstats *stats, struct task_struct *tsk)
>  
>  #define KB 1024

Needs to be 1024ULL, because the counters are 64bit.

>  #define MB (1024*KB)
> +#define KB_MASK (~(KB-1))

Otherwise you lose the upper 32bits here.

-Andi

-- 
ak@linux.intel.com -- Speaking for myself only

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [Security] [PATCH 2/2] taskstats: restrict access to user
  2011-09-19 18:29               ` [kernel-hardening] " Andi Kleen
@ 2011-09-19 18:32                 ` Linus Torvalds
  -1 siblings, 0 replies; 49+ messages in thread
From: Linus Torvalds @ 2011-09-19 18:32 UTC (permalink / raw)
  To: Andi Kleen
  Cc: Balbir Singh, Vasiliy Kulikov, Shailabh Nagar, linux-kernel,
	security, Eric Paris, Stephen Wilson, KOSAKI Motohiro,
	David Rientjes, Andrew Morton, Balbir Singh, kernel-hardening

On Mon, Sep 19, 2011 at 11:29 AM, Andi Kleen <andi@firstfloor.org> wrote:
>>
>>  #define KB 1024
>
> Needs to be 1024ULL, because the counters are 64bit.
>
>>  #define MB (1024*KB)
>> +#define KB_MASK (~(KB-1))
>
> Otherwise you lose the upper 32bits here.

No, it's an "int" constant, so it will sign-extend properly when it
gets extended to 64 bits.

It's "unsigned int" etc that can be dangerous in these kinds of situations.

(Of course, in other contexts, sign-extensions of masks is *bad*. So
it depends on what you do)

                     Linus

^ permalink raw reply	[flat|nested] 49+ messages in thread

* [kernel-hardening] Re: [Security] [PATCH 2/2] taskstats: restrict access to user
@ 2011-09-19 18:32                 ` Linus Torvalds
  0 siblings, 0 replies; 49+ messages in thread
From: Linus Torvalds @ 2011-09-19 18:32 UTC (permalink / raw)
  To: Andi Kleen
  Cc: Balbir Singh, Vasiliy Kulikov, Shailabh Nagar, linux-kernel,
	security, Eric Paris, Stephen Wilson, KOSAKI Motohiro,
	David Rientjes, Andrew Morton, Balbir Singh, kernel-hardening

On Mon, Sep 19, 2011 at 11:29 AM, Andi Kleen <andi@firstfloor.org> wrote:
>>
>>  #define KB 1024
>
> Needs to be 1024ULL, because the counters are 64bit.
>
>>  #define MB (1024*KB)
>> +#define KB_MASK (~(KB-1))
>
> Otherwise you lose the upper 32bits here.

No, it's an "int" constant, so it will sign-extend properly when it
gets extended to 64 bits.

It's "unsigned int" etc that can be dangerous in these kinds of situations.

(Of course, in other contexts, sign-extensions of masks is *bad*. So
it depends on what you do)

                     Linus

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [Security] [PATCH 2/2] taskstats: restrict access to user
  2011-09-19 17:45                 ` [kernel-hardening] " Linus Torvalds
@ 2011-09-20  3:35                   ` Eric W. Biederman
  -1 siblings, 0 replies; 49+ messages in thread
From: Eric W. Biederman @ 2011-09-20  3:35 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Vasiliy Kulikov, Balbir Singh, Shailabh Nagar, linux-kernel,
	security, Eric Paris, Stephen Wilson, KOSAKI Motohiro,
	David Rientjes, Andrew Morton, Balbir Singh, kernel-hardening

Linus Torvalds <torvalds@linux-foundation.org> writes:

> On Mon, Sep 19, 2011 at 10:39 AM, Vasiliy Kulikov <segoon@openwall.com> wrote:
>>
>> Shouldn't it simply protect taskstats_user_cmd()?  You may still poll
>> the counters with TASKSTATS_CMD_ATTR_PID/TASKSTATS_CMD_ATTR_TGID.
>
> Yeah, I wondered where I'd really want to hook it in, that was the
> other option.
>
> However, one thing that I'm currently independently asking some
> networking people is whether that patch guarantees anything at all: is
> the netlink command even guaranteed to be run in the same context as
> the person sending it?

I don't know where that conversation is happening but since I have been
involved rather heavily in netlink syncrhonously processing messages I
will answer.

> After all, it comes in as a packet of data.  How synchronous is the
> genetlink thing guaranteed to be in the first place?

Yes.  The netlink skb is guaranteed to be processed synchronously
and in the senders process.  So accessing current is guaranteed
to be valid.

I just confirmed my memory by reading the cod in 3.1-rc1

> IOW, are *any* of those "check current capabilities/euid" approaches
> really guaranteed to be valid? Are they valid today, will they
> necessarily be valid in a year?

They are valid until such time as someone as someone rearchitects
netlink message processing.

Several years ago it was decided that processing netlink messages
syncrhonously so we could access current made for much simpler easier to
understand kernel code, and most if not all of the netlink permissions
checks now depend upon the fact that we process netlink messages
synchronously.

Eric

^ permalink raw reply	[flat|nested] 49+ messages in thread

* [kernel-hardening] Re: [Security] [PATCH 2/2] taskstats: restrict access to user
@ 2011-09-20  3:35                   ` Eric W. Biederman
  0 siblings, 0 replies; 49+ messages in thread
From: Eric W. Biederman @ 2011-09-20  3:35 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Vasiliy Kulikov, Balbir Singh, Shailabh Nagar, linux-kernel,
	security, Eric Paris, Stephen Wilson, KOSAKI Motohiro,
	David Rientjes, Andrew Morton, Balbir Singh, kernel-hardening

Linus Torvalds <torvalds@linux-foundation.org> writes:

> On Mon, Sep 19, 2011 at 10:39 AM, Vasiliy Kulikov <segoon@openwall.com> wrote:
>>
>> Shouldn't it simply protect taskstats_user_cmd()?  You may still poll
>> the counters with TASKSTATS_CMD_ATTR_PID/TASKSTATS_CMD_ATTR_TGID.
>
> Yeah, I wondered where I'd really want to hook it in, that was the
> other option.
>
> However, one thing that I'm currently independently asking some
> networking people is whether that patch guarantees anything at all: is
> the netlink command even guaranteed to be run in the same context as
> the person sending it?

I don't know where that conversation is happening but since I have been
involved rather heavily in netlink syncrhonously processing messages I
will answer.

> After all, it comes in as a packet of data.  How synchronous is the
> genetlink thing guaranteed to be in the first place?

Yes.  The netlink skb is guaranteed to be processed synchronously
and in the senders process.  So accessing current is guaranteed
to be valid.

I just confirmed my memory by reading the cod in 3.1-rc1

> IOW, are *any* of those "check current capabilities/euid" approaches
> really guaranteed to be valid? Are they valid today, will they
> necessarily be valid in a year?

They are valid until such time as someone as someone rearchitects
netlink message processing.

Several years ago it was decided that processing netlink messages
syncrhonously so we could access current made for much simpler easier to
understand kernel code, and most if not all of the netlink permissions
checks now depend upon the fact that we process netlink messages
synchronously.

Eric

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [Security] [PATCH 2/2] taskstats: restrict access to user
  2011-09-19 17:45                 ` [kernel-hardening] " Linus Torvalds
@ 2011-09-20  5:47                   ` Alexey Dobriyan
  -1 siblings, 0 replies; 49+ messages in thread
From: Alexey Dobriyan @ 2011-09-20  5:47 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Vasiliy Kulikov, Balbir Singh, Shailabh Nagar, linux-kernel,
	security, Eric Paris, Stephen Wilson, KOSAKI Motohiro,
	David Rientjes, Andrew Morton, Balbir Singh, kernel-hardening

On Mon, Sep 19, 2011 at 10:45:20AM -0700, Linus Torvalds wrote:
> On Mon, Sep 19, 2011 at 10:39 AM, Vasiliy Kulikov <segoon@openwall.com> wrote:
> >
> > Shouldn't it simply protect taskstats_user_cmd()?  You may still poll
> > the counters with TASKSTATS_CMD_ATTR_PID/TASKSTATS_CMD_ATTR_TGID.
> 
> Yeah, I wondered where I'd really want to hook it in, that was the
> other option.
> 
> However, one thing that I'm currently independently asking some
> networking people is whether that patch guarantees anything at all: is
> the netlink command even guaranteed to be run in the same context as
> the person sending it?
> 
> After all, it comes in as a packet of data.  How synchronous is the
> genetlink thing guaranteed to be in the first place?
> 
> IOW, are *any* of those "check current capabilities/euid" approaches
> really guaranteed to be valid? Are they valid today, will they
> necessarily be valid in a year?

Netlink was made syncronous by commit cd40b7d3983c708aabe3d3008ec64ffce56d33b0
"[NET]: make netlink user -> kernel interface synchronious".

^ permalink raw reply	[flat|nested] 49+ messages in thread

* [kernel-hardening] Re: [Security] [PATCH 2/2] taskstats: restrict access to user
@ 2011-09-20  5:47                   ` Alexey Dobriyan
  0 siblings, 0 replies; 49+ messages in thread
From: Alexey Dobriyan @ 2011-09-20  5:47 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Vasiliy Kulikov, Balbir Singh, Shailabh Nagar, linux-kernel,
	security, Eric Paris, Stephen Wilson, KOSAKI Motohiro,
	David Rientjes, Andrew Morton, Balbir Singh, kernel-hardening

On Mon, Sep 19, 2011 at 10:45:20AM -0700, Linus Torvalds wrote:
> On Mon, Sep 19, 2011 at 10:39 AM, Vasiliy Kulikov <segoon@openwall.com> wrote:
> >
> > Shouldn't it simply protect taskstats_user_cmd()?  You may still poll
> > the counters with TASKSTATS_CMD_ATTR_PID/TASKSTATS_CMD_ATTR_TGID.
> 
> Yeah, I wondered where I'd really want to hook it in, that was the
> other option.
> 
> However, one thing that I'm currently independently asking some
> networking people is whether that patch guarantees anything at all: is
> the netlink command even guaranteed to be run in the same context as
> the person sending it?
> 
> After all, it comes in as a packet of data.  How synchronous is the
> genetlink thing guaranteed to be in the first place?
> 
> IOW, are *any* of those "check current capabilities/euid" approaches
> really guaranteed to be valid? Are they valid today, will they
> necessarily be valid in a year?

Netlink was made syncronous by commit cd40b7d3983c708aabe3d3008ec64ffce56d33b0
"[NET]: make netlink user -> kernel interface synchronious".

^ permalink raw reply	[flat|nested] 49+ messages in thread

end of thread, other threads:[~2011-09-20  5:47 UTC | newest]

Thread overview: 49+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-06-24 12:09 [PATCH 2/2] taskstats: restrict access to user Vasiliy Kulikov
2011-06-29  1:27 ` Balbir Singh
2011-06-29 11:42   ` Vasiliy Kulikov
2011-06-29 20:17   ` Vasiliy Kulikov
2011-07-02  7:36     ` Vasiliy Kulikov
2011-07-02  7:36       ` [kernel-hardening] " Vasiliy Kulikov
2011-07-04  2:57       ` Balbir Singh
2011-07-04  2:57         ` [kernel-hardening] " Balbir Singh
2011-07-04 17:45         ` Vasiliy Kulikov
2011-07-04 17:45           ` [kernel-hardening] " Vasiliy Kulikov
2011-07-07  8:55           ` Vasiliy Kulikov
2011-07-07  8:55             ` [kernel-hardening] " Vasiliy Kulikov
2011-07-07 11:53             ` Balbir Singh
2011-07-07 11:53               ` [kernel-hardening] " Balbir Singh
2011-07-07 16:23               ` Vasiliy Kulikov
2011-07-07 16:23                 ` [kernel-hardening] " Vasiliy Kulikov
2011-07-09 15:36                 ` Balbir Singh
2011-07-09 15:36                   ` [kernel-hardening] " Balbir Singh
2011-07-11 14:07                   ` Vasiliy Kulikov
2011-07-11 14:07                     ` [kernel-hardening] " Vasiliy Kulikov
2011-06-29 20:09 ` [Security] " Linus Torvalds
2011-06-30  7:57   ` Vasiliy Kulikov
2011-06-30  7:57     ` [kernel-hardening] " Vasiliy Kulikov
2011-06-30 10:59     ` Balbir Singh
2011-06-30 10:59       ` [kernel-hardening] " Balbir Singh
2011-06-30 12:08       ` Vasiliy Kulikov
2011-06-30 12:08         ` [kernel-hardening] " Vasiliy Kulikov
2011-06-30 16:40       ` Linus Torvalds
2011-06-30 16:40         ` [kernel-hardening] " Linus Torvalds
2011-07-01  3:02         ` Balbir Singh
2011-07-01  3:02           ` [kernel-hardening] " Balbir Singh
2011-09-19 16:40           ` Linus Torvalds
2011-09-19 16:40             ` [kernel-hardening] " Linus Torvalds
2011-09-19 17:20             ` Balbir Singh
2011-09-19 17:20               ` [kernel-hardening] " Balbir Singh
2011-09-19 17:39             ` Vasiliy Kulikov
2011-09-19 17:39               ` [kernel-hardening] " Vasiliy Kulikov
2011-09-19 17:45               ` Linus Torvalds
2011-09-19 17:45                 ` [kernel-hardening] " Linus Torvalds
2011-09-20  3:35                 ` Eric W. Biederman
2011-09-20  3:35                   ` [kernel-hardening] " Eric W. Biederman
2011-09-20  5:47                 ` Alexey Dobriyan
2011-09-20  5:47                   ` [kernel-hardening] " Alexey Dobriyan
2011-09-19 17:47               ` Balbir Singh
2011-09-19 17:47                 ` [kernel-hardening] " Balbir Singh
2011-09-19 18:29             ` Andi Kleen
2011-09-19 18:29               ` [kernel-hardening] " Andi Kleen
2011-09-19 18:32               ` Linus Torvalds
2011-09-19 18:32                 ` [kernel-hardening] " Linus Torvalds

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.