All of lore.kernel.org
 help / color / mirror / Atom feed
* Re: [PATCH RFC 8/8] audit: allow user records to be created inside a container
       [not found] ` <1363619405-6419-9-git-send-email-arozansk@redhat.com>
@ 2013-03-18 21:28   ` Eric W. Biederman
  0 siblings, 0 replies; 20+ messages in thread
From: Eric W. Biederman @ 2013-03-18 21:28 UTC (permalink / raw)
  To: Aristeu Rozanski; +Cc: linux-audit

Aristeu Rozanski <arozansk@redhat.com> writes:

> Since user events will be followed by namespace information, userspace
> can filter off undesired container records.

I don't think we want to allow any user to write to the audit records,
that is what nsown_capable will allow, as all you would need to do is to
unshare the user namespace to be able to write audit records.

Eric

> @@ -597,13 +612,13 @@ static int audit_netlink_ok(struct sk_buff *skb, u16 msg_type)
>  	case AUDIT_TTY_SET:
>  	case AUDIT_TRIM:
>  	case AUDIT_MAKE_EQUIV:
> -		if (!capable(CAP_AUDIT_CONTROL))
> +		if (!nsown_capable(CAP_AUDIT_CONTROL))
>  			err = -EPERM;
>  		break;
>  	case AUDIT_USER:
>  	case AUDIT_FIRST_USER_MSG ... AUDIT_LAST_USER_MSG:
>  	case AUDIT_FIRST_USER_MSG2 ... AUDIT_LAST_USER_MSG2:
> -		if (!capable(CAP_AUDIT_WRITE))
> +		if (!nsown_capable(CAP_AUDIT_WRITE))
>  			err = -EPERM;
>  		break;
>  	default:  /* bad msg */

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH RFC 7/8] audit: report namespace information along with USER events
       [not found] ` <1363619405-6419-8-git-send-email-arozansk@redhat.com>
@ 2013-03-18 21:44   ` Eric W. Biederman
  2013-03-19 12:08     ` Aristeu Rozanski
       [not found]     ` <871ubc9yda.fsf-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
  0 siblings, 2 replies; 20+ messages in thread
From: Eric W. Biederman @ 2013-03-18 21:44 UTC (permalink / raw)
  To: Aristeu Rozanski; +Cc: linux-audit

Aristeu Rozanski <arozansk@redhat.com> writes:

> For userspace generated events, include a record with the namespace
> procfs inode numbers the process belongs to. This allows to track down
> and filter audit messages by userspace.

I am not comfortable with using the inode numbers this way.  It does not
pass the test of can I migrate a container and still have this work
test.  Any kind of kernel assigned name for namespaces fails that test.

I also don't like that you don't include the procfs device number.  An
inode number means nothing without knowing which filesystem you are
referring to.

It may never happen but I reserve the right to have the inode numbers
for namespaces to show up differently in different instances of procfs.

Beyond that I think this usage is possibly buggy by using two audit
records for one event.

> Signed-off-by: Aristeu Rozanski <arozansk@redhat.com>
> ---
>  include/uapi/linux/audit.h |    1 +
>  kernel/audit.c             |   51 +++++++++++++++++++++++++++++++++++++++++++-
>  2 files changed, 51 insertions(+), 1 deletions(-)
>
> diff --git a/include/uapi/linux/audit.h b/include/uapi/linux/audit.h
> index 9f096f1..3ec3ccb 100644
> --- a/include/uapi/linux/audit.h
> +++ b/include/uapi/linux/audit.h
> @@ -106,6 +106,7 @@
>  #define AUDIT_NETFILTER_PKT	1324	/* Packets traversing netfilter chains */
>  #define AUDIT_NETFILTER_CFG	1325	/* Netfilter chain modifications */
>  #define AUDIT_SECCOMP		1326	/* Secure Computing event */
> +#define AUDIT_USER_NAMESPACE	1327	/* Information about process' namespaces */
>  
>  #define AUDIT_AVC		1400	/* SE Linux avc denial or grant */
>  #define AUDIT_SELINUX_ERR	1401	/* Internal SE Linux Errors */
> diff --git a/kernel/audit.c b/kernel/audit.c
> index 58db117..b17f9c0 100644
> --- a/kernel/audit.c
> +++ b/kernel/audit.c
> @@ -62,6 +62,11 @@
>  #include <linux/freezer.h>
>  #include <linux/tty.h>
>  #include <linux/pid_namespace.h>
> +#include <linux/ipc_namespace.h>
> +#include <linux/mnt_namespace.h>
> +#include <linux/utsname.h>
> +#include <linux/user_namespace.h>
> +#include <net/net_namespace.h>
>  
>  #include "audit.h"
>  
> @@ -641,6 +646,49 @@ static int audit_log_common_recv_msg(struct audit_buffer **ab, u16 msg_type,
>  	return rc;
>  }
>  
> +#ifdef CONFIG_NAMESPACES
> +static int audit_log_namespaces(struct task_struct *tsk,
> +				struct sk_buff *skb)
> +{
> +	struct audit_context *ctx = tsk->audit_context;
> +	struct audit_buffer *ab;
> +
> +	if (!audit_enabled)
> +		return 0;
> +
> +	ab = audit_log_start(ctx, GFP_KERNEL, AUDIT_USER_NAMESPACE);
> +	if (unlikely(!ab))
> +		return -ENOMEM;
> +
> +	audit_log_format(ab, "mnt=%u", mntns_get_inum(tsk));
> +#ifdef CONFIG_NET_NS
> +	audit_log_format(ab, " net=%u", netns_get_inum(tsk));
> +#endif
> +#ifdef CONFIG_UTS_NS
> +	audit_log_format(ab, " uts=%u", utsns_get_inum(tsk));
> +#endif
> +#ifdef CONFIG_IPC_NS
> +	audit_log_format(ab, " ipc=%u", ipcns_get_inum(tsk));
> +#endif
> +#ifdef CONFIG_PID_NS
> +	audit_log_format(ab, " pid=%u", pidns_get_inum(tsk));
> +#endif
> +#ifdef CONFIG_USER_NS
> +	audit_log_format(ab, " user=%u", userns_get_inum(tsk));
> +#endif  
> +	audit_set_pid(ab, NETLINK_CB(skb).portid);
> +	audit_log_end(ab);
> +
> +	return 0;
> +}
> +#else
> +static inline int audit_log_namespaces(struct task_struct *tsk,
> +				       struct sk_buff *skb)
> +{
> +	return 0;
> +}
> +#endif
> +
>  static int audit_receive_msg(struct sk_buff *skb, struct nlmsghdr *nlh)
>  {
>  	u32			seq, sid;
> @@ -741,7 +789,7 @@ static int audit_receive_msg(struct sk_buff *skb, struct nlmsghdr *nlh)
>  			}
>  			audit_log_common_recv_msg(&ab, msg_type,
>  						  loginuid, sessionid, sid,
> -						  NULL);
> +						  current->audit_context);
>  
>  			if (msg_type != AUDIT_USER_TTY)
>  				audit_log_format(ab, " msg='%.1024s'",
> @@ -758,6 +806,7 @@ static int audit_receive_msg(struct sk_buff *skb, struct nlmsghdr *nlh)
>  			}
>  			audit_set_pid(ab, NETLINK_CB(skb).portid);
>  			audit_log_end(ab);
> +			audit_log_namespaces(current, skb);
>  		}
>  		break;
>  	case AUDIT_ADD:

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH RFC] audit: provide namespace information in user originated records
       [not found] ` <1363619405-6419-1-git-send-email-arozansk-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
@ 2013-03-18 22:16   ` Eric W. Biederman
       [not found]     ` <877gl48iaz.fsf-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
  0 siblings, 1 reply; 20+ messages in thread
From: Eric W. Biederman @ 2013-03-18 22:16 UTC (permalink / raw)
  To: Aristeu Rozanski
  Cc: Linux Containers, linux-audit-H+wXaHxf7aLQT0dZR+AlfA, Eric Paris


Adding the containers list so folks with container expertise can see
what is being proposed.

Aristeu Rozanski <arozansk-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> writes:

> This patchset introduces a new audit record to follow all USER records which
> provides namespace information of the process. The idea is to allow processes
> in containers to create records in the host system while providing means to be
> filtered out.

It looks like this mechanism makes it easy for an unprivileged program
to spam and overwhelm the audit log.

> For each new namespace, a unique procfs inode number is allocated and this
> number has been used by userspace to determine which processes belong to the
> same namespace. These numbers are used in the new audit record.
>
> Applications such as libvirt-sandbox and lxc can then report the same numbers
> when a container is created and destroyed allowing to map records to a certain
> container. Maybe the next step would be having a record for whenever a new
> namespace is created?
>
> First 6 patches are needed in order to get each namespace's inode number.

Grumble the existing methods can be used you don't have to introduce a
whole new set of methods.  Grumble.  Besides the bug of assuming that
the inodes now and forever will be the same across all instances of
proc.

> Patch 7 properly defines the new record that is related to the USER
> record

Not agmenting the current user records seems a little odd to me.

You also continue in this my current policy of not allowing any audit
records in the container itself, so I a don't quite know what the point
of all of this is.

> Patch 8 allows USER records to be generated from different namespaces

Which essentially allows any user to create any USER record they want
whenever they want.

> Here's an example of output:
> type=CRED_DISP msg=audit(1363528861.403:311): pid=20016 uid=0 auid=0 ses=45 subj=system_u:system_r:crond_t:s0-s0:c0.c1023 msg='op=PAM:setcred acct="root" exe="/usr/sbin/crond" hostname=? addr=? terminal=cron res=success'

Ok.  This seems totally bizarre.  You are running a container with a
user namespace with some uid mapped to uid 0?

That defeats about half the point of having user namespaces, as half the
files in the world are owned by uid 0, and can be written by uid 0
outside of your user namespace.

Hmm.  I need to look at this in a little more detail but I believe our
use of task_pid_vnr here in the audit record is a long standing bug.

> type=UNKNOWN[1327] msg=audit(1363528861.403:311): mnt=4026531840 net=4026531956 uts=4026531838 ipc=4026531839 pid=4026531836 user=4026531837
>
> Notes:
> - this is a RFC, all sorts of feedback are much appreciated
> - while the last patch allows a new userns to send audit records, I haven't
>   look yet on making sure it has proper capabilities so regular users'
>   containers can create records

I don't think it does.

> - the record number allocated is just a draft. If this patchset evolves into
>   something that can be merged, please advise which number number is the best
>   choice
> - I'm not subscribed to the list, so please make sure I'm on the Cc list
>
>  fs/namespace.c                 |   14 +++++++
>  include/linux/ipc_namespace.h  |    1
>  include/linux/mnt_namespace.h  |    2 +
>  include/linux/pid_namespace.h  |    1
>  include/linux/user_namespace.h |    1
>  include/linux/utsname.h        |    1
>  include/net/net_namespace.h    |    1
>  include/uapi/linux/audit.h     |    1
>  ipc/namespace.c                |   14 +++++++
>  kernel/audit.c                 |   76 +++++++++++++++++++++++++++++++++++++----
>  kernel/pid_namespace.c         |   11 +++++
>  kernel/user_namespace.c        |    5 ++
>  kernel/utsname.c               |   14 +++++++
>  net/core/net_namespace.c       |   14 +++++++
>  14 files changed, 150 insertions(+), 6 deletions(-)

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH RFC 7/8] audit: report namespace information along with USER events
  2013-03-18 21:44   ` [PATCH RFC 7/8] audit: report namespace information along with USER events Eric W. Biederman
@ 2013-03-19 12:08     ` Aristeu Rozanski
       [not found]     ` <871ubc9yda.fsf-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
  1 sibling, 0 replies; 20+ messages in thread
From: Aristeu Rozanski @ 2013-03-19 12:08 UTC (permalink / raw)
  To: Eric W. Biederman; +Cc: linux-audit

On Mon, Mar 18, 2013 at 02:44:33PM -0700, Eric W. Biederman wrote:
> Aristeu Rozanski <arozansk@redhat.com> writes:
> 
> > For userspace generated events, include a record with the namespace
> > procfs inode numbers the process belongs to. This allows to track down
> > and filter audit messages by userspace.
> 
> I am not comfortable with using the inode numbers this way.  It does not
> pass the test of can I migrate a container and still have this work
> test.  Any kind of kernel assigned name for namespaces fails that test.
> 
> I also don't like that you don't include the procfs device number.  An
> inode number means nothing without knowing which filesystem you are
> referring to.
>
> It may never happen but I reserve the right to have the inode numbers
> for namespaces to show up differently in different instances of procfs.

well, in this case the whole idea is invalid. there's no way to reliably
identify which namespaces a process belongs to for logging purposes.

> Beyond that I think this usage is possibly buggy by using two audit
> records for one event.

this is valid, the records are related and they show up with the same
timestamp.

-- 
Aristeu

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH RFC] audit: provide namespace information in user originated records
       [not found]     ` <877gl48iaz.fsf-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
@ 2013-03-19 12:24       ` Aristeu Rozanski
       [not found]         ` <20130319122408.GC20187-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
  0 siblings, 1 reply; 20+ messages in thread
From: Aristeu Rozanski @ 2013-03-19 12:24 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Linux Containers, linux-audit-H+wXaHxf7aLQT0dZR+AlfA, Eric Paris

On Mon, Mar 18, 2013 at 03:16:52PM -0700, Eric W. Biederman wrote:
> Adding the containers list so folks with container expertise can see
> what is being proposed.
> 
> Aristeu Rozanski <arozansk-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> writes:
> 
> > This patchset introduces a new audit record to follow all USER records which
> > provides namespace information of the process. The idea is to allow processes
> > in containers to create records in the host system while providing means to be
> > filtered out.
> 
> It looks like this mechanism makes it easy for an unprivileged program
> to spam and overwhelm the audit log.
> 
> > For each new namespace, a unique procfs inode number is allocated and this
> > number has been used by userspace to determine which processes belong to the
> > same namespace. These numbers are used in the new audit record.
> >
> > Applications such as libvirt-sandbox and lxc can then report the same numbers
> > when a container is created and destroyed allowing to map records to a certain
> > container. Maybe the next step would be having a record for whenever a new
> > namespace is created?
> >
> > First 6 patches are needed in order to get each namespace's inode number.
> 
> Grumble the existing methods can be used you don't have to introduce a
> whole new set of methods.  Grumble.  Besides the bug of assuming that
> the inodes now and forever will be the same across all instances of
> proc.

the existing methods are for procfs use and I didn't want to abuse it.
like I said the other email, the fact that it's not a reliable way to
indefinitely describe a namespace due to multiple procfs instances or
migration, the whole idea is flawed.

> > Patch 7 properly defines the new record that is related to the USER
> > record
> 
> Not agmenting the current user records seems a little odd to me.
> 
> You also continue in this my current policy of not allowing any audit
> records in the container itself, so I a don't quite know what the point
> of all of this is.

your current policy wasn't known to me and
	/* Only support the initial namespaces for now. */
sounds like something that didn't happen for other reasons

> > Patch 8 allows USER records to be generated from different namespaces
> 
> Which essentially allows any user to create any USER record they want
> whenever they want.
> 
> > Here's an example of output:
> > type=CRED_DISP msg=audit(1363528861.403:311): pid=20016 uid=0 auid=0 ses=45 subj=system_u:system_r:crond_t:s0-s0:c0.c1023 msg='op=PAM:setcred acct="root" exe="/usr/sbin/crond" hostname=? addr=? terminal=cron res=success'
> 
> Ok.  This seems totally bizarre.  You are running a container with a
> user namespace with some uid mapped to uid 0?

on the notes section:
	- while the last patch allows a new userns to send audit records, I haven't
	  look yet on making sure it has proper capabilities so regular users'
	  containers can create records

so I haven't tried it with userns. It's a RFC. That's a regular record
to show the related records, using initial namespaces. like I stated in
the email, I wasn't sure how I'd handle capabilities but the idea would be
to allow containers to log to the system's auditd. since inode numbers
aren't more reliable for more than a moment, I guess there's no other
way than having an audit namespace and run an audit daemon inside the
container (and communicate over the network like an individual host).

-- 
Aristeu

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH RFC] audit: provide namespace information in user originated records
       [not found]         ` <20130319122408.GC20187-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
@ 2013-03-20  0:00           ` Eric W. Biederman
       [not found]             ` <874ng7gcst.fsf-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
  0 siblings, 1 reply; 20+ messages in thread
From: Eric W. Biederman @ 2013-03-20  0:00 UTC (permalink / raw)
  To: Aristeu Rozanski
  Cc: Linux Containers, linux-audit-H+wXaHxf7aLQT0dZR+AlfA, Eric Paris

Aristeu Rozanski <arozansk-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> writes:

> On Mon, Mar 18, 2013 at 03:16:52PM -0700, Eric W. Biederman wrote:
>> Adding the containers list so folks with container expertise can see
>> what is being proposed.
>> 
>> Aristeu Rozanski <arozansk-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> writes:
>> 
>> > This patchset introduces a new audit record to follow all USER records which
>> > provides namespace information of the process. The idea is to allow processes
>> > in containers to create records in the host system while providing means to be
>> > filtered out.
>> 
>> It looks like this mechanism makes it easy for an unprivileged program
>> to spam and overwhelm the audit log.
>> 
>> > For each new namespace, a unique procfs inode number is allocated and this
>> > number has been used by userspace to determine which processes belong to the
>> > same namespace. These numbers are used in the new audit record.
>> >
>> > Applications such as libvirt-sandbox and lxc can then report the same numbers
>> > when a container is created and destroyed allowing to map records to a certain
>> > container. Maybe the next step would be having a record for whenever a new
>> > namespace is created?
>> >
>> > First 6 patches are needed in order to get each namespace's inode number.
>> 
>> Grumble the existing methods can be used you don't have to introduce a
>> whole new set of methods.  Grumble.  Besides the bug of assuming that
>> the inodes now and forever will be the same across all instances of
>> proc.
>
> the existing methods are for procfs use and I didn't want to abuse it.
> like I said the other email, the fact that it's not a reliable way to
> indefinitely describe a namespace due to multiple procfs instances or
> migration, the whole idea is flawed.

It is always possible to pick the instance of /proc connected to the
initial pid namespace.  And there is a device number you can use to say
that.

Usually designs that need global identifiers for namespaces suffer from
the need for a namespace of namespaces (which we sort of have in /proc),
and I push back by default to get people to think if what they are
trying to do really makes sense.

>> > Patch 7 properly defines the new record that is related to the USER
>> > record
>> 
>> Not agmenting the current user records seems a little odd to me.
>> 
>> You also continue in this my current policy of not allowing any audit
>> records in the container itself, so I a don't quite know what the point
>> of all of this is.
>
> your current policy wasn't known to me and
> 	/* Only support the initial namespaces for now. */
> sounds like something that didn't happen for other reasons

The reasons were simply that to my knowledge no one has thought through
how audit records and namespaces make sense to interact.

My expectation would be that an extention of audit records would be
logged on a per container basis.  But I don't have any motivating
examples.

>> > Patch 8 allows USER records to be generated from different namespaces
>> 
>> Which essentially allows any user to create any USER record they want
>> whenever they want.
>> 
>> > Here's an example of output:
>> > type=CRED_DISP msg=audit(1363528861.403:311): pid=20016 uid=0 auid=0 ses=45 subj=system_u:system_r:crond_t:s0-s0:c0.c1023 msg='op=PAM:setcred acct="root" exe="/usr/sbin/crond" hostname=? addr=? terminal=cron res=success'
>> 
>> Ok.  This seems totally bizarre.  You are running a container with a
>> user namespace with some uid mapped to uid 0?
>
> on the notes section:
> 	- while the last patch allows a new userns to send audit records, I haven't
> 	  look yet on making sure it has proper capabilities so regular users'
> 	  containers can create records
>
> so I haven't tried it with userns. It's a RFC. 

I though you would have taken the time to run it at least once, or to
perhaps have manually edited your example to see how things would fit
together.

> That's a regular record
> to show the related records, using initial namespaces. like I stated in
> the email, I wasn't sure how I'd handle capabilities but the idea would be
> to allow containers to log to the system's auditd. since inode numbers
> aren't more reliable for more than a moment, I guess there's no other
> way than having an audit namespace and run an audit daemon inside the
> container (and communicate over the network like an individual host).

What was really missing from your RFC is a motivating example.  I sort
of see that in your paragraph above but it isn't clear to me.

What is lost by not allowing USER audit records from processes in
containers?  What is gained by implementing user process to have them?
And of course what are your thoughts on preventing unprivileged users
overwhelming the audit subsystem.

My minimal experience with the audit subsystem roughly feels like hardly
anyone really cares.  Although I may be wrong.

Eric

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH RFC] audit: provide namespace information in user originated records
       [not found]             ` <874ng7gcst.fsf-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
@ 2013-03-20 15:12               ` Serge Hallyn
  2013-03-20 15:45               ` Aristeu Rozanski
  1 sibling, 0 replies; 20+ messages in thread
From: Serge Hallyn @ 2013-03-20 15:12 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Linux Containers, linux-audit-H+wXaHxf7aLQT0dZR+AlfA, Eric Paris

Quoting Eric W. Biederman (ebiederm-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org):
> Aristeu Rozanski <arozansk-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> writes:
> The reasons were simply that to my knowledge no one has thought through
> how audit records and namespaces make sense to interact.

It seems clear to me (perhaps wrongly :) that:

  1. auditd is a host service only.
  2. in cases where the namespace is hierarchical and resources have
     identifiers in the init namespace (i.e. pid and user ns), audit
     should simply, always, report the id in the init ns
  3. in cases where namespaces are not hierarchical (ipc, netns)
     the (ns_id, resource_id) need to be dumped.  The ns_id should
     be the inode # for the /proc/$$/ns/$namespace, since that is
     what is used for setns.

Syslog I want eventually to be namespaced.  Audit, not.

Audit is (ISTM) about LSPP and such - things which we can't talk
about in containers anyway.

-serge

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH RFC] audit: provide namespace information in user originated records
       [not found]             ` <874ng7gcst.fsf-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
  2013-03-20 15:12               ` Serge Hallyn
@ 2013-03-20 15:45               ` Aristeu Rozanski
       [not found]                 ` <20130320154503.GF20187-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
  1 sibling, 1 reply; 20+ messages in thread
From: Aristeu Rozanski @ 2013-03-20 15:45 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Linux Containers, linux-audit-H+wXaHxf7aLQT0dZR+AlfA, Eric Paris

On Tue, Mar 19, 2013 at 05:00:50PM -0700, Eric W. Biederman wrote:
> It is always possible to pick the instance of /proc connected to the
> initial pid namespace.  And there is a device number you can use to say
> that.

I wasn't aware of that, I'll take a look, thanks!

> The reasons were simply that to my knowledge no one has thought through
> how audit records and namespaces make sense to interact.
> 
> My expectation would be that an extention of audit records would be
> logged on a per container basis.  But I don't have any motivating
> examples.

from what I've heard, there're two possibilites here: if a container is
understood to be "light virtualization", it should behave just like
another machine by having its own auditd daemon, sending records over
the network to the host. If that's not the case, a single auditd must be
present. But, the fact that you might want to run a sshd server inside a
container it might be desirable to have USER_AUTH records for example.
 
> I though you would have taken the time to run it at least once, or to
> perhaps have manually edited your example to see how things would fit
> together.

I did run it with different namespaces but not with userns. The example
was to show how the extra record would look like and I randomly picked
one. The idea is that auditd will know which namespaces are the original
ones and can use that to filter containers' records, which could be
filtered out by default.

> What was really missing from your RFC is a motivating example.  I sort
> of see that in your paragraph above but it isn't clear to me.
> 
> What is lost by not allowing USER audit records from processes in
> containers?  What is gained by implementing user process to have them?
> And of course what are your thoughts on preventing unprivileged users
> overwhelming the audit subsystem.

This is a bit fuzzy to me, perhaps due I'm not fully understanding
userns implementation yet, so bear with me:
I thought of changing so userns would not grant CAP_AUDIT_WRITE and
CAP_AUDIT_CONTROL unless the process already has it (i.e. it'd require
to be root on the init_ns). The 'init' process would start trusted
daemons with those capabilities then drop the capabilities for
everything else.
Does it make sense?

-- 
Aristeu

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH RFC] audit: provide namespace information in user originated records
       [not found]                 ` <20130320154503.GF20187-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
@ 2013-03-20 18:36                   ` Serge Hallyn
  2013-03-20 18:42                     ` Eric Paris
  0 siblings, 1 reply; 20+ messages in thread
From: Serge Hallyn @ 2013-03-20 18:36 UTC (permalink / raw)
  To: Aristeu Rozanski
  Cc: Linux Containers, linux-audit-H+wXaHxf7aLQT0dZR+AlfA,
	Eric W. Biederman, Eric Paris

Quoting Aristeu Rozanski (arozansk-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org):
> This is a bit fuzzy to me, perhaps due I'm not fully understanding
> userns implementation yet, so bear with me:
> I thought of changing so userns would not grant CAP_AUDIT_WRITE and
> CAP_AUDIT_CONTROL unless the process already has it (i.e. it'd require

Seems like CAP_AUDIT_WRITE should be targeted against the
skb->netns->userns.  Then CAP_AUDIT_WRITE can be treated like any other
capability.  Last I knew (long time ago) you had to be in init_user_ns
to talk audit, but that's ok - this would just do the right thing in
any case.

-serge

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH RFC] audit: provide namespace information in user originated records
  2013-03-20 18:36                   ` Serge Hallyn
@ 2013-03-20 18:42                     ` Eric Paris
  2013-03-20 18:49                       ` Serge Hallyn
  0 siblings, 1 reply; 20+ messages in thread
From: Eric Paris @ 2013-03-20 18:42 UTC (permalink / raw)
  To: Serge Hallyn
  Cc: Linux Containers, linux-audit-H+wXaHxf7aLQT0dZR+AlfA, Eric W. Biederman

On Wed, 2013-03-20 at 13:36 -0500, Serge Hallyn wrote:
> Quoting Aristeu Rozanski (arozansk-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org):
> > This is a bit fuzzy to me, perhaps due I'm not fully understanding
> > userns implementation yet, so bear with me:
> > I thought of changing so userns would not grant CAP_AUDIT_WRITE and
> > CAP_AUDIT_CONTROL unless the process already has it (i.e. it'd require
> 
> Seems like CAP_AUDIT_WRITE should be targeted against the
> skb->netns->userns.  Then CAP_AUDIT_WRITE can be treated like any other
> capability.  Last I knew (long time ago) you had to be in init_user_ns
> to talk audit, but that's ok - this would just do the right thing in
> any case.

kauditd should be considered as existing in the init user namespace.  So
I'd think we'd want to check if the process had CAP_AUDIT_WRITE in the
init user namespace and if so, allow it to send messages.  Who care what
*ns the process exists in.  If it has it in the init namespace, go
ahead.  Thus the process that created the container would need
CAP_AUDIT_WRITE in the init namespace for this to all work, right?

/me also gets so confused about what caps mean in the userns world.
(/me has larger issues with the ns concept as a whole, but that boat
sailed years and years ago)

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH RFC] audit: provide namespace information in user originated records
  2013-03-20 18:42                     ` Eric Paris
@ 2013-03-20 18:49                       ` Serge Hallyn
  2013-03-20 19:01                         ` Eric Paris
  0 siblings, 1 reply; 20+ messages in thread
From: Serge Hallyn @ 2013-03-20 18:49 UTC (permalink / raw)
  To: Eric Paris
  Cc: Linux Containers, linux-audit-H+wXaHxf7aLQT0dZR+AlfA, Eric W. Biederman

Quoting Eric Paris (eparis-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org):
> On Wed, 2013-03-20 at 13:36 -0500, Serge Hallyn wrote:
> > Quoting Aristeu Rozanski (arozansk-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org):
> > > This is a bit fuzzy to me, perhaps due I'm not fully understanding
> > > userns implementation yet, so bear with me:
> > > I thought of changing so userns would not grant CAP_AUDIT_WRITE and
> > > CAP_AUDIT_CONTROL unless the process already has it (i.e. it'd require
> > 
> > Seems like CAP_AUDIT_WRITE should be targeted against the
> > skb->netns->userns.  Then CAP_AUDIT_WRITE can be treated like any other
> > capability.  Last I knew (long time ago) you had to be in init_user_ns
> > to talk audit, but that's ok - this would just do the right thing in
> > any case.
> 
> kauditd should be considered as existing in the init user namespace.  So
> I'd think we'd want to check if the process had CAP_AUDIT_WRITE in the
> init user namespace and if so, allow it to send messages.  Who care what
> *ns the process exists in.  If it has it in the init namespace, go
> ahead.  Thus the process that created the container would need
> CAP_AUDIT_WRITE in the init namespace for this to all work, right?

Yes.  What I was suggesting is intended to work if that situation ever
changes.  But I have zero complaints about doing it as you say, as I
doubt it ever will/ought to change.

That basically means CAP_AUDIT_WRITE would be worthless in a non-init
userns.  That's fine - at least the rules would be consistent.

> /me also gets so confused about what caps mean in the userns world.

If the resource in question (like a network interface) belongs to a
namespace (netns) created by the userns in which the caller has the caps
in question, then privilege is granted.

Otherwise, not.

What you're saying above about CAP_AUDIT_WRITE is exactly right (for
how audit works right now).

> (/me has larger issues with the ns concept as a whole, but that boat
> sailed years and years ago)

-serge

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH RFC] audit: provide namespace information in user originated records
  2013-03-20 18:49                       ` Serge Hallyn
@ 2013-03-20 19:01                         ` Eric Paris
  2013-03-20 19:17                           ` Aristeu Rozanski
                                             ` (2 more replies)
  0 siblings, 3 replies; 20+ messages in thread
From: Eric Paris @ 2013-03-20 19:01 UTC (permalink / raw)
  To: Serge Hallyn
  Cc: Linux Containers, linux-audit-H+wXaHxf7aLQT0dZR+AlfA, Eric W. Biederman

On Wed, 2013-03-20 at 13:49 -0500, Serge Hallyn wrote:
> Quoting Eric Paris (eparis-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org):
> > On Wed, 2013-03-20 at 13:36 -0500, Serge Hallyn wrote:
> > > Quoting Aristeu Rozanski (arozansk-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org):
> > > > This is a bit fuzzy to me, perhaps due I'm not fully understanding
> > > > userns implementation yet, so bear with me:
> > > > I thought of changing so userns would not grant CAP_AUDIT_WRITE and
> > > > CAP_AUDIT_CONTROL unless the process already has it (i.e. it'd require
> > > 
> > > Seems like CAP_AUDIT_WRITE should be targeted against the
> > > skb->netns->userns.  Then CAP_AUDIT_WRITE can be treated like any other
> > > capability.  Last I knew (long time ago) you had to be in init_user_ns
> > > to talk audit, but that's ok - this would just do the right thing in
> > > any case.
> > 
> > kauditd should be considered as existing in the init user namespace.  So
> > I'd think we'd want to check if the process had CAP_AUDIT_WRITE in the
> > init user namespace and if so, allow it to send messages.  Who care what
> > *ns the process exists in.  If it has it in the init namespace, go
> > ahead.  Thus the process that created the container would need
> > CAP_AUDIT_WRITE in the init namespace for this to all work, right?
> 
> Yes.  What I was suggesting is intended to work if that situation ever
> changes.  But I have zero complaints about doing it as you say, as I
> doubt it ever will/ought to change.
> 
> That basically means CAP_AUDIT_WRITE would be worthless in a non-init
> userns.  That's fine - at least the rules would be consistent.

[veering away from this particular patch]

We are also talking about adding a CAP_AUDIT_READ and sending messages
via multicast on the audit socket.  The problem is I don't know how the
audit socket could work in the network namespace world.  Right now
kauditd has:

audit_sock = netlink_kernel_create(&init_net, NETLINK_AUDIT, &cfg);

So there won't ever be anything on the kernel side of the audit socket
in a non-init network namespace.  Lets say that is fixed somehow (I
assume it's possible?  something? magic pixies?) I think we'd somehow
need to do the CAP_AUDIT_READ check against the user namespace
associated with the network namespace in question?  But what messages
should go to this userspace auditd?

Going to have to have audit namespaces to.  But only CAP_AUDIT_READ
would make sense in the new audit namespace...

/me wishes containers were a 'thing' instead of a bucket of semi-related
nuts and bolts.

-Eric

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH RFC] audit: provide namespace information in user originated records
  2013-03-20 19:01                         ` Eric Paris
@ 2013-03-20 19:17                           ` Aristeu Rozanski
  2013-03-20 19:19                           ` Serge Hallyn
  2013-03-20 23:23                           ` Eric W. Biederman
  2 siblings, 0 replies; 20+ messages in thread
From: Aristeu Rozanski @ 2013-03-20 19:17 UTC (permalink / raw)
  To: Eric Paris
  Cc: Linux Containers, Serge Hallyn, Eric W. Biederman,
	linux-audit-H+wXaHxf7aLQT0dZR+AlfA

On Wed, Mar 20, 2013 at 03:01:32PM -0400, Eric Paris wrote:
> [veering away from this particular patch]
> 
> We are also talking about adding a CAP_AUDIT_READ and sending messages
> via multicast on the audit socket.  The problem is I don't know how the
> audit socket could work in the network namespace world.  Right now
> kauditd has:
> 
> audit_sock = netlink_kernel_create(&init_net, NETLINK_AUDIT, &cfg);
> 
> So there won't ever be anything on the kernel side of the audit socket
> in a non-init network namespace.  Lets say that is fixed somehow (I
> assume it's possible?  something? magic pixies?) I think we'd somehow
> need to do the CAP_AUDIT_READ check against the user namespace
> associated with the network namespace in question?  But what messages
> should go to this userspace auditd?
> 
> Going to have to have audit namespaces to.  But only CAP_AUDIT_READ
> would make sense in the new audit namespace...

I guess that could be achieved by forcing creating a new network namespace at
the same time you create a new audit namespace. any new network
namespace created inside this new container would lose CAP_AUDIT_*.

-- 
Aristeu

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH RFC] audit: provide namespace information in user originated records
  2013-03-20 19:01                         ` Eric Paris
  2013-03-20 19:17                           ` Aristeu Rozanski
@ 2013-03-20 19:19                           ` Serge Hallyn
  2013-03-20 23:23                           ` Eric W. Biederman
  2 siblings, 0 replies; 20+ messages in thread
From: Serge Hallyn @ 2013-03-20 19:19 UTC (permalink / raw)
  To: Eric Paris
  Cc: Linux Containers, linux-audit-H+wXaHxf7aLQT0dZR+AlfA, Eric W. Biederman

Quoting Eric Paris (eparis-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org):
> On Wed, 2013-03-20 at 13:49 -0500, Serge Hallyn wrote:
> > Quoting Eric Paris (eparis-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org):
> > > On Wed, 2013-03-20 at 13:36 -0500, Serge Hallyn wrote:
> > > > Quoting Aristeu Rozanski (arozansk-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org):
> > > > > This is a bit fuzzy to me, perhaps due I'm not fully understanding
> > > > > userns implementation yet, so bear with me:
> > > > > I thought of changing so userns would not grant CAP_AUDIT_WRITE and
> > > > > CAP_AUDIT_CONTROL unless the process already has it (i.e. it'd require
> > > > 
> > > > Seems like CAP_AUDIT_WRITE should be targeted against the
> > > > skb->netns->userns.  Then CAP_AUDIT_WRITE can be treated like any other
> > > > capability.  Last I knew (long time ago) you had to be in init_user_ns
> > > > to talk audit, but that's ok - this would just do the right thing in
> > > > any case.
> > > 
> > > kauditd should be considered as existing in the init user namespace.  So
> > > I'd think we'd want to check if the process had CAP_AUDIT_WRITE in the
> > > init user namespace and if so, allow it to send messages.  Who care what
> > > *ns the process exists in.  If it has it in the init namespace, go
> > > ahead.  Thus the process that created the container would need
> > > CAP_AUDIT_WRITE in the init namespace for this to all work, right?
> > 
> > Yes.  What I was suggesting is intended to work if that situation ever
> > changes.  But I have zero complaints about doing it as you say, as I
> > doubt it ever will/ought to change.
> > 
> > That basically means CAP_AUDIT_WRITE would be worthless in a non-init
> > userns.  That's fine - at least the rules would be consistent.
> 
> [veering away from this particular patch]
> 
> We are also talking about adding a CAP_AUDIT_READ and sending messages
> via multicast on the audit socket.  The problem is I don't know how the
> audit socket could work in the network namespace world.  Right now
> kauditd has:
> 
> audit_sock = netlink_kernel_create(&init_net, NETLINK_AUDIT, &cfg);
> 
> So there won't ever be anything on the kernel side of the audit socket
> in a non-init network namespace.

Right.

> Lets say that is fixed somehow (I
> assume it's possible?  something? magic pixies?) I think we'd somehow
> need to do the CAP_AUDIT_READ check against the user namespace
> associated with the network namespace in question?  But what messages
> should go to this userspace auditd?

Ones which pertain to resources in that userns.  If we ever were to
sprinkle that pixie dust, then we'd know how to do this as well :)

> Going to have to have audit namespaces to.  But only CAP_AUDIT_READ
> would make sense in the new audit namespace...

It's not clear to me that an audit namespace is needed.  The userns
'owns' other namespaces, so it seems like it should suffice for
directing audit msgs.

> /me wishes containers were a 'thing' instead of a bucket of semi-related
> nuts and bolts.

That sure would simplify things.  However there definately are heavy
users of individual namespaces - i.e. using thousands of network
namespaces but no other namespaces.

-serge

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH RFC] audit: provide namespace information in user originated records
  2013-03-20 19:01                         ` Eric Paris
  2013-03-20 19:17                           ` Aristeu Rozanski
  2013-03-20 19:19                           ` Serge Hallyn
@ 2013-03-20 23:23                           ` Eric W. Biederman
       [not found]                             ` <87y5dh8xl7.fsf-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
  2 siblings, 1 reply; 20+ messages in thread
From: Eric W. Biederman @ 2013-03-20 23:23 UTC (permalink / raw)
  To: Eric Paris
  Cc: Linux Containers, Serge Hallyn, linux-audit-H+wXaHxf7aLQT0dZR+AlfA

Eric Paris <eparis-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> writes:

> On Wed, 2013-03-20 at 13:49 -0500, Serge Hallyn wrote:
>> Quoting Eric Paris (eparis-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org):
>> > On Wed, 2013-03-20 at 13:36 -0500, Serge Hallyn wrote:
>> > > Quoting Aristeu Rozanski (arozansk-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org):
>> > > > This is a bit fuzzy to me, perhaps due I'm not fully understanding
>> > > > userns implementation yet, so bear with me:
>> > > > I thought of changing so userns would not grant CAP_AUDIT_WRITE and
>> > > > CAP_AUDIT_CONTROL unless the process already has it (i.e. it'd require
>> > > 
>> > > Seems like CAP_AUDIT_WRITE should be targeted against the
>> > > skb->netns->userns.  Then CAP_AUDIT_WRITE can be treated like any other
>> > > capability.  Last I knew (long time ago) you had to be in init_user_ns
>> > > to talk audit, but that's ok - this would just do the right thing in
>> > > any case.
>> > 
>> > kauditd should be considered as existing in the init user namespace.  So
>> > I'd think we'd want to check if the process had CAP_AUDIT_WRITE in the
>> > init user namespace and if so, allow it to send messages.  Who care what
>> > *ns the process exists in.  If it has it in the init namespace, go
>> > ahead.  Thus the process that created the container would need
>> > CAP_AUDIT_WRITE in the init namespace for this to all work, right?
>> 
>> Yes.  What I was suggesting is intended to work if that situation ever
>> changes.  But I have zero complaints about doing it as you say, as I
>> doubt it ever will/ought to change.
>> 
>> That basically means CAP_AUDIT_WRITE would be worthless in a non-init
>> userns.  That's fine - at least the rules would be consistent.
>
> [veering away from this particular patch]
>
> We are also talking about adding a CAP_AUDIT_READ and sending messages
> via multicast on the audit socket.  The problem is I don't know how the
> audit socket could work in the network namespace world.

Hmm.  I don't quite know how CAP_AUDIT_READ could work.  When delivering
a message to a socket you really don't know who is on the other end.

> Right now kauditd has:
>
> audit_sock = netlink_kernel_create(&init_net, NETLINK_AUDIT, &cfg);
>
> So there won't ever be anything on the kernel side of the audit socket
> in a non-init network namespace.  Lets say that is fixed somehow (I
> assume it's possible?  something? magic pixies?)

One socket for each network namespace...  It is a pain but doable.

> I think we'd somehow
> need to do the CAP_AUDIT_READ check against the user namespace
> associated with the network namespace in question?  But what messages
> should go to this userspace auditd?

Messages generated by processes in that user namespace?  

> Going to have to have audit namespaces to.  But only CAP_AUDIT_READ
> would make sense in the new audit namespace...

Given the connection of audit and security I think if we add support for
a non-global auditd the user namespace seems to fit.  The user namespace
is certainly where all of the security connected bits go.

Architecturally it gets a little tricky as it seems to make sense to
generate audit messages that make sense to the process receiving them,
which would mean actually generating a different audit message for
different receiving contexts.

I find the auditsc code odd.  We log file descriptor numbers when a file
is mmaped?  What is something so process relative good to anyone?

On a slightly different tangent.  Do we want to update the AUDIT_CAPSET
message to report the user namespace whose caps we are changing or
perhaps to surpress the message outside of the initial user namespace.

Eric

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH RFC] audit: provide namespace information in user originated records
       [not found]                             ` <87y5dh8xl7.fsf-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
@ 2013-03-21  1:46                               ` Eric Paris
  2013-03-21  2:21                                 ` Serge Hallyn
  0 siblings, 1 reply; 20+ messages in thread
From: Eric Paris @ 2013-03-21  1:46 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Linux Containers, Serge Hallyn, linux-audit-H+wXaHxf7aLQT0dZR+AlfA

On Wed, 2013-03-20 at 16:23 -0700, Eric W. Biederman wrote:
> Eric Paris <eparis-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> writes:
> 
> > On Wed, 2013-03-20 at 13:49 -0500, Serge Hallyn wrote:
> >> Quoting Eric Paris (eparis-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org):
> >> > On Wed, 2013-03-20 at 13:36 -0500, Serge Hallyn wrote:
> >> > > Quoting Aristeu Rozanski (arozansk-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org):
> >> > > > This is a bit fuzzy to me, perhaps due I'm not fully understanding
> >> > > > userns implementation yet, so bear with me:
> >> > > > I thought of changing so userns would not grant CAP_AUDIT_WRITE and
> >> > > > CAP_AUDIT_CONTROL unless the process already has it (i.e. it'd require
> >> > > 
> >> > > Seems like CAP_AUDIT_WRITE should be targeted against the
> >> > > skb->netns->userns.  Then CAP_AUDIT_WRITE can be treated like any other
> >> > > capability.  Last I knew (long time ago) you had to be in init_user_ns
> >> > > to talk audit, but that's ok - this would just do the right thing in
> >> > > any case.
> >> > 
> >> > kauditd should be considered as existing in the init user namespace.  So
> >> > I'd think we'd want to check if the process had CAP_AUDIT_WRITE in the
> >> > init user namespace and if so, allow it to send messages.  Who care what
> >> > *ns the process exists in.  If it has it in the init namespace, go
> >> > ahead.  Thus the process that created the container would need
> >> > CAP_AUDIT_WRITE in the init namespace for this to all work, right?
> >> 
> >> Yes.  What I was suggesting is intended to work if that situation ever
> >> changes.  But I have zero complaints about doing it as you say, as I
> >> doubt it ever will/ought to change.
> >> 
> >> That basically means CAP_AUDIT_WRITE would be worthless in a non-init
> >> userns.  That's fine - at least the rules would be consistent.
> >
> > [veering away from this particular patch]
> >
> > We are also talking about adding a CAP_AUDIT_READ and sending messages
> > via multicast on the audit socket.  The problem is I don't know how the
> > audit socket could work in the network namespace world.
> 
> Hmm.  I don't quite know how CAP_AUDIT_READ could work.  When delivering
> a message to a socket you really don't know who is on the other end.
> 
> > Right now kauditd has:
> >
> > audit_sock = netlink_kernel_create(&init_net, NETLINK_AUDIT, &cfg);
> >
> > So there won't ever be anything on the kernel side of the audit socket
> > in a non-init network namespace.  Lets say that is fixed somehow (I
> > assume it's possible?  something? magic pixies?)
> 
> One socket for each network namespace...  It is a pain but doable.
> 
> > I think we'd somehow
> > need to do the CAP_AUDIT_READ check against the user namespace
> > associated with the network namespace in question?  But what messages
> > should go to this userspace auditd?
> 
> Messages generated by processes in that user namespace?

So the kernel socket(s) would be per network namespace, but we divide
messages per user namespace?  Which socket do I send them on,
considering the possible crazy many<->many mappings between user and
network namespaces.  It all makes me cry a little.

> > Going to have to have audit namespaces to.  But only CAP_AUDIT_READ
> > would make sense in the new audit namespace...
> 
> Given the connection of audit and security I think if we add support for
> a non-global auditd the user namespace seems to fit.  The user namespace
> is certainly where all of the security connected bits go.
> 
> Architecturally it gets a little tricky as it seems to make sense to
> generate audit messages that make sense to the process receiving them,
> which would mean actually generating a different audit message for
> different receiving contexts.

Assuming as today, we only have 1 auditd and it is system wide.  We just
attach consistent identifiable information (aka proc inode number, which
people already use) to the audit records (this patch only does user
messages, but attaching to all messages needs to be done).

Moving to multiple auditd's starts to get really hard, and we might not
ever pursue it  :)

> I find the auditsc code odd.  We log file descriptor numbers when a file
> is mmaped?  What is something so process relative good to anyone?

When an earlier record showed that fd being opened?  I dunno....

> On a slightly different tangent.  Do we want to update the AUDIT_CAPSET
> message to report the user namespace whose caps we are changing or
> perhaps to surpress the message outside of the initial user namespace.

The extension of Aris's patch to syscall audit instead of just userspace
audit would take care of this.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH RFC] audit: provide namespace information in user originated records
  2013-03-21  1:46                               ` Eric Paris
@ 2013-03-21  2:21                                 ` Serge Hallyn
  2013-03-21  4:48                                   ` Eric W. Biederman
  0 siblings, 1 reply; 20+ messages in thread
From: Serge Hallyn @ 2013-03-21  2:21 UTC (permalink / raw)
  To: Eric Paris
  Cc: Linux Containers, linux-audit-H+wXaHxf7aLQT0dZR+AlfA, Eric W. Biederman

Quoting Eric Paris (eparis-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org):
> So the kernel socket(s) would be per network namespace, but we divide
> messages per user namespace?  Which socket do I send them on,
> considering the possible crazy many<->many mappings between user and
> network namespaces.  It all makes me cry a little.

not many-many - each netns is owned by exactly one userns.  The userns
from which the netns was created.

-serge

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH RFC] audit: provide namespace information in user originated records
  2013-03-21  2:21                                 ` Serge Hallyn
@ 2013-03-21  4:48                                   ` Eric W. Biederman
  0 siblings, 0 replies; 20+ messages in thread
From: Eric W. Biederman @ 2013-03-21  4:48 UTC (permalink / raw)
  To: Serge Hallyn
  Cc: Linux Containers, linux-audit-H+wXaHxf7aLQT0dZR+AlfA, Eric Paris

Serge Hallyn <serge.hallyn-GeWIH/nMZzLQT0dZR+AlfA@public.gmane.org> writes:

> Quoting Eric Paris (eparis-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org):
>> So the kernel socket(s) would be per network namespace, but we divide
>> messages per user namespace?  Which socket do I send them on,
>> considering the possible crazy many<->many mappings between user and
>> network namespaces.  It all makes me cry a little.
>
> not many-many - each netns is owned by exactly one userns.  The userns
> from which the netns was created.

Doh.  I missed this question and I think I misunderstood when Eric
Paris was talking about multicasting audit messages.

If what we are really talking about is sending some audit messages to
an auditd in a container what appears obvious to me is that we define
a per user namespace capability something like CAP_AUDIT_CONTROL.  That
does most or all of what CAP_AUDIT_CONTROL does in the init user
namespace.  Especially capturing audit_pid and audit_nlk_portid to
decide who to send the message to.

Something like:

struct audit_control {
	int	initialized;
	pid_t	pid;
	u32	nlk_portid;
};

struct user_namespace {
       ...
       struct audit_contol audit;
};

Then the transmission would be something like:
	struct user_namespace *user_ns = ...;
        for (;;) {
		if (ns->audit_pid) {
        		err = netlink_unicast(ns->audit.sock, skb, ns->audit.nlk_portid, 0);
		}
                if (!ns->parent)
                	break;
		ns = ns->parent;
	}

If someone finds auditd interesting enough to do that work.

In general I think it only makes sense if we can reuse the existing
userspace auditd.

Eric

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH RFC 7/8] audit: report namespace information along with USER events
       [not found]     ` <871ubc9yda.fsf-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
@ 2014-01-24  6:19       ` Richard Guy Briggs
  0 siblings, 0 replies; 20+ messages in thread
From: Richard Guy Briggs @ 2014-01-24  6:19 UTC (permalink / raw)
  To: Eric W. Biederman; +Cc: Linux Containers, linux-audit-H+wXaHxf7aLQT0dZR+AlfA

On 13/03/18, Eric W. Biederman wrote:
> Aristeu Rozanski <arozansk-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> writes:

(Digging up an old thread...)

> > For userspace generated events, include a record with the namespace
> > procfs inode numbers the process belongs to. This allows to track down
> > and filter audit messages by userspace.
> 
> I am not comfortable with using the inode numbers this way.  It does not
> pass the test of can I migrate a container and still have this work
> test.  Any kind of kernel assigned name for namespaces fails that test.

Any kind?  How about if we have a systemwide atomically incremented
serial number assigned every time a namespace is created?  This is close
to what the inode number was except the inode could be in a different
proc device, as pointed out.

> I also don't like that you don't include the procfs device number.  An
> inode number means nothing without knowing which filesystem you are
> referring to.

I'm looking at having everything relative to init_*_ns to start with, so
this isn't a problem initially, but may become so if it isn't the case.

Can anyone point out off-hand how to find that proc device number?
(I'll start looking...)

> It may never happen but I reserve the right to have the inode numbers
> for namespaces to show up differently in different instances of procfs.

So would that serial number idea work better?

> Beyond that I think this usage is possibly buggy by using two audit
> records for one event.

I'm looking at integrating this information into a standard message.

> > Signed-off-by: Aristeu Rozanski <arozansk-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
> > ---
> >  include/uapi/linux/audit.h |    1 +
> >  kernel/audit.c             |   51 +++++++++++++++++++++++++++++++++++++++++++-
> >  2 files changed, 51 insertions(+), 1 deletions(-)
> >
> > diff --git a/include/uapi/linux/audit.h b/include/uapi/linux/audit.h
> > index 9f096f1..3ec3ccb 100644
> > --- a/include/uapi/linux/audit.h
> > +++ b/include/uapi/linux/audit.h
> > @@ -106,6 +106,7 @@
> >  #define AUDIT_NETFILTER_PKT	1324	/* Packets traversing netfilter chains */
> >  #define AUDIT_NETFILTER_CFG	1325	/* Netfilter chain modifications */
> >  #define AUDIT_SECCOMP		1326	/* Secure Computing event */
> > +#define AUDIT_USER_NAMESPACE	1327	/* Information about process' namespaces */
> >  
> >  #define AUDIT_AVC		1400	/* SE Linux avc denial or grant */
> >  #define AUDIT_SELINUX_ERR	1401	/* Internal SE Linux Errors */
> > diff --git a/kernel/audit.c b/kernel/audit.c
> > index 58db117..b17f9c0 100644
> > --- a/kernel/audit.c
> > +++ b/kernel/audit.c
> > @@ -62,6 +62,11 @@
> >  #include <linux/freezer.h>
> >  #include <linux/tty.h>
> >  #include <linux/pid_namespace.h>
> > +#include <linux/ipc_namespace.h>
> > +#include <linux/mnt_namespace.h>
> > +#include <linux/utsname.h>
> > +#include <linux/user_namespace.h>
> > +#include <net/net_namespace.h>
> >  
> >  #include "audit.h"
> >  
> > @@ -641,6 +646,49 @@ static int audit_log_common_recv_msg(struct audit_buffer **ab, u16 msg_type,
> >  	return rc;
> >  }
> >  
> > +#ifdef CONFIG_NAMESPACES
> > +static int audit_log_namespaces(struct task_struct *tsk,
> > +				struct sk_buff *skb)
> > +{
> > +	struct audit_context *ctx = tsk->audit_context;
> > +	struct audit_buffer *ab;
> > +
> > +	if (!audit_enabled)
> > +		return 0;
> > +
> > +	ab = audit_log_start(ctx, GFP_KERNEL, AUDIT_USER_NAMESPACE);
> > +	if (unlikely(!ab))
> > +		return -ENOMEM;
> > +
> > +	audit_log_format(ab, "mnt=%u", mntns_get_inum(tsk));
> > +#ifdef CONFIG_NET_NS
> > +	audit_log_format(ab, " net=%u", netns_get_inum(tsk));
> > +#endif
> > +#ifdef CONFIG_UTS_NS
> > +	audit_log_format(ab, " uts=%u", utsns_get_inum(tsk));
> > +#endif
> > +#ifdef CONFIG_IPC_NS
> > +	audit_log_format(ab, " ipc=%u", ipcns_get_inum(tsk));
> > +#endif
> > +#ifdef CONFIG_PID_NS
> > +	audit_log_format(ab, " pid=%u", pidns_get_inum(tsk));
> > +#endif
> > +#ifdef CONFIG_USER_NS
> > +	audit_log_format(ab, " user=%u", userns_get_inum(tsk));
> > +#endif  
> > +	audit_set_pid(ab, NETLINK_CB(skb).portid);
> > +	audit_log_end(ab);
> > +
> > +	return 0;
> > +}
> > +#else
> > +static inline int audit_log_namespaces(struct task_struct *tsk,
> > +				       struct sk_buff *skb)
> > +{
> > +	return 0;
> > +}
> > +#endif
> > +
> >  static int audit_receive_msg(struct sk_buff *skb, struct nlmsghdr *nlh)
> >  {
> >  	u32			seq, sid;
> > @@ -741,7 +789,7 @@ static int audit_receive_msg(struct sk_buff *skb, struct nlmsghdr *nlh)
> >  			}
> >  			audit_log_common_recv_msg(&ab, msg_type,
> >  						  loginuid, sessionid, sid,
> > -						  NULL);
> > +						  current->audit_context);
> >  
> >  			if (msg_type != AUDIT_USER_TTY)
> >  				audit_log_format(ab, " msg='%.1024s'",
> > @@ -758,6 +806,7 @@ static int audit_receive_msg(struct sk_buff *skb, struct nlmsghdr *nlh)
> >  			}
> >  			audit_set_pid(ab, NETLINK_CB(skb).portid);
> >  			audit_log_end(ab);
> > +			audit_log_namespaces(current, skb);
> >  		}
> >  		break;
> >  	case AUDIT_ADD:
> 
> --
> Linux-audit mailing list
> Linux-audit-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org
> https://www.redhat.com/mailman/listinfo/linux-audit

- RGB

--
Richard Guy Briggs <rbriggs-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
Senior Software Engineer, Kernel Security, AMER ENG Base Operating Systems, Red Hat
Remote, Ottawa, Canada
Voice: +1.647.777.2635, Internal: (81) 32635, Alt: +1.613.693.0684x3545

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH RFC] audit: provide namespace information in user originated records
@ 2013-03-18 15:45 Aristeu Rozanski
  0 siblings, 0 replies; 20+ messages in thread
From: Aristeu Rozanski @ 2013-03-18 15:45 UTC (permalink / raw)
  To: linux-audit

(re-sending this, linux-audit is members only it seems)

This patchset introduces a new audit record to follow all USER records which
provides namespace information of the process. The idea is to allow processes
in containers to create records in the host system while providing means to be
filtered out.

For each new namespace, a unique procfs inode number is allocated and this
number has been used by userspace to determine which processes belong to the
same namespace. These numbers are used in the new audit record.

Applications such as libvirt-sandbox and lxc can then report the same numbers
when a container is created and destroyed allowing to map records to a certain
container. Maybe the next step would be having a record for whenever a new
namespace is created?

First 6 patches are needed in order to get each namespace's inode number.
Patch 7 properly defines the new record that is related to the USER record
Patch 8 allows USER records to be generated from namespaces

Here's an example of output:
type=CRED_DISP msg=audit(1363528861.403:311): pid=20016 uid=0 auid=0 ses=45 subj=system_u:system_r:crond_t:s0-s0:c0.c1023 msg='op=PAM:setcred acct="root" exe="/usr/sbin/crond" hostname=? addr=? terminal=cron res=success'
type=UNKNOWN[1327] msg=audit(1363528861.403:311): mnt=4026531840 net=4026531956 uts=4026531838 ipc=4026531839 pid=4026531836 user=4026531837

Notes:
- this is a RFC, all sorts of feedback are much appreciated
- while the last patch allows a new userns to send audit records, I haven't
  look yet on making sure it has proper capabilities so regular users'
  containers can create records
- the record number allocated is just a draft. If this patchset evolves into
  something that can be merged, please advise which number number is the best
  choice

 fs/namespace.c                 |   14 +++++++
 include/linux/ipc_namespace.h  |    1
 include/linux/mnt_namespace.h  |    2 +
 include/linux/pid_namespace.h  |    1
 include/linux/user_namespace.h |    1
 include/linux/utsname.h        |    1
 include/net/net_namespace.h    |    1
 include/uapi/linux/audit.h     |    1
 ipc/namespace.c                |   14 +++++++
 kernel/audit.c                 |   76 +++++++++++++++++++++++++++++++++++++----
 kernel/pid_namespace.c         |   11 +++++
 kernel/user_namespace.c        |    5 ++
 kernel/utsname.c               |   14 +++++++
 net/core/net_namespace.c       |   14 +++++++
 14 files changed, 150 insertions(+), 6 deletions(-)

^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2014-01-24  6:19 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <1363619405-6419-1-git-send-email-arozansk@redhat.com>
     [not found] ` <1363619405-6419-9-git-send-email-arozansk@redhat.com>
2013-03-18 21:28   ` [PATCH RFC 8/8] audit: allow user records to be created inside a container Eric W. Biederman
     [not found] ` <1363619405-6419-8-git-send-email-arozansk@redhat.com>
2013-03-18 21:44   ` [PATCH RFC 7/8] audit: report namespace information along with USER events Eric W. Biederman
2013-03-19 12:08     ` Aristeu Rozanski
     [not found]     ` <871ubc9yda.fsf-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
2014-01-24  6:19       ` Richard Guy Briggs
     [not found] ` <1363619405-6419-1-git-send-email-arozansk-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2013-03-18 22:16   ` [PATCH RFC] audit: provide namespace information in user originated records Eric W. Biederman
     [not found]     ` <877gl48iaz.fsf-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
2013-03-19 12:24       ` Aristeu Rozanski
     [not found]         ` <20130319122408.GC20187-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2013-03-20  0:00           ` Eric W. Biederman
     [not found]             ` <874ng7gcst.fsf-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
2013-03-20 15:12               ` Serge Hallyn
2013-03-20 15:45               ` Aristeu Rozanski
     [not found]                 ` <20130320154503.GF20187-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2013-03-20 18:36                   ` Serge Hallyn
2013-03-20 18:42                     ` Eric Paris
2013-03-20 18:49                       ` Serge Hallyn
2013-03-20 19:01                         ` Eric Paris
2013-03-20 19:17                           ` Aristeu Rozanski
2013-03-20 19:19                           ` Serge Hallyn
2013-03-20 23:23                           ` Eric W. Biederman
     [not found]                             ` <87y5dh8xl7.fsf-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
2013-03-21  1:46                               ` Eric Paris
2013-03-21  2:21                                 ` Serge Hallyn
2013-03-21  4:48                                   ` Eric W. Biederman
2013-03-18 15:45 Aristeu Rozanski

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.