linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC Part1 PATCH 00/20 v2] Add namespace support for audit
@ 2013-10-24  7:31 Gao feng
  2013-10-24  7:31 ` [PATCH 01/20] Audit: make audit netlink socket net namespace unaware Gao feng
                   ` (22 more replies)
  0 siblings, 23 replies; 50+ messages in thread
From: Gao feng @ 2013-10-24  7:31 UTC (permalink / raw)
  To: linux-kernel, linux-audit
  Cc: containers, ebiederm, serge.hallyn, eparis, sgrubb,
	toshi.okajima, Gao feng

Here is the v1 patchset: http://lwn.net/Articles/549546/

The main target of this patchset is allowing user in audit
namespace to generate the USER_MSG type of audit message,
some userspace tools need to generate audit message, or
these tools will broken.

And the login process in container may want to setup
/proc/<pid>/loginuid, right now this value is unalterable
once it being set. this will also broke the login problem
in container. After this patchset, we can reset this loginuid
to zero if task is running in a new audit namespace.

Same with v1 patchset, in this patchset, only the privileged
user in init_audit_ns and init_user_ns has rights to
add/del audit rules. and these rules are gloabl. all
audit namespace will comply with the rules.

Compared with v1, v2 patch has some big changes.
1, the audit namespace is not assigned to user namespace.
   since there is no available bit of flags for clone, we
   create audit namespace through netlink, patch[18/20]
   introduces a new audit netlink type AUDIT_CREATE_NS.
   the privileged user in userns has rights to create a
   audit namespace, it means the unprivileged user can
   create auditns through create userns first. In order
   to prevent them from doing harm to host, the default
   audit_backlog_limit of un-init-audit-ns is zero(means
   audit is unavailable in audit namespace). and it can't
   be changed in auditns through netlink.

2, introduce /proc/<pid>/audit_log_limit
   this interface is used to setup log_limit of audit
   namespace.  we need this interface to make audit
   available in un-init-audit-ns. Only the privileged user
   has right to set this value, it means only the root user
   of host can change it.

3, make audit namespace don't depend on net namespace.
   patch[1/20] add a compare function audit_compare for
   audit netlink, it always return true, it means the
   netlink subsystem will find out the netlink socket
   only through portid and netlink type. So we needn't
   to create kernel side audit netlink socket for per
   net namespace, all userspace audit netlink socket
   can find out the audit_sock, and audit_sock can
   communicate with them through the proper portid.
   it's just like the behavior we don't have net
   namespace before.


This patchset still need some work, such as allow changing
audit_enabled in audit namespace, auditd wants this feature.

I send this patchset now in order to get more comments, so
I can keep on improving namespace support for audit.

Gao feng (20):
  Audit: make audit netlink socket net namespace unaware
  audit: introduce configure option CONFIG_AUDIT_NS
  audit: make audit_skb_queue per audit namespace
  audit: make audit_skb_hold_queue per audit namespace
  audit: make audit_pid per audit namespace
  audit: make kauditd_task per audit namespace
  aduit: make audit_nlk_portid per audit namespace
  audit: make kaudit_wait queue per audit namespace
  audit: make audit_backlog_wait per audit namespace
  audit: allow un-init audit ns to change pid and portid only
  audit: use proper audit namespace in audit_receive_msg
  audit: use proper audit_namespace in kauditd_thread
  audit: introduce new audit logging interface for audit namespace
  audit: pass proper audit namespace to audit_log_common_recv_msg
  audit: Log audit pid config change in audit namespace
  audit: allow GET,SET,USER MSG operations in audit namespace
  nsproxy: don't make create_new_namespaces static
  audit: add new message type AUDIT_CREATE_NS
  audit: make audit_backlog_limit per audit namespace
  audit: introduce /proc/<pid>/audit_backlog_limit

 fs/proc/base.c                  |  53 ++++++
 include/linux/audit.h           |  26 ++-
 include/linux/audit_namespace.h |  92 ++++++++++
 include/linux/nsproxy.h         |  15 +-
 include/uapi/linux/audit.h      |   1 +
 init/Kconfig                    |  10 ++
 kernel/Makefile                 |   2 +-
 kernel/audit.c                  | 364 +++++++++++++++++++++++++---------------
 kernel/audit.h                  |   5 +-
 kernel/audit_namespace.c        | 123 ++++++++++++++
 kernel/auditsc.c                |   6 +-
 kernel/nsproxy.c                |  18 +-
 12 files changed, 561 insertions(+), 154 deletions(-)
 create mode 100644 include/linux/audit_namespace.h
 create mode 100644 kernel/audit_namespace.c

-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 50+ messages in thread

* [PATCH 01/20] Audit: make audit netlink socket net namespace unaware
  2013-10-24  7:31 [RFC Part1 PATCH 00/20 v2] Add namespace support for audit Gao feng
@ 2013-10-24  7:31 ` Gao feng
  2013-10-24  7:31 ` [PATCH 02/20] audit: introduce configure option CONFIG_AUDIT_NS Gao feng
                   ` (21 subsequent siblings)
  22 siblings, 0 replies; 50+ messages in thread
From: Gao feng @ 2013-10-24  7:31 UTC (permalink / raw)
  To: linux-kernel, linux-audit
  Cc: containers, ebiederm, serge.hallyn, eparis, sgrubb,
	toshi.okajima, Gao feng

Add a compare function which always return true for
audit netlink socket, this will cause audit netlink
sockets netns unaware, and no matter which netns the
user space audit netlink sockets belong to, they all
can find out and communicate with audit_sock.

This gets rid of the necessary to create per-netns
audit kernel side socket(audit_sock), it's pain to
depend on and get reference of netns for auditns.

Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
---
 kernel/audit.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/kernel/audit.c b/kernel/audit.c
index 7b0e23a..468950b 100644
--- a/kernel/audit.c
+++ b/kernel/audit.c
@@ -886,12 +886,18 @@ static void audit_receive(struct sk_buff  *skb)
 	mutex_unlock(&audit_cmd_mutex);
 }
 
+static bool audit_compare(struct net *net, struct sock *sk)
+{
+	return true;
+}
+
 /* Initialize audit support at boot time. */
 static int __init audit_init(void)
 {
 	int i;
 	struct netlink_kernel_cfg cfg = {
 		.input	= audit_receive,
+		.compare = audit_compare,
 	};
 
 	if (audit_initialized == AUDIT_DISABLED)
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH 02/20] audit: introduce configure option CONFIG_AUDIT_NS
  2013-10-24  7:31 [RFC Part1 PATCH 00/20 v2] Add namespace support for audit Gao feng
  2013-10-24  7:31 ` [PATCH 01/20] Audit: make audit netlink socket net namespace unaware Gao feng
@ 2013-10-24  7:31 ` Gao feng
  2013-10-24  7:31 ` [PATCH 03/20] audit: make audit_skb_queue per audit namespace Gao feng
                   ` (20 subsequent siblings)
  22 siblings, 0 replies; 50+ messages in thread
From: Gao feng @ 2013-10-24  7:31 UTC (permalink / raw)
  To: linux-kernel, linux-audit
  Cc: containers, ebiederm, serge.hallyn, eparis, sgrubb,
	toshi.okajima, Gao feng

This patch adds a new field audit_ns for struct
nsproxy, so task can access the audit_ns through
task->nsproxy->audit_ns.

Right now, we don't support create new audit_ns,
all tasks's audit_ns will point to the init_audit_ns.
next patches will add the feature creating new
audit namespace.

Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
---
 include/linux/audit_namespace.h | 51 +++++++++++++++++++++++++++++++++++++++++
 include/linux/nsproxy.h         | 11 +++++----
 init/Kconfig                    | 10 ++++++++
 kernel/Makefile                 |  2 +-
 kernel/audit_namespace.c        |  8 +++++++
 kernel/nsproxy.c                | 16 ++++++++++++-
 6 files changed, 91 insertions(+), 7 deletions(-)
 create mode 100644 include/linux/audit_namespace.h
 create mode 100644 kernel/audit_namespace.c

diff --git a/include/linux/audit_namespace.h b/include/linux/audit_namespace.h
new file mode 100644
index 0000000..ac22649
--- /dev/null
+++ b/include/linux/audit_namespace.h
@@ -0,0 +1,51 @@
+#ifndef __LINUX_AUDIT_NAMESPACE_H
+#define __LINUX_AUDIT_NAMESPACE_H
+
+#include <linux/audit.h>
+#include <linux/atomic.h>
+#include <linux/slab.h>
+#include <linux/user_namespace.h>
+
+struct audit_namespace {
+	atomic_t count;
+	struct user_namespace *user_ns;
+};
+
+extern struct audit_namespace init_audit_ns;
+
+#if defined(CONFIG_AUDIT_NS)
+static inline
+struct audit_namespace *get_audit_ns(struct audit_namespace *ns)
+{
+	atomic_inc(&ns->count);
+	return ns;
+}
+
+static inline
+void put_audit_ns(struct audit_namespace *ns)
+{
+	if (atomic_dec_and_test(&ns->count)) {
+		put_user_ns(ns->user_ns);
+		kfree(ns);
+	}
+}
+#else
+static inline
+struct audit_namespace *get_audit_ns(struct audit_namespace *ns)
+{
+	return ns;
+}
+
+static inline
+void put_audit_ns(struct audit_namespace *ns)
+{
+
+}
+#endif
+
+static inline struct
+audit_namespace *copy_audit_ns(struct audit_namespace *audit)
+{
+	return get_audit_ns(audit);
+}
+#endif
diff --git a/include/linux/nsproxy.h b/include/linux/nsproxy.h
index b4ec59d..dc7af11 100644
--- a/include/linux/nsproxy.h
+++ b/include/linux/nsproxy.h
@@ -28,11 +28,12 @@ struct fs_struct;
  */
 struct nsproxy {
 	atomic_t count;
-	struct uts_namespace *uts_ns;
-	struct ipc_namespace *ipc_ns;
-	struct mnt_namespace *mnt_ns;
-	struct pid_namespace *pid_ns_for_children;
-	struct net 	     *net_ns;
+	struct uts_namespace	*uts_ns;
+	struct ipc_namespace	*ipc_ns;
+	struct mnt_namespace	*mnt_ns;
+	struct pid_namespace	*pid_ns_for_children;
+	struct net		*net_ns;
+	struct audit_namespace	*audit_ns;
 };
 extern struct nsproxy init_nsproxy;
 
diff --git a/init/Kconfig b/init/Kconfig
index 3ecd8a1..05e3d2c 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -1154,6 +1154,16 @@ config NET_NS
 	  Allow user space to create what appear to be multiple instances
 	  of the network stack.
 
+config AUDIT_NS
+	bool "Audit namespace"
+	depends on AUDIT
+	default n
+	help
+	  Support audit namespace.  This allows processes write audit message
+	  to the audit namespace they belong to.
+
+	  If unsure, say N.
+
 endif # NAMESPACES
 
 config UIDGID_STRICT_TYPE_CHECKS
diff --git a/kernel/Makefile b/kernel/Makefile
index 1ce4755..6e64333 100644
--- a/kernel/Makefile
+++ b/kernel/Makefile
@@ -71,7 +71,7 @@ obj-$(CONFIG_IKCONFIG) += configs.o
 obj-$(CONFIG_RESOURCE_COUNTERS) += res_counter.o
 obj-$(CONFIG_SMP) += stop_machine.o
 obj-$(CONFIG_KPROBES_SANITY_TEST) += test_kprobes.o
-obj-$(CONFIG_AUDIT) += audit.o auditfilter.o
+obj-$(CONFIG_AUDIT) += audit.o auditfilter.o audit_namespace.o
 obj-$(CONFIG_AUDITSYSCALL) += auditsc.o
 obj-$(CONFIG_AUDIT_WATCH) += audit_watch.o
 obj-$(CONFIG_AUDIT_TREE) += audit_tree.o
diff --git a/kernel/audit_namespace.c b/kernel/audit_namespace.c
new file mode 100644
index 0000000..6d9cb8f
--- /dev/null
+++ b/kernel/audit_namespace.c
@@ -0,0 +1,8 @@
+#include <linux/audit_namespace.h>
+#include <linux/export.h>
+
+struct audit_namespace init_audit_ns = {
+	.count = ATOMIC_INIT(1),
+	.user_ns = &init_user_ns,
+};
+EXPORT_SYMBOL_GPL(init_audit_ns);
diff --git a/kernel/nsproxy.c b/kernel/nsproxy.c
index 8e78110..e8374aa 100644
--- a/kernel/nsproxy.c
+++ b/kernel/nsproxy.c
@@ -22,6 +22,7 @@
 #include <linux/pid_namespace.h>
 #include <net/net_namespace.h>
 #include <linux/ipc_namespace.h>
+#include <linux/audit_namespace.h>
 #include <linux/proc_ns.h>
 #include <linux/file.h>
 #include <linux/syscalls.h>
@@ -39,6 +40,9 @@ struct nsproxy init_nsproxy = {
 #ifdef CONFIG_NET
 	.net_ns			= &init_net,
 #endif
+#ifdef CONFIG_AUDIT
+	.audit_ns		= &init_audit_ns,
+#endif
 };
 
 static inline struct nsproxy *create_nsproxy(void)
@@ -98,8 +102,16 @@ static struct nsproxy *create_new_namespaces(unsigned long flags,
 		goto out_net;
 	}
 
-	return new_nsp;
+	new_nsp->audit_ns = copy_audit_ns(tsk->nsproxy->audit_ns);
+	if (IS_ERR(new_nsp->audit_ns)) {
+		err = PTR_ERR(new_nsp->audit_ns);
+		goto out_audit;
+	}
 
+	return new_nsp;
+out_audit:
+	if (new_nsp->net_ns)
+		put_net(new_nsp->net_ns);
 out_net:
 	if (new_nsp->pid_ns_for_children)
 		put_pid_ns(new_nsp->pid_ns_for_children);
@@ -165,6 +177,8 @@ void free_nsproxy(struct nsproxy *ns)
 		put_ipc_ns(ns->ipc_ns);
 	if (ns->pid_ns_for_children)
 		put_pid_ns(ns->pid_ns_for_children);
+	if (ns->audit_ns)
+		put_audit_ns(ns->audit_ns);
 	put_net(ns->net_ns);
 	kmem_cache_free(nsproxy_cachep, ns);
 }
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH 03/20] audit: make audit_skb_queue per audit namespace
  2013-10-24  7:31 [RFC Part1 PATCH 00/20 v2] Add namespace support for audit Gao feng
  2013-10-24  7:31 ` [PATCH 01/20] Audit: make audit netlink socket net namespace unaware Gao feng
  2013-10-24  7:31 ` [PATCH 02/20] audit: introduce configure option CONFIG_AUDIT_NS Gao feng
@ 2013-10-24  7:31 ` Gao feng
  2013-10-24  7:31 ` [PATCH 04/20] audit: make audit_skb_hold_queue " Gao feng
                   ` (19 subsequent siblings)
  22 siblings, 0 replies; 50+ messages in thread
From: Gao feng @ 2013-10-24  7:31 UTC (permalink / raw)
  To: linux-kernel, linux-audit
  Cc: containers, ebiederm, serge.hallyn, eparis, sgrubb,
	toshi.okajima, Gao feng

This patch makes audit_skb_queue per audit namespace,
Since we haven't finished the preparations, only
allow user to attach/detach skb to the queue of
init_audit_ns.

Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
---
 include/linux/audit_namespace.h |  3 +++
 kernel/audit.c                  | 18 +++++++++---------
 2 files changed, 12 insertions(+), 9 deletions(-)

diff --git a/include/linux/audit_namespace.h b/include/linux/audit_namespace.h
index ac22649..65acab1 100644
--- a/include/linux/audit_namespace.h
+++ b/include/linux/audit_namespace.h
@@ -4,11 +4,13 @@
 #include <linux/audit.h>
 #include <linux/atomic.h>
 #include <linux/slab.h>
+#include <linux/skbuff.h>
 #include <linux/user_namespace.h>
 
 struct audit_namespace {
 	atomic_t count;
 	struct user_namespace *user_ns;
+	struct sk_buff_head queue;
 };
 
 extern struct audit_namespace init_audit_ns;
@@ -26,6 +28,7 @@ void put_audit_ns(struct audit_namespace *ns)
 {
 	if (atomic_dec_and_test(&ns->count)) {
 		put_user_ns(ns->user_ns);
+		skb_queue_purge(&ns->queue);
 		kfree(ns);
 	}
 }
diff --git a/kernel/audit.c b/kernel/audit.c
index 468950b..f3c2150 100644
--- a/kernel/audit.c
+++ b/kernel/audit.c
@@ -64,6 +64,7 @@
 #include <linux/freezer.h>
 #include <linux/tty.h>
 #include <linux/pid_namespace.h>
+#include <linux/audit_namespace.h>
 
 #include "audit.h"
 
@@ -133,7 +134,6 @@ static DEFINE_SPINLOCK(audit_freelist_lock);
 static int	   audit_freelist_count;
 static LIST_HEAD(audit_freelist);
 
-static struct sk_buff_head audit_skb_queue;
 /* queue of skbs to send to auditd when/if it comes back */
 static struct sk_buff_head audit_skb_hold_queue;
 static struct task_struct *kauditd_task;
@@ -446,7 +446,7 @@ static int kauditd_thread(void *dummy)
 
 		flush_hold_queue();
 
-		skb = skb_dequeue(&audit_skb_queue);
+		skb = skb_dequeue(&init_audit_ns.queue);
 		wake_up(&audit_backlog_wait);
 		if (skb) {
 			if (audit_pid)
@@ -458,7 +458,7 @@ static int kauditd_thread(void *dummy)
 		set_current_state(TASK_INTERRUPTIBLE);
 		add_wait_queue(&kauditd_wait, &wait);
 
-		if (!skb_queue_len(&audit_skb_queue)) {
+		if (!skb_queue_len(&init_audit_ns.queue)) {
 			try_to_freeze();
 			schedule();
 		}
@@ -665,7 +665,7 @@ static int audit_receive_msg(struct sk_buff *skb, struct nlmsghdr *nlh)
 		status_set.rate_limit	 = audit_rate_limit;
 		status_set.backlog_limit = audit_backlog_limit;
 		status_set.lost		 = atomic_read(&audit_lost);
-		status_set.backlog	 = skb_queue_len(&audit_skb_queue);
+		status_set.backlog	 = skb_queue_len(&init_audit_ns.queue);
 		audit_send_reply(NETLINK_CB(skb).portid, seq, AUDIT_GET, 0, 0,
 				 &status_set, sizeof(status_set));
 		break;
@@ -911,7 +911,7 @@ static int __init audit_init(void)
 	else
 		audit_sock->sk_sndtimeo = MAX_SCHEDULE_TIMEOUT;
 
-	skb_queue_head_init(&audit_skb_queue);
+	skb_queue_head_init(&init_audit_ns.queue);
 	skb_queue_head_init(&audit_skb_hold_queue);
 	audit_initialized = AUDIT_INITIALIZED;
 	audit_enabled = audit_default;
@@ -1066,7 +1066,7 @@ static void wait_for_auditd(unsigned long sleep_time)
 	add_wait_queue(&audit_backlog_wait, &wait);
 
 	if (audit_backlog_limit &&
-	    skb_queue_len(&audit_skb_queue) > audit_backlog_limit)
+	    skb_queue_len(&init_audit_ns.queue) > audit_backlog_limit)
 		schedule_timeout(sleep_time);
 
 	__set_current_state(TASK_RUNNING);
@@ -1117,7 +1117,7 @@ struct audit_buffer *audit_log_start(struct audit_context *ctx, gfp_t gfp_mask,
 				entries over the normal backlog limit */
 
 	while (audit_backlog_limit
-	       && skb_queue_len(&audit_skb_queue) > audit_backlog_limit + reserve) {
+	       && skb_queue_len(&init_audit_ns.queue) > audit_backlog_limit + reserve) {
 		if (gfp_mask & __GFP_WAIT && audit_backlog_wait_time) {
 			unsigned long sleep_time;
 
@@ -1132,7 +1132,7 @@ struct audit_buffer *audit_log_start(struct audit_context *ctx, gfp_t gfp_mask,
 			printk(KERN_WARNING
 			       "audit: audit_backlog=%d > "
 			       "audit_backlog_limit=%d\n",
-			       skb_queue_len(&audit_skb_queue),
+			       skb_queue_len(&init_audit_ns.queue),
 			       audit_backlog_limit);
 		audit_log_lost("backlog limit exceeded");
 		audit_backlog_wait_time = audit_backlog_wait_overflow;
@@ -1677,7 +1677,7 @@ void audit_log_end(struct audit_buffer *ab)
 		nlh->nlmsg_len = ab->skb->len - NLMSG_HDRLEN;
 
 		if (audit_pid) {
-			skb_queue_tail(&audit_skb_queue, ab->skb);
+			skb_queue_tail(&init_audit_ns.queue, ab->skb);
 			wake_up_interruptible(&kauditd_wait);
 		} else {
 			audit_printk_skb(ab->skb);
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH 04/20] audit: make audit_skb_hold_queue per audit namespace
  2013-10-24  7:31 [RFC Part1 PATCH 00/20 v2] Add namespace support for audit Gao feng
                   ` (2 preceding siblings ...)
  2013-10-24  7:31 ` [PATCH 03/20] audit: make audit_skb_queue per audit namespace Gao feng
@ 2013-10-24  7:31 ` Gao feng
  2013-10-24  7:31 ` [PATCH 05/20] audit: make audit_pid " Gao feng
                   ` (18 subsequent siblings)
  22 siblings, 0 replies; 50+ messages in thread
From: Gao feng @ 2013-10-24  7:31 UTC (permalink / raw)
  To: linux-kernel, linux-audit
  Cc: containers, ebiederm, serge.hallyn, eparis, sgrubb,
	toshi.okajima, Gao feng

This patch makes audit_skb_hold_queue per audit namespace.

Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
---
 include/linux/audit_namespace.h |  3 +++
 kernel/audit.c                  | 12 +++++-------
 2 files changed, 8 insertions(+), 7 deletions(-)

diff --git a/include/linux/audit_namespace.h b/include/linux/audit_namespace.h
index 65acab1..9b7b4c5 100644
--- a/include/linux/audit_namespace.h
+++ b/include/linux/audit_namespace.h
@@ -11,6 +11,8 @@ struct audit_namespace {
 	atomic_t count;
 	struct user_namespace *user_ns;
 	struct sk_buff_head queue;
+	/* queue of skbs to send to auditd when/if it comes back */
+	struct sk_buff_head hold_queue;
 };
 
 extern struct audit_namespace init_audit_ns;
@@ -29,6 +31,7 @@ void put_audit_ns(struct audit_namespace *ns)
 	if (atomic_dec_and_test(&ns->count)) {
 		put_user_ns(ns->user_ns);
 		skb_queue_purge(&ns->queue);
+		skb_queue_purge(&ns->hold_queue);
 		kfree(ns);
 	}
 }
diff --git a/kernel/audit.c b/kernel/audit.c
index f3c2150..51ee701 100644
--- a/kernel/audit.c
+++ b/kernel/audit.c
@@ -134,8 +134,6 @@ static DEFINE_SPINLOCK(audit_freelist_lock);
 static int	   audit_freelist_count;
 static LIST_HEAD(audit_freelist);
 
-/* queue of skbs to send to auditd when/if it comes back */
-static struct sk_buff_head audit_skb_hold_queue;
 static struct task_struct *kauditd_task;
 static DECLARE_WAIT_QUEUE_HEAD(kauditd_wait);
 static DECLARE_WAIT_QUEUE_HEAD(audit_backlog_wait);
@@ -355,8 +353,8 @@ static int audit_set_failure(int state)
 static void audit_hold_skb(struct sk_buff *skb)
 {
 	if (audit_default &&
-	    skb_queue_len(&audit_skb_hold_queue) < audit_backlog_limit)
-		skb_queue_tail(&audit_skb_hold_queue, skb);
+	    skb_queue_len(&init_audit_ns.hold_queue) < audit_backlog_limit)
+		skb_queue_tail(&init_audit_ns.hold_queue, skb);
 	else
 		kfree_skb(skb);
 }
@@ -420,13 +418,13 @@ static void flush_hold_queue(void)
 	if (!audit_default || !audit_pid)
 		return;
 
-	skb = skb_dequeue(&audit_skb_hold_queue);
+	skb = skb_dequeue(&init_audit_ns.hold_queue);
 	if (likely(!skb))
 		return;
 
 	while (skb && audit_pid) {
 		kauditd_send_skb(skb);
-		skb = skb_dequeue(&audit_skb_hold_queue);
+		skb = skb_dequeue(&init_audit_ns.hold_queue);
 	}
 
 	/*
@@ -912,7 +910,7 @@ static int __init audit_init(void)
 		audit_sock->sk_sndtimeo = MAX_SCHEDULE_TIMEOUT;
 
 	skb_queue_head_init(&init_audit_ns.queue);
-	skb_queue_head_init(&audit_skb_hold_queue);
+	skb_queue_head_init(&init_audit_ns.hold_queue);
 	audit_initialized = AUDIT_INITIALIZED;
 	audit_enabled = audit_default;
 	audit_ever_enabled |= !!audit_default;
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH 05/20] audit: make audit_pid per audit namespace
  2013-10-24  7:31 [RFC Part1 PATCH 00/20 v2] Add namespace support for audit Gao feng
                   ` (3 preceding siblings ...)
  2013-10-24  7:31 ` [PATCH 04/20] audit: make audit_skb_hold_queue " Gao feng
@ 2013-10-24  7:31 ` Gao feng
  2013-10-24  7:31 ` [PATCH 06/20] audit: make kauditd_task " Gao feng
                   ` (17 subsequent siblings)
  22 siblings, 0 replies; 50+ messages in thread
From: Gao feng @ 2013-10-24  7:31 UTC (permalink / raw)
  To: linux-kernel, linux-audit
  Cc: containers, ebiederm, serge.hallyn, eparis, sgrubb,
	toshi.okajima, Gao feng

Every audit namespace has its own auditd process,
so audit_pid should be per audit namespace too.

Since some places such as audit_filter_syscall
use audit_pid to identify if the task is auditd,
so we should store auditd's pid which for init
pid namespace not for current pid ns.

Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
---
 include/linux/audit_namespace.h |  2 ++
 kernel/audit.c                  | 43 ++++++++++++++++++++++++++++++-----------
 kernel/audit.h                  |  5 ++---
 kernel/auditsc.c                |  6 +++---
 4 files changed, 39 insertions(+), 17 deletions(-)

diff --git a/include/linux/audit_namespace.h b/include/linux/audit_namespace.h
index 9b7b4c5..fdbb6c1 100644
--- a/include/linux/audit_namespace.h
+++ b/include/linux/audit_namespace.h
@@ -9,6 +9,8 @@
 
 struct audit_namespace {
 	atomic_t count;
+	/* pid of the auditd process */
+	int pid;
 	struct user_namespace *user_ns;
 	struct sk_buff_head queue;
 	/* queue of skbs to send to auditd when/if it comes back */
diff --git a/kernel/audit.c b/kernel/audit.c
index 51ee701..e5e8cb1 100644
--- a/kernel/audit.c
+++ b/kernel/audit.c
@@ -94,7 +94,6 @@ static int	audit_failure = AUDIT_FAIL_PRINTK;
  * contains the pid of the auditd process and audit_nlk_portid contains
  * the portid to use to send netlink messages to that process.
  */
-int		audit_pid;
 static int	audit_nlk_portid;
 
 /* If audit_rate_limit is non-zero, limit the rate of sending audit records
@@ -187,7 +186,7 @@ void audit_panic(const char *message)
 		break;
 	case AUDIT_FAIL_PANIC:
 		/* test audit_pid since printk is always losey, why bother? */
-		if (audit_pid)
+		if (init_audit_ns.pid)
 			panic("audit: %s\n", message);
 		break;
 	}
@@ -386,9 +385,9 @@ static void kauditd_send_skb(struct sk_buff *skb)
 	err = netlink_unicast(audit_sock, skb, audit_nlk_portid, 0);
 	if (err < 0) {
 		BUG_ON(err != -ECONNREFUSED); /* Shouldn't happen */
-		printk(KERN_ERR "audit: *NO* daemon at audit_pid=%d\n", audit_pid);
+		printk(KERN_ERR "audit: *NO* daemon at audit_pid=%d\n", init_audit_ns.pid);
 		audit_log_lost("auditd disappeared\n");
-		audit_pid = 0;
+		init_audit_ns.pid = 0;
 		/* we might get lucky and get this in the next auditd */
 		audit_hold_skb(skb);
 	} else
@@ -415,14 +414,14 @@ static void flush_hold_queue(void)
 {
 	struct sk_buff *skb;
 
-	if (!audit_default || !audit_pid)
+	if (!audit_default || !init_audit_ns.pid)
 		return;
 
 	skb = skb_dequeue(&init_audit_ns.hold_queue);
 	if (likely(!skb))
 		return;
 
-	while (skb && audit_pid) {
+	while (skb && init_audit_ns.pid) {
 		kauditd_send_skb(skb);
 		skb = skb_dequeue(&init_audit_ns.hold_queue);
 	}
@@ -447,7 +446,7 @@ static int kauditd_thread(void *dummy)
 		skb = skb_dequeue(&init_audit_ns.queue);
 		wake_up(&audit_backlog_wait);
 		if (skb) {
-			if (audit_pid)
+			if (init_audit_ns.pid)
 				kauditd_send_skb(skb);
 			else
 				audit_printk_skb(skb);
@@ -659,11 +658,22 @@ static int audit_receive_msg(struct sk_buff *skb, struct nlmsghdr *nlh)
 	case AUDIT_GET:
 		status_set.enabled	 = audit_enabled;
 		status_set.failure	 = audit_failure;
-		status_set.pid		 = audit_pid;
+		status_set.pid		 = init_audit_ns.pid;
 		status_set.rate_limit	 = audit_rate_limit;
 		status_set.backlog_limit = audit_backlog_limit;
 		status_set.lost		 = atomic_read(&audit_lost);
 		status_set.backlog	 = skb_queue_len(&init_audit_ns.queue);
+
+		rcu_read_lock();
+		{
+			struct task_struct *task;
+
+			task = find_task_by_pid_ns(ns->pid, &init_pid_ns);
+
+			if (task)
+				status_set.pid = task_pid_vnr(task);
+		}
+		rcu_read_unlock();
 		audit_send_reply(NETLINK_CB(skb).portid, seq, AUDIT_GET, 0, 0,
 				 &status_set, sizeof(status_set));
 		break;
@@ -683,10 +693,20 @@ static int audit_receive_msg(struct sk_buff *skb, struct nlmsghdr *nlh)
 		}
 		if (status_get->mask & AUDIT_STATUS_PID) {
 			int new_pid = status_get->pid;
+			struct task_struct *task;
+
+			rcu_read_lock();
+			task = find_task_by_vpid(new_pid);
+
+			if (task)
+				new_pid = task_pid_nr_ns(task, &init_pid_ns);
 
 			if (audit_enabled != AUDIT_OFF)
-				audit_log_config_change("audit_pid", new_pid, audit_pid, 1);
-			audit_pid = new_pid;
+				audit_log_config_change("audit_pid", new_pid, init_audit_ns.pid, 1);
+
+			init_audit_ns.pid = new_pid;
+			rcu_read_unlock();
+
 			audit_nlk_portid = NETLINK_CB(skb).portid;
 		}
 		if (status_get->mask & AUDIT_STATUS_RATE_LIMIT) {
@@ -909,6 +929,7 @@ static int __init audit_init(void)
 	else
 		audit_sock->sk_sndtimeo = MAX_SCHEDULE_TIMEOUT;
 
+	init_audit_ns.pid = 0;
 	skb_queue_head_init(&init_audit_ns.queue);
 	skb_queue_head_init(&init_audit_ns.hold_queue);
 	audit_initialized = AUDIT_INITIALIZED;
@@ -1674,7 +1695,7 @@ void audit_log_end(struct audit_buffer *ab)
 		struct nlmsghdr *nlh = nlmsg_hdr(ab->skb);
 		nlh->nlmsg_len = ab->skb->len - NLMSG_HDRLEN;
 
-		if (audit_pid) {
+		if (init_audit_ns.pid) {
 			skb_queue_tail(&init_audit_ns.queue, ab->skb);
 			wake_up_interruptible(&kauditd_wait);
 		} else {
diff --git a/kernel/audit.h b/kernel/audit.h
index 123c9b7..75362dd 100644
--- a/kernel/audit.h
+++ b/kernel/audit.h
@@ -23,6 +23,7 @@
 #include <linux/audit.h>
 #include <linux/skbuff.h>
 #include <uapi/linux/mqueue.h>
+#include <linux/audit_namespace.h>
 
 /* 0 = no checking
    1 = put_count checking
@@ -218,8 +219,6 @@ extern void audit_log_name(struct audit_context *context,
 			   struct audit_names *n, struct path *path,
 			   int record_num, int *call_panic);
 
-extern int audit_pid;
-
 #define AUDIT_INODE_BUCKETS	32
 extern struct list_head audit_inode_hash[AUDIT_INODE_BUCKETS];
 
@@ -310,7 +309,7 @@ extern u32 audit_sig_sid;
 extern int __audit_signal_info(int sig, struct task_struct *t);
 static inline int audit_signal_info(int sig, struct task_struct *t)
 {
-	if (unlikely((audit_pid && t->tgid == audit_pid) ||
+	if (unlikely((init_audit_ns.pid && t->tgid == init_audit_ns.pid) ||
 		     (audit_signals && !audit_dummy_context())))
 		return __audit_signal_info(sig, t);
 	return 0;
diff --git a/kernel/auditsc.c b/kernel/auditsc.c
index 9845cb3..61a85d8 100644
--- a/kernel/auditsc.c
+++ b/kernel/auditsc.c
@@ -745,7 +745,7 @@ static enum audit_state audit_filter_syscall(struct task_struct *tsk,
 	struct audit_entry *e;
 	enum audit_state state;
 
-	if (audit_pid && tsk->tgid == audit_pid)
+	if (init_audit_ns.pid && tsk->tgid == init_audit_ns.pid)
 		return AUDIT_DISABLED;
 
 	rcu_read_lock();
@@ -806,7 +806,7 @@ void audit_filter_inodes(struct task_struct *tsk, struct audit_context *ctx)
 {
 	struct audit_names *n;
 
-	if (audit_pid && tsk->tgid == audit_pid)
+	if (init_audit_ns.pid && tsk->tgid == init_audit_ns.pid)
 		return;
 
 	rcu_read_lock();
@@ -2226,7 +2226,7 @@ int __audit_signal_info(int sig, struct task_struct *t)
 	struct audit_context *ctx = tsk->audit_context;
 	kuid_t uid = current_uid(), t_uid = task_uid(t);
 
-	if (audit_pid && t->tgid == audit_pid) {
+	if (init_audit_ns.pid && t->tgid == init_audit_ns.pid) {
 		if (sig == SIGTERM || sig == SIGHUP || sig == SIGUSR1 || sig == SIGUSR2) {
 			audit_sig_pid = tsk->pid;
 			if (uid_valid(tsk->loginuid))
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH 06/20] audit: make kauditd_task per audit namespace
  2013-10-24  7:31 [RFC Part1 PATCH 00/20 v2] Add namespace support for audit Gao feng
                   ` (4 preceding siblings ...)
  2013-10-24  7:31 ` [PATCH 05/20] audit: make audit_pid " Gao feng
@ 2013-10-24  7:31 ` Gao feng
  2013-10-24  7:31 ` [PATCH 07/20] aduit: make audit_nlk_portid " Gao feng
                   ` (16 subsequent siblings)
  22 siblings, 0 replies; 50+ messages in thread
From: Gao feng @ 2013-10-24  7:31 UTC (permalink / raw)
  To: linux-kernel, linux-audit
  Cc: containers, ebiederm, serge.hallyn, eparis, sgrubb,
	toshi.okajima, Gao feng

kauditd_task is used to send audit netlink messages
to the user space auditd process. Because the netlink
messages are per audit namespace, we should make
kaudit_task per auditns to operate the right netlink
skb.

Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
---
 include/linux/audit_namespace.h | 12 +++++++++++
 kernel/audit.c                  | 47 +++++++++++++++++++++++++++--------------
 2 files changed, 43 insertions(+), 16 deletions(-)

diff --git a/include/linux/audit_namespace.h b/include/linux/audit_namespace.h
index fdbb6c1..2c0eede 100644
--- a/include/linux/audit_namespace.h
+++ b/include/linux/audit_namespace.h
@@ -15,6 +15,7 @@ struct audit_namespace {
 	struct sk_buff_head queue;
 	/* queue of skbs to send to auditd when/if it comes back */
 	struct sk_buff_head hold_queue;
+	struct task_struct *kauditd_task;
 };
 
 extern struct audit_namespace init_audit_ns;
@@ -35,6 +36,17 @@ void put_audit_ns(struct audit_namespace *ns)
 		skb_queue_purge(&ns->queue);
 		skb_queue_purge(&ns->hold_queue);
 		kfree(ns);
+	} else if (atomic_read(&ns->count) == 1) {
+		/* If the last user of audit namespace is kauditd,
+		 * we should wake up kauditd and let it kill itself,
+		 * Then this audit namespace will be destroyed. */
+		struct task_struct *task;
+
+		rcu_read_lock();
+		task = ACCESS_ONCE(ns->kauditd_task);
+		if (task)
+			wake_up_process(task);
+		rcu_read_unlock();
 	}
 }
 #else
diff --git a/kernel/audit.c b/kernel/audit.c
index e5e8cb1..ceb1cbd 100644
--- a/kernel/audit.c
+++ b/kernel/audit.c
@@ -133,7 +133,6 @@ static DEFINE_SPINLOCK(audit_freelist_lock);
 static int	   audit_freelist_count;
 static LIST_HEAD(audit_freelist);
 
-static struct task_struct *kauditd_task;
 static DECLARE_WAIT_QUEUE_HEAD(kauditd_wait);
 static DECLARE_WAIT_QUEUE_HEAD(audit_backlog_wait);
 
@@ -410,20 +409,20 @@ static void kauditd_send_skb(struct sk_buff *skb)
  * in 5 years when I want to play with this again I'll see this
  * note and still have no friggin idea what i'm thinking today.
  */
-static void flush_hold_queue(void)
+static void flush_hold_queue(struct audit_namespace *ns)
 {
 	struct sk_buff *skb;
 
-	if (!audit_default || !init_audit_ns.pid)
+	if (!audit_default || !ns->pid)
 		return;
 
-	skb = skb_dequeue(&init_audit_ns.hold_queue);
+	skb = skb_dequeue(&ns->hold_queue);
 	if (likely(!skb))
 		return;
 
-	while (skb && init_audit_ns.pid) {
+	while (skb && ns->pid) {
 		kauditd_send_skb(skb);
-		skb = skb_dequeue(&init_audit_ns.hold_queue);
+		skb = skb_dequeue(&ns->hold_queue);
 	}
 
 	/*
@@ -436,17 +435,25 @@ static void flush_hold_queue(void)
 
 static int kauditd_thread(void *dummy)
 {
+	struct audit_namespace *ns = (struct audit_namespace *)dummy;
+
 	set_freezable();
 	while (!kthread_should_stop()) {
 		struct sk_buff *skb;
 		DECLARE_WAITQUEUE(wait, current);
 
-		flush_hold_queue();
+		/* Ok, We are the last user of this audit namespace,
+		 * it's time to go. Kill kauditd thread and release
+		 * the audit namespace. */
+		if (atomic_read(&ns->count) == 1)
+			break;
+
+		flush_hold_queue(ns);
 
-		skb = skb_dequeue(&init_audit_ns.queue);
+		skb = skb_dequeue(&ns->queue);
 		wake_up(&audit_backlog_wait);
 		if (skb) {
-			if (init_audit_ns.pid)
+			if (ns->pid)
 				kauditd_send_skb(skb);
 			else
 				audit_printk_skb(skb);
@@ -455,7 +462,7 @@ static int kauditd_thread(void *dummy)
 		set_current_state(TASK_INTERRUPTIBLE);
 		add_wait_queue(&kauditd_wait, &wait);
 
-		if (!skb_queue_len(&init_audit_ns.queue)) {
+		if (!skb_queue_len(&ns->queue)) {
 			try_to_freeze();
 			schedule();
 		}
@@ -463,6 +470,9 @@ static int kauditd_thread(void *dummy)
 		__set_current_state(TASK_RUNNING);
 		remove_wait_queue(&kauditd_wait, &wait);
 	}
+
+	ns->kauditd_task = NULL;
+	put_audit_ns(ns);
 	return 0;
 }
 
@@ -635,6 +645,7 @@ static int audit_receive_msg(struct sk_buff *skb, struct nlmsghdr *nlh)
 	u16			msg_type = nlh->nlmsg_type;
 	struct audit_sig_info   *sig_data;
 	char			*ctx = NULL;
+	struct audit_namespace	*ns = current_audit_ns();
 	u32			len;
 
 	err = audit_netlink_ok(skb, msg_type);
@@ -643,13 +654,16 @@ static int audit_receive_msg(struct sk_buff *skb, struct nlmsghdr *nlh)
 
 	/* As soon as there's any sign of userspace auditd,
 	 * start kauditd to talk to it */
-	if (!kauditd_task) {
-		kauditd_task = kthread_run(kauditd_thread, NULL, "kauditd");
-		if (IS_ERR(kauditd_task)) {
-			err = PTR_ERR(kauditd_task);
-			kauditd_task = NULL;
-			return err;
+	if (!ns->kauditd_task) {
+		struct task_struct *tsk;
+		tsk = kthread_run(kauditd_thread,
+				  get_audit_ns(ns), "kauditd");
+		if (IS_ERR(tsk)) {
+			put_audit_ns(ns);
+			return PTR_ERR(tsk);
 		}
+
+		ns->kauditd_task = tsk;
 	}
 	seq  = nlh->nlmsg_seq;
 	data = nlmsg_data(nlh);
@@ -930,6 +944,7 @@ static int __init audit_init(void)
 		audit_sock->sk_sndtimeo = MAX_SCHEDULE_TIMEOUT;
 
 	init_audit_ns.pid = 0;
+	init_audit_ns.kauditd_task = NULL;
 	skb_queue_head_init(&init_audit_ns.queue);
 	skb_queue_head_init(&init_audit_ns.hold_queue);
 	audit_initialized = AUDIT_INITIALIZED;
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH 07/20] aduit: make audit_nlk_portid per audit namespace
  2013-10-24  7:31 [RFC Part1 PATCH 00/20 v2] Add namespace support for audit Gao feng
                   ` (5 preceding siblings ...)
  2013-10-24  7:31 ` [PATCH 06/20] audit: make kauditd_task " Gao feng
@ 2013-10-24  7:31 ` Gao feng
  2013-10-24  7:31 ` [PATCH 08/20] audit: make kaudit_wait queue " Gao feng
                   ` (15 subsequent siblings)
  22 siblings, 0 replies; 50+ messages in thread
From: Gao feng @ 2013-10-24  7:31 UTC (permalink / raw)
  To: linux-kernel, linux-audit
  Cc: containers, ebiederm, serge.hallyn, eparis, sgrubb,
	toshi.okajima, Gao feng

We should use audit_nlk_portid to decide to send
audit netlink message to which auditd processes.
it should be per audit namespace too.

Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
---
 include/linux/audit_namespace.h |  2 ++
 kernel/audit.c                  | 14 ++++----------
 2 files changed, 6 insertions(+), 10 deletions(-)

diff --git a/include/linux/audit_namespace.h b/include/linux/audit_namespace.h
index 2c0eede..a9e6a40 100644
--- a/include/linux/audit_namespace.h
+++ b/include/linux/audit_namespace.h
@@ -11,6 +11,8 @@ struct audit_namespace {
 	atomic_t count;
 	/* pid of the auditd process */
 	int pid;
+	/* portid of the auditd process's netlink socket */
+	int portid;
 	struct user_namespace *user_ns;
 	struct sk_buff_head queue;
 	/* queue of skbs to send to auditd when/if it comes back */
diff --git a/kernel/audit.c b/kernel/audit.c
index ceb1cbd..37375fb 100644
--- a/kernel/audit.c
+++ b/kernel/audit.c
@@ -89,13 +89,6 @@ static int	audit_default;
 /* If auditing cannot proceed, audit_failure selects what happens. */
 static int	audit_failure = AUDIT_FAIL_PRINTK;
 
-/*
- * If audit records are to be written to the netlink socket, audit_pid
- * contains the pid of the auditd process and audit_nlk_portid contains
- * the portid to use to send netlink messages to that process.
- */
-static int	audit_nlk_portid;
-
 /* If audit_rate_limit is non-zero, limit the rate of sending audit records
  * to that number per second.  This prevents DoS attacks, but results in
  * audit records being dropped. */
@@ -381,7 +374,7 @@ static void kauditd_send_skb(struct sk_buff *skb)
 	int err;
 	/* take a reference in case we can't send it and we want to hold it */
 	skb_get(skb);
-	err = netlink_unicast(audit_sock, skb, audit_nlk_portid, 0);
+	err = netlink_unicast(audit_sock, skb, init_audit_ns.portid, 0);
 	if (err < 0) {
 		BUG_ON(err != -ECONNREFUSED); /* Shouldn't happen */
 		printk(KERN_ERR "audit: *NO* daemon at audit_pid=%d\n", init_audit_ns.pid);
@@ -645,7 +638,7 @@ static int audit_receive_msg(struct sk_buff *skb, struct nlmsghdr *nlh)
 	u16			msg_type = nlh->nlmsg_type;
 	struct audit_sig_info   *sig_data;
 	char			*ctx = NULL;
-	struct audit_namespace	*ns = current_audit_ns();
+	struct audit_namespace	*ns = current->nsproxy->audit_ns;
 	u32			len;
 
 	err = audit_netlink_ok(skb, msg_type);
@@ -721,7 +714,7 @@ static int audit_receive_msg(struct sk_buff *skb, struct nlmsghdr *nlh)
 			init_audit_ns.pid = new_pid;
 			rcu_read_unlock();
 
-			audit_nlk_portid = NETLINK_CB(skb).portid;
+			init_audit_ns.portid = NETLINK_CB(skb).portid;
 		}
 		if (status_get->mask & AUDIT_STATUS_RATE_LIMIT) {
 			err = audit_set_rate_limit(status_get->rate_limit);
@@ -944,6 +937,7 @@ static int __init audit_init(void)
 		audit_sock->sk_sndtimeo = MAX_SCHEDULE_TIMEOUT;
 
 	init_audit_ns.pid = 0;
+	init_audit_ns.portid = 0;
 	init_audit_ns.kauditd_task = NULL;
 	skb_queue_head_init(&init_audit_ns.queue);
 	skb_queue_head_init(&init_audit_ns.hold_queue);
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH 08/20] audit: make kaudit_wait queue per audit namespace
  2013-10-24  7:31 [RFC Part1 PATCH 00/20 v2] Add namespace support for audit Gao feng
                   ` (6 preceding siblings ...)
  2013-10-24  7:31 ` [PATCH 07/20] aduit: make audit_nlk_portid " Gao feng
@ 2013-10-24  7:31 ` Gao feng
  2013-10-24  7:31 ` [PATCH 09/20] audit: make audit_backlog_wait " Gao feng
                   ` (14 subsequent siblings)
  22 siblings, 0 replies; 50+ messages in thread
From: Gao feng @ 2013-10-24  7:31 UTC (permalink / raw)
  To: linux-kernel, linux-audit
  Cc: containers, ebiederm, serge.hallyn, eparis, sgrubb,
	toshi.okajima, Gao feng

kauditd_task is added to the wait queue kaudit_wait when
there is no audit message being generated in audit namespace,
so the kaudit_wait should be per audit namespace too.

Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
---
 include/linux/audit_namespace.h | 2 ++
 kernel/audit.c                  | 8 ++++----
 2 files changed, 6 insertions(+), 4 deletions(-)

diff --git a/include/linux/audit_namespace.h b/include/linux/audit_namespace.h
index a9e6a40..2888238 100644
--- a/include/linux/audit_namespace.h
+++ b/include/linux/audit_namespace.h
@@ -4,6 +4,7 @@
 #include <linux/audit.h>
 #include <linux/atomic.h>
 #include <linux/slab.h>
+#include <linux/wait.h>
 #include <linux/skbuff.h>
 #include <linux/user_namespace.h>
 
@@ -18,6 +19,7 @@ struct audit_namespace {
 	/* queue of skbs to send to auditd when/if it comes back */
 	struct sk_buff_head hold_queue;
 	struct task_struct *kauditd_task;
+	wait_queue_head_t kauditd_wait;
 };
 
 extern struct audit_namespace init_audit_ns;
diff --git a/kernel/audit.c b/kernel/audit.c
index 37375fb..a4bfd7f 100644
--- a/kernel/audit.c
+++ b/kernel/audit.c
@@ -126,7 +126,6 @@ static DEFINE_SPINLOCK(audit_freelist_lock);
 static int	   audit_freelist_count;
 static LIST_HEAD(audit_freelist);
 
-static DECLARE_WAIT_QUEUE_HEAD(kauditd_wait);
 static DECLARE_WAIT_QUEUE_HEAD(audit_backlog_wait);
 
 /* Serialize requests from userspace. */
@@ -453,7 +452,7 @@ static int kauditd_thread(void *dummy)
 			continue;
 		}
 		set_current_state(TASK_INTERRUPTIBLE);
-		add_wait_queue(&kauditd_wait, &wait);
+		add_wait_queue(&init_audit_ns.kauditd_wait, &wait);
 
 		if (!skb_queue_len(&ns->queue)) {
 			try_to_freeze();
@@ -461,7 +460,7 @@ static int kauditd_thread(void *dummy)
 		}
 
 		__set_current_state(TASK_RUNNING);
-		remove_wait_queue(&kauditd_wait, &wait);
+		remove_wait_queue(&init_audit_ns.kauditd_wait, &wait);
 	}
 
 	ns->kauditd_task = NULL;
@@ -941,6 +940,7 @@ static int __init audit_init(void)
 	init_audit_ns.kauditd_task = NULL;
 	skb_queue_head_init(&init_audit_ns.queue);
 	skb_queue_head_init(&init_audit_ns.hold_queue);
+	init_waitqueue_head(&init_audit_ns.kauditd_wait);
 	audit_initialized = AUDIT_INITIALIZED;
 	audit_enabled = audit_default;
 	audit_ever_enabled |= !!audit_default;
@@ -1706,7 +1706,7 @@ void audit_log_end(struct audit_buffer *ab)
 
 		if (init_audit_ns.pid) {
 			skb_queue_tail(&init_audit_ns.queue, ab->skb);
-			wake_up_interruptible(&kauditd_wait);
+			wake_up_interruptible(&init_audit_ns.kauditd_wait);
 		} else {
 			audit_printk_skb(ab->skb);
 		}
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH 09/20] audit: make audit_backlog_wait per audit namespace
  2013-10-24  7:31 [RFC Part1 PATCH 00/20 v2] Add namespace support for audit Gao feng
                   ` (7 preceding siblings ...)
  2013-10-24  7:31 ` [PATCH 08/20] audit: make kaudit_wait queue " Gao feng
@ 2013-10-24  7:31 ` Gao feng
  2013-10-24  7:31 ` [PATCH 10/20] audit: allow un-init audit ns to change pid and portid only Gao feng
                   ` (13 subsequent siblings)
  22 siblings, 0 replies; 50+ messages in thread
From: Gao feng @ 2013-10-24  7:31 UTC (permalink / raw)
  To: linux-kernel, linux-audit
  Cc: containers, ebiederm, serge.hallyn, eparis, sgrubb,
	toshi.okajima, Gao feng

Tasks are added to audit_backlog_wait when the
audit_skb_queue of audit namespace is full, so
audit_backlog_wait should be per audit namespace too.

Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
---
 include/linux/audit_namespace.h |  1 +
 kernel/audit.c                  | 11 +++++------
 2 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/include/linux/audit_namespace.h b/include/linux/audit_namespace.h
index 2888238..79a9b78 100644
--- a/include/linux/audit_namespace.h
+++ b/include/linux/audit_namespace.h
@@ -20,6 +20,7 @@ struct audit_namespace {
 	struct sk_buff_head hold_queue;
 	struct task_struct *kauditd_task;
 	wait_queue_head_t kauditd_wait;
+	wait_queue_head_t backlog_wait;
 };
 
 extern struct audit_namespace init_audit_ns;
diff --git a/kernel/audit.c b/kernel/audit.c
index a4bfd7f..d7a0993 100644
--- a/kernel/audit.c
+++ b/kernel/audit.c
@@ -126,8 +126,6 @@ static DEFINE_SPINLOCK(audit_freelist_lock);
 static int	   audit_freelist_count;
 static LIST_HEAD(audit_freelist);
 
-static DECLARE_WAIT_QUEUE_HEAD(audit_backlog_wait);
-
 /* Serialize requests from userspace. */
 DEFINE_MUTEX(audit_cmd_mutex);
 
@@ -443,7 +441,7 @@ static int kauditd_thread(void *dummy)
 		flush_hold_queue(ns);
 
 		skb = skb_dequeue(&ns->queue);
-		wake_up(&audit_backlog_wait);
+		wake_up(&init_audit_ns.backlog_wait);
 		if (skb) {
 			if (ns->pid)
 				kauditd_send_skb(skb);
@@ -941,6 +939,7 @@ static int __init audit_init(void)
 	skb_queue_head_init(&init_audit_ns.queue);
 	skb_queue_head_init(&init_audit_ns.hold_queue);
 	init_waitqueue_head(&init_audit_ns.kauditd_wait);
+	init_waitqueue_head(&init_audit_ns.backlog_wait);
 	audit_initialized = AUDIT_INITIALIZED;
 	audit_enabled = audit_default;
 	audit_ever_enabled |= !!audit_default;
@@ -1091,14 +1090,14 @@ static void wait_for_auditd(unsigned long sleep_time)
 {
 	DECLARE_WAITQUEUE(wait, current);
 	set_current_state(TASK_UNINTERRUPTIBLE);
-	add_wait_queue(&audit_backlog_wait, &wait);
+	add_wait_queue(&init_audit_ns.backlog_wait, &wait);
 
 	if (audit_backlog_limit &&
 	    skb_queue_len(&init_audit_ns.queue) > audit_backlog_limit)
 		schedule_timeout(sleep_time);
 
 	__set_current_state(TASK_RUNNING);
-	remove_wait_queue(&audit_backlog_wait, &wait);
+	remove_wait_queue(&init_audit_ns.backlog_wait, &wait);
 }
 
 /* Obtain an audit buffer.  This routine does locking to obtain the
@@ -1164,7 +1163,7 @@ struct audit_buffer *audit_log_start(struct audit_context *ctx, gfp_t gfp_mask,
 			       audit_backlog_limit);
 		audit_log_lost("backlog limit exceeded");
 		audit_backlog_wait_time = audit_backlog_wait_overflow;
-		wake_up(&audit_backlog_wait);
+		wake_up(&init_audit_ns.backlog_wait);
 		return NULL;
 	}
 
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH 10/20] audit: allow un-init audit ns to change pid and portid only
  2013-10-24  7:31 [RFC Part1 PATCH 00/20 v2] Add namespace support for audit Gao feng
                   ` (8 preceding siblings ...)
  2013-10-24  7:31 ` [PATCH 09/20] audit: make audit_backlog_wait " Gao feng
@ 2013-10-24  7:31 ` Gao feng
  2013-10-24  7:31 ` [PATCH 11/20] audit: use proper audit namespace in audit_receive_msg Gao feng
                   ` (12 subsequent siblings)
  22 siblings, 0 replies; 50+ messages in thread
From: Gao feng @ 2013-10-24  7:31 UTC (permalink / raw)
  To: linux-kernel, linux-audit
  Cc: containers, ebiederm, serge.hallyn, eparis, sgrubb,
	toshi.okajima, Gao feng

Only these two vars are namespace aware.

Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
---
 kernel/audit.c | 26 ++++++++++++++++----------
 1 file changed, 16 insertions(+), 10 deletions(-)

diff --git a/kernel/audit.c b/kernel/audit.c
index d7a0993..2132929 100644
--- a/kernel/audit.c
+++ b/kernel/audit.c
@@ -685,16 +685,6 @@ static int audit_receive_msg(struct sk_buff *skb, struct nlmsghdr *nlh)
 		if (nlh->nlmsg_len < sizeof(struct audit_status))
 			return -EINVAL;
 		status_get   = (struct audit_status *)data;
-		if (status_get->mask & AUDIT_STATUS_ENABLED) {
-			err = audit_set_enabled(status_get->enabled);
-			if (err < 0)
-				return err;
-		}
-		if (status_get->mask & AUDIT_STATUS_FAILURE) {
-			err = audit_set_failure(status_get->failure);
-			if (err < 0)
-				return err;
-		}
 		if (status_get->mask & AUDIT_STATUS_PID) {
 			int new_pid = status_get->pid;
 			struct task_struct *task;
@@ -713,6 +703,22 @@ static int audit_receive_msg(struct sk_buff *skb, struct nlmsghdr *nlh)
 
 			init_audit_ns.portid = NETLINK_CB(skb).portid;
 		}
+
+		/* Right now, only audit_pid and audit_portid are namesapce
+		 * aware. */
+		if (ns != &init_audit_ns)
+			return -EPERM;
+
+		if (status_get->mask & AUDIT_STATUS_ENABLED) {
+			err = audit_set_enabled(status_get->enabled);
+			if (err < 0)
+				return err;
+		}
+		if (status_get->mask & AUDIT_STATUS_FAILURE) {
+			err = audit_set_failure(status_get->failure);
+			if (err < 0)
+				return err;
+		}
 		if (status_get->mask & AUDIT_STATUS_RATE_LIMIT) {
 			err = audit_set_rate_limit(status_get->rate_limit);
 			if (err < 0)
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH 11/20] audit: use proper audit namespace in audit_receive_msg
  2013-10-24  7:31 [RFC Part1 PATCH 00/20 v2] Add namespace support for audit Gao feng
                   ` (9 preceding siblings ...)
  2013-10-24  7:31 ` [PATCH 10/20] audit: allow un-init audit ns to change pid and portid only Gao feng
@ 2013-10-24  7:31 ` Gao feng
  2013-10-24  7:31 ` [PATCH 12/20] audit: use proper audit_namespace in kauditd_thread Gao feng
                   ` (11 subsequent siblings)
  22 siblings, 0 replies; 50+ messages in thread
From: Gao feng @ 2013-10-24  7:31 UTC (permalink / raw)
  To: linux-kernel, linux-audit
  Cc: containers, ebiederm, serge.hallyn, eparis, sgrubb,
	toshi.okajima, Gao feng

Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
---
 kernel/audit.c | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/kernel/audit.c b/kernel/audit.c
index 2132929..5524deb 100644
--- a/kernel/audit.c
+++ b/kernel/audit.c
@@ -662,11 +662,11 @@ static int audit_receive_msg(struct sk_buff *skb, struct nlmsghdr *nlh)
 	case AUDIT_GET:
 		status_set.enabled	 = audit_enabled;
 		status_set.failure	 = audit_failure;
-		status_set.pid		 = init_audit_ns.pid;
+		status_set.pid		 = ns->pid;
 		status_set.rate_limit	 = audit_rate_limit;
 		status_set.backlog_limit = audit_backlog_limit;
 		status_set.lost		 = atomic_read(&audit_lost);
-		status_set.backlog	 = skb_queue_len(&init_audit_ns.queue);
+		status_set.backlog	 = skb_queue_len(&ns->queue);
 
 		rcu_read_lock();
 		{
@@ -696,12 +696,12 @@ static int audit_receive_msg(struct sk_buff *skb, struct nlmsghdr *nlh)
 				new_pid = task_pid_nr_ns(task, &init_pid_ns);
 
 			if (audit_enabled != AUDIT_OFF)
-				audit_log_config_change("audit_pid", new_pid, init_audit_ns.pid, 1);
+				audit_log_config_change("audit_pid", new_pid, ns->pid, 1);
 
-			init_audit_ns.pid = new_pid;
+			ns->pid = new_pid;
 			rcu_read_unlock();
 
-			init_audit_ns.portid = NETLINK_CB(skb).portid;
+			ns->portid = NETLINK_CB(skb).portid;
 		}
 
 		/* Right now, only audit_pid and audit_portid are namesapce
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH 12/20] audit: use proper audit_namespace in kauditd_thread
  2013-10-24  7:31 [RFC Part1 PATCH 00/20 v2] Add namespace support for audit Gao feng
                   ` (10 preceding siblings ...)
  2013-10-24  7:31 ` [PATCH 11/20] audit: use proper audit namespace in audit_receive_msg Gao feng
@ 2013-10-24  7:31 ` Gao feng
  2013-10-24  7:31 ` [PATCH 13/20] audit: introduce new audit logging interface for audit namespace Gao feng
                   ` (10 subsequent siblings)
  22 siblings, 0 replies; 50+ messages in thread
From: Gao feng @ 2013-10-24  7:31 UTC (permalink / raw)
  To: linux-kernel, linux-audit
  Cc: containers, ebiederm, serge.hallyn, eparis, sgrubb,
	toshi.okajima, Gao feng

Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
---
 kernel/audit.c | 34 +++++++++++++++++-----------------
 1 file changed, 17 insertions(+), 17 deletions(-)

diff --git a/kernel/audit.c b/kernel/audit.c
index 5524deb..b203017 100644
--- a/kernel/audit.c
+++ b/kernel/audit.c
@@ -338,11 +338,11 @@ static int audit_set_failure(int state)
  * This only holds messages is audit_default is set, aka booting with audit=1
  * or building your kernel that way.
  */
-static void audit_hold_skb(struct sk_buff *skb)
+static void audit_hold_skb(struct audit_namespace *ns, struct sk_buff *skb)
 {
 	if (audit_default &&
-	    skb_queue_len(&init_audit_ns.hold_queue) < audit_backlog_limit)
-		skb_queue_tail(&init_audit_ns.hold_queue, skb);
+	    skb_queue_len(&ns->hold_queue) < audit_backlog_limit)
+		skb_queue_tail(&ns->hold_queue, skb);
 	else
 		kfree_skb(skb);
 }
@@ -351,7 +351,7 @@ static void audit_hold_skb(struct sk_buff *skb)
  * For one reason or another this nlh isn't getting delivered to the userspace
  * audit daemon, just send it to printk.
  */
-static void audit_printk_skb(struct sk_buff *skb)
+static void audit_printk_skb(struct audit_namespace *ns, struct sk_buff *skb)
 {
 	struct nlmsghdr *nlh = nlmsg_hdr(skb);
 	char *data = nlmsg_data(nlh);
@@ -363,22 +363,22 @@ static void audit_printk_skb(struct sk_buff *skb)
 			audit_log_lost("printk limit exceeded\n");
 	}
 
-	audit_hold_skb(skb);
+	audit_hold_skb(ns, skb);
 }
 
-static void kauditd_send_skb(struct sk_buff *skb)
+static void kauditd_send_skb(struct audit_namespace *ns, struct sk_buff *skb)
 {
 	int err;
 	/* take a reference in case we can't send it and we want to hold it */
 	skb_get(skb);
-	err = netlink_unicast(audit_sock, skb, init_audit_ns.portid, 0);
+	err = netlink_unicast(audit_sock, skb, ns->portid, 0);
 	if (err < 0) {
 		BUG_ON(err != -ECONNREFUSED); /* Shouldn't happen */
-		printk(KERN_ERR "audit: *NO* daemon at audit_pid=%d\n", init_audit_ns.pid);
+		printk(KERN_ERR "audit: *NO* daemon at audit_pid=%d\n", ns->pid);
 		audit_log_lost("auditd disappeared\n");
-		init_audit_ns.pid = 0;
+		ns->pid = 0;
 		/* we might get lucky and get this in the next auditd */
-		audit_hold_skb(skb);
+		audit_hold_skb(ns, skb);
 	} else
 		/* drop the extra reference if sent ok */
 		consume_skb(skb);
@@ -411,7 +411,7 @@ static void flush_hold_queue(struct audit_namespace *ns)
 		return;
 
 	while (skb && ns->pid) {
-		kauditd_send_skb(skb);
+		kauditd_send_skb(ns, skb);
 		skb = skb_dequeue(&ns->hold_queue);
 	}
 
@@ -441,16 +441,16 @@ static int kauditd_thread(void *dummy)
 		flush_hold_queue(ns);
 
 		skb = skb_dequeue(&ns->queue);
-		wake_up(&init_audit_ns.backlog_wait);
+		wake_up(&ns->backlog_wait);
 		if (skb) {
 			if (ns->pid)
-				kauditd_send_skb(skb);
+				kauditd_send_skb(ns, skb);
 			else
-				audit_printk_skb(skb);
+				audit_printk_skb(ns, skb);
 			continue;
 		}
 		set_current_state(TASK_INTERRUPTIBLE);
-		add_wait_queue(&init_audit_ns.kauditd_wait, &wait);
+		add_wait_queue(&ns->kauditd_wait, &wait);
 
 		if (!skb_queue_len(&ns->queue)) {
 			try_to_freeze();
@@ -458,7 +458,7 @@ static int kauditd_thread(void *dummy)
 		}
 
 		__set_current_state(TASK_RUNNING);
-		remove_wait_queue(&init_audit_ns.kauditd_wait, &wait);
+		remove_wait_queue(&ns->kauditd_wait, &wait);
 	}
 
 	ns->kauditd_task = NULL;
@@ -1713,7 +1713,7 @@ void audit_log_end(struct audit_buffer *ab)
 			skb_queue_tail(&init_audit_ns.queue, ab->skb);
 			wake_up_interruptible(&init_audit_ns.kauditd_wait);
 		} else {
-			audit_printk_skb(ab->skb);
+			audit_printk_skb(&init_audit_ns, ab->skb);
 		}
 		ab->skb = NULL;
 	}
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH 13/20] audit: introduce new audit logging interface for audit namespace
  2013-10-24  7:31 [RFC Part1 PATCH 00/20 v2] Add namespace support for audit Gao feng
                   ` (11 preceding siblings ...)
  2013-10-24  7:31 ` [PATCH 12/20] audit: use proper audit_namespace in kauditd_thread Gao feng
@ 2013-10-24  7:31 ` Gao feng
  2013-10-24  7:31 ` [PATCH 14/20] audit: pass proper audit namespace to audit_log_common_recv_msg Gao feng
                   ` (9 subsequent siblings)
  22 siblings, 0 replies; 50+ messages in thread
From: Gao feng @ 2013-10-24  7:31 UTC (permalink / raw)
  To: linux-kernel, linux-audit
  Cc: containers, ebiederm, serge.hallyn, eparis, sgrubb,
	toshi.okajima, Gao feng

This interface audit_log_start_ns and audit_log_end_ns
will be used for logging audit logs in audit namespace.

Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
---
 include/linux/audit.h | 26 +++++++++++++--
 kernel/audit.c        | 92 ++++++++++++++++++++++++++++++---------------------
 2 files changed, 77 insertions(+), 41 deletions(-)

diff --git a/include/linux/audit.h b/include/linux/audit.h
index 729a4d1..717e1d1 100644
--- a/include/linux/audit.h
+++ b/include/linux/audit.h
@@ -43,6 +43,7 @@ struct mq_attr;
 struct mqstat;
 struct audit_watch;
 struct audit_tree;
+struct audit_namespace;
 
 struct audit_krule {
 	int			vers_ops;
@@ -421,10 +422,19 @@ extern __printf(4, 5)
 void audit_log(struct audit_context *ctx, gfp_t gfp_mask, int type,
 	       const char *fmt, ...);
 
-extern struct audit_buffer *audit_log_start(struct audit_context *ctx, gfp_t gfp_mask, int type);
+extern struct audit_buffer *
+audit_log_start(struct audit_context *ctx, gfp_t gfp_mask, int type);
+
+extern struct audit_buffer *
+audit_log_start_ns(struct audit_namespace *ns,
+		   struct audit_context *ctx,
+		   gfp_t gfp_mask, int type);
+
 extern __printf(2, 3)
 void audit_log_format(struct audit_buffer *ab, const char *fmt, ...);
 extern void		    audit_log_end(struct audit_buffer *ab);
+extern void		    audit_log_end_ns(struct audit_namespace *ns,
+					     struct audit_buffer *ab);
 extern int		    audit_string_contains_control(const char *string,
 							  size_t len);
 extern void		    audit_log_n_hex(struct audit_buffer *ab,
@@ -470,8 +480,15 @@ static inline __printf(4, 5)
 void audit_log(struct audit_context *ctx, gfp_t gfp_mask, int type,
 	       const char *fmt, ...)
 { }
-static inline struct audit_buffer *audit_log_start(struct audit_context *ctx,
-						   gfp_t gfp_mask, int type)
+static inline struct audit_buffer *
+audit_log_start(struct audit_context *ctx, gfp_t gfp_mask, int type)
+{
+	return NULL;
+}
+static inline struct audit_buffer *
+audit_log_start_ns(struct audit_namespace *ns,
+		   struct audit_context *ctx,
+		   gfp_t gfp_mask, int type)
 {
 	return NULL;
 }
@@ -480,6 +497,9 @@ void audit_log_format(struct audit_buffer *ab, const char *fmt, ...)
 { }
 static inline void audit_log_end(struct audit_buffer *ab)
 { }
+static inline void audit_log_end_ns(struct audit_namespace *ns,
+				    struct audit_buffer *ab)
+{ }
 static inline void audit_log_n_hex(struct audit_buffer *ab,
 				   const unsigned char *buf, size_t len)
 { }
diff --git a/kernel/audit.c b/kernel/audit.c
index b203017..5ac7365 100644
--- a/kernel/audit.c
+++ b/kernel/audit.c
@@ -1092,18 +1092,19 @@ static inline void audit_get_stamp(struct audit_context *ctx,
 /*
  * Wait for auditd to drain the queue a little
  */
-static void wait_for_auditd(unsigned long sleep_time)
+static void wait_for_auditd(struct audit_namespace *ns,
+			    unsigned long sleep_time)
 {
 	DECLARE_WAITQUEUE(wait, current);
 	set_current_state(TASK_UNINTERRUPTIBLE);
-	add_wait_queue(&init_audit_ns.backlog_wait, &wait);
+	add_wait_queue(&ns->backlog_wait, &wait);
 
 	if (audit_backlog_limit &&
-	    skb_queue_len(&init_audit_ns.queue) > audit_backlog_limit)
+	    skb_queue_len(&ns->queue) > audit_backlog_limit)
 		schedule_timeout(sleep_time);
 
 	__set_current_state(TASK_RUNNING);
-	remove_wait_queue(&init_audit_ns.backlog_wait, &wait);
+	remove_wait_queue(&ns->backlog_wait, &wait);
 }
 
 /* Obtain an audit buffer.  This routine does locking to obtain the
@@ -1113,23 +1114,10 @@ static void wait_for_auditd(unsigned long sleep_time)
  * will be written at syscall exit.  If there is no associated task, tsk
  * should be NULL. */
 
-/**
- * audit_log_start - obtain an audit buffer
- * @ctx: audit_context (may be NULL)
- * @gfp_mask: type of allocation
- * @type: audit message type
- *
- * Returns audit_buffer pointer on success or NULL on error.
- *
- * Obtain an audit buffer.  This routine does locking to obtain the
- * audit buffer, but then no locking is required for calls to
- * audit_log_*format.  If the task (ctx) is a task that is currently in a
- * syscall, then the syscall is marked as auditable and an audit record
- * will be written at syscall exit.  If there is no associated task, then
- * task context (ctx) should be NULL.
- */
-struct audit_buffer *audit_log_start(struct audit_context *ctx, gfp_t gfp_mask,
-				     int type)
+struct audit_buffer *
+audit_log_start_ns(struct audit_namespace *ns,
+		   struct audit_context *ctx,
+		   gfp_t gfp_mask, int type)
 {
 	struct audit_buffer	*ab	= NULL;
 	struct timespec		t;
@@ -1150,14 +1138,14 @@ struct audit_buffer *audit_log_start(struct audit_context *ctx, gfp_t gfp_mask,
 				entries over the normal backlog limit */
 
 	while (audit_backlog_limit
-	       && skb_queue_len(&init_audit_ns.queue) > audit_backlog_limit + reserve) {
+	       && skb_queue_len(&ns->queue) > audit_backlog_limit + reserve) {
 		if (gfp_mask & __GFP_WAIT && audit_backlog_wait_time) {
 			unsigned long sleep_time;
 
 			sleep_time = timeout_start + audit_backlog_wait_time -
 					jiffies;
 			if ((long)sleep_time > 0) {
-				wait_for_auditd(sleep_time);
+				wait_for_auditd(ns, sleep_time);
 				continue;
 			}
 		}
@@ -1165,7 +1153,7 @@ struct audit_buffer *audit_log_start(struct audit_context *ctx, gfp_t gfp_mask,
 			printk(KERN_WARNING
 			       "audit: audit_backlog=%d > "
 			       "audit_backlog_limit=%d\n",
-			       skb_queue_len(&init_audit_ns.queue),
+			       skb_queue_len(&ns->queue),
 			       audit_backlog_limit);
 		audit_log_lost("backlog limit exceeded");
 		audit_backlog_wait_time = audit_backlog_wait_overflow;
@@ -1187,6 +1175,27 @@ struct audit_buffer *audit_log_start(struct audit_context *ctx, gfp_t gfp_mask,
 }
 
 /**
+ * audit_log_start - obtain an audit buffer
+ * @ctx: audit_context (may be NULL)
+ * @gfp_mask: type of allocation
+ * @type: audit message type
+ *
+ * Returns audit_buffer pointer on success or NULL on error.
+ *
+ * Obtain an audit buffer.  This routine does locking to obtain the
+ * audit buffer, but then no locking is required for calls to
+ * audit_log_*format.  If the task (ctx) is a task that is currently in a
+ * syscall, then the syscall is marked as auditable and an audit record
+ * will be written at syscall exit.  If there is no associated task, then
+ * task context (ctx) should be NULL.
+ */
+struct audit_buffer *audit_log_start(struct audit_context *ctx, gfp_t gfp_mask,
+				     int type)
+{
+	return audit_log_start_ns(&init_audit_ns, ctx, gfp_mask, type);
+}
+
+/**
  * audit_expand - expand skb in the audit buffer
  * @ab: audit_buffer
  * @extra: space to add at tail of the skb
@@ -1690,16 +1699,7 @@ out:
 	kfree(name);
 }
 
-/**
- * audit_log_end - end one audit record
- * @ab: the audit_buffer
- *
- * The netlink_* functions cannot be called inside an irq context, so
- * the audit buffer is placed on a queue and a tasklet is scheduled to
- * remove them from the queue outside the irq context.  May be called in
- * any context.
- */
-void audit_log_end(struct audit_buffer *ab)
+void audit_log_end_ns(struct audit_namespace *ns, struct audit_buffer *ab)
 {
 	if (!ab)
 		return;
@@ -1709,11 +1709,11 @@ void audit_log_end(struct audit_buffer *ab)
 		struct nlmsghdr *nlh = nlmsg_hdr(ab->skb);
 		nlh->nlmsg_len = ab->skb->len - NLMSG_HDRLEN;
 
-		if (init_audit_ns.pid) {
-			skb_queue_tail(&init_audit_ns.queue, ab->skb);
-			wake_up_interruptible(&init_audit_ns.kauditd_wait);
+		if (ns->pid) {
+			skb_queue_tail(&ns->queue, ab->skb);
+			wake_up_interruptible(&ns->kauditd_wait);
 		} else {
-			audit_printk_skb(&init_audit_ns, ab->skb);
+			audit_printk_skb(ns, ab->skb);
 		}
 		ab->skb = NULL;
 	}
@@ -1721,6 +1721,20 @@ void audit_log_end(struct audit_buffer *ab)
 }
 
 /**
+ * audit_log_end - end one audit record
+ * @ab: the audit_buffer
+ *
+ * The netlink_* functions cannot be called inside an irq context, so
+ * the audit buffer is placed on a queue and a tasklet is scheduled to
+ * remove them from the queue outside the irq context.  May be called in
+ * any context.
+ */
+void audit_log_end(struct audit_buffer *ab)
+{
+	return audit_log_end_ns(&init_audit_ns, ab);
+}
+
+/**
  * audit_log - Log an audit record
  * @ctx: audit context
  * @gfp_mask: type of allocation
@@ -1774,6 +1788,8 @@ EXPORT_SYMBOL(audit_log_secctx);
 #endif
 
 EXPORT_SYMBOL(audit_log_start);
+EXPORT_SYMBOL(audit_log_start_ns);
 EXPORT_SYMBOL(audit_log_end);
+EXPORT_SYMBOL(audit_log_end_ns);
 EXPORT_SYMBOL(audit_log_format);
 EXPORT_SYMBOL(audit_log);
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH 14/20] audit: pass proper audit namespace to audit_log_common_recv_msg
  2013-10-24  7:31 [RFC Part1 PATCH 00/20 v2] Add namespace support for audit Gao feng
                   ` (12 preceding siblings ...)
  2013-10-24  7:31 ` [PATCH 13/20] audit: introduce new audit logging interface for audit namespace Gao feng
@ 2013-10-24  7:31 ` Gao feng
  2013-10-24  7:32 ` [PATCH 15/20] audit: Log audit pid config change in audit namespace Gao feng
                   ` (8 subsequent siblings)
  22 siblings, 0 replies; 50+ messages in thread
From: Gao feng @ 2013-10-24  7:31 UTC (permalink / raw)
  To: linux-kernel, linux-audit
  Cc: containers, ebiederm, serge.hallyn, eparis, sgrubb,
	toshi.okajima, Gao feng

The audit log that generated in audit namespace should be
received by the auditd running in this audit namespace.

Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
---
 kernel/audit.c | 21 +++++++++++----------
 1 file changed, 11 insertions(+), 10 deletions(-)

diff --git a/kernel/audit.c b/kernel/audit.c
index 5ac7365..92da21d 100644
--- a/kernel/audit.c
+++ b/kernel/audit.c
@@ -605,7 +605,8 @@ static int audit_netlink_ok(struct sk_buff *skb, u16 msg_type)
 	return err;
 }
 
-static int audit_log_common_recv_msg(struct audit_buffer **ab, u16 msg_type)
+static int audit_log_common_recv_msg(struct audit_namespace *ns,
+				     struct audit_buffer **ab, u16 msg_type)
 {
 	int rc = 0;
 	uid_t uid = from_kuid(&init_user_ns, current_uid());
@@ -615,7 +616,7 @@ static int audit_log_common_recv_msg(struct audit_buffer **ab, u16 msg_type)
 		return rc;
 	}
 
-	*ab = audit_log_start(NULL, GFP_KERNEL, msg_type);
+	*ab = audit_log_start_ns(ns, NULL, GFP_KERNEL, msg_type);
 	if (unlikely(!*ab))
 		return rc;
 	audit_log_format(*ab, "pid=%d uid=%u", task_tgid_vnr(current), uid);
@@ -741,7 +742,7 @@ static int audit_receive_msg(struct sk_buff *skb, struct nlmsghdr *nlh)
 				if (err)
 					break;
 			}
-			audit_log_common_recv_msg(&ab, msg_type);
+			audit_log_common_recv_msg(ns, &ab, msg_type);
 			if (msg_type != AUDIT_USER_TTY)
 				audit_log_format(ab, " msg='%.1024s'",
 						 (char *)data);
@@ -756,7 +757,7 @@ static int audit_receive_msg(struct sk_buff *skb, struct nlmsghdr *nlh)
 				audit_log_n_untrustedstring(ab, data, size);
 			}
 			audit_set_pid(ab, NETLINK_CB(skb).portid);
-			audit_log_end(ab);
+			audit_log_end_ns(ns, ab);
 		}
 		break;
 	case AUDIT_ADD_RULE:
@@ -764,9 +765,9 @@ static int audit_receive_msg(struct sk_buff *skb, struct nlmsghdr *nlh)
 		if (nlmsg_len(nlh) < sizeof(struct audit_rule_data))
 			return -EINVAL;
 		if (audit_enabled == AUDIT_LOCKED) {
-			audit_log_common_recv_msg(&ab, AUDIT_CONFIG_CHANGE);
+			audit_log_common_recv_msg(ns, &ab, AUDIT_CONFIG_CHANGE);
 			audit_log_format(ab, " audit_enabled=%d res=0", audit_enabled);
-			audit_log_end(ab);
+			audit_log_end_ns(ns, ab);
 			return -EPERM;
 		}
 		/* fallthrough */
@@ -776,9 +777,9 @@ static int audit_receive_msg(struct sk_buff *skb, struct nlmsghdr *nlh)
 		break;
 	case AUDIT_TRIM:
 		audit_trim_trees();
-		audit_log_common_recv_msg(&ab, AUDIT_CONFIG_CHANGE);
+		audit_log_common_recv_msg(ns, &ab, AUDIT_CONFIG_CHANGE);
 		audit_log_format(ab, " op=trim res=1");
-		audit_log_end(ab);
+		audit_log_end_ns(ns, ab);
 		break;
 	case AUDIT_MAKE_EQUIV: {
 		void *bufp = data;
@@ -806,14 +807,14 @@ static int audit_receive_msg(struct sk_buff *skb, struct nlmsghdr *nlh)
 		/* OK, here comes... */
 		err = audit_tag_tree(old, new);
 
-		audit_log_common_recv_msg(&ab, AUDIT_CONFIG_CHANGE);
+		audit_log_common_recv_msg(ns, &ab, AUDIT_CONFIG_CHANGE);
 
 		audit_log_format(ab, " op=make_equiv old=");
 		audit_log_untrustedstring(ab, old);
 		audit_log_format(ab, " new=");
 		audit_log_untrustedstring(ab, new);
 		audit_log_format(ab, " res=%d", !err);
-		audit_log_end(ab);
+		audit_log_end_ns(ns, ab);
 		kfree(old);
 		kfree(new);
 		break;
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH 15/20] audit: Log audit pid config change in audit namespace
  2013-10-24  7:31 [RFC Part1 PATCH 00/20 v2] Add namespace support for audit Gao feng
                   ` (13 preceding siblings ...)
  2013-10-24  7:31 ` [PATCH 14/20] audit: pass proper audit namespace to audit_log_common_recv_msg Gao feng
@ 2013-10-24  7:32 ` Gao feng
  2013-10-24  7:32 ` [PATCH 16/20] audit: allow GET,SET,USER MSG operations " Gao feng
                   ` (7 subsequent siblings)
  22 siblings, 0 replies; 50+ messages in thread
From: Gao feng @ 2013-10-24  7:32 UTC (permalink / raw)
  To: linux-kernel, linux-audit
  Cc: containers, ebiederm, serge.hallyn, eparis, sgrubb,
	toshi.okajima, Gao feng

This patch allow to log audit config change in
audit namespace.

Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
---
 kernel/audit.c | 18 +++++++++++++-----
 1 file changed, 13 insertions(+), 5 deletions(-)

diff --git a/kernel/audit.c b/kernel/audit.c
index 92da21d..095f54d 100644
--- a/kernel/audit.c
+++ b/kernel/audit.c
@@ -252,13 +252,15 @@ void audit_log_lost(const char *message)
 	}
 }
 
-static int audit_log_config_change(char *function_name, int new, int old,
+static int audit_log_config_change(struct audit_namespace *ns,
+				   char *function_name,
+				   int new, int old,
 				   int allow_changes)
 {
 	struct audit_buffer *ab;
 	int rc = 0;
 
-	ab = audit_log_start(NULL, GFP_KERNEL, AUDIT_CONFIG_CHANGE);
+	ab = audit_log_start_ns(ns, NULL, GFP_KERNEL, AUDIT_CONFIG_CHANGE);
 	if (unlikely(!ab))
 		return rc;
 	audit_log_format(ab, "%s=%d old=%d", function_name, new, old);
@@ -267,7 +269,7 @@ static int audit_log_config_change(char *function_name, int new, int old,
 	if (rc)
 		allow_changes = 0; /* Something weird, deny request */
 	audit_log_format(ab, " res=%d", allow_changes);
-	audit_log_end(ab);
+	audit_log_end_ns(ns, ab);
 	return rc;
 }
 
@@ -282,7 +284,10 @@ static int audit_do_config_change(char *function_name, int *to_change, int new)
 		allow_changes = 1;
 
 	if (audit_enabled != AUDIT_OFF) {
-		rc = audit_log_config_change(function_name, new, old, allow_changes);
+		rc = audit_log_config_change(&init_audit_ns,
+					     function_name,
+					     new, old,
+					     allow_changes);
 		if (rc)
 			allow_changes = 0;
 	}
@@ -697,7 +702,10 @@ static int audit_receive_msg(struct sk_buff *skb, struct nlmsghdr *nlh)
 				new_pid = task_pid_nr_ns(task, &init_pid_ns);
 
 			if (audit_enabled != AUDIT_OFF)
-				audit_log_config_change("audit_pid", new_pid, ns->pid, 1);
+				audit_log_config_change(&init_audit_ns,
+							"audit_pid",
+							new_pid,
+							ns->pid, 1);
 
 			ns->pid = new_pid;
 			rcu_read_unlock();
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH 16/20] audit: allow GET,SET,USER MSG operations in audit namespace
  2013-10-24  7:31 [RFC Part1 PATCH 00/20 v2] Add namespace support for audit Gao feng
                   ` (14 preceding siblings ...)
  2013-10-24  7:32 ` [PATCH 15/20] audit: Log audit pid config change in audit namespace Gao feng
@ 2013-10-24  7:32 ` Gao feng
  2013-12-06 22:00   ` [PATCH 16/20] audit: allow GET, SET, USER " Serge E. Hallyn
  2013-10-24  7:32 ` [PATCH 17/20] nsproxy: don't make create_new_namespaces static Gao feng
                   ` (6 subsequent siblings)
  22 siblings, 1 reply; 50+ messages in thread
From: Gao feng @ 2013-10-24  7:32 UTC (permalink / raw)
  To: linux-kernel, linux-audit
  Cc: containers, ebiederm, serge.hallyn, eparis, sgrubb,
	toshi.okajima, Gao feng

1, remove the permission check of pid namespace. it's no reason
   to deny un-init pid namespace to operate audit subsystem.

2, only allow init user namespace and init audit namespace to
   operate list/add/del rule, tty set, trim, make equiv operations.

3, allow audit namespace to get/set audit configuration, send
   userspace audit message.

Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
---
 kernel/audit.c | 13 ++++++-------
 1 file changed, 6 insertions(+), 7 deletions(-)

diff --git a/kernel/audit.c b/kernel/audit.c
index 095f54d..c4d4291 100644
--- a/kernel/audit.c
+++ b/kernel/audit.c
@@ -573,11 +573,7 @@ out:
 static int audit_netlink_ok(struct sk_buff *skb, u16 msg_type)
 {
 	int err = 0;
-
-	/* Only support the initial namespaces for now. */
-	if ((current_user_ns() != &init_user_ns) ||
-	    (task_active_pid_ns(current) != &init_pid_ns))
-		return -EPERM;
+	struct audit_namespace *ns = current->nsproxy->audit_ns;
 
 	switch (msg_type) {
 	case AUDIT_LIST:
@@ -586,6 +582,7 @@ static int audit_netlink_ok(struct sk_buff *skb, u16 msg_type)
 		return -EOPNOTSUPP;
 	case AUDIT_GET:
 	case AUDIT_SET:
+		break;
 	case AUDIT_LIST_RULES:
 	case AUDIT_ADD_RULE:
 	case AUDIT_DEL_RULE:
@@ -594,13 +591,15 @@ static int audit_netlink_ok(struct sk_buff *skb, u16 msg_type)
 	case AUDIT_TTY_SET:
 	case AUDIT_TRIM:
 	case AUDIT_MAKE_EQUIV:
-		if (!capable(CAP_AUDIT_CONTROL))
+		if ((current_user_ns() != &init_user_ns) ||
+		    (ns != &init_audit_ns) ||
+		    !capable(CAP_AUDIT_CONTROL))
 			err = -EPERM;
 		break;
 	case AUDIT_USER:
 	case AUDIT_FIRST_USER_MSG ... AUDIT_LAST_USER_MSG:
 	case AUDIT_FIRST_USER_MSG2 ... AUDIT_LAST_USER_MSG2:
-		if (!capable(CAP_AUDIT_WRITE))
+		if (!ns_capable(ns->user_ns, CAP_AUDIT_WRITE))
 			err = -EPERM;
 		break;
 	default:  /* bad msg */
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH 17/20] nsproxy: don't make create_new_namespaces static
  2013-10-24  7:31 [RFC Part1 PATCH 00/20 v2] Add namespace support for audit Gao feng
                   ` (15 preceding siblings ...)
  2013-10-24  7:32 ` [PATCH 16/20] audit: allow GET,SET,USER MSG operations " Gao feng
@ 2013-10-24  7:32 ` Gao feng
  2013-10-24  7:32 ` [PATCH 18/20] audit: add new message type AUDIT_CREATE_NS Gao feng
                   ` (5 subsequent siblings)
  22 siblings, 0 replies; 50+ messages in thread
From: Gao feng @ 2013-10-24  7:32 UTC (permalink / raw)
  To: linux-kernel, linux-audit
  Cc: containers, ebiederm, serge.hallyn, eparis, sgrubb,
	toshi.okajima, Gao feng

audit moudule will use create_new_namespaces to
create new nsproxy.

Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
---
 include/linux/nsproxy.h | 4 ++++
 kernel/nsproxy.c        | 2 +-
 2 files changed, 5 insertions(+), 1 deletion(-)

diff --git a/include/linux/nsproxy.h b/include/linux/nsproxy.h
index dc7af11..7ae15fc 100644
--- a/include/linux/nsproxy.h
+++ b/include/linux/nsproxy.h
@@ -73,6 +73,10 @@ void switch_task_namespaces(struct task_struct *tsk, struct nsproxy *new);
 void free_nsproxy(struct nsproxy *ns);
 int unshare_nsproxy_namespaces(unsigned long, struct nsproxy **,
 	struct cred *, struct fs_struct *);
+struct nsproxy *create_new_namespaces(unsigned long flags,
+				      struct task_struct *tsk,
+				      struct user_namespace *user_ns,
+				      struct fs_struct *new_fs);
 int __init nsproxy_cache_init(void);
 
 static inline void put_nsproxy(struct nsproxy *ns)
diff --git a/kernel/nsproxy.c b/kernel/nsproxy.c
index e8374aa..a8cf2f4 100644
--- a/kernel/nsproxy.c
+++ b/kernel/nsproxy.c
@@ -60,7 +60,7 @@ static inline struct nsproxy *create_nsproxy(void)
  * Return the newly created nsproxy.  Do not attach this to the task,
  * leave it to the caller to do proper locking and attach it to task.
  */
-static struct nsproxy *create_new_namespaces(unsigned long flags,
+struct nsproxy *create_new_namespaces(unsigned long flags,
 	struct task_struct *tsk, struct user_namespace *user_ns,
 	struct fs_struct *new_fs)
 {
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH 18/20] audit: add new message type AUDIT_CREATE_NS
  2013-10-24  7:31 [RFC Part1 PATCH 00/20 v2] Add namespace support for audit Gao feng
                   ` (16 preceding siblings ...)
  2013-10-24  7:32 ` [PATCH 17/20] nsproxy: don't make create_new_namespaces static Gao feng
@ 2013-10-24  7:32 ` Gao feng
  2013-12-06 22:10   ` Serge E. Hallyn
  2013-10-24  7:32 ` [PATCH 19/20] audit: make audit_backlog_limit per audit namespace Gao feng
                   ` (4 subsequent siblings)
  22 siblings, 1 reply; 50+ messages in thread
From: Gao feng @ 2013-10-24  7:32 UTC (permalink / raw)
  To: linux-kernel, linux-audit
  Cc: containers, ebiederm, serge.hallyn, eparis, sgrubb,
	toshi.okajima, Gao feng

Since there is no more place for flags of clone system call.
we need to find a way to create audit namespace.

this patch add a new type of message AUDIT_CREATE_NS.
user space can create new audit namespace through
netlink.

Right now, The privileged user in user namespace is allowed
to create audit namespace. it means the unprivileged user can
create an user namespace and then create audit namespace.

Looks like it is not safe, but even the unprivileged user can
create audit namespace, it can do no harm to the host. un-init
audit namespace cann't effect the host.

In the follow patches, the audit_backlog_limit will be per
audit namesapace, but only the privileged user has rights to
modify it. and the default value of audit_backlog_limit for
uninit audit namespace will be set to 0.

And the audit_rate_limit will be limited too.

Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
---
 include/linux/audit_namespace.h |  7 +++++++
 include/uapi/linux/audit.h      |  1 +
 kernel/audit.c                  | 22 ++++++++++++++++++++++
 kernel/audit_namespace.c        | 29 +++++++++++++++++++++++++++++
 4 files changed, 59 insertions(+)

diff --git a/include/linux/audit_namespace.h b/include/linux/audit_namespace.h
index 79a9b78..b17f052 100644
--- a/include/linux/audit_namespace.h
+++ b/include/linux/audit_namespace.h
@@ -54,6 +54,8 @@ void put_audit_ns(struct audit_namespace *ns)
 		rcu_read_unlock();
 	}
 }
+
+extern int unshare_audit_namespace(void);
 #else
 static inline
 struct audit_namespace *get_audit_ns(struct audit_namespace *ns)
@@ -66,6 +68,11 @@ void put_audit_ns(struct audit_namespace *ns)
 {
 
 }
+
+static inline int unshare_audit_namespace()
+{
+	return -EINVAL;
+}
 #endif
 
 static inline struct
diff --git a/include/uapi/linux/audit.h b/include/uapi/linux/audit.h
index 75cef3f..877d509 100644
--- a/include/uapi/linux/audit.h
+++ b/include/uapi/linux/audit.h
@@ -68,6 +68,7 @@
 #define AUDIT_MAKE_EQUIV	1015	/* Append to watched tree */
 #define AUDIT_TTY_GET		1016	/* Get TTY auditing status */
 #define AUDIT_TTY_SET		1017	/* Set TTY auditing status */
+#define AUDIT_CREATE_NS		1018	/* Create new audit namespace */
 
 #define AUDIT_FIRST_USER_MSG	1100	/* Userspace messages mostly uninteresting to kernel */
 #define AUDIT_USER_AVC		1107	/* We filter this differently */
diff --git a/kernel/audit.c b/kernel/audit.c
index c4d4291..86212d3 100644
--- a/kernel/audit.c
+++ b/kernel/audit.c
@@ -596,6 +596,12 @@ static int audit_netlink_ok(struct sk_buff *skb, u16 msg_type)
 		    !capable(CAP_AUDIT_CONTROL))
 			err = -EPERM;
 		break;
+	case AUDIT_CREATE_NS:
+		/* Allow privileged user in user namespace to
+		 * create audit namespace */
+		if (!ns_capable(current_user_ns(), CAP_AUDIT_CONTROL))
+			err = -EPERM;
+		break;
 	case AUDIT_USER:
 	case AUDIT_FIRST_USER_MSG ... AUDIT_LAST_USER_MSG:
 	case AUDIT_FIRST_USER_MSG2 ... AUDIT_LAST_USER_MSG2:
@@ -735,6 +741,22 @@ static int audit_receive_msg(struct sk_buff *skb, struct nlmsghdr *nlh)
 		if (status_get->mask & AUDIT_STATUS_BACKLOG_LIMIT)
 			err = audit_set_backlog_limit(status_get->backlog_limit);
 		break;
+	case AUDIT_CREATE_NS:
+		err = unshare_audit_namespace();
+
+		if (audit_enabled == AUDIT_OFF)
+			break;
+
+		ab = audit_log_start_ns(ns, NULL, GFP_KERNEL, AUDIT_CREATE_NS);
+		if (ab) {
+			audit_log_format(ab, "Create audit namespace");
+			audit_log_session_info(ab);
+			audit_log_task_context(ab);
+			audit_log_format(ab, "res=%d", err ? 0 : 1);
+			audit_log_end_ns(ns, ab);
+		}
+
+		break;
 	case AUDIT_USER:
 	case AUDIT_FIRST_USER_MSG ... AUDIT_LAST_USER_MSG:
 	case AUDIT_FIRST_USER_MSG2 ... AUDIT_LAST_USER_MSG2:
diff --git a/kernel/audit_namespace.c b/kernel/audit_namespace.c
index 6d9cb8f..28c608e 100644
--- a/kernel/audit_namespace.c
+++ b/kernel/audit_namespace.c
@@ -6,3 +6,32 @@ struct audit_namespace init_audit_ns = {
 	.user_ns = &init_user_ns,
 };
 EXPORT_SYMBOL_GPL(init_audit_ns);
+
+int unshare_audit_namespace(void)
+{
+	struct task_struct *tsk = current;
+	struct audit_namespace *new_audit = NULL;
+	struct nsproxy *new_nsp;
+
+	new_audit = kzalloc(sizeof(struct audit_namespace), GFP_KERNEL);
+	if (!new_audit)
+		return -ENOMEM;
+
+	skb_queue_head_init(&new_audit->queue);
+	skb_queue_head_init(&new_audit->hold_queue);
+	init_waitqueue_head(&new_audit->kauditd_wait);
+	init_waitqueue_head(&new_audit->backlog_wait);
+
+	new_nsp = create_new_namespaces(0, tsk, NULL, NULL);
+	if (IS_ERR(new_nsp)) {
+		kfree(new_audit);
+		return PTR_ERR(new_nsp);
+	}
+
+	new_audit->user_ns = get_user_ns(current_user_ns());
+	new_nsp->audit_ns = get_audit_ns(new_audit);
+
+	switch_task_namespaces(current, new_nsp);
+
+	return 0;
+}
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH 19/20] audit: make audit_backlog_limit per audit namespace
  2013-10-24  7:31 [RFC Part1 PATCH 00/20 v2] Add namespace support for audit Gao feng
                   ` (17 preceding siblings ...)
  2013-10-24  7:32 ` [PATCH 18/20] audit: add new message type AUDIT_CREATE_NS Gao feng
@ 2013-10-24  7:32 ` Gao feng
  2013-10-24  7:32 ` [PATCH 20/20] audit: introduce /proc/<pid>/audit_backlog_limit Gao feng
                   ` (3 subsequent siblings)
  22 siblings, 0 replies; 50+ messages in thread
From: Gao feng @ 2013-10-24  7:32 UTC (permalink / raw)
  To: linux-kernel, linux-audit
  Cc: containers, ebiederm, serge.hallyn, eparis, sgrubb,
	toshi.okajima, Gao feng

This patch makes audit_backlog_limit per audit
namespace, so we can limit the memory usage of
audit namespace through limit the value of
auditns's audit_backlog_limit.

By default, the backlog_limit of new created
audit namespace is zero, so processes in this
auditns has no ability to generate audit log.

Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
---
 include/linux/audit_namespace.h |  2 ++
 kernel/audit.c                  | 47 +++++++++++++++++++++++------------------
 2 files changed, 28 insertions(+), 21 deletions(-)

diff --git a/include/linux/audit_namespace.h b/include/linux/audit_namespace.h
index b17f052..4648b4f 100644
--- a/include/linux/audit_namespace.h
+++ b/include/linux/audit_namespace.h
@@ -14,6 +14,8 @@ struct audit_namespace {
 	int pid;
 	/* portid of the auditd process's netlink socket */
 	int portid;
+	/* number of outstanding audit_buffers allowed */
+	int backlog_limit;
 	struct user_namespace *user_ns;
 	struct sk_buff_head queue;
 	/* queue of skbs to send to auditd when/if it comes back */
diff --git a/kernel/audit.c b/kernel/audit.c
index 86212d3..63797557 100644
--- a/kernel/audit.c
+++ b/kernel/audit.c
@@ -94,8 +94,6 @@ static int	audit_failure = AUDIT_FAIL_PRINTK;
  * audit records being dropped. */
 static int	audit_rate_limit;
 
-/* Number of outstanding audit_buffers allowed. */
-static int	audit_backlog_limit = 64;
 static int	audit_backlog_wait_time = 60 * HZ;
 static int	audit_backlog_wait_overflow = 0;
 
@@ -210,15 +208,7 @@ static inline int audit_rate_check(void)
 	return retval;
 }
 
-/**
- * audit_log_lost - conditionally log lost audit message event
- * @message: the message stating reason for lost audit message
- *
- * Emit at least 1 message per second, even if audit_rate_check is
- * throttling.
- * Always increment the lost messages counter.
-*/
-void audit_log_lost(const char *message)
+void audit_log_lost_ns(struct audit_namespace *ns, const char *message)
 {
 	static unsigned long	last_msg = 0;
 	static DEFINE_SPINLOCK(lock);
@@ -240,18 +230,31 @@ void audit_log_lost(const char *message)
 		spin_unlock_irqrestore(&lock, flags);
 	}
 
-	if (print) {
+	if (print && (ns == &init_audit_ns)) {
 		if (printk_ratelimit())
 			printk(KERN_WARNING
 				"audit: audit_lost=%d audit_rate_limit=%d "
 				"audit_backlog_limit=%d\n",
 				atomic_read(&audit_lost),
 				audit_rate_limit,
-				audit_backlog_limit);
+				ns->backlog_limit);
 		audit_panic(message);
 	}
 }
 
+/**
+ * audit_log_lost - conditionally log lost audit message event
+ * @message: the message stating reason for lost audit message
+ *
+ * Emit at least 1 message per second, even if audit_rate_check is
+ * throttling.
+ * Always increment the lost messages counter.
+*/
+void audit_log_lost(const char *message)
+{
+	return audit_log_lost_ns(&init_audit_ns, message);
+}
+
 static int audit_log_config_change(struct audit_namespace *ns,
 				   char *function_name,
 				   int new, int old,
@@ -308,7 +311,8 @@ static int audit_set_rate_limit(int limit)
 
 static int audit_set_backlog_limit(int limit)
 {
-	return audit_do_config_change("audit_backlog_limit", &audit_backlog_limit, limit);
+	return audit_do_config_change("audit_backlog_limit",
+				      &init_audit_ns.backlog_limit, limit);
 }
 
 static int audit_set_enabled(int state)
@@ -346,7 +350,7 @@ static int audit_set_failure(int state)
 static void audit_hold_skb(struct audit_namespace *ns, struct sk_buff *skb)
 {
 	if (audit_default &&
-	    skb_queue_len(&ns->hold_queue) < audit_backlog_limit)
+	    skb_queue_len(&ns->hold_queue) < ns->backlog_limit)
 		skb_queue_tail(&ns->hold_queue, skb);
 	else
 		kfree_skb(skb);
@@ -675,7 +679,7 @@ static int audit_receive_msg(struct sk_buff *skb, struct nlmsghdr *nlh)
 		status_set.failure	 = audit_failure;
 		status_set.pid		 = ns->pid;
 		status_set.rate_limit	 = audit_rate_limit;
-		status_set.backlog_limit = audit_backlog_limit;
+		status_set.backlog_limit = ns->backlog_limit;
 		status_set.lost		 = atomic_read(&audit_lost);
 		status_set.backlog	 = skb_queue_len(&ns->queue);
 
@@ -971,6 +975,7 @@ static int __init audit_init(void)
 
 	init_audit_ns.pid = 0;
 	init_audit_ns.portid = 0;
+	init_audit_ns.backlog_limit = 64;
 	init_audit_ns.kauditd_task = NULL;
 	skb_queue_head_init(&init_audit_ns.queue);
 	skb_queue_head_init(&init_audit_ns.hold_queue);
@@ -1129,8 +1134,8 @@ static void wait_for_auditd(struct audit_namespace *ns,
 	set_current_state(TASK_UNINTERRUPTIBLE);
 	add_wait_queue(&ns->backlog_wait, &wait);
 
-	if (audit_backlog_limit &&
-	    skb_queue_len(&ns->queue) > audit_backlog_limit)
+	if (ns->backlog_limit &&
+	    skb_queue_len(&ns->queue) > ns->backlog_limit)
 		schedule_timeout(sleep_time);
 
 	__set_current_state(TASK_RUNNING);
@@ -1167,8 +1172,8 @@ audit_log_start_ns(struct audit_namespace *ns,
 		reserve = 5; /* Allow atomic callers to go up to five
 				entries over the normal backlog limit */
 
-	while (audit_backlog_limit
-	       && skb_queue_len(&ns->queue) > audit_backlog_limit + reserve) {
+	while (ns->backlog_limit
+	       && skb_queue_len(&ns->queue) > ns->backlog_limit + reserve) {
 		if (gfp_mask & __GFP_WAIT && audit_backlog_wait_time) {
 			unsigned long sleep_time;
 
@@ -1184,7 +1189,7 @@ audit_log_start_ns(struct audit_namespace *ns,
 			       "audit: audit_backlog=%d > "
 			       "audit_backlog_limit=%d\n",
 			       skb_queue_len(&ns->queue),
-			       audit_backlog_limit);
+			       ns->backlog_limit);
 		audit_log_lost("backlog limit exceeded");
 		audit_backlog_wait_time = audit_backlog_wait_overflow;
 		wake_up(&init_audit_ns.backlog_wait);
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH 20/20] audit: introduce /proc/<pid>/audit_backlog_limit
  2013-10-24  7:31 [RFC Part1 PATCH 00/20 v2] Add namespace support for audit Gao feng
                   ` (18 preceding siblings ...)
  2013-10-24  7:32 ` [PATCH 19/20] audit: make audit_backlog_limit per audit namespace Gao feng
@ 2013-10-24  7:32 ` Gao feng
  2013-10-31  3:52 ` [RFC Part1 PATCH 00/20 v2] Add namespace support for audit Gao feng
                   ` (2 subsequent siblings)
  22 siblings, 0 replies; 50+ messages in thread
From: Gao feng @ 2013-10-24  7:32 UTC (permalink / raw)
  To: linux-kernel, linux-audit
  Cc: containers, ebiederm, serge.hallyn, eparis, sgrubb,
	toshi.okajima, Gao feng

Since the backlog_limit of new created audit namespace
is zero, this audit namespace is unavailable, this patch
introdeces a proc file audit_backlog_limit. only privileged
user of host has rights to setup backlog_limit for audit
namespace. this prevents unprivileged user into costing
memory through create user namespace and then create audit
namespace.

Inder to keep the consistent behavior as before, for init
audit namespace, the backlog_limit can be changed only
through netlink interface.

Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
---
 fs/proc/base.c                  | 53 +++++++++++++++++++++++++
 include/linux/audit_namespace.h |  7 ++++
 kernel/audit_namespace.c        | 86 +++++++++++++++++++++++++++++++++++++++++
 3 files changed, 146 insertions(+)

diff --git a/fs/proc/base.c b/fs/proc/base.c
index 1485e38..3553699 100644
--- a/fs/proc/base.c
+++ b/fs/proc/base.c
@@ -83,6 +83,7 @@
 #include <linux/elf.h>
 #include <linux/pid_namespace.h>
 #include <linux/user_namespace.h>
+#include <linux/audit_namespace.h>
 #include <linux/fs_struct.h>
 #include <linux/slab.h>
 #include <linux/flex_array.h>
@@ -2545,6 +2546,52 @@ static const struct file_operations proc_projid_map_operations = {
 };
 #endif /* CONFIG_USER_NS */
 
+#ifdef CONFIG_AUDIT_NS
+ssize_t proc_audit_backlog_read(struct file *file,
+				char __user *buf,
+				size_t count,
+				loff_t *ppos)
+{
+	struct task_struct *task = NULL;
+	ssize_t ret;
+
+	task = get_proc_task(file_inode(file));
+	if (!task)
+		return -ESRCH;
+
+	ret = audit_backlog_read(task, buf, count, ppos);
+
+	put_task_struct(task);
+
+	return ret;
+}
+
+ssize_t proc_audit_backlog_write(struct file *file,
+				 const char __user *buf,
+				 size_t size,
+				 loff_t *ppos)
+{
+	int ret;
+	struct task_struct *task;
+
+	task = get_proc_task(file_inode(file));
+	if (!task)
+		return -ESRCH;
+
+	ret = audit_backlog_write(task, buf, size, ppos);
+
+	put_task_struct(task);
+
+	return ret;
+}
+
+static const struct file_operations proc_audit_backlog_operations = {
+	.read		= proc_audit_backlog_read,
+	.write		= proc_audit_backlog_write,
+	.llseek		= generic_file_llseek,
+};
+#endif /* CONFIG_AUDIT_NS */
+
 static int proc_pid_personality(struct seq_file *m, struct pid_namespace *ns,
 				struct pid *pid, struct task_struct *task)
 {
@@ -2635,6 +2682,9 @@ static const struct pid_entry tgid_base_stuff[] = {
 	REG("loginuid",   S_IWUSR|S_IRUGO, proc_loginuid_operations),
 	REG("sessionid",  S_IRUGO, proc_sessionid_operations),
 #endif
+#ifdef CONFIG_AUDIT_NS
+	REG("audit_backlog_limit", S_IRUGO|S_IWUSR, proc_audit_backlog_operations),
+#endif
 #ifdef CONFIG_FAULT_INJECTION
 	REG("make-it-fail", S_IRUGO|S_IWUSR, proc_fault_inject_operations),
 #endif
@@ -2973,6 +3023,9 @@ static const struct pid_entry tid_base_stuff[] = {
 	REG("loginuid",  S_IWUSR|S_IRUGO, proc_loginuid_operations),
 	REG("sessionid",  S_IRUGO, proc_sessionid_operations),
 #endif
+#ifdef CONFIG_AUDIT_NS
+	REG("audit_backlog_limit", S_IRUGO|S_IWUSR, proc_audit_backlog_operations),
+#endif
 #ifdef CONFIG_FAULT_INJECTION
 	REG("make-it-fail", S_IRUGO|S_IWUSR, proc_fault_inject_operations),
 #endif
diff --git a/include/linux/audit_namespace.h b/include/linux/audit_namespace.h
index 4648b4f..c913fe8 100644
--- a/include/linux/audit_namespace.h
+++ b/include/linux/audit_namespace.h
@@ -58,6 +58,13 @@ void put_audit_ns(struct audit_namespace *ns)
 }
 
 extern int unshare_audit_namespace(void);
+extern ssize_t audit_backlog_read(struct task_struct *,
+				  char __user *,
+				  size_t, loff_t *);
+extern ssize_t audit_backlog_write(struct task_struct *,
+				   const char __user *,
+				   size_t size, loff_t *);
+
 #else
 static inline
 struct audit_namespace *get_audit_ns(struct audit_namespace *ns)
diff --git a/kernel/audit_namespace.c b/kernel/audit_namespace.c
index 28c608e..e2319b8 100644
--- a/kernel/audit_namespace.c
+++ b/kernel/audit_namespace.c
@@ -1,5 +1,11 @@
 #include <linux/audit_namespace.h>
 #include <linux/export.h>
+#include <linux/proc_ns.h>
+#include <linux/seq_file.h>
+#include <linux/pid.h>
+#include <linux/fs.h>
+#include <linux/nsproxy.h>
+#include "audit.h"
 
 struct audit_namespace init_audit_ns = {
 	.count = ATOMIC_INIT(1),
@@ -35,3 +41,83 @@ int unshare_audit_namespace(void)
 
 	return 0;
 }
+
+#define TMPBUFLEN 21
+ssize_t audit_backlog_read(struct task_struct *tsk,
+			   char __user *buf,
+			   size_t count,
+			   loff_t *ppos)
+{
+	ssize_t len;
+	char tmpbuf[TMPBUFLEN];
+	struct audit_namespace *ns = NULL;
+
+	rcu_read_lock();
+	ns = get_audit_ns(tsk->nsproxy->audit_ns);
+	rcu_read_unlock();
+
+	mutex_lock(&audit_cmd_mutex);
+	len = scnprintf(tmpbuf, TMPBUFLEN, "%d", ns->backlog_limit);
+	mutex_unlock(&audit_cmd_mutex);
+
+	put_audit_ns(ns);
+
+	return simple_read_from_buffer(buf, count, ppos, tmpbuf, len);
+}
+
+ssize_t audit_backlog_write(struct task_struct *tsk,
+			    const char __user *buf,
+			    size_t size,
+			    loff_t *ppos)
+{
+	ssize_t ret = 0;
+	char *page = NULL, *tmp;
+	int audit_log_limit = 0;
+	struct audit_namespace *ns = NULL;
+
+	if (!capable(CAP_AUDIT_CONTROL))
+		return -EPERM;
+
+	if (*ppos != 0)
+		return -EINVAL;
+
+	if (size >= PAGE_SIZE)
+		size = PAGE_SIZE - 1;
+
+	page = (char *)__get_free_page(GFP_TEMPORARY);
+	if (!page)
+		return -ENOMEM;
+
+	rcu_read_lock();
+	ns = get_audit_ns(tsk->nsproxy->audit_ns);
+	rcu_read_unlock();
+
+	if (!ns)
+		goto out_free_page;
+
+	/* Disallow to change init_audit_ns through proc interface */
+	ret = -EPERM;
+	if (ns == &init_audit_ns)
+		goto out_put_audit;
+
+	ret = -EFAULT;
+	if (copy_from_user(page, buf, size))
+		goto out_put_audit;
+
+	ret = -EINVAL;
+	page[size] = '\0';
+	audit_log_limit = simple_strtoul(page, &tmp, 10);
+	if (tmp == page)
+		goto out_put_audit;
+
+	mutex_lock(&audit_cmd_mutex);
+	ns->backlog_limit = audit_log_limit;
+	mutex_unlock(&audit_cmd_mutex);
+
+	ret = size;
+out_put_audit:
+	put_audit_ns(ns);
+out_free_page:
+	free_page((unsigned long) page);
+	return ret;
+}
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 50+ messages in thread

* Re: [RFC Part1 PATCH 00/20 v2] Add namespace support for audit
  2013-10-24  7:31 [RFC Part1 PATCH 00/20 v2] Add namespace support for audit Gao feng
                   ` (19 preceding siblings ...)
  2013-10-24  7:32 ` [PATCH 20/20] audit: introduce /proc/<pid>/audit_backlog_limit Gao feng
@ 2013-10-31  3:52 ` Gao feng
  2013-11-05  7:51   ` Gao feng
  2013-12-04  8:31 ` Gao feng
  2013-12-06 21:31 ` Serge E. Hallyn
  22 siblings, 1 reply; 50+ messages in thread
From: Gao feng @ 2013-10-31  3:52 UTC (permalink / raw)
  To: eparis
  Cc: Gao feng, linux-kernel, linux-audit, containers, ebiederm,
	serge.hallyn, sgrubb, toshi.okajima, Richard Guy Briggs

Hi Eric Paris,

Can you give me some comments?

You think the tying audit namespace to user namespace is a bad idea,
so this patchset doesn't assign auditns to userns and introduce an
new audit netlink type to help to create audit namespace.

and this patchset also introduces an new proc interface to make
sure container can't influence the whole system.

and the audit rules are not namespace aware, all of audit namespaces
should comply with the rules. in next step, if we find it's need to
make audit rules per audit namespace, then it's the time to do that
job.

This patchset also makes all of net namespaces have ability to send/
receive audit netlink message.

I may miss some points, if you find there are some shortage or loophole,
please let me know.

Thanks!

On 10/24/2013 03:31 PM, Gao feng wrote:
> Here is the v1 patchset: http://lwn.net/Articles/549546/
> 
> The main target of this patchset is allowing user in audit
> namespace to generate the USER_MSG type of audit message,
> some userspace tools need to generate audit message, or
> these tools will broken.
> 
> And the login process in container may want to setup
> /proc/<pid>/loginuid, right now this value is unalterable
> once it being set. this will also broke the login problem
> in container. After this patchset, we can reset this loginuid
> to zero if task is running in a new audit namespace.
> 
> Same with v1 patchset, in this patchset, only the privileged
> user in init_audit_ns and init_user_ns has rights to
> add/del audit rules. and these rules are gloabl. all
> audit namespace will comply with the rules.
> 
> Compared with v1, v2 patch has some big changes.
> 1, the audit namespace is not assigned to user namespace.
>    since there is no available bit of flags for clone, we
>    create audit namespace through netlink, patch[18/20]
>    introduces a new audit netlink type AUDIT_CREATE_NS.
>    the privileged user in userns has rights to create a
>    audit namespace, it means the unprivileged user can
>    create auditns through create userns first. In order
>    to prevent them from doing harm to host, the default
>    audit_backlog_limit of un-init-audit-ns is zero(means
>    audit is unavailable in audit namespace). and it can't
>    be changed in auditns through netlink.
> 
> 2, introduce /proc/<pid>/audit_log_limit
>    this interface is used to setup log_limit of audit
>    namespace.  we need this interface to make audit
>    available in un-init-audit-ns. Only the privileged user
>    has right to set this value, it means only the root user
>    of host can change it.
> 
> 3, make audit namespace don't depend on net namespace.
>    patch[1/20] add a compare function audit_compare for
>    audit netlink, it always return true, it means the
>    netlink subsystem will find out the netlink socket
>    only through portid and netlink type. So we needn't
>    to create kernel side audit netlink socket for per
>    net namespace, all userspace audit netlink socket
>    can find out the audit_sock, and audit_sock can
>    communicate with them through the proper portid.
>    it's just like the behavior we don't have net
>    namespace before.
> 
> 
> This patchset still need some work, such as allow changing
> audit_enabled in audit namespace, auditd wants this feature.
> 
> I send this patchset now in order to get more comments, so
> I can keep on improving namespace support for audit.
> 
> Gao feng (20):
>   Audit: make audit netlink socket net namespace unaware
>   audit: introduce configure option CONFIG_AUDIT_NS
>   audit: make audit_skb_queue per audit namespace
>   audit: make audit_skb_hold_queue per audit namespace
>   audit: make audit_pid per audit namespace
>   audit: make kauditd_task per audit namespace
>   aduit: make audit_nlk_portid per audit namespace
>   audit: make kaudit_wait queue per audit namespace
>   audit: make audit_backlog_wait per audit namespace
>   audit: allow un-init audit ns to change pid and portid only
>   audit: use proper audit namespace in audit_receive_msg
>   audit: use proper audit_namespace in kauditd_thread
>   audit: introduce new audit logging interface for audit namespace
>   audit: pass proper audit namespace to audit_log_common_recv_msg
>   audit: Log audit pid config change in audit namespace
>   audit: allow GET,SET,USER MSG operations in audit namespace
>   nsproxy: don't make create_new_namespaces static
>   audit: add new message type AUDIT_CREATE_NS
>   audit: make audit_backlog_limit per audit namespace
>   audit: introduce /proc/<pid>/audit_backlog_limit
> 
>  fs/proc/base.c                  |  53 ++++++
>  include/linux/audit.h           |  26 ++-
>  include/linux/audit_namespace.h |  92 ++++++++++
>  include/linux/nsproxy.h         |  15 +-
>  include/uapi/linux/audit.h      |   1 +
>  init/Kconfig                    |  10 ++
>  kernel/Makefile                 |   2 +-
>  kernel/audit.c                  | 364 +++++++++++++++++++++++++---------------
>  kernel/audit.h                  |   5 +-
>  kernel/audit_namespace.c        | 123 ++++++++++++++
>  kernel/auditsc.c                |   6 +-
>  kernel/nsproxy.c                |  18 +-
>  12 files changed, 561 insertions(+), 154 deletions(-)
>  create mode 100644 include/linux/audit_namespace.h
>  create mode 100644 kernel/audit_namespace.c
> 


^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [RFC Part1 PATCH 00/20 v2] Add namespace support for audit
  2013-10-31  3:52 ` [RFC Part1 PATCH 00/20 v2] Add namespace support for audit Gao feng
@ 2013-11-05  7:51   ` Gao feng
  2013-11-05  7:52     ` Gao feng
  0 siblings, 1 reply; 50+ messages in thread
From: Gao feng @ 2013-11-05  7:51 UTC (permalink / raw)
  To: Gao feng
  Cc: eparis, linux-kernel, linux-audit, containers, ebiederm,
	serge.hallyn, sgrubb, toshi.okajima, Richard Guy Briggs

Ping...

On 10/31/2013 11:52 AM, Gao feng wrote:
> Hi Eric Paris,
> 
> Can you give me some comments?
> 
> You think the tying audit namespace to user namespace is a bad idea,
> so this patchset doesn't assign auditns to userns and introduce an
> new audit netlink type to help to create audit namespace.
> 
> and this patchset also introduces an new proc interface to make
> sure container can't influence the whole system.
> 
> and the audit rules are not namespace aware, all of audit namespaces
> should comply with the rules. in next step, if we find it's need to
> make audit rules per audit namespace, then it's the time to do that
> job.
> 
> This patchset also makes all of net namespaces have ability to send/
> receive audit netlink message.
> 
> I may miss some points, if you find there are some shortage or loophole,
> please let me know.
> 
> Thanks!
> 
> On 10/24/2013 03:31 PM, Gao feng wrote:
>> Here is the v1 patchset: http://lwn.net/Articles/549546/
>>
>> The main target of this patchset is allowing user in audit
>> namespace to generate the USER_MSG type of audit message,
>> some userspace tools need to generate audit message, or
>> these tools will broken.
>>
>> And the login process in container may want to setup
>> /proc/<pid>/loginuid, right now this value is unalterable
>> once it being set. this will also broke the login problem
>> in container. After this patchset, we can reset this loginuid
>> to zero if task is running in a new audit namespace.
>>
>> Same with v1 patchset, in this patchset, only the privileged
>> user in init_audit_ns and init_user_ns has rights to
>> add/del audit rules. and these rules are gloabl. all
>> audit namespace will comply with the rules.
>>
>> Compared with v1, v2 patch has some big changes.
>> 1, the audit namespace is not assigned to user namespace.
>>    since there is no available bit of flags for clone, we
>>    create audit namespace through netlink, patch[18/20]
>>    introduces a new audit netlink type AUDIT_CREATE_NS.
>>    the privileged user in userns has rights to create a
>>    audit namespace, it means the unprivileged user can
>>    create auditns through create userns first. In order
>>    to prevent them from doing harm to host, the default
>>    audit_backlog_limit of un-init-audit-ns is zero(means
>>    audit is unavailable in audit namespace). and it can't
>>    be changed in auditns through netlink.
>>
>> 2, introduce /proc/<pid>/audit_log_limit
>>    this interface is used to setup log_limit of audit
>>    namespace.  we need this interface to make audit
>>    available in un-init-audit-ns. Only the privileged user
>>    has right to set this value, it means only the root user
>>    of host can change it.
>>
>> 3, make audit namespace don't depend on net namespace.
>>    patch[1/20] add a compare function audit_compare for
>>    audit netlink, it always return true, it means the
>>    netlink subsystem will find out the netlink socket
>>    only through portid and netlink type. So we needn't
>>    to create kernel side audit netlink socket for per
>>    net namespace, all userspace audit netlink socket
>>    can find out the audit_sock, and audit_sock can
>>    communicate with them through the proper portid.
>>    it's just like the behavior we don't have net
>>    namespace before.
>>
>>
>> This patchset still need some work, such as allow changing
>> audit_enabled in audit namespace, auditd wants this feature.
>>
>> I send this patchset now in order to get more comments, so
>> I can keep on improving namespace support for audit.
>>
>> Gao feng (20):
>>   Audit: make audit netlink socket net namespace unaware
>>   audit: introduce configure option CONFIG_AUDIT_NS
>>   audit: make audit_skb_queue per audit namespace
>>   audit: make audit_skb_hold_queue per audit namespace
>>   audit: make audit_pid per audit namespace
>>   audit: make kauditd_task per audit namespace
>>   aduit: make audit_nlk_portid per audit namespace
>>   audit: make kaudit_wait queue per audit namespace
>>   audit: make audit_backlog_wait per audit namespace
>>   audit: allow un-init audit ns to change pid and portid only
>>   audit: use proper audit namespace in audit_receive_msg
>>   audit: use proper audit_namespace in kauditd_thread
>>   audit: introduce new audit logging interface for audit namespace
>>   audit: pass proper audit namespace to audit_log_common_recv_msg
>>   audit: Log audit pid config change in audit namespace
>>   audit: allow GET,SET,USER MSG operations in audit namespace
>>   nsproxy: don't make create_new_namespaces static
>>   audit: add new message type AUDIT_CREATE_NS
>>   audit: make audit_backlog_limit per audit namespace
>>   audit: introduce /proc/<pid>/audit_backlog_limit
>>
>>  fs/proc/base.c                  |  53 ++++++
>>  include/linux/audit.h           |  26 ++-
>>  include/linux/audit_namespace.h |  92 ++++++++++
>>  include/linux/nsproxy.h         |  15 +-
>>  include/uapi/linux/audit.h      |   1 +
>>  init/Kconfig                    |  10 ++
>>  kernel/Makefile                 |   2 +-
>>  kernel/audit.c                  | 364 +++++++++++++++++++++++++---------------
>>  kernel/audit.h                  |   5 +-
>>  kernel/audit_namespace.c        | 123 ++++++++++++++
>>  kernel/auditsc.c                |   6 +-
>>  kernel/nsproxy.c                |  18 +-
>>  12 files changed, 561 insertions(+), 154 deletions(-)
>>  create mode 100644 include/linux/audit_namespace.h
>>  create mode 100644 kernel/audit_namespace.c
>>
> 
> 


^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [RFC Part1 PATCH 00/20 v2] Add namespace support for audit
  2013-11-05  7:51   ` Gao feng
@ 2013-11-05  7:52     ` Gao feng
  2013-11-05  8:11       ` Li Zefan
  0 siblings, 1 reply; 50+ messages in thread
From: Gao feng @ 2013-11-05  7:52 UTC (permalink / raw)
  To: Gao feng
  Cc: eparis, linux-kernel, linux-audit, containers, ebiederm,
	serge.hallyn, sgrubb, toshi.okajima, Richard Guy Briggs

On 11/05/2013 03:51 PM, Gao feng wrote:
> Ping...
> 

I want to catch up the merge window..

> On 10/31/2013 11:52 AM, Gao feng wrote:
>> Hi Eric Paris,
>>
>> Can you give me some comments?
>>
>> You think the tying audit namespace to user namespace is a bad idea,
>> so this patchset doesn't assign auditns to userns and introduce an
>> new audit netlink type to help to create audit namespace.
>>
>> and this patchset also introduces an new proc interface to make
>> sure container can't influence the whole system.
>>
>> and the audit rules are not namespace aware, all of audit namespaces
>> should comply with the rules. in next step, if we find it's need to
>> make audit rules per audit namespace, then it's the time to do that
>> job.
>>
>> This patchset also makes all of net namespaces have ability to send/
>> receive audit netlink message.
>>
>> I may miss some points, if you find there are some shortage or loophole,
>> please let me know.
>>
>> Thanks!
>>
>> On 10/24/2013 03:31 PM, Gao feng wrote:
>>> Here is the v1 patchset: http://lwn.net/Articles/549546/
>>>
>>> The main target of this patchset is allowing user in audit
>>> namespace to generate the USER_MSG type of audit message,
>>> some userspace tools need to generate audit message, or
>>> these tools will broken.
>>>
>>> And the login process in container may want to setup
>>> /proc/<pid>/loginuid, right now this value is unalterable
>>> once it being set. this will also broke the login problem
>>> in container. After this patchset, we can reset this loginuid
>>> to zero if task is running in a new audit namespace.
>>>
>>> Same with v1 patchset, in this patchset, only the privileged
>>> user in init_audit_ns and init_user_ns has rights to
>>> add/del audit rules. and these rules are gloabl. all
>>> audit namespace will comply with the rules.
>>>
>>> Compared with v1, v2 patch has some big changes.
>>> 1, the audit namespace is not assigned to user namespace.
>>>    since there is no available bit of flags for clone, we
>>>    create audit namespace through netlink, patch[18/20]
>>>    introduces a new audit netlink type AUDIT_CREATE_NS.
>>>    the privileged user in userns has rights to create a
>>>    audit namespace, it means the unprivileged user can
>>>    create auditns through create userns first. In order
>>>    to prevent them from doing harm to host, the default
>>>    audit_backlog_limit of un-init-audit-ns is zero(means
>>>    audit is unavailable in audit namespace). and it can't
>>>    be changed in auditns through netlink.
>>>
>>> 2, introduce /proc/<pid>/audit_log_limit
>>>    this interface is used to setup log_limit of audit
>>>    namespace.  we need this interface to make audit
>>>    available in un-init-audit-ns. Only the privileged user
>>>    has right to set this value, it means only the root user
>>>    of host can change it.
>>>
>>> 3, make audit namespace don't depend on net namespace.
>>>    patch[1/20] add a compare function audit_compare for
>>>    audit netlink, it always return true, it means the
>>>    netlink subsystem will find out the netlink socket
>>>    only through portid and netlink type. So we needn't
>>>    to create kernel side audit netlink socket for per
>>>    net namespace, all userspace audit netlink socket
>>>    can find out the audit_sock, and audit_sock can
>>>    communicate with them through the proper portid.
>>>    it's just like the behavior we don't have net
>>>    namespace before.
>>>
>>>
>>> This patchset still need some work, such as allow changing
>>> audit_enabled in audit namespace, auditd wants this feature.
>>>
>>> I send this patchset now in order to get more comments, so
>>> I can keep on improving namespace support for audit.
>>>
>>> Gao feng (20):
>>>   Audit: make audit netlink socket net namespace unaware
>>>   audit: introduce configure option CONFIG_AUDIT_NS
>>>   audit: make audit_skb_queue per audit namespace
>>>   audit: make audit_skb_hold_queue per audit namespace
>>>   audit: make audit_pid per audit namespace
>>>   audit: make kauditd_task per audit namespace
>>>   aduit: make audit_nlk_portid per audit namespace
>>>   audit: make kaudit_wait queue per audit namespace
>>>   audit: make audit_backlog_wait per audit namespace
>>>   audit: allow un-init audit ns to change pid and portid only
>>>   audit: use proper audit namespace in audit_receive_msg
>>>   audit: use proper audit_namespace in kauditd_thread
>>>   audit: introduce new audit logging interface for audit namespace
>>>   audit: pass proper audit namespace to audit_log_common_recv_msg
>>>   audit: Log audit pid config change in audit namespace
>>>   audit: allow GET,SET,USER MSG operations in audit namespace
>>>   nsproxy: don't make create_new_namespaces static
>>>   audit: add new message type AUDIT_CREATE_NS
>>>   audit: make audit_backlog_limit per audit namespace
>>>   audit: introduce /proc/<pid>/audit_backlog_limit
>>>
>>>  fs/proc/base.c                  |  53 ++++++
>>>  include/linux/audit.h           |  26 ++-
>>>  include/linux/audit_namespace.h |  92 ++++++++++
>>>  include/linux/nsproxy.h         |  15 +-
>>>  include/uapi/linux/audit.h      |   1 +
>>>  init/Kconfig                    |  10 ++
>>>  kernel/Makefile                 |   2 +-
>>>  kernel/audit.c                  | 364 +++++++++++++++++++++++++---------------
>>>  kernel/audit.h                  |   5 +-
>>>  kernel/audit_namespace.c        | 123 ++++++++++++++
>>>  kernel/auditsc.c                |   6 +-
>>>  kernel/nsproxy.c                |  18 +-
>>>  12 files changed, 561 insertions(+), 154 deletions(-)
>>>  create mode 100644 include/linux/audit_namespace.h
>>>  create mode 100644 kernel/audit_namespace.c
>>>
>>
>>
> 
> 


^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [RFC Part1 PATCH 00/20 v2] Add namespace support for audit
  2013-11-05  7:52     ` Gao feng
@ 2013-11-05  8:11       ` Li Zefan
  2013-11-05  8:56         ` Gao feng
  0 siblings, 1 reply; 50+ messages in thread
From: Li Zefan @ 2013-11-05  8:11 UTC (permalink / raw)
  To: Gao feng
  Cc: toshi.okajima, containers, serge.hallyn, linux-kernel, eparis,
	Richard Guy Briggs, linux-audit, ebiederm, sgrubb

On 2013/11/5 15:52, Gao feng wrote:
> On 11/05/2013 03:51 PM, Gao feng wrote:
>> Ping...
>>
> 
> I want to catch up the merge window..
> 

Even if your patches are accepted by a certain maintainer immediately,
he will in no doubt queue them for 3.14.


^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [RFC Part1 PATCH 00/20 v2] Add namespace support for audit
  2013-11-05  8:11       ` Li Zefan
@ 2013-11-05  8:56         ` Gao feng
  2013-11-05 19:14           ` Richard Guy Briggs
  0 siblings, 1 reply; 50+ messages in thread
From: Gao feng @ 2013-11-05  8:56 UTC (permalink / raw)
  To: Li Zefan
  Cc: toshi.okajima, containers, serge.hallyn, linux-kernel, eparis,
	Richard Guy Briggs, linux-audit, ebiederm, sgrubb

On 11/05/2013 04:11 PM, Li Zefan wrote:
> On 2013/11/5 15:52, Gao feng wrote:
>> On 11/05/2013 03:51 PM, Gao feng wrote:
>>> Ping...
>>>
>>
>> I want to catch up the merge window..
>>
> 
> Even if your patches are accepted by a certain maintainer immediately,
> he will in no doubt queue them for 3.14.
> 

Yes, you are right...
But I still want to get some comments..


^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [RFC Part1 PATCH 00/20 v2] Add namespace support for audit
  2013-11-05  8:56         ` Gao feng
@ 2013-11-05 19:14           ` Richard Guy Briggs
  2013-11-07  5:51             ` Gao feng
  2013-11-21  7:57             ` Gao feng
  0 siblings, 2 replies; 50+ messages in thread
From: Richard Guy Briggs @ 2013-11-05 19:14 UTC (permalink / raw)
  To: Gao feng
  Cc: Li Zefan, toshi.okajima, containers, serge.hallyn, linux-kernel,
	eparis, linux-audit, ebiederm, sgrubb

On Tue, Nov 05, 2013 at 04:56:55PM +0800, Gao feng wrote:
> On 11/05/2013 04:11 PM, Li Zefan wrote:
> > On 2013/11/5 15:52, Gao feng wrote:
> >> On 11/05/2013 03:51 PM, Gao feng wrote:
> >>> Ping...
> >>
> >> I want to catch up the merge window..
> > 
> > Even if your patches are accepted by a certain maintainer immediately,
> > he will in no doubt queue them for 3.14.
> 
> Yes, you are right...
> But I still want to get some comments..

Definitely won't make it in in this merge window.  I agree it needs
discussion, but I won't start that for at least another week (I'm
currently at IETF in Vancouver.).


- RGB

--
Richard Guy Briggs <rbriggs@redhat.com>
Senior Software Engineer, Kernel Security, AMER ENG Base Operating Systems, Red Hat
Remote, Ottawa, Canada
Voice: +1.647.777.2635, Internal: (81) 32635, Alt: +1.613.693.0684x3545

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [RFC Part1 PATCH 00/20 v2] Add namespace support for audit
  2013-11-05 19:14           ` Richard Guy Briggs
@ 2013-11-07  5:51             ` Gao feng
  2013-11-21  7:57             ` Gao feng
  1 sibling, 0 replies; 50+ messages in thread
From: Gao feng @ 2013-11-07  5:51 UTC (permalink / raw)
  To: Richard Guy Briggs
  Cc: Li Zefan, toshi.okajima, containers, serge.hallyn, linux-kernel,
	eparis, linux-audit, ebiederm, sgrubb

On 11/06/2013 03:14 AM, Richard Guy Briggs wrote:
> On Tue, Nov 05, 2013 at 04:56:55PM +0800, Gao feng wrote:
>> On 11/05/2013 04:11 PM, Li Zefan wrote:
>>> On 2013/11/5 15:52, Gao feng wrote:
>>>> On 11/05/2013 03:51 PM, Gao feng wrote:
>>>>> Ping...
>>>>
>>>> I want to catch up the merge window..
>>>
>>> Even if your patches are accepted by a certain maintainer immediately,
>>> he will in no doubt queue them for 3.14.
>>
>> Yes, you are right...
>> But I still want to get some comments..
> 
> Definitely won't make it in in this merge window.  I agree it needs
> discussion, but I won't start that for at least another week (I'm
> currently at IETF in Vancouver.).
> 

Get it,hope we can start discussion ASAP.

Thanks

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [RFC Part1 PATCH 00/20 v2] Add namespace support for audit
  2013-11-05 19:14           ` Richard Guy Briggs
  2013-11-07  5:51             ` Gao feng
@ 2013-11-21  7:57             ` Gao feng
  1 sibling, 0 replies; 50+ messages in thread
From: Gao feng @ 2013-11-21  7:57 UTC (permalink / raw)
  To: Richard Guy Briggs
  Cc: Li Zefan, toshi.okajima, containers, serge.hallyn, linux-kernel,
	eparis, linux-audit, ebiederm, sgrubb

On 11/06/2013 03:14 AM, Richard Guy Briggs wrote:
> On Tue, Nov 05, 2013 at 04:56:55PM +0800, Gao feng wrote:
>> On 11/05/2013 04:11 PM, Li Zefan wrote:
>>> On 2013/11/5 15:52, Gao feng wrote:
>>>> On 11/05/2013 03:51 PM, Gao feng wrote:
>>>>> Ping...
>>>>
>>>> I want to catch up the merge window..
>>>
>>> Even if your patches are accepted by a certain maintainer immediately,
>>> he will in no doubt queue them for 3.14.
>>
>> Yes, you are right...
>> But I still want to get some comments..
> 
> Definitely won't make it in in this merge window.  I agree it needs
> discussion, but I won't start that for at least another week (I'm
> currently at IETF in Vancouver.).
> 

Hi guys,

Any comments?

Thanks


^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [RFC Part1 PATCH 00/20 v2] Add namespace support for audit
  2013-10-24  7:31 [RFC Part1 PATCH 00/20 v2] Add namespace support for audit Gao feng
                   ` (20 preceding siblings ...)
  2013-10-31  3:52 ` [RFC Part1 PATCH 00/20 v2] Add namespace support for audit Gao feng
@ 2013-12-04  8:31 ` Gao feng
  2013-12-06 22:12   ` Serge E. Hallyn
  2013-12-06 21:31 ` Serge E. Hallyn
  22 siblings, 1 reply; 50+ messages in thread
From: Gao feng @ 2013-12-04  8:31 UTC (permalink / raw)
  To: Gao feng, linux-kernel, linux-audit
  Cc: containers, ebiederm, serge.hallyn, eparis, sgrubb, Richard Guy Briggs

Hi

On 10/24/2013 03:31 PM, Gao feng wrote:
> Here is the v1 patchset: http://lwn.net/Articles/549546/
> 
> The main target of this patchset is allowing user in audit
> namespace to generate the USER_MSG type of audit message,
> some userspace tools need to generate audit message, or
> these tools will broken.
> 

I really need this feature, right now,some process such as
logind are broken in container becase we leak of this feature.

> And the login process in container may want to setup
> /proc/<pid>/loginuid, right now this value is unalterable
> once it being set. this will also broke the login problem
> in container. After this patchset, we can reset this loginuid
> to zero if task is running in a new audit namespace.
> 

I notice this problem has been fixed in audit tree, so may be
I don't need to clean up loginuid when we create audit namespace.

And in audit tree, audit socket is per net namespace, hmm, I want
to know if anybody has a solution to allow processes to generate
USER_MSG type of audit message in un init pid/user namepsace.

Any idea? Eric,Steve,Richard?

Thanks
Gao

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [RFC Part1 PATCH 00/20 v2] Add namespace support for audit
  2013-10-24  7:31 [RFC Part1 PATCH 00/20 v2] Add namespace support for audit Gao feng
                   ` (21 preceding siblings ...)
  2013-12-04  8:31 ` Gao feng
@ 2013-12-06 21:31 ` Serge E. Hallyn
  2013-12-09  2:29   ` Gao feng
  22 siblings, 1 reply; 50+ messages in thread
From: Serge E. Hallyn @ 2013-12-06 21:31 UTC (permalink / raw)
  To: Gao feng
  Cc: linux-kernel, linux-audit, toshi.okajima, containers,
	serge.hallyn, eparis, ebiederm, sgrubb

Quoting Gao feng (gaofeng@cn.fujitsu.com):
> Here is the v1 patchset: http://lwn.net/Articles/549546/
> 
> The main target of this patchset is allowing user in audit
> namespace to generate the USER_MSG type of audit message,
> some userspace tools need to generate audit message, or
> these tools will broken.
> 
> And the login process in container may want to setup
> /proc/<pid>/loginuid, right now this value is unalterable
> once it being set. this will also broke the login problem
> in container. After this patchset, we can reset this loginuid
> to zero if task is running in a new audit namespace.
> 
> Same with v1 patchset, in this patchset, only the privileged
> user in init_audit_ns and init_user_ns has rights to
> add/del audit rules. and these rules are gloabl. all
> audit namespace will comply with the rules.
> 
> Compared with v1, v2 patch has some big changes.
> 1, the audit namespace is not assigned to user namespace.
>    since there is no available bit of flags for clone, we
>    create audit namespace through netlink, patch[18/20]
>    introduces a new audit netlink type AUDIT_CREATE_NS.
>    the privileged user in userns has rights to create a
>    audit namespace, it means the unprivileged user can
>    create auditns through create userns first. In order
>    to prevent them from doing harm to host, the default
>    audit_backlog_limit of un-init-audit-ns is zero(means
>    audit is unavailable in audit namespace). and it can't
>    be changed in auditns through netlink.

So the unprivileged user can create an audit-ns, but can't
then actually send any messages there?  I guess setting it
to something small would just be hacky?

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 16/20] audit: allow GET, SET, USER MSG operations in audit namespace
  2013-10-24  7:32 ` [PATCH 16/20] audit: allow GET,SET,USER MSG operations " Gao feng
@ 2013-12-06 22:00   ` Serge E. Hallyn
  2013-12-09  1:47     ` Gao feng
  0 siblings, 1 reply; 50+ messages in thread
From: Serge E. Hallyn @ 2013-12-06 22:00 UTC (permalink / raw)
  To: Gao feng
  Cc: linux-kernel, linux-audit, toshi.okajima, containers,
	serge.hallyn, eparis, ebiederm, sgrubb

Quoting Gao feng (gaofeng@cn.fujitsu.com):
> 1, remove the permission check of pid namespace. it's no reason
>    to deny un-init pid namespace to operate audit subsystem.
> 
> 2, only allow init user namespace and init audit namespace to
>    operate list/add/del rule, tty set, trim, make equiv operations.
> 
> 3, allow audit namespace to get/set audit configuration, send
>    userspace audit message.
> 
> Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
> ---
>  kernel/audit.c | 13 ++++++-------
>  1 file changed, 6 insertions(+), 7 deletions(-)
> 
> diff --git a/kernel/audit.c b/kernel/audit.c
> index 095f54d..c4d4291 100644
> --- a/kernel/audit.c
> +++ b/kernel/audit.c
> @@ -573,11 +573,7 @@ out:
>  static int audit_netlink_ok(struct sk_buff *skb, u16 msg_type)
>  {
>  	int err = 0;
> -
> -	/* Only support the initial namespaces for now. */
> -	if ((current_user_ns() != &init_user_ns) ||
> -	    (task_active_pid_ns(current) != &init_pid_ns))
> -		return -EPERM;
> +	struct audit_namespace *ns = current->nsproxy->audit_ns;
>  
>  	switch (msg_type) {
>  	case AUDIT_LIST:
> @@ -586,6 +582,7 @@ static int audit_netlink_ok(struct sk_buff *skb, u16 msg_type)
>  		return -EOPNOTSUPP;
>  	case AUDIT_GET:
>  	case AUDIT_SET:
> +		break;

So, these AUDIT_SET and AUDIT_GET go from requiring CAP_AUDIT_CONTROL
to not needing any privs at all?

>  	case AUDIT_LIST_RULES:
>  	case AUDIT_ADD_RULE:
>  	case AUDIT_DEL_RULE:
> @@ -594,13 +591,15 @@ static int audit_netlink_ok(struct sk_buff *skb, u16 msg_type)
>  	case AUDIT_TTY_SET:
>  	case AUDIT_TRIM:
>  	case AUDIT_MAKE_EQUIV:
> -		if (!capable(CAP_AUDIT_CONTROL))
> +		if ((current_user_ns() != &init_user_ns) ||
> +		    (ns != &init_audit_ns) ||
> +		    !capable(CAP_AUDIT_CONTROL))
>  			err = -EPERM;
>  		break;
>  	case AUDIT_USER:
>  	case AUDIT_FIRST_USER_MSG ... AUDIT_LAST_USER_MSG:
>  	case AUDIT_FIRST_USER_MSG2 ... AUDIT_LAST_USER_MSG2:
> -		if (!capable(CAP_AUDIT_WRITE))
> +		if (!ns_capable(ns->user_ns, CAP_AUDIT_WRITE))
>  			err = -EPERM;
>  		break;
>  	default:  /* bad msg */
> -- 
> 1.8.3.1
> 
> _______________________________________________
> Containers mailing list
> Containers@lists.linux-foundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/containers

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 18/20] audit: add new message type AUDIT_CREATE_NS
  2013-10-24  7:32 ` [PATCH 18/20] audit: add new message type AUDIT_CREATE_NS Gao feng
@ 2013-12-06 22:10   ` Serge E. Hallyn
  2013-12-09  1:59     ` Gao feng
  0 siblings, 1 reply; 50+ messages in thread
From: Serge E. Hallyn @ 2013-12-06 22:10 UTC (permalink / raw)
  To: Gao feng
  Cc: linux-kernel, linux-audit, containers, ebiederm, serge.hallyn,
	eparis, sgrubb, toshi.okajima

Quoting Gao feng (gaofeng@cn.fujitsu.com):
> Since there is no more place for flags of clone system call.
> we need to find a way to create audit namespace.
> 
> this patch add a new type of message AUDIT_CREATE_NS.
> user space can create new audit namespace through
> netlink.
> 
> Right now, The privileged user in user namespace is allowed
> to create audit namespace. it means the unprivileged user can
> create an user namespace and then create audit namespace.
> 
> Looks like it is not safe, but even the unprivileged user can
> create audit namespace, it can do no harm to the host. un-init
> audit namespace cann't effect the host.
> 
> In the follow patches, the audit_backlog_limit will be per
> audit namesapace, but only the privileged user has rights to
> modify it. and the default value of audit_backlog_limit for
> uninit audit namespace will be set to 0.
> 
> And the audit_rate_limit will be limited too.
> 
> Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
> ---
>  include/linux/audit_namespace.h |  7 +++++++
>  include/uapi/linux/audit.h      |  1 +
>  kernel/audit.c                  | 22 ++++++++++++++++++++++
>  kernel/audit_namespace.c        | 29 +++++++++++++++++++++++++++++
>  4 files changed, 59 insertions(+)
> 
> diff --git a/include/linux/audit_namespace.h b/include/linux/audit_namespace.h
> index 79a9b78..b17f052 100644
> --- a/include/linux/audit_namespace.h
> +++ b/include/linux/audit_namespace.h
> @@ -54,6 +54,8 @@ void put_audit_ns(struct audit_namespace *ns)
>  		rcu_read_unlock();
>  	}
>  }
> +
> +extern int unshare_audit_namespace(void);
>  #else
>  static inline
>  struct audit_namespace *get_audit_ns(struct audit_namespace *ns)
> @@ -66,6 +68,11 @@ void put_audit_ns(struct audit_namespace *ns)
>  {
>  
>  }
> +
> +static inline int unshare_audit_namespace()
> +{
> +	return -EINVAL;
> +}
>  #endif
>  
>  static inline struct
> diff --git a/include/uapi/linux/audit.h b/include/uapi/linux/audit.h
> index 75cef3f..877d509 100644
> --- a/include/uapi/linux/audit.h
> +++ b/include/uapi/linux/audit.h
> @@ -68,6 +68,7 @@
>  #define AUDIT_MAKE_EQUIV	1015	/* Append to watched tree */
>  #define AUDIT_TTY_GET		1016	/* Get TTY auditing status */
>  #define AUDIT_TTY_SET		1017	/* Set TTY auditing status */
> +#define AUDIT_CREATE_NS		1018	/* Create new audit namespace */
>  
>  #define AUDIT_FIRST_USER_MSG	1100	/* Userspace messages mostly uninteresting to kernel */
>  #define AUDIT_USER_AVC		1107	/* We filter this differently */
> diff --git a/kernel/audit.c b/kernel/audit.c
> index c4d4291..86212d3 100644
> --- a/kernel/audit.c
> +++ b/kernel/audit.c
> @@ -596,6 +596,12 @@ static int audit_netlink_ok(struct sk_buff *skb, u16 msg_type)
>  		    !capable(CAP_AUDIT_CONTROL))
>  			err = -EPERM;
>  		break;
> +	case AUDIT_CREATE_NS:
> +		/* Allow privileged user in user namespace to
> +		 * create audit namespace */
> +		if (!ns_capable(current_user_ns(), CAP_AUDIT_CONTROL))
> +			err = -EPERM;
> +		break;
>  	case AUDIT_USER:
>  	case AUDIT_FIRST_USER_MSG ... AUDIT_LAST_USER_MSG:
>  	case AUDIT_FIRST_USER_MSG2 ... AUDIT_LAST_USER_MSG2:
> @@ -735,6 +741,22 @@ static int audit_receive_msg(struct sk_buff *skb, struct nlmsghdr *nlh)
>  		if (status_get->mask & AUDIT_STATUS_BACKLOG_LIMIT)
>  			err = audit_set_backlog_limit(status_get->backlog_limit);
>  		break;
> +	case AUDIT_CREATE_NS:
> +		err = unshare_audit_namespace();
> +
> +		if (audit_enabled == AUDIT_OFF)
> +			break;
> +
> +		ab = audit_log_start_ns(ns, NULL, GFP_KERNEL, AUDIT_CREATE_NS);
> +		if (ab) {
> +			audit_log_format(ab, "Create audit namespace");
> +			audit_log_session_info(ab);
> +			audit_log_task_context(ab);
> +			audit_log_format(ab, "res=%d", err ? 0 : 1);
> +			audit_log_end_ns(ns, ab);
> +		}
> +
> +		break;
>  	case AUDIT_USER:
>  	case AUDIT_FIRST_USER_MSG ... AUDIT_LAST_USER_MSG:
>  	case AUDIT_FIRST_USER_MSG2 ... AUDIT_LAST_USER_MSG2:
> diff --git a/kernel/audit_namespace.c b/kernel/audit_namespace.c
> index 6d9cb8f..28c608e 100644
> --- a/kernel/audit_namespace.c
> +++ b/kernel/audit_namespace.c
> @@ -6,3 +6,32 @@ struct audit_namespace init_audit_ns = {
>  	.user_ns = &init_user_ns,
>  };
>  EXPORT_SYMBOL_GPL(init_audit_ns);
> +
> +int unshare_audit_namespace(void)
> +{
> +	struct task_struct *tsk = current;
> +	struct audit_namespace *new_audit = NULL;
> +	struct nsproxy *new_nsp;
> +
> +	new_audit = kzalloc(sizeof(struct audit_namespace), GFP_KERNEL);
> +	if (!new_audit)
> +		return -ENOMEM;
> +
> +	skb_queue_head_init(&new_audit->queue);
> +	skb_queue_head_init(&new_audit->hold_queue);
> +	init_waitqueue_head(&new_audit->kauditd_wait);
> +	init_waitqueue_head(&new_audit->backlog_wait);
> +
> +	new_nsp = create_new_namespaces(0, tsk, NULL, NULL);
> +	if (IS_ERR(new_nsp)) {
> +		kfree(new_audit);
> +		return PTR_ERR(new_nsp);
> +	}
> +
> +	new_audit->user_ns = get_user_ns(current_user_ns());
> +	new_nsp->audit_ns = get_audit_ns(new_audit);
> +
> +	switch_task_namespaces(current, new_nsp);

Do you need to drop the old audit->ns refcount?

> +
> +	return 0;
> +}
> -- 
> 1.8.3.1
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [RFC Part1 PATCH 00/20 v2] Add namespace support for audit
  2013-12-04  8:31 ` Gao feng
@ 2013-12-06 22:12   ` Serge E. Hallyn
  2013-12-09  2:06     ` Gao feng
  0 siblings, 1 reply; 50+ messages in thread
From: Serge E. Hallyn @ 2013-12-06 22:12 UTC (permalink / raw)
  To: Gao feng
  Cc: linux-kernel, linux-audit, Richard Guy Briggs, containers,
	serge.hallyn, eparis, ebiederm, sgrubb

Quoting Gao feng (gaofeng@cn.fujitsu.com):
> Hi
> 
> On 10/24/2013 03:31 PM, Gao feng wrote:
> > Here is the v1 patchset: http://lwn.net/Articles/549546/
> > 
> > The main target of this patchset is allowing user in audit
> > namespace to generate the USER_MSG type of audit message,
> > some userspace tools need to generate audit message, or
> > these tools will broken.
> > 
> 
> I really need this feature, right now,some process such as
> logind are broken in container becase we leak of this feature.

Your set doesn't address loginuid though right?  How exactly do you
expect to do that?  If user violates MAC policy and audit msg is
sent to init user ns by mac subsys, you need the loginuid from
init_audit_ns.  where will that be stored if you allow updates
of loginuid in auditns?

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 16/20] audit: allow GET, SET, USER MSG operations in audit namespace
  2013-12-06 22:00   ` [PATCH 16/20] audit: allow GET, SET, USER " Serge E. Hallyn
@ 2013-12-09  1:47     ` Gao feng
  0 siblings, 0 replies; 50+ messages in thread
From: Gao feng @ 2013-12-09  1:47 UTC (permalink / raw)
  To: Serge E. Hallyn
  Cc: linux-kernel, linux-audit, toshi.okajima, containers,
	serge.hallyn, eparis, ebiederm, sgrubb

On 12/07/2013 06:00 AM, Serge E. Hallyn wrote:
> Quoting Gao feng (gaofeng@cn.fujitsu.com):
>> 1, remove the permission check of pid namespace. it's no reason
>>    to deny un-init pid namespace to operate audit subsystem.
>>
>> 2, only allow init user namespace and init audit namespace to
>>    operate list/add/del rule, tty set, trim, make equiv operations.
>>
>> 3, allow audit namespace to get/set audit configuration, send
>>    userspace audit message.
>>
>> Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
>> ---
>>  kernel/audit.c | 13 ++++++-------
>>  1 file changed, 6 insertions(+), 7 deletions(-)
>>
>> diff --git a/kernel/audit.c b/kernel/audit.c
>> index 095f54d..c4d4291 100644
>> --- a/kernel/audit.c
>> +++ b/kernel/audit.c
>> @@ -573,11 +573,7 @@ out:
>>  static int audit_netlink_ok(struct sk_buff *skb, u16 msg_type)
>>  {
>>  	int err = 0;
>> -
>> -	/* Only support the initial namespaces for now. */
>> -	if ((current_user_ns() != &init_user_ns) ||
>> -	    (task_active_pid_ns(current) != &init_pid_ns))
>> -		return -EPERM;
>> +	struct audit_namespace *ns = current->nsproxy->audit_ns;
>>  
>>  	switch (msg_type) {
>>  	case AUDIT_LIST:
>> @@ -586,6 +582,7 @@ static int audit_netlink_ok(struct sk_buff *skb, u16 msg_type)
>>  		return -EOPNOTSUPP;
>>  	case AUDIT_GET:
>>  	case AUDIT_SET:
>> +		break;
> 
> So, these AUDIT_SET and AUDIT_GET go from requiring CAP_AUDIT_CONTROL
> to not needing any privs at all?
> 

My mistake, there should be a check such as
ns_capable(ns, CAP_AUDIT_CONTROL).

will fix in next version.

Thanks!


^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 18/20] audit: add new message type AUDIT_CREATE_NS
  2013-12-06 22:10   ` Serge E. Hallyn
@ 2013-12-09  1:59     ` Gao feng
  2013-12-09 17:53       ` Serge Hallyn
  0 siblings, 1 reply; 50+ messages in thread
From: Gao feng @ 2013-12-09  1:59 UTC (permalink / raw)
  To: Serge E. Hallyn
  Cc: linux-kernel, linux-audit, containers, ebiederm, serge.hallyn,
	eparis, sgrubb, toshi.okajima

On 12/07/2013 06:10 AM, Serge E. Hallyn wrote:
> Quoting Gao feng (gaofeng@cn.fujitsu.com):
>> Since there is no more place for flags of clone system call.
>> we need to find a way to create audit namespace.
>>
>> this patch add a new type of message AUDIT_CREATE_NS.
>> user space can create new audit namespace through
>> netlink.
>>
>> Right now, The privileged user in user namespace is allowed
>> to create audit namespace. it means the unprivileged user can
>> create an user namespace and then create audit namespace.
>>
>> Looks like it is not safe, but even the unprivileged user can
>> create audit namespace, it can do no harm to the host. un-init
>> audit namespace cann't effect the host.
>>
>> In the follow patches, the audit_backlog_limit will be per
>> audit namesapace, but only the privileged user has rights to
>> modify it. and the default value of audit_backlog_limit for
>> uninit audit namespace will be set to 0.
>>
>> And the audit_rate_limit will be limited too.
>>
>> Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
>> ---
>>  include/linux/audit_namespace.h |  7 +++++++
>>  include/uapi/linux/audit.h      |  1 +
>>  kernel/audit.c                  | 22 ++++++++++++++++++++++
>>  kernel/audit_namespace.c        | 29 +++++++++++++++++++++++++++++
>>  4 files changed, 59 insertions(+)
>>
>> diff --git a/include/linux/audit_namespace.h b/include/linux/audit_namespace.h
>> index 79a9b78..b17f052 100644
>> --- a/include/linux/audit_namespace.h
>> +++ b/include/linux/audit_namespace.h
>> @@ -54,6 +54,8 @@ void put_audit_ns(struct audit_namespace *ns)
>>  		rcu_read_unlock();
>>  	}
>>  }
>> +
>> +extern int unshare_audit_namespace(void);
>>  #else
>>  static inline
>>  struct audit_namespace *get_audit_ns(struct audit_namespace *ns)
>> @@ -66,6 +68,11 @@ void put_audit_ns(struct audit_namespace *ns)
>>  {
>>  
>>  }
>> +
>> +static inline int unshare_audit_namespace()
>> +{
>> +	return -EINVAL;
>> +}
>>  #endif
>>  
>>  static inline struct
>> diff --git a/include/uapi/linux/audit.h b/include/uapi/linux/audit.h
>> index 75cef3f..877d509 100644
>> --- a/include/uapi/linux/audit.h
>> +++ b/include/uapi/linux/audit.h
>> @@ -68,6 +68,7 @@
>>  #define AUDIT_MAKE_EQUIV	1015	/* Append to watched tree */
>>  #define AUDIT_TTY_GET		1016	/* Get TTY auditing status */
>>  #define AUDIT_TTY_SET		1017	/* Set TTY auditing status */
>> +#define AUDIT_CREATE_NS		1018	/* Create new audit namespace */
>>  
>>  #define AUDIT_FIRST_USER_MSG	1100	/* Userspace messages mostly uninteresting to kernel */
>>  #define AUDIT_USER_AVC		1107	/* We filter this differently */
>> diff --git a/kernel/audit.c b/kernel/audit.c
>> index c4d4291..86212d3 100644
>> --- a/kernel/audit.c
>> +++ b/kernel/audit.c
>> @@ -596,6 +596,12 @@ static int audit_netlink_ok(struct sk_buff *skb, u16 msg_type)
>>  		    !capable(CAP_AUDIT_CONTROL))
>>  			err = -EPERM;
>>  		break;
>> +	case AUDIT_CREATE_NS:
>> +		/* Allow privileged user in user namespace to
>> +		 * create audit namespace */
>> +		if (!ns_capable(current_user_ns(), CAP_AUDIT_CONTROL))
>> +			err = -EPERM;
>> +		break;
>>  	case AUDIT_USER:
>>  	case AUDIT_FIRST_USER_MSG ... AUDIT_LAST_USER_MSG:
>>  	case AUDIT_FIRST_USER_MSG2 ... AUDIT_LAST_USER_MSG2:
>> @@ -735,6 +741,22 @@ static int audit_receive_msg(struct sk_buff *skb, struct nlmsghdr *nlh)
>>  		if (status_get->mask & AUDIT_STATUS_BACKLOG_LIMIT)
>>  			err = audit_set_backlog_limit(status_get->backlog_limit);
>>  		break;
>> +	case AUDIT_CREATE_NS:
>> +		err = unshare_audit_namespace();
>> +
>> +		if (audit_enabled == AUDIT_OFF)
>> +			break;
>> +
>> +		ab = audit_log_start_ns(ns, NULL, GFP_KERNEL, AUDIT_CREATE_NS);
>> +		if (ab) {
>> +			audit_log_format(ab, "Create audit namespace");
>> +			audit_log_session_info(ab);
>> +			audit_log_task_context(ab);
>> +			audit_log_format(ab, "res=%d", err ? 0 : 1);
>> +			audit_log_end_ns(ns, ab);
>> +		}
>> +
>> +		break;
>>  	case AUDIT_USER:
>>  	case AUDIT_FIRST_USER_MSG ... AUDIT_LAST_USER_MSG:
>>  	case AUDIT_FIRST_USER_MSG2 ... AUDIT_LAST_USER_MSG2:
>> diff --git a/kernel/audit_namespace.c b/kernel/audit_namespace.c
>> index 6d9cb8f..28c608e 100644
>> --- a/kernel/audit_namespace.c
>> +++ b/kernel/audit_namespace.c
>> @@ -6,3 +6,32 @@ struct audit_namespace init_audit_ns = {
>>  	.user_ns = &init_user_ns,
>>  };
>>  EXPORT_SYMBOL_GPL(init_audit_ns);
>> +
>> +int unshare_audit_namespace(void)
>> +{
>> +	struct task_struct *tsk = current;
>> +	struct audit_namespace *new_audit = NULL;
>> +	struct nsproxy *new_nsp;
>> +
>> +	new_audit = kzalloc(sizeof(struct audit_namespace), GFP_KERNEL);
>> +	if (!new_audit)
>> +		return -ENOMEM;
>> +
>> +	skb_queue_head_init(&new_audit->queue);
>> +	skb_queue_head_init(&new_audit->hold_queue);
>> +	init_waitqueue_head(&new_audit->kauditd_wait);
>> +	init_waitqueue_head(&new_audit->backlog_wait);
>> +
>> +	new_nsp = create_new_namespaces(0, tsk, NULL, NULL);
>> +	if (IS_ERR(new_nsp)) {
>> +		kfree(new_audit);
>> +		return PTR_ERR(new_nsp);
>> +	}
>> +
>> +	new_audit->user_ns = get_user_ns(current_user_ns());
>> +	new_nsp->audit_ns = get_audit_ns(new_audit);
>> +
>> +	switch_task_namespaces(current, new_nsp);
> 
> Do you need to drop the old audit->ns refcount?
> 

task doesn't hold namespace's refcount directly. it hold ns's refcount
through nsproxy. switch_task_namespaces will release the refcount of old
nsproxy. so I think the refcount here is no problem.


^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [RFC Part1 PATCH 00/20 v2] Add namespace support for audit
  2013-12-06 22:12   ` Serge E. Hallyn
@ 2013-12-09  2:06     ` Gao feng
  2013-12-09 18:26       ` Serge Hallyn
  0 siblings, 1 reply; 50+ messages in thread
From: Gao feng @ 2013-12-09  2:06 UTC (permalink / raw)
  To: Serge E. Hallyn
  Cc: linux-kernel, linux-audit, Richard Guy Briggs, containers,
	serge.hallyn, eparis, ebiederm, sgrubb

On 12/07/2013 06:12 AM, Serge E. Hallyn wrote:
> Quoting Gao feng (gaofeng@cn.fujitsu.com):
>> Hi
>>
>> On 10/24/2013 03:31 PM, Gao feng wrote:
>>> Here is the v1 patchset: http://lwn.net/Articles/549546/
>>>
>>> The main target of this patchset is allowing user in audit
>>> namespace to generate the USER_MSG type of audit message,
>>> some userspace tools need to generate audit message, or
>>> these tools will broken.
>>>
>>
>> I really need this feature, right now,some process such as
>> logind are broken in container becase we leak of this feature.
> 
> Your set doesn't address loginuid though right?  How exactly do you
> expect to do that?  If user violates MAC policy and audit msg is
> sent to init user ns by mac subsys, you need the loginuid from
> init_audit_ns.  where will that be stored if you allow updates
> of loginuid in auditns?
> 
This patchset doesn't include the loginuid part.

the loginuid is stored in task as before.
In my opinion, when task creates a new audit namespace, this task's
loginuid will be reset to zero, so the children tasks can set their
loginuid. Does this change break the MAC?

Thanks!

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [RFC Part1 PATCH 00/20 v2] Add namespace support for audit
  2013-12-06 21:31 ` Serge E. Hallyn
@ 2013-12-09  2:29   ` Gao feng
  2013-12-23 23:47     ` Richard Guy Briggs
  0 siblings, 1 reply; 50+ messages in thread
From: Gao feng @ 2013-12-09  2:29 UTC (permalink / raw)
  To: Serge E. Hallyn
  Cc: linux-kernel, linux-audit, toshi.okajima, containers,
	serge.hallyn, eparis, ebiederm, sgrubb

Hi Serge,

Thanks for your comments!

On 12/07/2013 05:31 AM, Serge E. Hallyn wrote:
> Quoting Gao feng (gaofeng@cn.fujitsu.com):
>> Here is the v1 patchset: http://lwn.net/Articles/549546/
>>
>> The main target of this patchset is allowing user in audit
>> namespace to generate the USER_MSG type of audit message,
>> some userspace tools need to generate audit message, or
>> these tools will broken.
>>
>> And the login process in container may want to setup
>> /proc/<pid>/loginuid, right now this value is unalterable
>> once it being set. this will also broke the login problem
>> in container. After this patchset, we can reset this loginuid
>> to zero if task is running in a new audit namespace.
>>
>> Same with v1 patchset, in this patchset, only the privileged
>> user in init_audit_ns and init_user_ns has rights to
>> add/del audit rules. and these rules are gloabl. all
>> audit namespace will comply with the rules.
>>
>> Compared with v1, v2 patch has some big changes.
>> 1, the audit namespace is not assigned to user namespace.
>>    since there is no available bit of flags for clone, we
>>    create audit namespace through netlink, patch[18/20]
>>    introduces a new audit netlink type AUDIT_CREATE_NS.
>>    the privileged user in userns has rights to create a
>>    audit namespace, it means the unprivileged user can
>>    create auditns through create userns first. In order
>>    to prevent them from doing harm to host, the default
>>    audit_backlog_limit of un-init-audit-ns is zero(means
>>    audit is unavailable in audit namespace). and it can't
>>    be changed in auditns through netlink.
> 
> So the unprivileged user can create an audit-ns, but can't
> then actually send any messages there?  I guess setting it
> to something small would just be hacky?

Yes, if unprivileged user wants to send audit message, he should
ask privileged user to setup the audit_backlog_limit for him.

I know it's a little of hack, but I don't have good idea :(


^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 18/20] audit: add new message type AUDIT_CREATE_NS
  2013-12-09  1:59     ` Gao feng
@ 2013-12-09 17:53       ` Serge Hallyn
  2013-12-10  5:34         ` Gao feng
  0 siblings, 1 reply; 50+ messages in thread
From: Serge Hallyn @ 2013-12-09 17:53 UTC (permalink / raw)
  To: Gao feng
  Cc: Serge E. Hallyn, toshi.okajima, containers, linux-kernel, eparis,
	linux-audit, ebiederm, sgrubb

Quoting Gao feng (gaofeng@cn.fujitsu.com):
> On 12/07/2013 06:10 AM, Serge E. Hallyn wrote:
> > Quoting Gao feng (gaofeng@cn.fujitsu.com):
> >> Since there is no more place for flags of clone system call.
> >> we need to find a way to create audit namespace.
> >>
> >> this patch add a new type of message AUDIT_CREATE_NS.
> >> user space can create new audit namespace through
> >> netlink.
> >>
> >> Right now, The privileged user in user namespace is allowed
> >> to create audit namespace. it means the unprivileged user can
> >> create an user namespace and then create audit namespace.
> >>
> >> Looks like it is not safe, but even the unprivileged user can
> >> create audit namespace, it can do no harm to the host. un-init
> >> audit namespace cann't effect the host.
> >>
> >> In the follow patches, the audit_backlog_limit will be per
> >> audit namesapace, but only the privileged user has rights to
> >> modify it. and the default value of audit_backlog_limit for
> >> uninit audit namespace will be set to 0.
> >>
> >> And the audit_rate_limit will be limited too.
> >>
> >> Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
> >> ---
> >>  include/linux/audit_namespace.h |  7 +++++++
> >>  include/uapi/linux/audit.h      |  1 +
> >>  kernel/audit.c                  | 22 ++++++++++++++++++++++
> >>  kernel/audit_namespace.c        | 29 +++++++++++++++++++++++++++++
> >>  4 files changed, 59 insertions(+)
> >>
> >> diff --git a/include/linux/audit_namespace.h b/include/linux/audit_namespace.h
> >> index 79a9b78..b17f052 100644
> >> --- a/include/linux/audit_namespace.h
> >> +++ b/include/linux/audit_namespace.h
> >> @@ -54,6 +54,8 @@ void put_audit_ns(struct audit_namespace *ns)
> >>  		rcu_read_unlock();
> >>  	}
> >>  }
> >> +
> >> +extern int unshare_audit_namespace(void);
> >>  #else
> >>  static inline
> >>  struct audit_namespace *get_audit_ns(struct audit_namespace *ns)
> >> @@ -66,6 +68,11 @@ void put_audit_ns(struct audit_namespace *ns)
> >>  {
> >>  
> >>  }
> >> +
> >> +static inline int unshare_audit_namespace()
> >> +{
> >> +	return -EINVAL;
> >> +}
> >>  #endif
> >>  
> >>  static inline struct
> >> diff --git a/include/uapi/linux/audit.h b/include/uapi/linux/audit.h
> >> index 75cef3f..877d509 100644
> >> --- a/include/uapi/linux/audit.h
> >> +++ b/include/uapi/linux/audit.h
> >> @@ -68,6 +68,7 @@
> >>  #define AUDIT_MAKE_EQUIV	1015	/* Append to watched tree */
> >>  #define AUDIT_TTY_GET		1016	/* Get TTY auditing status */
> >>  #define AUDIT_TTY_SET		1017	/* Set TTY auditing status */
> >> +#define AUDIT_CREATE_NS		1018	/* Create new audit namespace */
> >>  
> >>  #define AUDIT_FIRST_USER_MSG	1100	/* Userspace messages mostly uninteresting to kernel */
> >>  #define AUDIT_USER_AVC		1107	/* We filter this differently */
> >> diff --git a/kernel/audit.c b/kernel/audit.c
> >> index c4d4291..86212d3 100644
> >> --- a/kernel/audit.c
> >> +++ b/kernel/audit.c
> >> @@ -596,6 +596,12 @@ static int audit_netlink_ok(struct sk_buff *skb, u16 msg_type)
> >>  		    !capable(CAP_AUDIT_CONTROL))
> >>  			err = -EPERM;
> >>  		break;
> >> +	case AUDIT_CREATE_NS:
> >> +		/* Allow privileged user in user namespace to
> >> +		 * create audit namespace */
> >> +		if (!ns_capable(current_user_ns(), CAP_AUDIT_CONTROL))
> >> +			err = -EPERM;
> >> +		break;
> >>  	case AUDIT_USER:
> >>  	case AUDIT_FIRST_USER_MSG ... AUDIT_LAST_USER_MSG:
> >>  	case AUDIT_FIRST_USER_MSG2 ... AUDIT_LAST_USER_MSG2:
> >> @@ -735,6 +741,22 @@ static int audit_receive_msg(struct sk_buff *skb, struct nlmsghdr *nlh)
> >>  		if (status_get->mask & AUDIT_STATUS_BACKLOG_LIMIT)
> >>  			err = audit_set_backlog_limit(status_get->backlog_limit);
> >>  		break;
> >> +	case AUDIT_CREATE_NS:
> >> +		err = unshare_audit_namespace();
> >> +
> >> +		if (audit_enabled == AUDIT_OFF)
> >> +			break;
> >> +
> >> +		ab = audit_log_start_ns(ns, NULL, GFP_KERNEL, AUDIT_CREATE_NS);
> >> +		if (ab) {
> >> +			audit_log_format(ab, "Create audit namespace");
> >> +			audit_log_session_info(ab);
> >> +			audit_log_task_context(ab);
> >> +			audit_log_format(ab, "res=%d", err ? 0 : 1);
> >> +			audit_log_end_ns(ns, ab);
> >> +		}
> >> +
> >> +		break;
> >>  	case AUDIT_USER:
> >>  	case AUDIT_FIRST_USER_MSG ... AUDIT_LAST_USER_MSG:
> >>  	case AUDIT_FIRST_USER_MSG2 ... AUDIT_LAST_USER_MSG2:
> >> diff --git a/kernel/audit_namespace.c b/kernel/audit_namespace.c
> >> index 6d9cb8f..28c608e 100644
> >> --- a/kernel/audit_namespace.c
> >> +++ b/kernel/audit_namespace.c
> >> @@ -6,3 +6,32 @@ struct audit_namespace init_audit_ns = {
> >>  	.user_ns = &init_user_ns,
> >>  };
> >>  EXPORT_SYMBOL_GPL(init_audit_ns);
> >> +
> >> +int unshare_audit_namespace(void)
> >> +{
> >> +	struct task_struct *tsk = current;
> >> +	struct audit_namespace *new_audit = NULL;
> >> +	struct nsproxy *new_nsp;
> >> +
> >> +	new_audit = kzalloc(sizeof(struct audit_namespace), GFP_KERNEL);
> >> +	if (!new_audit)
> >> +		return -ENOMEM;
> >> +
> >> +	skb_queue_head_init(&new_audit->queue);
> >> +	skb_queue_head_init(&new_audit->hold_queue);
> >> +	init_waitqueue_head(&new_audit->kauditd_wait);
> >> +	init_waitqueue_head(&new_audit->backlog_wait);
> >> +
> >> +	new_nsp = create_new_namespaces(0, tsk, NULL, NULL);
> >> +	if (IS_ERR(new_nsp)) {
> >> +		kfree(new_audit);
> >> +		return PTR_ERR(new_nsp);
> >> +	}
> >> +
> >> +	new_audit->user_ns = get_user_ns(current_user_ns());
> >> +	new_nsp->audit_ns = get_audit_ns(new_audit);
> >> +
> >> +	switch_task_namespaces(current, new_nsp);
> > 
> > Do you need to drop the old audit->ns refcount?
> > 
> 
> task doesn't hold namespace's refcount directly. it hold ns's refcount

No but when you create the new_nsp with create_new_namespaces(),
that new_nsp bumps the refcount on the init_audit_ns.  Then you
point new_nsp to a new audit_ns, but you never drop the refcount
on init_audit_ns.

> through nsproxy. switch_task_namespaces will release the refcount of old
> nsproxy. so I think the refcount here is no problem.
> 
> _______________________________________________
> Containers mailing list
> Containers@lists.linux-foundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/containers

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [RFC Part1 PATCH 00/20 v2] Add namespace support for audit
  2013-12-09  2:06     ` Gao feng
@ 2013-12-09 18:26       ` Serge Hallyn
  2013-12-10  8:16         ` Gao feng
  0 siblings, 1 reply; 50+ messages in thread
From: Serge Hallyn @ 2013-12-09 18:26 UTC (permalink / raw)
  To: Gao feng
  Cc: Serge E. Hallyn, linux-kernel, linux-audit, Richard Guy Briggs,
	containers, eparis, ebiederm, sgrubb

Quoting Gao feng (gaofeng@cn.fujitsu.com):
> On 12/07/2013 06:12 AM, Serge E. Hallyn wrote:
> > Quoting Gao feng (gaofeng@cn.fujitsu.com):
> >> Hi
> >>
> >> On 10/24/2013 03:31 PM, Gao feng wrote:
> >>> Here is the v1 patchset: http://lwn.net/Articles/549546/
> >>>
> >>> The main target of this patchset is allowing user in audit
> >>> namespace to generate the USER_MSG type of audit message,
> >>> some userspace tools need to generate audit message, or
> >>> these tools will broken.
> >>>
> >>
> >> I really need this feature, right now,some process such as
> >> logind are broken in container becase we leak of this feature.
> > 
> > Your set doesn't address loginuid though right?  How exactly do you
> > expect to do that?  If user violates MAC policy and audit msg is
> > sent to init user ns by mac subsys, you need the loginuid from
> > init_audit_ns.  where will that be stored if you allow updates
> > of loginuid in auditns?
> > 
> This patchset doesn't include the loginuid part.
> 
> the loginuid is stored in task as before.
> In my opinion, when task creates a new audit namespace, this task's
> loginuid will be reset to zero, so the children tasks can set their
> loginuid. Does this change break the MAC?

I think so, yes.  In an LSPP selinux environment, if the task
manages to trigger an selinux deny rule which is audited, then
the loginuid must make sense on the host.  Now presumably it
will get translated to the mapped host uid, and we can figure
out the host uid owning it through /etc/subuid.  But that adds
/etc/subuid as a new part of the TCB without any warning <shrug>
So in that sense, for LSPP, it breaks it.

-serge

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 18/20] audit: add new message type AUDIT_CREATE_NS
  2013-12-09 17:53       ` Serge Hallyn
@ 2013-12-10  5:34         ` Gao feng
  0 siblings, 0 replies; 50+ messages in thread
From: Gao feng @ 2013-12-10  5:34 UTC (permalink / raw)
  To: Serge Hallyn
  Cc: Serge E. Hallyn, toshi.okajima, containers, linux-kernel, eparis,
	linux-audit, ebiederm, sgrubb

On 12/10/2013 01:53 AM, Serge Hallyn wrote:
> Quoting Gao feng (gaofeng@cn.fujitsu.com):
>> On 12/07/2013 06:10 AM, Serge E. Hallyn wrote:
>>> Quoting Gao feng (gaofeng@cn.fujitsu.com):
>>>> Since there is no more place for flags of clone system call.
>>>> we need to find a way to create audit namespace.
>>>>
>>>> this patch add a new type of message AUDIT_CREATE_NS.
>>>> user space can create new audit namespace through
>>>> netlink.
>>>>
>>>> Right now, The privileged user in user namespace is allowed
>>>> to create audit namespace. it means the unprivileged user can
>>>> create an user namespace and then create audit namespace.
>>>>
>>>> Looks like it is not safe, but even the unprivileged user can
>>>> create audit namespace, it can do no harm to the host. un-init
>>>> audit namespace cann't effect the host.
>>>>
>>>> In the follow patches, the audit_backlog_limit will be per
>>>> audit namesapace, but only the privileged user has rights to
>>>> modify it. and the default value of audit_backlog_limit for
>>>> uninit audit namespace will be set to 0.
>>>>
>>>> And the audit_rate_limit will be limited too.
>>>>
>>>> Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
>>>> ---
>>>>  include/linux/audit_namespace.h |  7 +++++++
>>>>  include/uapi/linux/audit.h      |  1 +
>>>>  kernel/audit.c                  | 22 ++++++++++++++++++++++
>>>>  kernel/audit_namespace.c        | 29 +++++++++++++++++++++++++++++
>>>>  4 files changed, 59 insertions(+)
>>>>
>>>> diff --git a/include/linux/audit_namespace.h b/include/linux/audit_namespace.h
>>>> index 79a9b78..b17f052 100644
>>>> --- a/include/linux/audit_namespace.h
>>>> +++ b/include/linux/audit_namespace.h
>>>> @@ -54,6 +54,8 @@ void put_audit_ns(struct audit_namespace *ns)
>>>>  		rcu_read_unlock();
>>>>  	}
>>>>  }
>>>> +
>>>> +extern int unshare_audit_namespace(void);
>>>>  #else
>>>>  static inline
>>>>  struct audit_namespace *get_audit_ns(struct audit_namespace *ns)
>>>> @@ -66,6 +68,11 @@ void put_audit_ns(struct audit_namespace *ns)
>>>>  {
>>>>  
>>>>  }
>>>> +
>>>> +static inline int unshare_audit_namespace()
>>>> +{
>>>> +	return -EINVAL;
>>>> +}
>>>>  #endif
>>>>  
>>>>  static inline struct
>>>> diff --git a/include/uapi/linux/audit.h b/include/uapi/linux/audit.h
>>>> index 75cef3f..877d509 100644
>>>> --- a/include/uapi/linux/audit.h
>>>> +++ b/include/uapi/linux/audit.h
>>>> @@ -68,6 +68,7 @@
>>>>  #define AUDIT_MAKE_EQUIV	1015	/* Append to watched tree */
>>>>  #define AUDIT_TTY_GET		1016	/* Get TTY auditing status */
>>>>  #define AUDIT_TTY_SET		1017	/* Set TTY auditing status */
>>>> +#define AUDIT_CREATE_NS		1018	/* Create new audit namespace */
>>>>  
>>>>  #define AUDIT_FIRST_USER_MSG	1100	/* Userspace messages mostly uninteresting to kernel */
>>>>  #define AUDIT_USER_AVC		1107	/* We filter this differently */
>>>> diff --git a/kernel/audit.c b/kernel/audit.c
>>>> index c4d4291..86212d3 100644
>>>> --- a/kernel/audit.c
>>>> +++ b/kernel/audit.c
>>>> @@ -596,6 +596,12 @@ static int audit_netlink_ok(struct sk_buff *skb, u16 msg_type)
>>>>  		    !capable(CAP_AUDIT_CONTROL))
>>>>  			err = -EPERM;
>>>>  		break;
>>>> +	case AUDIT_CREATE_NS:
>>>> +		/* Allow privileged user in user namespace to
>>>> +		 * create audit namespace */
>>>> +		if (!ns_capable(current_user_ns(), CAP_AUDIT_CONTROL))
>>>> +			err = -EPERM;
>>>> +		break;
>>>>  	case AUDIT_USER:
>>>>  	case AUDIT_FIRST_USER_MSG ... AUDIT_LAST_USER_MSG:
>>>>  	case AUDIT_FIRST_USER_MSG2 ... AUDIT_LAST_USER_MSG2:
>>>> @@ -735,6 +741,22 @@ static int audit_receive_msg(struct sk_buff *skb, struct nlmsghdr *nlh)
>>>>  		if (status_get->mask & AUDIT_STATUS_BACKLOG_LIMIT)
>>>>  			err = audit_set_backlog_limit(status_get->backlog_limit);
>>>>  		break;
>>>> +	case AUDIT_CREATE_NS:
>>>> +		err = unshare_audit_namespace();
>>>> +
>>>> +		if (audit_enabled == AUDIT_OFF)
>>>> +			break;
>>>> +
>>>> +		ab = audit_log_start_ns(ns, NULL, GFP_KERNEL, AUDIT_CREATE_NS);
>>>> +		if (ab) {
>>>> +			audit_log_format(ab, "Create audit namespace");
>>>> +			audit_log_session_info(ab);
>>>> +			audit_log_task_context(ab);
>>>> +			audit_log_format(ab, "res=%d", err ? 0 : 1);
>>>> +			audit_log_end_ns(ns, ab);
>>>> +		}
>>>> +
>>>> +		break;
>>>>  	case AUDIT_USER:
>>>>  	case AUDIT_FIRST_USER_MSG ... AUDIT_LAST_USER_MSG:
>>>>  	case AUDIT_FIRST_USER_MSG2 ... AUDIT_LAST_USER_MSG2:
>>>> diff --git a/kernel/audit_namespace.c b/kernel/audit_namespace.c
>>>> index 6d9cb8f..28c608e 100644
>>>> --- a/kernel/audit_namespace.c
>>>> +++ b/kernel/audit_namespace.c
>>>> @@ -6,3 +6,32 @@ struct audit_namespace init_audit_ns = {
>>>>  	.user_ns = &init_user_ns,
>>>>  };
>>>>  EXPORT_SYMBOL_GPL(init_audit_ns);
>>>> +
>>>> +int unshare_audit_namespace(void)
>>>> +{
>>>> +	struct task_struct *tsk = current;
>>>> +	struct audit_namespace *new_audit = NULL;
>>>> +	struct nsproxy *new_nsp;
>>>> +
>>>> +	new_audit = kzalloc(sizeof(struct audit_namespace), GFP_KERNEL);
>>>> +	if (!new_audit)
>>>> +		return -ENOMEM;
>>>> +
>>>> +	skb_queue_head_init(&new_audit->queue);
>>>> +	skb_queue_head_init(&new_audit->hold_queue);
>>>> +	init_waitqueue_head(&new_audit->kauditd_wait);
>>>> +	init_waitqueue_head(&new_audit->backlog_wait);
>>>> +
>>>> +	new_nsp = create_new_namespaces(0, tsk, NULL, NULL);
>>>> +	if (IS_ERR(new_nsp)) {
>>>> +		kfree(new_audit);
>>>> +		return PTR_ERR(new_nsp);
>>>> +	}
>>>> +
>>>> +	new_audit->user_ns = get_user_ns(current_user_ns());
>>>> +	new_nsp->audit_ns = get_audit_ns(new_audit);
>>>> +
>>>> +	switch_task_namespaces(current, new_nsp);
>>>
>>> Do you need to drop the old audit->ns refcount?
>>>
>>
>> task doesn't hold namespace's refcount directly. it hold ns's refcount
> 
> No but when you create the new_nsp with create_new_namespaces(),
> that new_nsp bumps the refcount on the init_audit_ns.  Then you
> point new_nsp to a new audit_ns, but you never drop the refcount
> on init_audit_ns.
> 

Yes, I got it.
Thanks for your explantion, will fix this problem in next version.

Thanks!

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [RFC Part1 PATCH 00/20 v2] Add namespace support for audit
  2013-12-09 18:26       ` Serge Hallyn
@ 2013-12-10  8:16         ` Gao feng
  2013-12-10 16:51           ` Serge Hallyn
  0 siblings, 1 reply; 50+ messages in thread
From: Gao feng @ 2013-12-10  8:16 UTC (permalink / raw)
  To: Serge Hallyn
  Cc: Serge E. Hallyn, linux-kernel, linux-audit, Richard Guy Briggs,
	containers, eparis, ebiederm, sgrubb

On 12/10/2013 02:26 AM, Serge Hallyn wrote:
> Quoting Gao feng (gaofeng@cn.fujitsu.com):
>> On 12/07/2013 06:12 AM, Serge E. Hallyn wrote:
>>> Quoting Gao feng (gaofeng@cn.fujitsu.com):
>>>> Hi
>>>>
>>>> On 10/24/2013 03:31 PM, Gao feng wrote:
>>>>> Here is the v1 patchset: http://lwn.net/Articles/549546/
>>>>>
>>>>> The main target of this patchset is allowing user in audit
>>>>> namespace to generate the USER_MSG type of audit message,
>>>>> some userspace tools need to generate audit message, or
>>>>> these tools will broken.
>>>>>
>>>>
>>>> I really need this feature, right now,some process such as
>>>> logind are broken in container becase we leak of this feature.
>>>
>>> Your set doesn't address loginuid though right?  How exactly do you
>>> expect to do that?  If user violates MAC policy and audit msg is
>>> sent to init user ns by mac subsys, you need the loginuid from
>>> init_audit_ns.  where will that be stored if you allow updates
>>> of loginuid in auditns?
>>>
>> This patchset doesn't include the loginuid part.
>>
>> the loginuid is stored in task as before.
>> In my opinion, when task creates a new audit namespace, this task's
>> loginuid will be reset to zero, so the children tasks can set their
>> loginuid. Does this change break the MAC?
> 
> I think so, yes.  In an LSPP selinux environment, if the task
> manages to trigger an selinux deny rule which is audited, then
> the loginuid must make sense on the host.  Now presumably it
> will get translated to the mapped host uid, and we can figure
> out the host uid owning it through /etc/subuid.  But that adds
> /etc/subuid as a new part of the TCB without any warning <shrug>
> So in that sense, for LSPP, it breaks it.
> 

Looks like my opinion is incorrect.

In the audit-next tree, Eric added a new audit feature to allow privileged
user to disable AUDIT_LOGINUID_IMMUTABLE. after AUDIT_LOGINUID_IMMUTABLE
is disabled, the privileged user can reset/set the loginuid of task. I
think this way is safe since only privileged user can do the change.

So I will not change the loginuid part.

Thanks for your information Serge :)

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [RFC Part1 PATCH 00/20 v2] Add namespace support for audit
  2013-12-10  8:16         ` Gao feng
@ 2013-12-10 16:51           ` Serge Hallyn
  2013-12-10 19:50             ` Eric Paris
  0 siblings, 1 reply; 50+ messages in thread
From: Serge Hallyn @ 2013-12-10 16:51 UTC (permalink / raw)
  To: Gao feng
  Cc: Richard Guy Briggs, containers, linux-kernel, eparis,
	linux-audit, ebiederm, sgrubb

Quoting Gao feng (gaofeng@cn.fujitsu.com):
> On 12/10/2013 02:26 AM, Serge Hallyn wrote:
> > Quoting Gao feng (gaofeng@cn.fujitsu.com):
> >> On 12/07/2013 06:12 AM, Serge E. Hallyn wrote:
> >>> Quoting Gao feng (gaofeng@cn.fujitsu.com):
> >>>> Hi
> >>>>
> >>>> On 10/24/2013 03:31 PM, Gao feng wrote:
> >>>>> Here is the v1 patchset: http://lwn.net/Articles/549546/
> >>>>>
> >>>>> The main target of this patchset is allowing user in audit
> >>>>> namespace to generate the USER_MSG type of audit message,
> >>>>> some userspace tools need to generate audit message, or
> >>>>> these tools will broken.
> >>>>>
> >>>>
> >>>> I really need this feature, right now,some process such as
> >>>> logind are broken in container becase we leak of this feature.
> >>>
> >>> Your set doesn't address loginuid though right?  How exactly do you
> >>> expect to do that?  If user violates MAC policy and audit msg is
> >>> sent to init user ns by mac subsys, you need the loginuid from
> >>> init_audit_ns.  where will that be stored if you allow updates
> >>> of loginuid in auditns?
> >>>
> >> This patchset doesn't include the loginuid part.
> >>
> >> the loginuid is stored in task as before.
> >> In my opinion, when task creates a new audit namespace, this task's
> >> loginuid will be reset to zero, so the children tasks can set their
> >> loginuid. Does this change break the MAC?
> > 
> > I think so, yes.  In an LSPP selinux environment, if the task
> > manages to trigger an selinux deny rule which is audited, then
> > the loginuid must make sense on the host.  Now presumably it
> > will get translated to the mapped host uid, and we can figure
> > out the host uid owning it through /etc/subuid.  But that adds
> > /etc/subuid as a new part of the TCB without any warning <shrug>
> > So in that sense, for LSPP, it breaks it.
> > 
> 
> Looks like my opinion is incorrect.
> 
> In the audit-next tree, Eric added a new audit feature to allow privileged
> user to disable AUDIT_LOGINUID_IMMUTABLE. after AUDIT_LOGINUID_IMMUTABLE
> is disabled, the privileged user can reset/set the loginuid of task. I
> think this way is safe since only privileged user can do the change.
> 
> So I will not change the loginuid part.
> 
> Thanks for your information Serge :)

Unfortunately this makes the patchset much less compelling :)  The
problem I was looking into is that a container running in a user
namespace cannot (bc he has ns_capable(CAP_AUDIT_*) but not
capable(CAP_AUDIT_*)) set loginuids at all.

Which from an LSPP pov is correct;  which is why I was hoping you were
going to have the audit namespaces be hierarchical, with a task in a
level 2 audit ns having two loginuids - one in his own auditns, and
one in the initial one.

-serge

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [RFC Part1 PATCH 00/20 v2] Add namespace support for audit
  2013-12-10 16:51           ` Serge Hallyn
@ 2013-12-10 19:50             ` Eric Paris
  2013-12-10 20:36               ` Serge E. Hallyn
  0 siblings, 1 reply; 50+ messages in thread
From: Eric Paris @ 2013-12-10 19:50 UTC (permalink / raw)
  To: Serge Hallyn
  Cc: Gao feng, Richard Guy Briggs, containers, linux-kernel,
	linux-audit, ebiederm, sgrubb

On Tue, 2013-12-10 at 10:51 -0600, Serge Hallyn wrote:
> Quoting Gao feng (gaofeng@cn.fujitsu.com):
> > On 12/10/2013 02:26 AM, Serge Hallyn wrote:
> > > Quoting Gao feng (gaofeng@cn.fujitsu.com):
> > >> On 12/07/2013 06:12 AM, Serge E. Hallyn wrote:
> > >>> Quoting Gao feng (gaofeng@cn.fujitsu.com):
> > >>>> Hi
> > >>>>
> > >>>> On 10/24/2013 03:31 PM, Gao feng wrote:
> > >>>>> Here is the v1 patchset: http://lwn.net/Articles/549546/
> > >>>>>
> > >>>>> The main target of this patchset is allowing user in audit
> > >>>>> namespace to generate the USER_MSG type of audit message,
> > >>>>> some userspace tools need to generate audit message, or
> > >>>>> these tools will broken.
> > >>>>>
> > >>>>
> > >>>> I really need this feature, right now,some process such as
> > >>>> logind are broken in container becase we leak of this feature.
> > >>>
> > >>> Your set doesn't address loginuid though right?  How exactly do you
> > >>> expect to do that?  If user violates MAC policy and audit msg is
> > >>> sent to init user ns by mac subsys, you need the loginuid from
> > >>> init_audit_ns.  where will that be stored if you allow updates
> > >>> of loginuid in auditns?
> > >>>
> > >> This patchset doesn't include the loginuid part.
> > >>
> > >> the loginuid is stored in task as before.
> > >> In my opinion, when task creates a new audit namespace, this task's
> > >> loginuid will be reset to zero, so the children tasks can set their
> > >> loginuid. Does this change break the MAC?
> > > 
> > > I think so, yes.  In an LSPP selinux environment, if the task
> > > manages to trigger an selinux deny rule which is audited, then
> > > the loginuid must make sense on the host.  Now presumably it
> > > will get translated to the mapped host uid, and we can figure
> > > out the host uid owning it through /etc/subuid.  But that adds
> > > /etc/subuid as a new part of the TCB without any warning <shrug>
> > > So in that sense, for LSPP, it breaks it.
> > > 
> > 
> > Looks like my opinion is incorrect.
> > 
> > In the audit-next tree, Eric added a new audit feature to allow privileged
> > user to disable AUDIT_LOGINUID_IMMUTABLE. after AUDIT_LOGINUID_IMMUTABLE
> > is disabled, the privileged user can reset/set the loginuid of task. I
> > think this way is safe since only privileged user can do the change.
> > 
> > So I will not change the loginuid part.
> > 
> > Thanks for your information Serge :)
> 
> Unfortunately this makes the patchset much less compelling :)  The
> problem I was looking into is that a container running in a user
> namespace cannot (bc he has ns_capable(CAP_AUDIT_*) but not
> capable(CAP_AUDIT_*)) set loginuids at all.
> 
> Which from an LSPP pov is correct;  which is why I was hoping you were
> going to have the audit namespaces be hierarchical, with a task in a
> level 2 audit ns having two loginuids - one in his own auditns, and
> one in the initial one.

Right now user namespace + audit is just total crud.  We all know
this...  (I'm not sure pid is must better, but I digress)   All thoughts
around loginuid in the kernel right this very moment only make sense in
the initial user namespace and all permission checks are in the initial
user namespace as well.

I think I'm a proponent of the hierarchical approach to audit
namespaces.  An audit namespace would hold a reference to the
pid/user/whatever namespace it was created in/with.  Each audit
namespace should have it's own set of filter rules, etc.  Instead of
just storing 'loginuid' we store 'loginuid+user namespace'.   When the
kernel creates a record it should translate the loginuid to the
namespace of the audit namespace and send the record.

It's a pretty major rewrite, but at least it makes sense.  Things like
AVC's might show up in multiple audit logs, but in every log they would
make sense to the admin of that namespace...

But what the hell do I know...  Namespaces scare me...


^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [RFC Part1 PATCH 00/20 v2] Add namespace support for audit
  2013-12-10 19:50             ` Eric Paris
@ 2013-12-10 20:36               ` Serge E. Hallyn
  2013-12-16  3:39                 ` Gao feng
  0 siblings, 1 reply; 50+ messages in thread
From: Serge E. Hallyn @ 2013-12-10 20:36 UTC (permalink / raw)
  To: Eric Paris
  Cc: Serge Hallyn, Richard Guy Briggs, containers, linux-kernel,
	linux-audit, ebiederm, sgrubb

Quoting Eric Paris (eparis@redhat.com):
> On Tue, 2013-12-10 at 10:51 -0600, Serge Hallyn wrote:
> > Quoting Gao feng (gaofeng@cn.fujitsu.com):
> > > On 12/10/2013 02:26 AM, Serge Hallyn wrote:
> > > > Quoting Gao feng (gaofeng@cn.fujitsu.com):
> > > >> On 12/07/2013 06:12 AM, Serge E. Hallyn wrote:
> > > >>> Quoting Gao feng (gaofeng@cn.fujitsu.com):
> > > >>>> Hi
> > > >>>>
> > > >>>> On 10/24/2013 03:31 PM, Gao feng wrote:
> > > >>>>> Here is the v1 patchset: http://lwn.net/Articles/549546/
> > > >>>>>
> > > >>>>> The main target of this patchset is allowing user in audit
> > > >>>>> namespace to generate the USER_MSG type of audit message,
> > > >>>>> some userspace tools need to generate audit message, or
> > > >>>>> these tools will broken.
> > > >>>>>
> > > >>>>
> > > >>>> I really need this feature, right now,some process such as
> > > >>>> logind are broken in container becase we leak of this feature.
> > > >>>
> > > >>> Your set doesn't address loginuid though right?  How exactly do you
> > > >>> expect to do that?  If user violates MAC policy and audit msg is
> > > >>> sent to init user ns by mac subsys, you need the loginuid from
> > > >>> init_audit_ns.  where will that be stored if you allow updates
> > > >>> of loginuid in auditns?
> > > >>>
> > > >> This patchset doesn't include the loginuid part.
> > > >>
> > > >> the loginuid is stored in task as before.
> > > >> In my opinion, when task creates a new audit namespace, this task's
> > > >> loginuid will be reset to zero, so the children tasks can set their
> > > >> loginuid. Does this change break the MAC?
> > > > 
> > > > I think so, yes.  In an LSPP selinux environment, if the task
> > > > manages to trigger an selinux deny rule which is audited, then
> > > > the loginuid must make sense on the host.  Now presumably it
> > > > will get translated to the mapped host uid, and we can figure
> > > > out the host uid owning it through /etc/subuid.  But that adds
> > > > /etc/subuid as a new part of the TCB without any warning <shrug>
> > > > So in that sense, for LSPP, it breaks it.
> > > > 
> > > 
> > > Looks like my opinion is incorrect.
> > > 
> > > In the audit-next tree, Eric added a new audit feature to allow privileged
> > > user to disable AUDIT_LOGINUID_IMMUTABLE. after AUDIT_LOGINUID_IMMUTABLE
> > > is disabled, the privileged user can reset/set the loginuid of task. I
> > > think this way is safe since only privileged user can do the change.
> > > 
> > > So I will not change the loginuid part.
> > > 
> > > Thanks for your information Serge :)
> > 
> > Unfortunately this makes the patchset much less compelling :)  The
> > problem I was looking into is that a container running in a user
> > namespace cannot (bc he has ns_capable(CAP_AUDIT_*) but not
> > capable(CAP_AUDIT_*)) set loginuids at all.
> > 
> > Which from an LSPP pov is correct;  which is why I was hoping you were
> > going to have the audit namespaces be hierarchical, with a task in a
> > level 2 audit ns having two loginuids - one in his own auditns, and
> > one in the initial one.
> 
> Right now user namespace + audit is just total crud.  We all know
> this...  (I'm not sure pid is must better, but I digress)   All thoughts
> around loginuid in the kernel right this very moment only make sense in
> the initial user namespace and all permission checks are in the initial
> user namespace as well.
> 
> I think I'm a proponent of the hierarchical approach to audit
> namespaces.  An audit namespace would hold a reference to the
> pid/user/whatever namespace it was created in/with.  Each audit
> namespace should have it's own set of filter rules, etc.  Instead of
> just storing 'loginuid' we store 'loginuid+user namespace'.   When the

So long as the kernel stores the kuid_t (which the only sane thing to
do) that is a non-issue.

> kernel creates a record it should translate the loginuid to the
> namespace of the audit namespace and send the record.

Yup, that should go without saying.  Use kuid_t in kernel and translate
at the kernel-user boundary.

> It's a pretty major rewrite, but at least it makes sense.  Things like
> AVC's might show up in multiple audit logs, but in every log they would
> make sense to the admin of that namespace...
> 
> But what the hell do I know...

Exactly how it would all affect selinux.  I'm happy it seems we agree.

> Namespaces scare me...

-serge

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [RFC Part1 PATCH 00/20 v2] Add namespace support for audit
  2013-12-10 20:36               ` Serge E. Hallyn
@ 2013-12-16  3:39                 ` Gao feng
  2013-12-20 21:15                   ` Serge E. Hallyn
  0 siblings, 1 reply; 50+ messages in thread
From: Gao feng @ 2013-12-16  3:39 UTC (permalink / raw)
  To: Serge E. Hallyn, Eric Paris
  Cc: Richard Guy Briggs, containers, Serge Hallyn, linux-kernel,
	linux-audit, ebiederm

On 12/11/2013 04:36 AM, Serge E. Hallyn wrote:
> Quoting Eric Paris (eparis@redhat.com):
>> On Tue, 2013-12-10 at 10:51 -0600, Serge Hallyn wrote:
>>> Quoting Gao feng (gaofeng@cn.fujitsu.com):
>>>> On 12/10/2013 02:26 AM, Serge Hallyn wrote:
>>>>> Quoting Gao feng (gaofeng@cn.fujitsu.com):
>>>>>> On 12/07/2013 06:12 AM, Serge E. Hallyn wrote:
>>>>>>> Quoting Gao feng (gaofeng@cn.fujitsu.com):
>>>>>>>> Hi
>>>>>>>>
>>>>>>>> On 10/24/2013 03:31 PM, Gao feng wrote:
>>>>>>>>> Here is the v1 patchset: http://lwn.net/Articles/549546/
>>>>>>>>>
>>>>>>>>> The main target of this patchset is allowing user in audit
>>>>>>>>> namespace to generate the USER_MSG type of audit message,
>>>>>>>>> some userspace tools need to generate audit message, or
>>>>>>>>> these tools will broken.
>>>>>>>>>
>>>>>>>>
>>>>>>>> I really need this feature, right now,some process such as
>>>>>>>> logind are broken in container becase we leak of this feature.
>>>>>>>
>>>>>>> Your set doesn't address loginuid though right?  How exactly do you
>>>>>>> expect to do that?  If user violates MAC policy and audit msg is
>>>>>>> sent to init user ns by mac subsys, you need the loginuid from
>>>>>>> init_audit_ns.  where will that be stored if you allow updates
>>>>>>> of loginuid in auditns?
>>>>>>>
>>>>>> This patchset doesn't include the loginuid part.
>>>>>>
>>>>>> the loginuid is stored in task as before.
>>>>>> In my opinion, when task creates a new audit namespace, this task's
>>>>>> loginuid will be reset to zero, so the children tasks can set their
>>>>>> loginuid. Does this change break the MAC?
>>>>>
>>>>> I think so, yes.  In an LSPP selinux environment, if the task
>>>>> manages to trigger an selinux deny rule which is audited, then
>>>>> the loginuid must make sense on the host.  Now presumably it
>>>>> will get translated to the mapped host uid, and we can figure
>>>>> out the host uid owning it through /etc/subuid.  But that adds
>>>>> /etc/subuid as a new part of the TCB without any warning <shrug>
>>>>> So in that sense, for LSPP, it breaks it.
>>>>>
>>>>
>>>> Looks like my opinion is incorrect.
>>>>
>>>> In the audit-next tree, Eric added a new audit feature to allow privileged
>>>> user to disable AUDIT_LOGINUID_IMMUTABLE. after AUDIT_LOGINUID_IMMUTABLE
>>>> is disabled, the privileged user can reset/set the loginuid of task. I
>>>> think this way is safe since only privileged user can do the change.
>>>>
>>>> So I will not change the loginuid part.
>>>>
>>>> Thanks for your information Serge :)
>>>
>>> Unfortunately this makes the patchset much less compelling :)  The
>>> problem I was looking into is that a container running in a user
>>> namespace cannot (bc he has ns_capable(CAP_AUDIT_*) but not
>>> capable(CAP_AUDIT_*)) set loginuids at all.
>>>
>>> Which from an LSPP pov is correct;  which is why I was hoping you were
>>> going to have the audit namespaces be hierarchical, with a task in a
>>> level 2 audit ns having two loginuids - one in his own auditns, and
>>> one in the initial one.
>>
>> Right now user namespace + audit is just total crud.  We all know
>> this...  (I'm not sure pid is must better, but I digress)   All thoughts
>> around loginuid in the kernel right this very moment only make sense in
>> the initial user namespace and all permission checks are in the initial
>> user namespace as well.
>>
>> I think I'm a proponent of the hierarchical approach to audit
>> namespaces.  An audit namespace would hold a reference to the
>> pid/user/whatever namespace it was created in/with.  Each audit
>> namespace should have it's own set of filter rules, etc.  Instead of
>> just storing 'loginuid' we store 'loginuid+user namespace'.   When the
> 
> So long as the kernel stores the kuid_t (which the only sane thing to
> do) that is a non-issue.
> 
>> kernel creates a record it should translate the loginuid to the
>> namespace of the audit namespace and send the record.
> 
> Yup, that should go without saying.  Use kuid_t in kernel and translate
> at the kernel-user boundary.
> 

I can implement audit namespace as a hierarchy, give per auditns a level value
and a pointer which point to parent auditns.

but for the loginuid part, I think we can implement it after we push the audit
ns into the upstream.

Is this ok?
>> It's a pretty major rewrite, but at least it makes sense.  Things like
>> AVC's might show up in multiple audit logs, but in every log they would
>> make sense to the admin of that namespace...
>>
>> But what the hell do I know...
> 
> Exactly how it would all affect selinux.  I'm happy it seems we agree.

This idea looks good to me, I will Investigate this. :)

Thanks.

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [RFC Part1 PATCH 00/20 v2] Add namespace support for audit
  2013-12-16  3:39                 ` Gao feng
@ 2013-12-20 21:15                   ` Serge E. Hallyn
  2013-12-24  9:32                     ` Gao feng
  0 siblings, 1 reply; 50+ messages in thread
From: Serge E. Hallyn @ 2013-12-20 21:15 UTC (permalink / raw)
  To: Gao feng
  Cc: Serge E. Hallyn, Eric Paris, Richard Guy Briggs, containers,
	Serge Hallyn, linux-kernel, linux-audit, ebiederm

Quoting Gao feng (gaofeng@cn.fujitsu.com):
> On 12/11/2013 04:36 AM, Serge E. Hallyn wrote:
> > Quoting Eric Paris (eparis@redhat.com):
> >> On Tue, 2013-12-10 at 10:51 -0600, Serge Hallyn wrote:
> >>> Quoting Gao feng (gaofeng@cn.fujitsu.com):
> >>>> On 12/10/2013 02:26 AM, Serge Hallyn wrote:
> >>>>> Quoting Gao feng (gaofeng@cn.fujitsu.com):
> >>>>>> On 12/07/2013 06:12 AM, Serge E. Hallyn wrote:
> >>>>>>> Quoting Gao feng (gaofeng@cn.fujitsu.com):
> >>>>>>>> Hi
> >>>>>>>>
> >>>>>>>> On 10/24/2013 03:31 PM, Gao feng wrote:
> >>>>>>>>> Here is the v1 patchset: http://lwn.net/Articles/549546/
> >>>>>>>>>
> >>>>>>>>> The main target of this patchset is allowing user in audit
> >>>>>>>>> namespace to generate the USER_MSG type of audit message,
> >>>>>>>>> some userspace tools need to generate audit message, or
> >>>>>>>>> these tools will broken.
> >>>>>>>>>
> >>>>>>>>
> >>>>>>>> I really need this feature, right now,some process such as
> >>>>>>>> logind are broken in container becase we leak of this feature.
> >>>>>>>
> >>>>>>> Your set doesn't address loginuid though right?  How exactly do you
> >>>>>>> expect to do that?  If user violates MAC policy and audit msg is
> >>>>>>> sent to init user ns by mac subsys, you need the loginuid from
> >>>>>>> init_audit_ns.  where will that be stored if you allow updates
> >>>>>>> of loginuid in auditns?
> >>>>>>>
> >>>>>> This patchset doesn't include the loginuid part.
> >>>>>>
> >>>>>> the loginuid is stored in task as before.
> >>>>>> In my opinion, when task creates a new audit namespace, this task's
> >>>>>> loginuid will be reset to zero, so the children tasks can set their
> >>>>>> loginuid. Does this change break the MAC?
> >>>>>
> >>>>> I think so, yes.  In an LSPP selinux environment, if the task
> >>>>> manages to trigger an selinux deny rule which is audited, then
> >>>>> the loginuid must make sense on the host.  Now presumably it
> >>>>> will get translated to the mapped host uid, and we can figure
> >>>>> out the host uid owning it through /etc/subuid.  But that adds
> >>>>> /etc/subuid as a new part of the TCB without any warning <shrug>
> >>>>> So in that sense, for LSPP, it breaks it.
> >>>>>
> >>>>
> >>>> Looks like my opinion is incorrect.
> >>>>
> >>>> In the audit-next tree, Eric added a new audit feature to allow privileged
> >>>> user to disable AUDIT_LOGINUID_IMMUTABLE. after AUDIT_LOGINUID_IMMUTABLE
> >>>> is disabled, the privileged user can reset/set the loginuid of task. I
> >>>> think this way is safe since only privileged user can do the change.
> >>>>
> >>>> So I will not change the loginuid part.
> >>>>
> >>>> Thanks for your information Serge :)
> >>>
> >>> Unfortunately this makes the patchset much less compelling :)  The
> >>> problem I was looking into is that a container running in a user
> >>> namespace cannot (bc he has ns_capable(CAP_AUDIT_*) but not
> >>> capable(CAP_AUDIT_*)) set loginuids at all.
> >>>
> >>> Which from an LSPP pov is correct;  which is why I was hoping you were
> >>> going to have the audit namespaces be hierarchical, with a task in a
> >>> level 2 audit ns having two loginuids - one in his own auditns, and
> >>> one in the initial one.
> >>
> >> Right now user namespace + audit is just total crud.  We all know
> >> this...  (I'm not sure pid is must better, but I digress)   All thoughts
> >> around loginuid in the kernel right this very moment only make sense in
> >> the initial user namespace and all permission checks are in the initial
> >> user namespace as well.
> >>
> >> I think I'm a proponent of the hierarchical approach to audit
> >> namespaces.  An audit namespace would hold a reference to the
> >> pid/user/whatever namespace it was created in/with.  Each audit
> >> namespace should have it's own set of filter rules, etc.  Instead of
> >> just storing 'loginuid' we store 'loginuid+user namespace'.   When the
> > 
> > So long as the kernel stores the kuid_t (which the only sane thing to
> > do) that is a non-issue.
> > 
> >> kernel creates a record it should translate the loginuid to the
> >> namespace of the audit namespace and send the record.
> > 
> > Yup, that should go without saying.  Use kuid_t in kernel and translate
> > at the kernel-user boundary.
> > 
> 
> I can implement audit namespace as a hierarchy, give per auditns a level value
> and a pointer which point to parent auditns.
> 
> but for the loginuid part, I think we can implement it after we push the audit
> ns into the upstream.
> 
> Is this ok?

Well as I"ve said the loginuid part is the only one that interests
me.  I'll be out most of the rest of the year, but I'll review any
patchset you send for what seems to me to be correctness :)

> >> It's a pretty major rewrite, but at least it makes sense.  Things like
> >> AVC's might show up in multiple audit logs, but in every log they would
> >> make sense to the admin of that namespace...
> >>
> >> But what the hell do I know...
> > 
> > Exactly how it would all affect selinux.  I'm happy it seems we agree.
> 
> This idea looks good to me, I will Investigate this. :)
> 
> Thanks.

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [RFC Part1 PATCH 00/20 v2] Add namespace support for audit
  2013-12-09  2:29   ` Gao feng
@ 2013-12-23 23:47     ` Richard Guy Briggs
  2013-12-24  9:53       ` Gao feng
  0 siblings, 1 reply; 50+ messages in thread
From: Richard Guy Briggs @ 2013-12-23 23:47 UTC (permalink / raw)
  To: Gao feng
  Cc: Serge E. Hallyn, containers, serge.hallyn, linux-kernel,
	linux-audit, ebiederm

On 13/12/09, Gao feng wrote:
> On 12/07/2013 05:31 AM, Serge E. Hallyn wrote:
> > Quoting Gao feng (gaofeng@cn.fujitsu.com):

> >> The main target of this patchset is allowing user in audit
> >> namespace to generate the USER_MSG type of audit message,
> >> some userspace tools need to generate audit message, or
> >> these tools will broken.
> >>
> >> And the login process in container may want to setup
> >> /proc/<pid>/loginuid, right now this value is unalterable
> >> once it being set. this will also broke the login problem
> >> in container. After this patchset, we can reset this loginuid
> >> to zero if task is running in a new audit namespace.
> >>
> >> Same with v1 patchset, in this patchset, only the privileged
> >> user in init_audit_ns and init_user_ns has rights to
> >> add/del audit rules. and these rules are gloabl. all
> >> audit namespace will comply with the rules.
> >>
> >> Compared with v1, v2 patch has some big changes.
> >> 1, the audit namespace is not assigned to user namespace.
> >>    since there is no available bit of flags for clone, we
> >>    create audit namespace through netlink, patch[18/20]
> >>    introduces a new audit netlink type AUDIT_CREATE_NS.
> >>    the privileged user in userns has rights to create a
> >>    audit namespace, it means the unprivileged user can
> >>    create auditns through create userns first. In order
> >>    to prevent them from doing harm to host, the default
> >>    audit_backlog_limit of un-init-audit-ns is zero(means
> >>    audit is unavailable in audit namespace). and it can't
> >>    be changed in auditns through netlink.
> > 
> > So the unprivileged user can create an audit-ns, but can't
> > then actually send any messages there?  I guess setting it
> > to something small would just be hacky?
> 
> Yes, if unprivileged user wants to send audit message, he should
> ask privileged user to setup the audit_backlog_limit for him.
> 
> I know it's a little of hack, but I don't have good idea :(

There's a recent patch that actually clarifies the way
audit_backlog_limit works, since different parts of the code seemed to
think different things.  It now means unlimited, since there are other
places to shut off logging.
	audit: allow unlimited backlog queue

At first, I'd say each audit_ns should be able to set its own
audit_backlog_limit.  The most obvious argument against this would be
the vulnerability of a DoS.  Could there be some automatic metrics to
set sub audit_ns backlog limits, such as default to the same as
init_audit_ns and have the init_audit_ns override those defaults?
Could this be done per user like ulimiit?

- RGB

--
Richard Guy Briggs <rbriggs@redhat.com>
Senior Software Engineer, Kernel Security, AMER ENG Base Operating Systems, Red Hat
Remote, Ottawa, Canada
Voice: +1.647.777.2635, Internal: (81) 32635, Alt: +1.613.693.0684x3545

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [RFC Part1 PATCH 00/20 v2] Add namespace support for audit
  2013-12-20 21:15                   ` Serge E. Hallyn
@ 2013-12-24  9:32                     ` Gao feng
  0 siblings, 0 replies; 50+ messages in thread
From: Gao feng @ 2013-12-24  9:32 UTC (permalink / raw)
  To: Serge E. Hallyn
  Cc: Eric Paris, Richard Guy Briggs, containers, Serge Hallyn,
	linux-kernel, linux-audit, ebiederm

On 12/21/2013 05:15 AM, Serge E. Hallyn wrote:
> Quoting Gao feng (gaofeng@cn.fujitsu.com):
>> On 12/11/2013 04:36 AM, Serge E. Hallyn wrote:
>>> Quoting Eric Paris (eparis@redhat.com):
>>>> On Tue, 2013-12-10 at 10:51 -0600, Serge Hallyn wrote:
>>>>> Quoting Gao feng (gaofeng@cn.fujitsu.com):
>>>>>> On 12/10/2013 02:26 AM, Serge Hallyn wrote:
>>>>>>> Quoting Gao feng (gaofeng@cn.fujitsu.com):
>>>>>>>> On 12/07/2013 06:12 AM, Serge E. Hallyn wrote:
>>>>>>>>> Quoting Gao feng (gaofeng@cn.fujitsu.com):
>>>>>>>>>> Hi
>>>>>>>>>>
>>>>>>>>>> On 10/24/2013 03:31 PM, Gao feng wrote:
>>>>>>>>>>> Here is the v1 patchset: http://lwn.net/Articles/549546/
>>>>>>>>>>>
>>>>>>>>>>> The main target of this patchset is allowing user in audit
>>>>>>>>>>> namespace to generate the USER_MSG type of audit message,
>>>>>>>>>>> some userspace tools need to generate audit message, or
>>>>>>>>>>> these tools will broken.
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> I really need this feature, right now,some process such as
>>>>>>>>>> logind are broken in container becase we leak of this feature.
>>>>>>>>>
>>>>>>>>> Your set doesn't address loginuid though right?  How exactly do you
>>>>>>>>> expect to do that?  If user violates MAC policy and audit msg is
>>>>>>>>> sent to init user ns by mac subsys, you need the loginuid from
>>>>>>>>> init_audit_ns.  where will that be stored if you allow updates
>>>>>>>>> of loginuid in auditns?
>>>>>>>>>
>>>>>>>> This patchset doesn't include the loginuid part.
>>>>>>>>
>>>>>>>> the loginuid is stored in task as before.
>>>>>>>> In my opinion, when task creates a new audit namespace, this task's
>>>>>>>> loginuid will be reset to zero, so the children tasks can set their
>>>>>>>> loginuid. Does this change break the MAC?
>>>>>>>
>>>>>>> I think so, yes.  In an LSPP selinux environment, if the task
>>>>>>> manages to trigger an selinux deny rule which is audited, then
>>>>>>> the loginuid must make sense on the host.  Now presumably it
>>>>>>> will get translated to the mapped host uid, and we can figure
>>>>>>> out the host uid owning it through /etc/subuid.  But that adds
>>>>>>> /etc/subuid as a new part of the TCB without any warning <shrug>
>>>>>>> So in that sense, for LSPP, it breaks it.
>>>>>>>
>>>>>>
>>>>>> Looks like my opinion is incorrect.
>>>>>>
>>>>>> In the audit-next tree, Eric added a new audit feature to allow privileged
>>>>>> user to disable AUDIT_LOGINUID_IMMUTABLE. after AUDIT_LOGINUID_IMMUTABLE
>>>>>> is disabled, the privileged user can reset/set the loginuid of task. I
>>>>>> think this way is safe since only privileged user can do the change.
>>>>>>
>>>>>> So I will not change the loginuid part.
>>>>>>
>>>>>> Thanks for your information Serge :)
>>>>>
>>>>> Unfortunately this makes the patchset much less compelling :)  The
>>>>> problem I was looking into is that a container running in a user
>>>>> namespace cannot (bc he has ns_capable(CAP_AUDIT_*) but not
>>>>> capable(CAP_AUDIT_*)) set loginuids at all.
>>>>>
>>>>> Which from an LSPP pov is correct;  which is why I was hoping you were
>>>>> going to have the audit namespaces be hierarchical, with a task in a
>>>>> level 2 audit ns having two loginuids - one in his own auditns, and
>>>>> one in the initial one.
>>>>
>>>> Right now user namespace + audit is just total crud.  We all know
>>>> this...  (I'm not sure pid is must better, but I digress)   All thoughts
>>>> around loginuid in the kernel right this very moment only make sense in
>>>> the initial user namespace and all permission checks are in the initial
>>>> user namespace as well.
>>>>
>>>> I think I'm a proponent of the hierarchical approach to audit
>>>> namespaces.  An audit namespace would hold a reference to the
>>>> pid/user/whatever namespace it was created in/with.  Each audit
>>>> namespace should have it's own set of filter rules, etc.  Instead of
>>>> just storing 'loginuid' we store 'loginuid+user namespace'.   When the
>>>
>>> So long as the kernel stores the kuid_t (which the only sane thing to
>>> do) that is a non-issue.
>>>
>>>> kernel creates a record it should translate the loginuid to the
>>>> namespace of the audit namespace and send the record.
>>>
>>> Yup, that should go without saying.  Use kuid_t in kernel and translate
>>> at the kernel-user boundary.
>>>
>>
>> I can implement audit namespace as a hierarchy, give per auditns a level value
>> and a pointer which point to parent auditns.
>>
>> but for the loginuid part, I think we can implement it after we push the audit
>> ns into the upstream.
>>
>> Is this ok?
> 
> Well as I"ve said the loginuid part is the only one that interests
> me.  I'll be out most of the rest of the year, but I'll review any
> patchset you send for what seems to me to be correctness :)
> 

Thanks for your help!
As soon as the frame of auditns being accepted, I think it's easily
to push the loginuid part. :)


^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [RFC Part1 PATCH 00/20 v2] Add namespace support for audit
  2013-12-23 23:47     ` Richard Guy Briggs
@ 2013-12-24  9:53       ` Gao feng
  0 siblings, 0 replies; 50+ messages in thread
From: Gao feng @ 2013-12-24  9:53 UTC (permalink / raw)
  To: Richard Guy Briggs
  Cc: Serge E. Hallyn, containers, serge.hallyn, linux-kernel,
	linux-audit, ebiederm

On 12/24/2013 07:47 AM, Richard Guy Briggs wrote:
> On 13/12/09, Gao feng wrote:
>> On 12/07/2013 05:31 AM, Serge E. Hallyn wrote:
>>> Quoting Gao feng (gaofeng@cn.fujitsu.com):
> 
>>>> The main target of this patchset is allowing user in audit
>>>> namespace to generate the USER_MSG type of audit message,
>>>> some userspace tools need to generate audit message, or
>>>> these tools will broken.
>>>>
>>>> And the login process in container may want to setup
>>>> /proc/<pid>/loginuid, right now this value is unalterable
>>>> once it being set. this will also broke the login problem
>>>> in container. After this patchset, we can reset this loginuid
>>>> to zero if task is running in a new audit namespace.
>>>>
>>>> Same with v1 patchset, in this patchset, only the privileged
>>>> user in init_audit_ns and init_user_ns has rights to
>>>> add/del audit rules. and these rules are gloabl. all
>>>> audit namespace will comply with the rules.
>>>>
>>>> Compared with v1, v2 patch has some big changes.
>>>> 1, the audit namespace is not assigned to user namespace.
>>>>    since there is no available bit of flags for clone, we
>>>>    create audit namespace through netlink, patch[18/20]
>>>>    introduces a new audit netlink type AUDIT_CREATE_NS.
>>>>    the privileged user in userns has rights to create a
>>>>    audit namespace, it means the unprivileged user can
>>>>    create auditns through create userns first. In order
>>>>    to prevent them from doing harm to host, the default
>>>>    audit_backlog_limit of un-init-audit-ns is zero(means
>>>>    audit is unavailable in audit namespace). and it can't
>>>>    be changed in auditns through netlink.
>>>
>>> So the unprivileged user can create an audit-ns, but can't
>>> then actually send any messages there?  I guess setting it
>>> to something small would just be hacky?
>>
>> Yes, if unprivileged user wants to send audit message, he should
>> ask privileged user to setup the audit_backlog_limit for him.
>>
>> I know it's a little of hack, but I don't have good idea :(
> 
> There's a recent patch that actually clarifies the way
> audit_backlog_limit works, since different parts of the code seemed to
> think different things.  It now means unlimited, since there are other
> places to shut off logging.
> 	audit: allow unlimited backlog queue

Yep, thanks for your information, we can set a negative number to backlog_limit
to mark there is no available buff for this audit ns.

> 
> At first, I'd say each audit_ns should be able to set its own
> audit_backlog_limit.  The most obvious argument against this would be
> the vulnerability of a DoS. 

There are two problem we should conside, auditns costs lot's of memory by
setting large backlog_limit and costs lot's of cpu resources by generating
audit log all the time. So I think the privileged user should have the ability
to limit the backlog len.

And I think it's not very necessary to keep on allowing auditns to set its own
audit_backlog_limit. if you think this is necessary, we can add a field max_backlog_limit
for per audit namespace. and set this value when we create auditns.

And seem like the audit_rate_limit should not be change by unprivileged user.

I don't know if I really follow your request...

> Could there be some automatic metrics to
> set sub audit_ns backlog limits, such as default to the same as
> init_audit_ns and have the init_audit_ns override those defaults?
> Could this be done per user like ulimiit?
> 

I think something like ulimit cannot help us.
we should set sub-auditns's backlog_limit in parent auditns..
so maybe the proc file is the best way.

^ permalink raw reply	[flat|nested] 50+ messages in thread

end of thread, other threads:[~2013-12-24  9:52 UTC | newest]

Thread overview: 50+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-10-24  7:31 [RFC Part1 PATCH 00/20 v2] Add namespace support for audit Gao feng
2013-10-24  7:31 ` [PATCH 01/20] Audit: make audit netlink socket net namespace unaware Gao feng
2013-10-24  7:31 ` [PATCH 02/20] audit: introduce configure option CONFIG_AUDIT_NS Gao feng
2013-10-24  7:31 ` [PATCH 03/20] audit: make audit_skb_queue per audit namespace Gao feng
2013-10-24  7:31 ` [PATCH 04/20] audit: make audit_skb_hold_queue " Gao feng
2013-10-24  7:31 ` [PATCH 05/20] audit: make audit_pid " Gao feng
2013-10-24  7:31 ` [PATCH 06/20] audit: make kauditd_task " Gao feng
2013-10-24  7:31 ` [PATCH 07/20] aduit: make audit_nlk_portid " Gao feng
2013-10-24  7:31 ` [PATCH 08/20] audit: make kaudit_wait queue " Gao feng
2013-10-24  7:31 ` [PATCH 09/20] audit: make audit_backlog_wait " Gao feng
2013-10-24  7:31 ` [PATCH 10/20] audit: allow un-init audit ns to change pid and portid only Gao feng
2013-10-24  7:31 ` [PATCH 11/20] audit: use proper audit namespace in audit_receive_msg Gao feng
2013-10-24  7:31 ` [PATCH 12/20] audit: use proper audit_namespace in kauditd_thread Gao feng
2013-10-24  7:31 ` [PATCH 13/20] audit: introduce new audit logging interface for audit namespace Gao feng
2013-10-24  7:31 ` [PATCH 14/20] audit: pass proper audit namespace to audit_log_common_recv_msg Gao feng
2013-10-24  7:32 ` [PATCH 15/20] audit: Log audit pid config change in audit namespace Gao feng
2013-10-24  7:32 ` [PATCH 16/20] audit: allow GET,SET,USER MSG operations " Gao feng
2013-12-06 22:00   ` [PATCH 16/20] audit: allow GET, SET, USER " Serge E. Hallyn
2013-12-09  1:47     ` Gao feng
2013-10-24  7:32 ` [PATCH 17/20] nsproxy: don't make create_new_namespaces static Gao feng
2013-10-24  7:32 ` [PATCH 18/20] audit: add new message type AUDIT_CREATE_NS Gao feng
2013-12-06 22:10   ` Serge E. Hallyn
2013-12-09  1:59     ` Gao feng
2013-12-09 17:53       ` Serge Hallyn
2013-12-10  5:34         ` Gao feng
2013-10-24  7:32 ` [PATCH 19/20] audit: make audit_backlog_limit per audit namespace Gao feng
2013-10-24  7:32 ` [PATCH 20/20] audit: introduce /proc/<pid>/audit_backlog_limit Gao feng
2013-10-31  3:52 ` [RFC Part1 PATCH 00/20 v2] Add namespace support for audit Gao feng
2013-11-05  7:51   ` Gao feng
2013-11-05  7:52     ` Gao feng
2013-11-05  8:11       ` Li Zefan
2013-11-05  8:56         ` Gao feng
2013-11-05 19:14           ` Richard Guy Briggs
2013-11-07  5:51             ` Gao feng
2013-11-21  7:57             ` Gao feng
2013-12-04  8:31 ` Gao feng
2013-12-06 22:12   ` Serge E. Hallyn
2013-12-09  2:06     ` Gao feng
2013-12-09 18:26       ` Serge Hallyn
2013-12-10  8:16         ` Gao feng
2013-12-10 16:51           ` Serge Hallyn
2013-12-10 19:50             ` Eric Paris
2013-12-10 20:36               ` Serge E. Hallyn
2013-12-16  3:39                 ` Gao feng
2013-12-20 21:15                   ` Serge E. Hallyn
2013-12-24  9:32                     ` Gao feng
2013-12-06 21:31 ` Serge E. Hallyn
2013-12-09  2:29   ` Gao feng
2013-12-23 23:47     ` Richard Guy Briggs
2013-12-24  9:53       ` Gao feng

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).