linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/4] Sysctl namespace support
       [not found] <4742C73C.3010904@openvz.org>
@ 2007-11-29 17:40 ` Eric W. Biederman
  2007-11-29 17:45   ` [PATCH 1/4] sysctl: Add register_sysctl_paths function Eric W. Biederman
  2007-11-30 12:56   ` [PATCH 0/4] Sysctl namespace support Herbert Xu
  0 siblings, 2 replies; 11+ messages in thread
From: Eric W. Biederman @ 2007-11-29 17:40 UTC (permalink / raw)
  To: Herbert Xu, Andrew Morton
  Cc: Serge Hallyn, Daniel Lezcano, Cedric Le Goater, Linux Containers,
	Pavel Emelyanov, netdev, linux-kernel, David Miller


Currently the network namespace work has gotten about as far as we can
without the ability to make sysctls that are per network namespace.

The techniques we have been using for other namespace of examining
current and replacing the ctl_table.data field depending on the
namespace instance that current->nsproxy refers to are both ugly
and do not work for the network sysctls.

The case in handling the networking sysctls that does not work with
the existing ugly pointer munging techniques are directories like
/proc/sys/net/ipv4/conf/ and /proc/sys/net/ipv4/neigh/ whose contents
vary depending on the networking devices present in the network
namespace.

Adding support to the sysctl infrastructure to allow to register
a sysctl table for a particular instance of a particular namespace
removes the need for magic sysctl methods, and allows the use
of the techniques for managing dynamic sysctl tables used for years
in the network stack.



Herbert we need this infrastructure most in net-2.6.25 (as not having
it is a current bottleneck to further development of the network
namespace) so these patches are against net-2.6.25.

Andrew also need this infrastructure in -mm so that we can take
advantage of this new infrastructure when implementing other
namespaces.

So I expect the sane way to deal with this patchset is to merge into
both net-2.6.25 and -mm and then Andrew can drop or disable the
patches once he pulls bases -mm on a version of net-2.6.25 with
the changes.

Eric

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH 1/4] sysctl: Add register_sysctl_paths function
  2007-11-29 17:40 ` [PATCH 0/4] Sysctl namespace support Eric W. Biederman
@ 2007-11-29 17:45   ` Eric W. Biederman
  2007-11-29 17:46     ` [PATCH 2/4] sysctl: Remember the ctl_table we passed to register_sysctl_paths Eric W. Biederman
  2007-11-30 12:56   ` [PATCH 0/4] Sysctl namespace support Herbert Xu
  1 sibling, 1 reply; 11+ messages in thread
From: Eric W. Biederman @ 2007-11-29 17:45 UTC (permalink / raw)
  To: Herbert Xu, Andrew Morton
  Cc: Serge Hallyn, Daniel Lezcano, Cedric Le Goater, Linux Containers,
	Pavel Emelyanov, netdev, linux-kernel, David Miller, Olaf Kirch,
	Olaf Hering


There are a number of modules that register a sysctl table
somewhere deeply nested in the sysctl hierarchy, such as
fs/nfs, fs/xfs, dev/cdrom, etc.

They all specify several dummy ctl_tables for the path name.
This patch implements register_sysctl_path that takes
an additional path name, and makes up dummy sysctl nodes
for each component.

This patch was originally written by Olaf Kirch and
brought to my attention and reworked some by Olaf Hering.
I have changed a few additional things so the bugs are mine.

After converting all of the easy callers Olaf Hering observed
allyesconfig ARCH=i386, the patch reduces the final binary size by 9369 bytes.

.text +897
.data -7008

   text    data     bss     dec     hex filename
   26959310        4045899 4718592 35723801        2211a19 ../vmlinux-vanilla
   26960207        4038891 4718592 35717690        221023a ../O-allyesconfig/vmlinux

So this change is both a space savings and a code simplification.

CC: Olaf Kirch <okir@suse.de>
CC: Olaf Hering <olaf@aepfle.de>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
---
 include/linux/sysctl.h |    9 +++++
 kernel/sysctl.c        |   90 ++++++++++++++++++++++++++++++++++++++++--------
 2 files changed, 84 insertions(+), 15 deletions(-)

diff --git a/include/linux/sysctl.h b/include/linux/sysctl.h
index e99171f..eb522bf 100644
--- a/include/linux/sysctl.h
+++ b/include/linux/sysctl.h
@@ -1065,7 +1065,16 @@ struct ctl_table_header
 	struct completion *unregistering;
 };
 
+/* struct ctl_path describes where in the hierarchy a table is added */
+struct ctl_path
+{
+	const char *procname;
+	int ctl_name;
+};
+
 struct ctl_table_header *register_sysctl_table(struct ctl_table * table);
+struct ctl_table_header *register_sysctl_paths(const struct ctl_path *path,
+						struct ctl_table *table);
 
 void unregister_sysctl_table(struct ctl_table_header * table);
 int sysctl_check_table(struct ctl_table *table);
diff --git a/kernel/sysctl.c b/kernel/sysctl.c
index 0deed82..fa92e70 100644
--- a/kernel/sysctl.c
+++ b/kernel/sysctl.c
@@ -1490,11 +1490,12 @@ static __init int sysctl_init(void)
 core_initcall(sysctl_init);
 
 /**
- * register_sysctl_table - register a sysctl hierarchy
+ * register_sysctl_paths - register a sysctl hierarchy
+ * @path: The path to the directory the sysctl table is in.
  * @table: the top-level table structure
  *
  * Register a sysctl table hierarchy. @table should be a filled in ctl_table
- * array. An entry with a ctl_name of 0 terminates the table. 
+ * array. A completely 0 filled entry terminates the table.
  *
  * The members of the &struct ctl_table structure are used as follows:
  *
@@ -1557,28 +1558,80 @@ core_initcall(sysctl_init);
  * This routine returns %NULL on a failure to register, and a pointer
  * to the table header on success.
  */
-struct ctl_table_header *register_sysctl_table(struct ctl_table * table)
+struct ctl_table_header *register_sysctl_paths(const struct ctl_path *path,
+						struct ctl_table *table)
 {
-	struct ctl_table_header *tmp;
-	tmp = kmalloc(sizeof(struct ctl_table_header), GFP_KERNEL);
-	if (!tmp)
+	struct ctl_table_header *header;
+	struct ctl_table *new, **prevp;
+	unsigned int n, npath;
+
+	/* Count the path components */
+	for (npath = 0; path[npath].ctl_name || path[npath].procname; ++npath)
+		;
+
+	/*
+	 * For each path component, allocate a 2-element ctl_table array.
+	 * The first array element will be filled with the sysctl entry
+	 * for this, the second will be the sentinel (ctl_name == 0).
+	 *
+	 * We allocate everything in one go so that we don't have to
+	 * worry about freeing additional memory in unregister_sysctl_table.
+	 */
+	header = kzalloc(sizeof(struct ctl_table_header) +
+			 (2 * npath * sizeof(struct ctl_table)), GFP_KERNEL);
+	if (!header)
 		return NULL;
-	tmp->ctl_table = table;
-	INIT_LIST_HEAD(&tmp->ctl_entry);
-	tmp->used = 0;
-	tmp->unregistering = NULL;
-	sysctl_set_parent(NULL, table);
-	if (sysctl_check_table(tmp->ctl_table)) {
-		kfree(tmp);
+
+	new = (struct ctl_table *) (header + 1);
+
+	/* Now connect the dots */
+	prevp = &header->ctl_table;
+	for (n = 0; n < npath; ++n, ++path) {
+		/* Copy the procname */
+		new->procname = path->procname;
+		new->ctl_name = path->ctl_name;
+		new->mode     = 0555;
+
+		*prevp = new;
+		prevp = &new->child;
+
+		new += 2;
+	}
+	*prevp = table;
+
+	INIT_LIST_HEAD(&header->ctl_entry);
+	header->used = 0;
+	header->unregistering = NULL;
+	sysctl_set_parent(NULL, header->ctl_table);
+	if (sysctl_check_table(header->ctl_table)) {
+		kfree(header);
 		return NULL;
 	}
 	spin_lock(&sysctl_lock);
-	list_add_tail(&tmp->ctl_entry, &root_table_header.ctl_entry);
+	list_add_tail(&header->ctl_entry, &root_table_header.ctl_entry);
 	spin_unlock(&sysctl_lock);
-	return tmp;
+
+	return header;
 }
 
 /**
+ * register_sysctl_table - register a sysctl table hierarchy
+ * @table: the top-level table structure
+ *
+ * Register a sysctl table hierarchy. @table should be a filled in ctl_table
+ * array. A completely 0 filled entry terminates the table.
+ *
+ * See register_sysctl_paths for more details.
+ */
+struct ctl_table_header *register_sysctl_table(struct ctl_table * table)
+{
+	static const struct ctl_path null_path[] = { {} };
+
+	return register_sysctl_paths(null_path, table);
+}
+
+
+/**
  * unregister_sysctl_table - unregister a sysctl table hierarchy
  * @header: the header returned from register_sysctl_table
  *
@@ -1600,6 +1653,12 @@ struct ctl_table_header *register_sysctl_table(struct ctl_table * table)
 	return NULL;
 }
 
+struct ctl_table_header *register_sysctl_paths(const struct ctl_path *path,
+						    struct ctl_table *table)
+{
+	return NULL;
+}
+
 void unregister_sysctl_table(struct ctl_table_header * table)
 {
 }
@@ -2658,6 +2717,7 @@ EXPORT_SYMBOL(proc_dostring);
 EXPORT_SYMBOL(proc_doulongvec_minmax);
 EXPORT_SYMBOL(proc_doulongvec_ms_jiffies_minmax);
 EXPORT_SYMBOL(register_sysctl_table);
+EXPORT_SYMBOL(register_sysctl_paths);
 EXPORT_SYMBOL(sysctl_intvec);
 EXPORT_SYMBOL(sysctl_jiffies);
 EXPORT_SYMBOL(sysctl_ms_jiffies);
-- 
1.5.3.rc6.17.g1911


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 2/4] sysctl: Remember the ctl_table we passed to register_sysctl_paths
  2007-11-29 17:45   ` [PATCH 1/4] sysctl: Add register_sysctl_paths function Eric W. Biederman
@ 2007-11-29 17:46     ` Eric W. Biederman
  2007-11-29 17:51       ` [PATCH 3/4] sysctl: Infrastructure for per namespace sysctls Eric W. Biederman
  0 siblings, 1 reply; 11+ messages in thread
From: Eric W. Biederman @ 2007-11-29 17:46 UTC (permalink / raw)
  To: Herbert Xu, Andrew Morton
  Cc: Serge Hallyn, Daniel Lezcano, Cedric Le Goater, Linux Containers,
	Pavel Emelyanov, netdev, linux-kernel, David Miller, Olaf Kirch,
	Olaf Hering


By doing this we allow users of register_sysctl_paths that build
and dynamically allocate their ctl_table to be simpler.  This allows
them to just remember the ctl_table_header returned from
register_sysctl_paths from which they can now find the
ctl_table array they need to free.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
---
 include/linux/sysctl.h |    1 +
 kernel/sysctl.c        |    1 +
 2 files changed, 2 insertions(+), 0 deletions(-)

diff --git a/include/linux/sysctl.h b/include/linux/sysctl.h
index eb522bf..8b2e9e0 100644
--- a/include/linux/sysctl.h
+++ b/include/linux/sysctl.h
@@ -1063,6 +1063,7 @@ struct ctl_table_header
 	struct list_head ctl_entry;
 	int used;
 	struct completion *unregistering;
+	struct ctl_table *ctl_table_arg;
 };
 
 /* struct ctl_path describes where in the hierarchy a table is added */
diff --git a/kernel/sysctl.c b/kernel/sysctl.c
index fa92e70..effae87 100644
--- a/kernel/sysctl.c
+++ b/kernel/sysctl.c
@@ -1598,6 +1598,7 @@ struct ctl_table_header *register_sysctl_paths(const struct ctl_path *path,
 		new += 2;
 	}
 	*prevp = table;
+	header->ctl_table_arg = table;
 
 	INIT_LIST_HEAD(&header->ctl_entry);
 	header->used = 0;
-- 
1.5.3.rc6.17.g1911


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 3/4] sysctl: Infrastructure for per namespace sysctls
  2007-11-29 17:46     ` [PATCH 2/4] sysctl: Remember the ctl_table we passed to register_sysctl_paths Eric W. Biederman
@ 2007-11-29 17:51       ` Eric W. Biederman
  2007-11-29 17:53         ` [PATCH 4/4] net: Implement the per network namespace sysctl infrastructure Eric W. Biederman
  0 siblings, 1 reply; 11+ messages in thread
From: Eric W. Biederman @ 2007-11-29 17:51 UTC (permalink / raw)
  To: Herbert Xu, Andrew Morton
  Cc: Serge Hallyn, Daniel Lezcano, Cedric Le Goater, Linux Containers,
	Pavel Emelyanov, netdev, linux-kernel, David Miller


This patch implements the basic infrastructure for per namespace sysctls.

A list of lists of sysctl headers is added, allowing each namespace to have
it's own list of sysctl headers.

Each list of sysctl headers has a lookup function to find the first
sysctl header in the list, allowing the lists to have a per namespace
instance.

register_sysct_root is added to tell sysctl.c about additional
lists of sysctl_headers.  As all of the users are expected to be in
kernel no unregister function is provided.

sysctl_head_next is updated to walk through the list of lists.

__register_sysctl_paths is added to add a new sysctl table on
a non-default sysctl list.

The only intrusive part of this patch is propagating the information
to decided which list of sysctls to use for sysctl_check_table.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
---
 include/linux/sysctl.h |   16 ++++++++-
 kernel/sysctl.c        |   93 ++++++++++++++++++++++++++++++++++++++++++------
 kernel/sysctl_check.c  |   25 +++++++------
 3 files changed, 111 insertions(+), 23 deletions(-)

diff --git a/include/linux/sysctl.h b/include/linux/sysctl.h
index 8b2e9e0..cd1da5c 100644
--- a/include/linux/sysctl.h
+++ b/include/linux/sysctl.h
@@ -951,7 +951,9 @@ enum
 
 /* For the /proc/sys support */
 struct ctl_table;
+struct nsproxy;
 extern struct ctl_table_header *sysctl_head_next(struct ctl_table_header *prev);
+extern struct ctl_table_header *__sysctl_head_next(struct nsproxy *namespaces, struct ctl_table_header *prev);
 extern void sysctl_head_finish(struct ctl_table_header *prev);
 extern int sysctl_perm(struct ctl_table *table, int op);
 
@@ -1055,6 +1057,13 @@ struct ctl_table
 	void *extra2;
 };
 
+struct ctl_table_root {
+	struct list_head root_list;
+	struct list_head header_list;
+	struct list_head *(*lookup)(struct ctl_table_root *root,
+					   struct nsproxy *namespaces);
+};
+
 /* struct ctl_table_header is used to maintain dynamic lists of
    struct ctl_table trees. */
 struct ctl_table_header
@@ -1064,6 +1073,7 @@ struct ctl_table_header
 	int used;
 	struct completion *unregistering;
 	struct ctl_table *ctl_table_arg;
+	struct ctl_table_root *root;
 };
 
 /* struct ctl_path describes where in the hierarchy a table is added */
@@ -1073,12 +1083,16 @@ struct ctl_path
 	int ctl_name;
 };
 
+void register_sysctl_root(struct ctl_table_root *root);
+struct ctl_table_header *__register_sysctl_paths(
+	struct ctl_table_root *root, struct nsproxy *namespaces,
+	const struct ctl_path *path, struct ctl_table *table);
 struct ctl_table_header *register_sysctl_table(struct ctl_table * table);
 struct ctl_table_header *register_sysctl_paths(const struct ctl_path *path,
 						struct ctl_table *table);
 
 void unregister_sysctl_table(struct ctl_table_header * table);
-int sysctl_check_table(struct ctl_table *table);
+int sysctl_check_table(struct nsproxy *namespaces, struct ctl_table *table);
 
 #else /* __KERNEL__ */
 
diff --git a/kernel/sysctl.c b/kernel/sysctl.c
index effae87..ad4b709 100644
--- a/kernel/sysctl.c
+++ b/kernel/sysctl.c
@@ -156,8 +156,16 @@ static int proc_dointvec_taint(struct ctl_table *table, int write, struct file *
 #endif
 
 static struct ctl_table root_table[];
-static struct ctl_table_header root_table_header =
-	{ root_table, LIST_HEAD_INIT(root_table_header.ctl_entry) };
+static struct ctl_table_root sysctl_table_root;
+static struct ctl_table_header root_table_header = {
+	.ctl_table = root_table,
+	.ctl_entry = LIST_HEAD_INIT(sysctl_table_root.header_list),
+	.root = &sysctl_table_root,
+};
+static struct ctl_table_root sysctl_table_root = {
+	.root_list = LIST_HEAD_INIT(sysctl_table_root.root_list),
+	.header_list = LIST_HEAD_INIT(root_table_header.ctl_entry),
+};
 
 static struct ctl_table kern_table[];
 static struct ctl_table vm_table[];
@@ -1300,12 +1308,27 @@ void sysctl_head_finish(struct ctl_table_header *head)
 	spin_unlock(&sysctl_lock);
 }
 
-struct ctl_table_header *sysctl_head_next(struct ctl_table_header *prev)
+static struct list_head *
+lookup_header_list(struct ctl_table_root *root, struct nsproxy *namespaces)
 {
+	struct list_head *header_list;
+	header_list = &root->header_list;
+	if (root->lookup)
+		header_list = root->lookup(root, namespaces);
+	return header_list;
+}
+
+struct ctl_table_header *__sysctl_head_next(struct nsproxy *namespaces,
+					    struct ctl_table_header *prev)
+{
+	struct ctl_table_root *root;
+	struct list_head *header_list;
 	struct ctl_table_header *head;
 	struct list_head *tmp;
+
 	spin_lock(&sysctl_lock);
 	if (prev) {
+		head = prev;
 		tmp = &prev->ctl_entry;
 		unuse_table(prev);
 		goto next;
@@ -1319,14 +1342,38 @@ struct ctl_table_header *sysctl_head_next(struct ctl_table_header *prev)
 		spin_unlock(&sysctl_lock);
 		return head;
 	next:
+		root = head->root;
 		tmp = tmp->next;
-		if (tmp == &root_table_header.ctl_entry)
-			break;
+		header_list = lookup_header_list(root, namespaces);
+		if (tmp != header_list)
+			continue;
+
+		do {
+			root = list_entry(root->root_list.next,
+					struct ctl_table_root, root_list);
+			if (root == &sysctl_table_root)
+				goto out;
+			header_list = lookup_header_list(root, namespaces);
+		} while (list_empty(header_list));
+		tmp = header_list->next;
 	}
+out:
 	spin_unlock(&sysctl_lock);
 	return NULL;
 }
 
+struct ctl_table_header *sysctl_head_next(struct ctl_table_header *prev)
+{
+	return __sysctl_head_next(current->nsproxy, prev);
+}
+
+void register_sysctl_root(struct ctl_table_root *root)
+{
+	spin_lock(&sysctl_lock);
+	list_add_tail(&root->root_list, &sysctl_table_root.root_list);
+	spin_unlock(&sysctl_lock);
+}
+
 #ifdef CONFIG_SYSCTL_SYSCALL
 int do_sysctl(int __user *name, int nlen, void __user *oldval, size_t __user *oldlenp,
 	       void __user *newval, size_t newlen)
@@ -1483,14 +1530,16 @@ static __init int sysctl_init(void)
 {
 	int err;
 	sysctl_set_parent(NULL, root_table);
-	err = sysctl_check_table(root_table);
+	err = sysctl_check_table(current->nsproxy, root_table);
 	return 0;
 }
 
 core_initcall(sysctl_init);
 
 /**
- * register_sysctl_paths - register a sysctl hierarchy
+ * __register_sysctl_paths - register a sysctl hierarchy
+ * @root: List of sysctl headers to register on
+ * @namespaces: Data to compute which lists of sysctl entries are visible
  * @path: The path to the directory the sysctl table is in.
  * @table: the top-level table structure
  *
@@ -1558,9 +1607,12 @@ core_initcall(sysctl_init);
  * This routine returns %NULL on a failure to register, and a pointer
  * to the table header on success.
  */
-struct ctl_table_header *register_sysctl_paths(const struct ctl_path *path,
-						struct ctl_table *table)
+struct ctl_table_header *__register_sysctl_paths(
+	struct ctl_table_root *root,
+	struct nsproxy *namespaces,
+	const struct ctl_path *path, struct ctl_table *table)
 {
+	struct list_head *header_list;
 	struct ctl_table_header *header;
 	struct ctl_table *new, **prevp;
 	unsigned int n, npath;
@@ -1603,19 +1655,38 @@ struct ctl_table_header *register_sysctl_paths(const struct ctl_path *path,
 	INIT_LIST_HEAD(&header->ctl_entry);
 	header->used = 0;
 	header->unregistering = NULL;
+	header->root = root;
 	sysctl_set_parent(NULL, header->ctl_table);
-	if (sysctl_check_table(header->ctl_table)) {
+	if (sysctl_check_table(namespaces, header->ctl_table)) {
 		kfree(header);
 		return NULL;
 	}
 	spin_lock(&sysctl_lock);
-	list_add_tail(&header->ctl_entry, &root_table_header.ctl_entry);
+	header_list = lookup_header_list(root, namespaces);
+	list_add_tail(&header->ctl_entry, header_list);
 	spin_unlock(&sysctl_lock);
 
 	return header;
 }
 
 /**
+ * register_sysctl_table_path - register a sysctl table hierarchy
+ * @path: The path to the directory the sysctl table is in.
+ * @table: the top-level table structure
+ *
+ * Register a sysctl table hierarchy. @table should be a filled in ctl_table
+ * array. A completely 0 filled entry terminates the table.
+ *
+ * See __register_sysctl_paths for more details.
+ */
+struct ctl_table_header *register_sysctl_paths(const struct ctl_path *path,
+						struct ctl_table *table)
+{
+	return __register_sysctl_paths(&sysctl_table_root, current->nsproxy,
+					path, table);
+}
+
+/**
  * register_sysctl_table - register a sysctl table hierarchy
  * @table: the top-level table structure
  *
diff --git a/kernel/sysctl_check.c b/kernel/sysctl_check.c
index fdfca0d..2544852 100644
--- a/kernel/sysctl_check.c
+++ b/kernel/sysctl_check.c
@@ -1352,7 +1352,8 @@ static void sysctl_repair_table(struct ctl_table *table)
 	}
 }
 
-static struct ctl_table *sysctl_check_lookup(struct ctl_table *table)
+static struct ctl_table *sysctl_check_lookup(struct nsproxy *namespaces,
+						struct ctl_table *table)
 {
 	struct ctl_table_header *head;
 	struct ctl_table *ref, *test;
@@ -1360,8 +1361,8 @@ static struct ctl_table *sysctl_check_lookup(struct ctl_table *table)
 
 	depth = sysctl_depth(table);
 
-	for (head = sysctl_head_next(NULL); head;
-	     head = sysctl_head_next(head)) {
+	for (head = __sysctl_head_next(namespaces, NULL); head;
+	     head = __sysctl_head_next(namespaces, head)) {
 		cur_depth = depth;
 		ref = head->ctl_table;
 repeat:
@@ -1406,13 +1407,14 @@ static void set_fail(const char **fail, struct ctl_table *table, const char *str
 	*fail = str;
 }
 
-static int sysctl_check_dir(struct ctl_table *table)
+static int sysctl_check_dir(struct nsproxy *namespaces,
+				struct ctl_table *table)
 {
 	struct ctl_table *ref;
 	int error;
 
 	error = 0;
-	ref = sysctl_check_lookup(table);
+	ref = sysctl_check_lookup(namespaces, table);
 	if (ref) {
 		int match = 0;
 		if ((!table->procname && !ref->procname) ||
@@ -1437,11 +1439,12 @@ static int sysctl_check_dir(struct ctl_table *table)
 	return error;
 }
 
-static void sysctl_check_leaf(struct ctl_table *table, const char **fail)
+static void sysctl_check_leaf(struct nsproxy *namespaces,
+				struct ctl_table *table, const char **fail)
 {
 	struct ctl_table *ref;
 
-	ref = sysctl_check_lookup(table);
+	ref = sysctl_check_lookup(namespaces, table);
 	if (ref && (ref != table))
 		set_fail(fail, table, "Sysctl already exists");
 }
@@ -1465,7 +1468,7 @@ static void sysctl_check_bin_path(struct ctl_table *table, const char **fail)
 	}
 }
 
-int sysctl_check_table(struct ctl_table *table)
+int sysctl_check_table(struct nsproxy *namespaces, struct ctl_table *table)
 {
 	int error = 0;
 	for (; table->ctl_name || table->procname; table++) {
@@ -1495,7 +1498,7 @@ int sysctl_check_table(struct ctl_table *table)
 				set_fail(&fail, table, "Directory with extra1");
 			if (table->extra2)
 				set_fail(&fail, table, "Directory with extra2");
-			if (sysctl_check_dir(table))
+			if (sysctl_check_dir(namespaces, table))
 				set_fail(&fail, table, "Inconsistent directory names");
 		} else {
 			if ((table->strategy == sysctl_data) ||
@@ -1544,7 +1547,7 @@ int sysctl_check_table(struct ctl_table *table)
 			if (!table->procname && table->proc_handler)
 				set_fail(&fail, table, "proc_handler without procname");
 #endif
-			sysctl_check_leaf(table, &fail);
+			sysctl_check_leaf(namespaces, table, &fail);
 		}
 		sysctl_check_bin_path(table, &fail);
 		if (fail) {
@@ -1552,7 +1555,7 @@ int sysctl_check_table(struct ctl_table *table)
 			error = -EINVAL;
 		}
 		if (table->child)
-			error |= sysctl_check_table(table->child);
+			error |= sysctl_check_table(namespaces, table->child);
 	}
 	return error;
 }
-- 
1.5.3.rc6.17.g1911


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 4/4] net: Implement the per network namespace sysctl infrastructure
  2007-11-29 17:51       ` [PATCH 3/4] sysctl: Infrastructure for per namespace sysctls Eric W. Biederman
@ 2007-11-29 17:53         ` Eric W. Biederman
  2007-11-30 16:18           ` Serge E. Hallyn
  0 siblings, 1 reply; 11+ messages in thread
From: Eric W. Biederman @ 2007-11-29 17:53 UTC (permalink / raw)
  To: Herbert Xu, Andrew Morton
  Cc: Serge Hallyn, Daniel Lezcano, Cedric Le Goater, Linux Containers,
	Pavel Emelyanov, netdev, linux-kernel, David Miller


The user interface is: register_net_sysctl_table and
unregister_net_sysctl_table.  Very much like the current
interface except there is a network namespace parameter.

With this any sysctl registered with register_net_sysctl_table
will only show up to tasks in the same network namespace.

All other sysctls continue to be globally visible.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
---
 include/net/net_namespace.h |    9 +++++++
 net/sysctl_net.c            |   57 +++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 66 insertions(+), 0 deletions(-)

diff --git a/include/net/net_namespace.h b/include/net/net_namespace.h
index 4d0d634..235214c 100644
--- a/include/net/net_namespace.h
+++ b/include/net/net_namespace.h
@@ -25,6 +25,8 @@ struct net {
 	struct proc_dir_entry 	*proc_net_stat;
 	struct proc_dir_entry 	*proc_net_root;
 
+	struct list_head	sysctl_table_headers;
+
 	struct net_device       *loopback_dev;          /* The loopback */
 
 	struct list_head 	dev_base_head;
@@ -144,4 +146,11 @@ extern void unregister_pernet_subsys(struct pernet_operations *);
 extern int register_pernet_device(struct pernet_operations *);
 extern void unregister_pernet_device(struct pernet_operations *);
 
+struct ctl_path;
+struct ctl_table;
+struct ctl_table_header;
+extern struct ctl_table_header *register_net_sysctl_table(struct net *net,
+	const struct ctl_path *path, struct ctl_table *table);
+extern void unregister_net_sysctl_table(struct ctl_table_header *header);
+
 #endif /* __NET_NET_NAMESPACE_H */
diff --git a/net/sysctl_net.c b/net/sysctl_net.c
index cd4eafb..c50c793 100644
--- a/net/sysctl_net.c
+++ b/net/sysctl_net.c
@@ -14,6 +14,7 @@
 
 #include <linux/mm.h>
 #include <linux/sysctl.h>
+#include <linux/nsproxy.h>
 
 #include <net/sock.h>
 
@@ -54,3 +55,59 @@ struct ctl_table net_table[] = {
 #endif
 	{ 0 },
 };
+
+static struct list_head *
+net_ctl_header_lookup(struct ctl_table_root *root, struct nsproxy *namespaces)
+{
+	return &namespaces->net_ns->sysctl_table_headers;
+}
+
+static struct ctl_table_root net_sysctl_root = {
+	.lookup = net_ctl_header_lookup,
+};
+
+static int sysctl_net_init(struct net *net)
+{
+	INIT_LIST_HEAD(&net->sysctl_table_headers);
+	return 0;
+}
+
+static void sysctl_net_exit(struct net *net)
+{
+	WARN_ON(!list_empty(&net->sysctl_table_headers));
+	return;
+}
+
+static struct pernet_operations sysctl_pernet_ops = {
+	.init = sysctl_net_init,
+	.exit = sysctl_net_exit,
+};
+
+static __init int sysctl_init(void)
+{
+	int ret;
+	ret = register_pernet_subsys(&sysctl_pernet_ops);
+	if (ret)
+		goto out;
+	register_sysctl_root(&net_sysctl_root);
+out:
+	return ret;
+}
+subsys_initcall(sysctl_init);
+
+struct ctl_table_header *register_net_sysctl_table(struct net *net,
+	const struct ctl_path *path, struct ctl_table *table)
+{
+	struct nsproxy namespaces;
+	namespaces = *current->nsproxy;
+	namespaces.net_ns = net;
+	return __register_sysctl_paths(&net_sysctl_root,
+					&namespaces, path, table);
+}
+EXPORT_SYMBOL_GPL(register_net_sysctl_table);
+
+void unregister_net_sysctl_table(struct ctl_table_header *header)
+{
+	return unregister_sysctl_table(header);
+}
+EXPORT_SYMBOL_GPL(unregister_net_sysctl_table);
-- 
1.5.3.rc6.17.g1911


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH 0/4] Sysctl namespace support
  2007-11-29 17:40 ` [PATCH 0/4] Sysctl namespace support Eric W. Biederman
  2007-11-29 17:45   ` [PATCH 1/4] sysctl: Add register_sysctl_paths function Eric W. Biederman
@ 2007-11-30 12:56   ` Herbert Xu
  2007-11-30 13:25     ` Eric W. Biederman
  1 sibling, 1 reply; 11+ messages in thread
From: Herbert Xu @ 2007-11-30 12:56 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Andrew Morton, Serge Hallyn, Daniel Lezcano, Cedric Le Goater,
	Linux Containers, Pavel Emelyanov, netdev, linux-kernel,
	David Miller

On Thu, Nov 29, 2007 at 10:40:24AM -0700, Eric W. Biederman wrote:
> 
> Herbert we need this infrastructure most in net-2.6.25 (as not having
> it is a current bottleneck to further development of the network
> namespace) so these patches are against net-2.6.25.

I've applied them all to net-2.6.25 with Andrew's fixes included.
Thanks Eric.
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 0/4] Sysctl namespace support
  2007-11-30 12:56   ` [PATCH 0/4] Sysctl namespace support Herbert Xu
@ 2007-11-30 13:25     ` Eric W. Biederman
  0 siblings, 0 replies; 11+ messages in thread
From: Eric W. Biederman @ 2007-11-30 13:25 UTC (permalink / raw)
  To: Herbert Xu
  Cc: Andrew Morton, Serge Hallyn, Daniel Lezcano, Cedric Le Goater,
	Linux Containers, Pavel Emelyanov, netdev, linux-kernel,
	David Miller

Herbert Xu <herbert@gondor.apana.org.au> writes:

> On Thu, Nov 29, 2007 at 10:40:24AM -0700, Eric W. Biederman wrote:
>> 
>> Herbert we need this infrastructure most in net-2.6.25 (as not having
>> it is a current bottleneck to further development of the network
>> namespace) so these patches are against net-2.6.25.
>
> I've applied them all to net-2.6.25 with Andrew's fixes included.
> Thanks Eric.

Welcome, and thanks.

I will see about taking advantage of this shortly.

Eric

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 4/4] net: Implement the per network namespace sysctl infrastructure
  2007-11-29 17:53         ` [PATCH 4/4] net: Implement the per network namespace sysctl infrastructure Eric W. Biederman
@ 2007-11-30 16:18           ` Serge E. Hallyn
  2007-11-30 16:23             ` Pavel Emelyanov
  2007-11-30 21:49             ` Eric W. Biederman
  0 siblings, 2 replies; 11+ messages in thread
From: Serge E. Hallyn @ 2007-11-30 16:18 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Herbert Xu, Andrew Morton, Serge Hallyn, Daniel Lezcano,
	Cedric Le Goater, Linux Containers, Pavel Emelyanov, netdev,
	linux-kernel, David Miller

Quoting Eric W. Biederman (ebiederm@xmission.com):
> 
> The user interface is: register_net_sysctl_table and
> unregister_net_sysctl_table.  Very much like the current
> interface except there is a network namespace parameter.
> 
> With this any sysctl registered with register_net_sysctl_table
> will only show up to tasks in the same network namespace.
> 
> All other sysctls continue to be globally visible.
> 
> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
> ---
>  include/net/net_namespace.h |    9 +++++++
>  net/sysctl_net.c            |   57 +++++++++++++++++++++++++++++++++++++++++++
>  2 files changed, 66 insertions(+), 0 deletions(-)
> 
> diff --git a/include/net/net_namespace.h b/include/net/net_namespace.h
> index 4d0d634..235214c 100644
> --- a/include/net/net_namespace.h
> +++ b/include/net/net_namespace.h
> @@ -25,6 +25,8 @@ struct net {
>  	struct proc_dir_entry 	*proc_net_stat;
>  	struct proc_dir_entry 	*proc_net_root;
> 
> +	struct list_head	sysctl_table_headers;
> +
>  	struct net_device       *loopback_dev;          /* The loopback */
> 
>  	struct list_head 	dev_base_head;
> @@ -144,4 +146,11 @@ extern void unregister_pernet_subsys(struct pernet_operations *);
>  extern int register_pernet_device(struct pernet_operations *);
>  extern void unregister_pernet_device(struct pernet_operations *);
> 
> +struct ctl_path;
> +struct ctl_table;
> +struct ctl_table_header;
> +extern struct ctl_table_header *register_net_sysctl_table(struct net *net,
> +	const struct ctl_path *path, struct ctl_table *table);
> +extern void unregister_net_sysctl_table(struct ctl_table_header *header);
> +
>  #endif /* __NET_NET_NAMESPACE_H */
> diff --git a/net/sysctl_net.c b/net/sysctl_net.c
> index cd4eafb..c50c793 100644
> --- a/net/sysctl_net.c
> +++ b/net/sysctl_net.c
> @@ -14,6 +14,7 @@
> 
>  #include <linux/mm.h>
>  #include <linux/sysctl.h>
> +#include <linux/nsproxy.h>
> 
>  #include <net/sock.h>
> 
> @@ -54,3 +55,59 @@ struct ctl_table net_table[] = {
>  #endif
>  	{ 0 },
>  };
> +
> +static struct list_head *
> +net_ctl_header_lookup(struct ctl_table_root *root, struct nsproxy *namespaces)
> +{
> +	return &namespaces->net_ns->sysctl_table_headers;
> +}
> +
> +static struct ctl_table_root net_sysctl_root = {
> +	.lookup = net_ctl_header_lookup,
> +};
> +
> +static int sysctl_net_init(struct net *net)
> +{
> +	INIT_LIST_HEAD(&net->sysctl_table_headers);
> +	return 0;
> +}
> +
> +static void sysctl_net_exit(struct net *net)
> +{
> +	WARN_ON(!list_empty(&net->sysctl_table_headers));
> +	return;
> +}
> +
> +static struct pernet_operations sysctl_pernet_ops = {
> +	.init = sysctl_net_init,
> +	.exit = sysctl_net_exit,
> +};
> +
> +static __init int sysctl_init(void)
> +{
> +	int ret;
> +	ret = register_pernet_subsys(&sysctl_pernet_ops);
> +	if (ret)
> +		goto out;
> +	register_sysctl_root(&net_sysctl_root);
> +out:
> +	return ret;
> +}
> +subsys_initcall(sysctl_init);
> +
> +struct ctl_table_header *register_net_sysctl_table(struct net *net,
> +	const struct ctl_path *path, struct ctl_table *table)
> +{
> +	struct nsproxy namespaces;
> +	namespaces = *current->nsproxy;
> +	namespaces.net_ns = net;
> +	return __register_sysctl_paths(&net_sysctl_root,
> +					&namespaces, path, table);

Hey Eric,

the patches look nice.

The hand-forcing of the passed-in net_ns into a copy of current->nsproxy
does make it seem like nsproxy may not be the best choice of what to
pass in.  Doesn't only net_sysctl_root->lookup() look at the argument?

But I assume you don't want to be more general than sending in a
nsproxy so as to dissuade abuse of this interface for needlessly complex
sysctl interfaces?

(Well I expect that'll become clear once the the patches using this
come out.)

Are you planning to use this infrastructure for the uts and ipc
sysctls as well?

thanks,
-serge

> +}
> +EXPORT_SYMBOL_GPL(register_net_sysctl_table);
> +
> +void unregister_net_sysctl_table(struct ctl_table_header *header)
> +{
> +	return unregister_sysctl_table(header);
> +}
> +EXPORT_SYMBOL_GPL(unregister_net_sysctl_table);
> -- 
> 1.5.3.rc6.17.g1911

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 4/4] net: Implement the per network namespace sysctl infrastructure
  2007-11-30 16:18           ` Serge E. Hallyn
@ 2007-11-30 16:23             ` Pavel Emelyanov
  2007-11-30 21:49             ` Eric W. Biederman
  1 sibling, 0 replies; 11+ messages in thread
From: Pavel Emelyanov @ 2007-11-30 16:23 UTC (permalink / raw)
  To: Serge E. Hallyn
  Cc: Eric W. Biederman, Herbert Xu, Andrew Morton, Daniel Lezcano,
	Cedric Le Goater, Linux Containers, netdev, linux-kernel,
	David Miller

[snip]

>> +					&namespaces, path, table);
> 
> Hey Eric,
> 
> the patches look nice.

Agree ;)

> The hand-forcing of the passed-in net_ns into a copy of current->nsproxy
> does make it seem like nsproxy may not be the best choice of what to
> pass in.  Doesn't only net_sysctl_root->lookup() look at the argument?
> 
> But I assume you don't want to be more general than sending in a
> nsproxy so as to dissuade abuse of this interface for needlessly complex
> sysctl interfaces?
> 
> (Well I expect that'll become clear once the the patches using this
> come out.)
> 
> Are you planning to use this infrastructure for the uts and ipc
> sysctls as well?

I have sent some patches concerning uts and ipc already.
I'd appreciate any feedback on it :)

> thanks,
> -serge

Thanks,
Pavel

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 4/4] net: Implement the per network namespace sysctl infrastructure
  2007-11-30 16:18           ` Serge E. Hallyn
  2007-11-30 16:23             ` Pavel Emelyanov
@ 2007-11-30 21:49             ` Eric W. Biederman
  2007-12-01  0:01               ` Serge E. Hallyn
  1 sibling, 1 reply; 11+ messages in thread
From: Eric W. Biederman @ 2007-11-30 21:49 UTC (permalink / raw)
  To: Serge E. Hallyn
  Cc: Herbert Xu, Andrew Morton, Daniel Lezcano, Cedric Le Goater,
	Linux Containers, Pavel Emelyanov, netdev, linux-kernel,
	David Miller

"Serge E. Hallyn" <serue@us.ibm.com> writes:

>
> Hey Eric,
>
> the patches look nice.
>
> The hand-forcing of the passed-in net_ns into a copy of current->nsproxy
> does make it seem like nsproxy may not be the best choice of what to
> pass in.  Doesn't only net_sysctl_root->lookup() look at the argument?

Yes.  Although I call it from __register_sysctl_paths.

> But I assume you don't want to be more general than sending in a
> nsproxy so as to dissuade abuse of this interface for needlessly complex
> sysctl interfaces?

A bit of that.  I would love to pass in a task_struct so you can use
anything from a task.  The trouble is I don't have any task_structs or
nsproxys with the proper value at the point where I am first setting
this up.  Further I have to have the full sysctl lookup working or I
could not call sysctl_check.

> (Well I expect that'll become clear once the the patches using this
> come out.)
>
> Are you planning to use this infrastructure for the uts and ipc
> sysctls as well?

Yes.  Where it comes in especially useful, is I can move /proc/sys
to /proc/sys/<tgid>/task/<pid>/sys.  And get a particular processes
view of sysctl.  

We also get a little more reuse of common functions.

Otherwise Pavel does have a point that using this for uts and ipc
is not a savings lines of code wise.

After having seen Pavel changes I am asking myself if there is a sane
way to remove the ctl_name argument from the ctl_path.

Anyway where I am with the nsproxy question was that I don't
see anything easily better.  What I have works and gets the job
done, and doesn't have any module unload races or holes where a sloppy
programmer can mess up the sysctl tree.  We needed a solution.
Trying any harder to find something better would take ages.  So
I figured this implementation was good enough.

Eric

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 4/4] net: Implement the per network namespace sysctl infrastructure
  2007-11-30 21:49             ` Eric W. Biederman
@ 2007-12-01  0:01               ` Serge E. Hallyn
  0 siblings, 0 replies; 11+ messages in thread
From: Serge E. Hallyn @ 2007-12-01  0:01 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Serge E. Hallyn, Herbert Xu, Andrew Morton, Daniel Lezcano,
	Cedric Le Goater, Linux Containers, Pavel Emelyanov, netdev,
	linux-kernel, David Miller

Quoting Eric W. Biederman (ebiederm@xmission.com):
> "Serge E. Hallyn" <serue@us.ibm.com> writes:
> 
> >
> > Hey Eric,
> >
> > the patches look nice.
> >
> > The hand-forcing of the passed-in net_ns into a copy of current->nsproxy
> > does make it seem like nsproxy may not be the best choice of what to
> > pass in.  Doesn't only net_sysctl_root->lookup() look at the argument?
> 
> Yes.  Although I call it from __register_sysctl_paths.
> 
> > But I assume you don't want to be more general than sending in a
> > nsproxy so as to dissuade abuse of this interface for needlessly complex
> > sysctl interfaces?
> 
> A bit of that.  I would love to pass in a task_struct so you can use
> anything from a task.  The trouble is I don't have any task_structs or
> nsproxys with the proper value at the point where I am first setting
> this up.  Further I have to have the full sysctl lookup working or I
> could not call sysctl_check.
> 
> > (Well I expect that'll become clear once the the patches using this
> > come out.)
> >
> > Are you planning to use this infrastructure for the uts and ipc
> > sysctls as well?
> 
> Yes.  Where it comes in especially useful, is I can move /proc/sys
> to /proc/sys/<tgid>/task/<pid>/sys.  And get a particular processes
> view of sysctl.  
> 
> We also get a little more reuse of common functions.
> 
> Otherwise Pavel does have a point that using this for uts and ipc
> is not a savings lines of code wise.
> 
> After having seen Pavel changes I am asking myself if there is a sane
> way to remove the ctl_name argument from the ctl_path.
> 
> Anyway where I am with the nsproxy question was that I don't
> see anything easily better.  What I have works and gets the job
> done, and doesn't have any module unload races or holes where a sloppy
> programmer can mess up the sysctl tree.  We needed a solution.
> Trying any harder to find something better would take ages.  So
> I figured this implementation was good enough.

I agree.  So it's already in -mm but still

Acked-by: Serge Hallyn <serue@us.ibm.com>

thanks,
-serge

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2007-12-01  0:01 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <4742C73C.3010904@openvz.org>
2007-11-29 17:40 ` [PATCH 0/4] Sysctl namespace support Eric W. Biederman
2007-11-29 17:45   ` [PATCH 1/4] sysctl: Add register_sysctl_paths function Eric W. Biederman
2007-11-29 17:46     ` [PATCH 2/4] sysctl: Remember the ctl_table we passed to register_sysctl_paths Eric W. Biederman
2007-11-29 17:51       ` [PATCH 3/4] sysctl: Infrastructure for per namespace sysctls Eric W. Biederman
2007-11-29 17:53         ` [PATCH 4/4] net: Implement the per network namespace sysctl infrastructure Eric W. Biederman
2007-11-30 16:18           ` Serge E. Hallyn
2007-11-30 16:23             ` Pavel Emelyanov
2007-11-30 21:49             ` Eric W. Biederman
2007-12-01  0:01               ` Serge E. Hallyn
2007-11-30 12:56   ` [PATCH 0/4] Sysctl namespace support Herbert Xu
2007-11-30 13:25     ` Eric W. Biederman

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).