All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH RFC 00/25] Replacing net_mutex with rw_semaphore
@ 2017-11-17 18:27 Kirill Tkhai
  2017-11-17 18:27 ` [PATCH RFC 01/25] net: Assign net to net_namespace_list in setup_net() Kirill Tkhai
                   ` (25 more replies)
  0 siblings, 26 replies; 27+ messages in thread
From: Kirill Tkhai @ 2017-11-17 18:27 UTC (permalink / raw)
  To: davem, vyasevic, kstewart, pombredanne, vyasevich, mark.rutland,
	gregkh, adobriyan, fw, nicolas.dichtel, xiyou.wangcong,
	roman.kapl, paul, dsahern, daniel, lucien.xin, mschiffer,
	rshearma, linux-kernel, netdev, ktkhai, ebiederm, avagin,
	gorcunov, eric.dumazet, stephen, ktkhai

Hi,

this is continuation of discussion from here:

https://lkml.org/lkml/2017/11/14/298

The plan has changed a little bit, so I'd be happy to hear
people's comments, before I dived into all 400+ pernet subsys
and devices.

The patch set adds pernet sys list ahead of subsys and device,
and it's used for pernet_operations, which may be executed
in parallel with any other pernet_operations methods. Also,
some high-priority ops converted (up to registered using
postcore_initcall(), and some subsys_initcall()) in order
of appearance. The sequence in setup_net() is following:

1)execute all the callbacks from pernet_sys list
2)lock net_mutex
3)execute all the callbacks from pernet_subsys list
4)execute all the callbacks from pernet_device list
5)unlock net_mutex

There was not pernet_operations, requiring additional
synchronization, yet, but I've bumped in another problem.
The problem is that some drivers may be compiled as modules
and as kernel-image part. They register pernet_operations
from device_initcall() for example. This initcall executes
in different time comparing to in-kernel built-in only
drivers.

Imagine, we have three state driverA, and boolean driverB.
driverA registers pernet_subsys from subsys_initcall().
driverB registers pernet_subsys from fs_initcall().
So, here we have two cases:

driverA is module              driverA is built-in
--------------------           -------------------
register driverB ops           register driverA ops
register driverA ops           register driverB ops

So, the order is different. When converting driver one-by-one,
it's impossible to make the order true for all .config
states, because of the above. So, the bisect won't work.

And it seems, it's just the same as to convert pernet_operations
from all the files in file alphabetical order. What do you
think about this? (Note, the patches has no such a problem
at the moment, as there are all in-kernel early core drivers).

Maybe there are another comments on the code.
---

Kirill Tkhai (25):
      net: Assign net to net_namespace_list in setup_net()
      net: Cleanup copy_net_ns()
      net: Introduce net_sem for protection of pernet_list
      net: Move mutex_unlock() in cleanup_net() up
      net: Add primitives to update heads of pernet_list sublists
      net: Add pernet sys and registration functions
      net: Make sys sublist pernet_operations executed out of net_mutex
      net: Move proc_net_ns_ops to pernet_sys list
      net: Move net_ns_ops to pernet_sys list
      net: Move sysctl_pernet_ops to pernet_sys list
      net: Move netfilter_net_ops to pernet_sys list
      net: Move nf_log_net_ops to pernet_sys list
      net: Move net_inuse_ops to pernet_sys list
      net: Move net_defaults_ops to pernet_sys list
      net: Move netlink_net_ops to pernet_sys list
      net: Move rtnetlink_net_ops to pernet_sys list
      net: Move audit_net_ops to pernet_sys list
      net: Move uevent_net_ops to pernet_sys list
      net: Move proto_net_ops to pernet_sys list
      net: Move pernet_subsys, registered via net_dev_init(), to pernet_sys list
      net: Move fib_* pernet_operations, registered via subsys_initcall(), to pernet_sys list
      net: Move subsys_initcall() registered pernet_operations from net/sched to pernet_sys list
      net: Move genl_pernet_ops to pernet_sys list
      net: Move wext_pernet_ops to pernet_sys list
      net: Move sysctl_core_ops to pernet_sys list


 fs/proc/proc_net.c          |    2 
 include/linux/rtnetlink.h   |    1 
 include/net/net_namespace.h |    2 
 kernel/audit.c              |    2 
 lib/kobject_uevent.c        |    2 
 net/core/dev.c              |    2 
 net/core/fib_notifier.c     |    2 
 net/core/fib_rules.c        |    2 
 net/core/net-procfs.c       |    4 -
 net/core/net_namespace.c    |  203 +++++++++++++++++++++++++++++++++----------
 net/core/rtnetlink.c        |    6 +
 net/core/sock.c             |    4 -
 net/core/sysctl_net_core.c  |    2 
 net/netfilter/core.c        |    2 
 net/netfilter/nf_log.c      |    2 
 net/netlink/af_netlink.c    |    2 
 net/netlink/genetlink.c     |    2 
 net/sched/act_api.c         |    2 
 net/sched/sch_api.c         |    2 
 net/sysctl_net.c            |    2 
 net/wireless/wext-core.c    |    2 
 21 files changed, 183 insertions(+), 67 deletions(-)

--
Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [PATCH RFC 01/25] net: Assign net to net_namespace_list in setup_net()
  2017-11-17 18:27 [PATCH RFC 00/25] Replacing net_mutex with rw_semaphore Kirill Tkhai
@ 2017-11-17 18:27 ` Kirill Tkhai
  2017-11-17 18:27 ` [PATCH RFC 02/25] net: Cleanup copy_net_ns() Kirill Tkhai
                   ` (24 subsequent siblings)
  25 siblings, 0 replies; 27+ messages in thread
From: Kirill Tkhai @ 2017-11-17 18:27 UTC (permalink / raw)
  To: davem, vyasevic, kstewart, pombredanne, vyasevich, mark.rutland,
	gregkh, adobriyan, fw, nicolas.dichtel, xiyou.wangcong,
	roman.kapl, paul, dsahern, daniel, lucien.xin, mschiffer,
	rshearma, linux-kernel, netdev, ktkhai, ebiederm, avagin,
	gorcunov, eric.dumazet, stephen, ktkhai

This patch merges two repeating pieces of code in one,
and they will live in setup_net() now.

It acts as cleanup even despite init_net_initialized
assignment is reordered with the linking of net now.
This variable is need for proc_net_init() called from:

start_kernel()->proc_root_init()->proc_net_init(),

which can't race with net_ns_init(), called from
initcall.

Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
---
 net/core/net_namespace.c |   13 +++----------
 1 file changed, 3 insertions(+), 10 deletions(-)

diff --git a/net/core/net_namespace.c b/net/core/net_namespace.c
index b797832565d3..7ecf71050ffa 100644
--- a/net/core/net_namespace.c
+++ b/net/core/net_namespace.c
@@ -296,6 +296,9 @@ static __net_init int setup_net(struct net *net, struct user_namespace *user_ns)
 		if (error < 0)
 			goto out_undo;
 	}
+	rtnl_lock();
+	list_add_tail_rcu(&net->list, &net_namespace_list);
+	rtnl_unlock();
 out:
 	return error;
 
@@ -417,11 +420,6 @@ struct net *copy_net_ns(unsigned long flags,
 
 	net->ucounts = ucounts;
 	rv = setup_net(net, user_ns);
-	if (rv == 0) {
-		rtnl_lock();
-		list_add_tail_rcu(&net->list, &net_namespace_list);
-		rtnl_unlock();
-	}
 	mutex_unlock(&net_mutex);
 	if (rv < 0) {
 		dec_net_namespaces(ucounts);
@@ -847,11 +845,6 @@ static int __init net_ns_init(void)
 		panic("Could not setup the initial network namespace");
 
 	init_net_initialized = true;
-
-	rtnl_lock();
-	list_add_tail_rcu(&init_net.list, &net_namespace_list);
-	rtnl_unlock();
-
 	mutex_unlock(&net_mutex);
 
 	register_pernet_subsys(&net_ns_ops);

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH RFC 02/25] net: Cleanup copy_net_ns()
  2017-11-17 18:27 [PATCH RFC 00/25] Replacing net_mutex with rw_semaphore Kirill Tkhai
  2017-11-17 18:27 ` [PATCH RFC 01/25] net: Assign net to net_namespace_list in setup_net() Kirill Tkhai
@ 2017-11-17 18:27 ` Kirill Tkhai
  2017-11-17 18:27 ` [PATCH RFC 03/25] net: Introduce net_sem for protection of pernet_list Kirill Tkhai
                   ` (23 subsequent siblings)
  25 siblings, 0 replies; 27+ messages in thread
From: Kirill Tkhai @ 2017-11-17 18:27 UTC (permalink / raw)
  To: davem, vyasevic, kstewart, pombredanne, vyasevich, mark.rutland,
	gregkh, adobriyan, fw, nicolas.dichtel, xiyou.wangcong,
	roman.kapl, paul, dsahern, daniel, lucien.xin, mschiffer,
	rshearma, linux-kernel, netdev, ktkhai, ebiederm, avagin,
	gorcunov, eric.dumazet, stephen, ktkhai

Line up destructors actions in the revers order
to constructors. Next patches will add more actions,
and this will be comfortable, if there is the such
order.

Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
---
 net/core/net_namespace.c |   20 +++++++++-----------
 1 file changed, 9 insertions(+), 11 deletions(-)

diff --git a/net/core/net_namespace.c b/net/core/net_namespace.c
index 7ecf71050ffa..2e512965bf42 100644
--- a/net/core/net_namespace.c
+++ b/net/core/net_namespace.c
@@ -404,27 +404,25 @@ struct net *copy_net_ns(unsigned long flags,
 
 	net = net_alloc();
 	if (!net) {
-		dec_net_namespaces(ucounts);
-		return ERR_PTR(-ENOMEM);
+		rv = -ENOMEM;
+		goto dec_ucounts;
 	}
-
+	refcount_set(&net->passive, 1);
+	net->ucounts = ucounts;
 	get_user_ns(user_ns);
 
 	rv = mutex_lock_killable(&net_mutex);
-	if (rv < 0) {
-		net_free(net);
-		dec_net_namespaces(ucounts);
-		put_user_ns(user_ns);
-		return ERR_PTR(rv);
-	}
+	if (rv < 0)
+		goto put_userns;
 
-	net->ucounts = ucounts;
 	rv = setup_net(net, user_ns);
 	mutex_unlock(&net_mutex);
 	if (rv < 0) {
-		dec_net_namespaces(ucounts);
+put_userns:
 		put_user_ns(user_ns);
 		net_drop_ns(net);
+dec_ucounts:
+		dec_net_namespaces(ucounts);
 		return ERR_PTR(rv);
 	}
 	return net;

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH RFC 03/25] net: Introduce net_sem for protection of pernet_list
  2017-11-17 18:27 [PATCH RFC 00/25] Replacing net_mutex with rw_semaphore Kirill Tkhai
  2017-11-17 18:27 ` [PATCH RFC 01/25] net: Assign net to net_namespace_list in setup_net() Kirill Tkhai
  2017-11-17 18:27 ` [PATCH RFC 02/25] net: Cleanup copy_net_ns() Kirill Tkhai
@ 2017-11-17 18:27 ` Kirill Tkhai
  2017-11-17 18:27 ` [PATCH RFC 04/25] net: Move mutex_unlock() in cleanup_net() up Kirill Tkhai
                   ` (22 subsequent siblings)
  25 siblings, 0 replies; 27+ messages in thread
From: Kirill Tkhai @ 2017-11-17 18:27 UTC (permalink / raw)
  To: davem, vyasevic, kstewart, pombredanne, vyasevich, mark.rutland,
	gregkh, adobriyan, fw, nicolas.dichtel, xiyou.wangcong,
	roman.kapl, paul, dsahern, daniel, lucien.xin, mschiffer,
	rshearma, linux-kernel, netdev, ktkhai, ebiederm, avagin,
	gorcunov, eric.dumazet, stephen, ktkhai

Curently mutex is used to protect pernet operations list. It makes
cleanup_net() to execute ->exit methods of the same operations set,
which was used on the time of ->init, even after net namespace is
unlinked from net_namespace_list.

But the problem is it's need to synchronize_rcu() after net is removed
from net_namespace_list():

Destroy net_ns:
cleanup_net()
  mutex_lock(&net_mutex)
  list_del_rcu(&net->list)
  synchronize_rcu()                                  <--- Sleep there for ages
  list_for_each_entry_reverse(ops, &pernet_list, list)
    ops_exit_list(ops, &net_exit_list)
  list_for_each_entry_reverse(ops, &pernet_list, list)
    ops_free_list(ops, &net_exit_list)
  mutex_unlock(&net_mutex)

This primitive is not fast, especially on the systems with many processors
and/or when preemptible RCU is enabled in config. So, all the time, while
cleanup_net() is waiting for RCU grace period, creation of new net namespaces
is not possible, the tasks, who makes it, are sleeping on the same mutex:

Create net_ns:
copy_net_ns()
  mutex_lock_killable(&net_mutex)                    <--- Sleep there for ages

I observed 20-30 seconds hangs of "unshare -n" on ordinary 8-cpu laptop
with preemptible RCU enabled.

The solution is to convert net_mutex to the rw_semaphore and add small locks
to really small number of pernet_operations, what really need them. Then,
pernet_operations::init/::exit methods, modifying the net-related data,
will require down_read() locking only, while down_write() will be used
for changing pernet_list.

This gives signify performance increase, like you may see here:
https://www.spinics.net/lists/netdev/msg467095.html

It's 4.6 times performance increase on one-thread test.
Multi-thread tests increase may be close to 4.6 multiplied
to number of threads.

This patch starts replacing net_mutex to net_sem. It adds rw_semaphore,
describes the variables it protects, and makes to use where appropriate.
net_mutex is still present, and next patches will kick it out step-by-step.

Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
---
 include/linux/rtnetlink.h |    1 +
 net/core/net_namespace.c  |   37 +++++++++++++++++++++++++------------
 net/core/rtnetlink.c      |    4 ++--
 3 files changed, 28 insertions(+), 14 deletions(-)

diff --git a/include/linux/rtnetlink.h b/include/linux/rtnetlink.h
index 2032ce2eb20b..f640fc87fe1d 100644
--- a/include/linux/rtnetlink.h
+++ b/include/linux/rtnetlink.h
@@ -35,6 +35,7 @@ extern int rtnl_is_locked(void);
 
 extern wait_queue_head_t netdev_unregistering_wq;
 extern struct mutex net_mutex;
+extern struct rw_semaphore net_sem;
 
 #ifdef CONFIG_PROVE_LOCKING
 extern bool lockdep_rtnl_is_held(void);
diff --git a/net/core/net_namespace.c b/net/core/net_namespace.c
index 2e512965bf42..2254b1639209 100644
--- a/net/core/net_namespace.c
+++ b/net/core/net_namespace.c
@@ -41,6 +41,11 @@ struct net init_net = {
 EXPORT_SYMBOL(init_net);
 
 static bool init_net_initialized;
+/*
+ * net_sem: protects: pernet_list, net_generic_ids,
+ * init_net_initialized and first_* pointers.
+ */
+DECLARE_RWSEM(net_sem);
 
 #define MIN_PERNET_OPS_ID	\
 	((sizeof(struct net_generic) + sizeof(void *) - 1) / sizeof(void *))
@@ -411,12 +416,16 @@ struct net *copy_net_ns(unsigned long flags,
 	net->ucounts = ucounts;
 	get_user_ns(user_ns);
 
-	rv = mutex_lock_killable(&net_mutex);
+	rv = down_read_killable(&net_sem);
 	if (rv < 0)
 		goto put_userns;
-
+	rv = mutex_lock_killable(&net_mutex);
+	if (rv < 0)
+		goto up_read;
 	rv = setup_net(net, user_ns);
 	mutex_unlock(&net_mutex);
+up_read:
+	up_read(&net_sem);
 	if (rv < 0) {
 put_userns:
 		put_user_ns(user_ns);
@@ -443,6 +452,7 @@ static void cleanup_net(struct work_struct *work)
 	list_replace_init(&cleanup_list, &net_kill_list);
 	spin_unlock_irq(&cleanup_list_lock);
 
+	down_read(&net_sem);
 	mutex_lock(&net_mutex);
 
 	/* Don't let anyone else find us. */
@@ -484,6 +494,7 @@ static void cleanup_net(struct work_struct *work)
 		ops_free_list(ops, &net_exit_list);
 
 	mutex_unlock(&net_mutex);
+	up_read(&net_sem);
 
 	/* Ensure there are no outstanding rcu callbacks using this
 	 * network namespace.
@@ -510,8 +521,10 @@ static void cleanup_net(struct work_struct *work)
  */
 void net_ns_barrier(void)
 {
+	down_write(&net_sem);
 	mutex_lock(&net_mutex);
 	mutex_unlock(&net_mutex);
+	up_write(&net_sem);
 }
 EXPORT_SYMBOL(net_ns_barrier);
 
@@ -838,12 +851,12 @@ static int __init net_ns_init(void)
 
 	rcu_assign_pointer(init_net.gen, ng);
 
-	mutex_lock(&net_mutex);
+	down_write(&net_sem);
 	if (setup_net(&init_net, &init_user_ns))
 		panic("Could not setup the initial network namespace");
 
 	init_net_initialized = true;
-	mutex_unlock(&net_mutex);
+	up_write(&net_sem);
 
 	register_pernet_subsys(&net_ns_ops);
 
@@ -983,9 +996,9 @@ static void unregister_pernet_operations(struct pernet_operations *ops)
 int register_pernet_subsys(struct pernet_operations *ops)
 {
 	int error;
-	mutex_lock(&net_mutex);
+	down_write(&net_sem);
 	error =  register_pernet_operations(first_device, ops);
-	mutex_unlock(&net_mutex);
+	up_write(&net_sem);
 	return error;
 }
 EXPORT_SYMBOL_GPL(register_pernet_subsys);
@@ -1001,9 +1014,9 @@ EXPORT_SYMBOL_GPL(register_pernet_subsys);
  */
 void unregister_pernet_subsys(struct pernet_operations *ops)
 {
-	mutex_lock(&net_mutex);
+	down_write(&net_sem);
 	unregister_pernet_operations(ops);
-	mutex_unlock(&net_mutex);
+	up_write(&net_sem);
 }
 EXPORT_SYMBOL_GPL(unregister_pernet_subsys);
 
@@ -1029,11 +1042,11 @@ EXPORT_SYMBOL_GPL(unregister_pernet_subsys);
 int register_pernet_device(struct pernet_operations *ops)
 {
 	int error;
-	mutex_lock(&net_mutex);
+	down_write(&net_sem);
 	error = register_pernet_operations(&pernet_list, ops);
 	if (!error && (first_device == &pernet_list))
 		first_device = &ops->list;
-	mutex_unlock(&net_mutex);
+	up_write(&net_sem);
 	return error;
 }
 EXPORT_SYMBOL_GPL(register_pernet_device);
@@ -1049,11 +1062,11 @@ EXPORT_SYMBOL_GPL(register_pernet_device);
  */
 void unregister_pernet_device(struct pernet_operations *ops)
 {
-	mutex_lock(&net_mutex);
+	down_write(&net_sem);
 	if (&ops->list == first_device)
 		first_device = first_device->next;
 	unregister_pernet_operations(ops);
-	mutex_unlock(&net_mutex);
+	up_write(&net_sem);
 }
 EXPORT_SYMBOL_GPL(unregister_pernet_device);
 
diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c
index dabba2a91fc8..cb06d43c4230 100644
--- a/net/core/rtnetlink.c
+++ b/net/core/rtnetlink.c
@@ -390,11 +390,11 @@ static void rtnl_lock_unregistering_all(void)
 void rtnl_link_unregister(struct rtnl_link_ops *ops)
 {
 	/* Close the race with cleanup_net() */
-	mutex_lock(&net_mutex);
+	down_write(&net_sem);
 	rtnl_lock_unregistering_all();
 	__rtnl_link_unregister(ops);
 	rtnl_unlock();
-	mutex_unlock(&net_mutex);
+	up_write(&net_sem);
 }
 EXPORT_SYMBOL_GPL(rtnl_link_unregister);
 

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH RFC 04/25] net: Move mutex_unlock() in cleanup_net() up
  2017-11-17 18:27 [PATCH RFC 00/25] Replacing net_mutex with rw_semaphore Kirill Tkhai
                   ` (2 preceding siblings ...)
  2017-11-17 18:27 ` [PATCH RFC 03/25] net: Introduce net_sem for protection of pernet_list Kirill Tkhai
@ 2017-11-17 18:27 ` Kirill Tkhai
  2017-11-17 18:28 ` [PATCH RFC 05/25] net: Add primitives to update heads of pernet_list sublists Kirill Tkhai
                   ` (21 subsequent siblings)
  25 siblings, 0 replies; 27+ messages in thread
From: Kirill Tkhai @ 2017-11-17 18:27 UTC (permalink / raw)
  To: davem, vyasevic, kstewart, pombredanne, vyasevich, mark.rutland,
	gregkh, adobriyan, fw, nicolas.dichtel, xiyou.wangcong,
	roman.kapl, paul, dsahern, daniel, lucien.xin, mschiffer,
	rshearma, linux-kernel, netdev, ktkhai, ebiederm, avagin,
	gorcunov, eric.dumazet, stephen, ktkhai

net_sem protects from pernet_list changing, while
ops_free_list() makes simple kfree(), and it can't
race with other pernet_operations callbacks.

So we may release net_mutex earlier then it was.

Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
---
 net/core/net_namespace.c |    3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/net/core/net_namespace.c b/net/core/net_namespace.c
index 2254b1639209..a8ea580885d9 100644
--- a/net/core/net_namespace.c
+++ b/net/core/net_namespace.c
@@ -489,11 +489,12 @@ static void cleanup_net(struct work_struct *work)
 	list_for_each_entry_reverse(ops, &pernet_list, list)
 		ops_exit_list(ops, &net_exit_list);
 
+	mutex_unlock(&net_mutex);
+
 	/* Free the net generic variables */
 	list_for_each_entry_reverse(ops, &pernet_list, list)
 		ops_free_list(ops, &net_exit_list);
 
-	mutex_unlock(&net_mutex);
 	up_read(&net_sem);
 
 	/* Ensure there are no outstanding rcu callbacks using this

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH RFC 05/25] net: Add primitives to update heads of pernet_list sublists
  2017-11-17 18:27 [PATCH RFC 00/25] Replacing net_mutex with rw_semaphore Kirill Tkhai
                   ` (3 preceding siblings ...)
  2017-11-17 18:27 ` [PATCH RFC 04/25] net: Move mutex_unlock() in cleanup_net() up Kirill Tkhai
@ 2017-11-17 18:28 ` Kirill Tkhai
  2017-11-17 18:28 ` [PATCH RFC 06/25] net: Add pernet sys and registration functions Kirill Tkhai
                   ` (20 subsequent siblings)
  25 siblings, 0 replies; 27+ messages in thread
From: Kirill Tkhai @ 2017-11-17 18:28 UTC (permalink / raw)
  To: davem, vyasevic, kstewart, pombredanne, vyasevich, mark.rutland,
	gregkh, adobriyan, fw, nicolas.dichtel, xiyou.wangcong,
	roman.kapl, paul, dsahern, daniel, lucien.xin, mschiffer,
	rshearma, linux-kernel, netdev, ktkhai, ebiederm, avagin,
	gorcunov, eric.dumazet, stephen, ktkhai

Currently we have first_device, and device and subsys
sublists. Next patches introduce one more sublist.
So, move the functionality, which will be repeating,
to the primitives.

Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
---
 net/core/net_namespace.c |   19 +++++++++++++++----
 1 file changed, 15 insertions(+), 4 deletions(-)

diff --git a/net/core/net_namespace.c b/net/core/net_namespace.c
index a8ea580885d9..1d9712973695 100644
--- a/net/core/net_namespace.c
+++ b/net/core/net_namespace.c
@@ -939,6 +939,18 @@ static void __unregister_pernet_operations(struct pernet_operations *ops)
 
 static DEFINE_IDA(net_generic_ids);
 
+#define update_first_on_add(first, delim, added) 	\
+	do {						\
+		if (first == delim)			\
+			first = added;			\
+	} while (0)
+
+#define update_first_on_del(first, to_delete)		\
+	do {						\
+		if (first == to_delete)			\
+			first = (to_delete)->next;	\
+	} while (0)
+
 static int register_pernet_operations(struct list_head *list,
 				      struct pernet_operations *ops)
 {
@@ -1045,8 +1057,8 @@ int register_pernet_device(struct pernet_operations *ops)
 	int error;
 	down_write(&net_sem);
 	error = register_pernet_operations(&pernet_list, ops);
-	if (!error && (first_device == &pernet_list))
-		first_device = &ops->list;
+	if (!error)
+		update_first_on_add(first_device, &pernet_list, &ops->list);
 	up_write(&net_sem);
 	return error;
 }
@@ -1064,8 +1076,7 @@ EXPORT_SYMBOL_GPL(register_pernet_device);
 void unregister_pernet_device(struct pernet_operations *ops)
 {
 	down_write(&net_sem);
-	if (&ops->list == first_device)
-		first_device = first_device->next;
+	update_first_on_del(first_device, &ops->list);
 	unregister_pernet_operations(ops);
 	up_write(&net_sem);
 }

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH RFC 06/25] net: Add pernet sys and registration functions
  2017-11-17 18:27 [PATCH RFC 00/25] Replacing net_mutex with rw_semaphore Kirill Tkhai
                   ` (4 preceding siblings ...)
  2017-11-17 18:28 ` [PATCH RFC 05/25] net: Add primitives to update heads of pernet_list sublists Kirill Tkhai
@ 2017-11-17 18:28 ` Kirill Tkhai
  2017-11-17 18:28 ` [PATCH RFC 07/25] net: Make sys sublist pernet_operations executed out of net_mutex Kirill Tkhai
                   ` (19 subsequent siblings)
  25 siblings, 0 replies; 27+ messages in thread
From: Kirill Tkhai @ 2017-11-17 18:28 UTC (permalink / raw)
  To: davem, vyasevic, kstewart, pombredanne, vyasevich, mark.rutland,
	gregkh, adobriyan, fw, nicolas.dichtel, xiyou.wangcong,
	roman.kapl, paul, dsahern, daniel, lucien.xin, mschiffer,
	rshearma, linux-kernel, netdev, ktkhai, ebiederm, avagin,
	gorcunov, eric.dumazet, stephen, ktkhai

This is a new sublist of pernet_list, which will live ahead
of already existing:

sys, subsys, device.

It's aimed for subsystems, which pernet_operations may execute
in parallel with any other's pernet_operations. In further,
step-by-step we will move all subsys there, adding necessary
small synchronization locks, where it's need. After all subsys
are moved to sys, we'll kill subsys list and we'll have
all current subsys not requiring net_mutex and to be able
to init and exit in parallel with others.

Then we'll add dev sublist ahead of device, and will repeat
the cycle.

Suggested-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
---
 include/net/net_namespace.h |    2 +
 net/core/net_namespace.c    |   75 ++++++++++++++++++++++++++++++++++++++++++-
 2 files changed, 75 insertions(+), 2 deletions(-)

diff --git a/include/net/net_namespace.h b/include/net/net_namespace.h
index 10f99dafd5ac..2cde5f766ec6 100644
--- a/include/net/net_namespace.h
+++ b/include/net/net_namespace.h
@@ -324,6 +324,8 @@ struct pernet_operations {
  * device which caused kernel oops, and panics during network
  * namespace cleanup.   So please don't get this wrong.
  */
+int register_pernet_sys(struct pernet_operations *);
+void unregister_pernet_sys(struct pernet_operations *);
 int register_pernet_subsys(struct pernet_operations *);
 void unregister_pernet_subsys(struct pernet_operations *);
 int register_pernet_device(struct pernet_operations *);
diff --git a/net/core/net_namespace.c b/net/core/net_namespace.c
index 1d9712973695..f4f4aaa5ce1f 100644
--- a/net/core/net_namespace.c
+++ b/net/core/net_namespace.c
@@ -24,10 +24,24 @@
 #include <net/netns/generic.h>
 
 /*
- *	Our network namespace constructor/destructor lists
+ * Our network namespace constructor/destructor lists
+ * one by one linked in pernet_list. They are (in order
+ * of linking): sys, subsys, device.
+ *
+ * The methods from sys for a network namespace may be
+ * called in parallel with any method from any list
+ * for another net namespace.
+ *
+ * The methods from subsys and device can't be called
+ * in parallel with a method from subsys or device.
+ *
+ * When all subsys pernet_operations are moved to sys
+ * sublist, we'll kill subsys sublist, and create dev
+ * ahead of device sublist, and repeat the cycle.
  */
 
 static LIST_HEAD(pernet_list);
+static struct list_head *first_subsys = &pernet_list;
 static struct list_head *first_device = &pernet_list;
 DEFINE_MUTEX(net_mutex);
 
@@ -987,6 +1001,57 @@ static void unregister_pernet_operations(struct pernet_operations *ops)
 		ida_remove(&net_generic_ids, *ops->id);
 }
 
+/**
+ *      register_pernet_sys - register a network namespace system
+ *	@ops:  pernet operations structure for the system
+ *
+ *	Register a subsystem which has init and exit functions
+ *	that are called when network namespaces are created and
+ *	destroyed respectively.
+ *
+ *	When registered all network namespace init functions are
+ *	called for every existing network namespace.  Allowing kernel
+ *	modules to have a race free view of the set of network namespaces.
+ *
+ *	When a new network namespace is created all of the init
+ *	methods are called in the order in which they were registered.
+ *
+ *	When a network namespace is destroyed all of the exit methods
+ *	are called in the reverse of the order with which they were
+ *	registered.
+ */
+int register_pernet_sys(struct pernet_operations *ops)
+{
+	int error;
+	down_write(&net_sem);
+	if (first_subsys != first_device) {
+		panic("Pernet %ps registered out of order.\n"
+		      "There is already %ps.\n", ops,
+		      list_entry(first_subsys, struct pernet_operations, list));
+	}
+	error =  register_pernet_operations(first_subsys, ops);
+	up_write(&net_sem);
+	return error;
+}
+EXPORT_SYMBOL_GPL(register_pernet_sys);
+
+/**
+ *      unregister_pernet_sys - unregister a network namespace system
+ *	@ops: pernet operations structure to manipulate
+ *
+ *	Remove the pernet operations structure from the list to be
+ *	used when network namespaces are created or destroyed.  In
+ *	addition run the exit method for all existing network
+ *	namespaces.
+ */
+void unregister_pernet_sys(struct pernet_operations *ops)
+{
+	down_write(&net_sem);
+	unregister_pernet_operations(ops);
+	up_write(&net_sem);
+}
+EXPORT_SYMBOL_GPL(unregister_pernet_sys);
+
 /**
  *      register_pernet_subsys - register a network namespace subsystem
  *	@ops:  pernet operations structure for the subsystem
@@ -1011,6 +1076,8 @@ int register_pernet_subsys(struct pernet_operations *ops)
 	int error;
 	down_write(&net_sem);
 	error =  register_pernet_operations(first_device, ops);
+	if (!error)
+		update_first_on_add(first_subsys, first_device, &ops->list);
 	up_write(&net_sem);
 	return error;
 }
@@ -1028,6 +1095,7 @@ EXPORT_SYMBOL_GPL(register_pernet_subsys);
 void unregister_pernet_subsys(struct pernet_operations *ops)
 {
 	down_write(&net_sem);
+	update_first_on_del(first_subsys, &ops->list);
 	unregister_pernet_operations(ops);
 	up_write(&net_sem);
 }
@@ -1057,8 +1125,10 @@ int register_pernet_device(struct pernet_operations *ops)
 	int error;
 	down_write(&net_sem);
 	error = register_pernet_operations(&pernet_list, ops);
-	if (!error)
+	if (!error) {
+		update_first_on_add(first_subsys, &pernet_list, &ops->list);
 		update_first_on_add(first_device, &pernet_list, &ops->list);
+	}
 	up_write(&net_sem);
 	return error;
 }
@@ -1076,6 +1146,7 @@ EXPORT_SYMBOL_GPL(register_pernet_device);
 void unregister_pernet_device(struct pernet_operations *ops)
 {
 	down_write(&net_sem);
+	update_first_on_del(first_subsys, &ops->list);
 	update_first_on_del(first_device, &ops->list);
 	unregister_pernet_operations(ops);
 	up_write(&net_sem);

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH RFC 07/25] net: Make sys sublist pernet_operations executed out of net_mutex
  2017-11-17 18:27 [PATCH RFC 00/25] Replacing net_mutex with rw_semaphore Kirill Tkhai
                   ` (5 preceding siblings ...)
  2017-11-17 18:28 ` [PATCH RFC 06/25] net: Add pernet sys and registration functions Kirill Tkhai
@ 2017-11-17 18:28 ` Kirill Tkhai
  2017-11-17 18:28 ` [PATCH RFC 08/25] net: Move proc_net_ns_ops to pernet_sys list Kirill Tkhai
                   ` (18 subsequent siblings)
  25 siblings, 0 replies; 27+ messages in thread
From: Kirill Tkhai @ 2017-11-17 18:28 UTC (permalink / raw)
  To: davem, vyasevic, kstewart, pombredanne, vyasevich, mark.rutland,
	gregkh, adobriyan, fw, nicolas.dichtel, xiyou.wangcong,
	roman.kapl, paul, dsahern, daniel, lucien.xin, mschiffer,
	rshearma, linux-kernel, netdev, ktkhai, ebiederm, avagin,
	gorcunov, eric.dumazet, stephen, ktkhai

Move net_mutex to setup_net() and cleanup_net(), and
do not hold it, while sys sublist methods are executed.

Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
---
 net/core/net_namespace.c |   44 +++++++++++++++++++++++++++++++++++---------
 1 file changed, 35 insertions(+), 9 deletions(-)

diff --git a/net/core/net_namespace.c b/net/core/net_namespace.c
index f4f4aaa5ce1f..7aec8c1afe50 100644
--- a/net/core/net_namespace.c
+++ b/net/core/net_namespace.c
@@ -84,11 +84,11 @@ static int net_assign_generic(struct net *net, unsigned int id, void *data)
 {
 	struct net_generic *ng, *old_ng;
 
-	BUG_ON(!mutex_is_locked(&net_mutex));
+	BUG_ON(!rwsem_is_locked(&net_sem));
 	BUG_ON(id < MIN_PERNET_OPS_ID);
 
 	old_ng = rcu_dereference_protected(net->gen,
-					   lockdep_is_held(&net_mutex));
+					   lockdep_is_held(&net_sem));
 	if (old_ng->s.len > id) {
 		old_ng->ptr[id] = data;
 		return 0;
@@ -300,6 +300,7 @@ static __net_init int setup_net(struct net *net, struct user_namespace *user_ns)
 {
 	/* Must be called with net_mutex held */
 	const struct pernet_operations *ops, *saved_ops;
+	bool locked = false;
 	int error = 0;
 	LIST_HEAD(net_exit_list);
 
@@ -311,14 +312,34 @@ static __net_init int setup_net(struct net *net, struct user_namespace *user_ns)
 	spin_lock_init(&net->nsid_lock);
 
 	list_for_each_entry(ops, &pernet_list, list) {
+		if (&ops->list == first_subsys) {
+			BUG_ON(locked);
+			error = mutex_lock_killable(&net_mutex);
+			if (error)
+				goto out_undo;
+			locked = true;
+		}
+
 		error = ops_init(ops, net);
 		if (error < 0)
 			goto out_undo;
 	}
+
+	if (!locked) {
+		/*
+		 * This may happen only on early boot, so we don't
+		 * care about possibility to interrupt the locking.
+		 */
+		mutex_lock(&net_mutex);
+		locked = true;
+	}
+
 	rtnl_lock();
 	list_add_tail_rcu(&net->list, &net_namespace_list);
 	rtnl_unlock();
 out:
+	if (locked)
+		mutex_unlock(&net_mutex);
 	return error;
 
 out_undo:
@@ -433,12 +454,7 @@ struct net *copy_net_ns(unsigned long flags,
 	rv = down_read_killable(&net_sem);
 	if (rv < 0)
 		goto put_userns;
-	rv = mutex_lock_killable(&net_mutex);
-	if (rv < 0)
-		goto up_read;
 	rv = setup_net(net, user_ns);
-	mutex_unlock(&net_mutex);
-up_read:
 	up_read(&net_sem);
 	if (rv < 0) {
 put_userns:
@@ -460,6 +476,7 @@ static void cleanup_net(struct work_struct *work)
 	struct net *net, *tmp;
 	struct list_head net_kill_list;
 	LIST_HEAD(net_exit_list);
+	bool locked;
 
 	/* Atomically snapshot the list of namespaces to cleanup */
 	spin_lock_irq(&cleanup_list_lock);
@@ -468,6 +485,7 @@ static void cleanup_net(struct work_struct *work)
 
 	down_read(&net_sem);
 	mutex_lock(&net_mutex);
+	locked = true;
 
 	/* Don't let anyone else find us. */
 	rtnl_lock();
@@ -500,10 +518,18 @@ static void cleanup_net(struct work_struct *work)
 	synchronize_rcu();
 
 	/* Run all of the network namespace exit methods */
-	list_for_each_entry_reverse(ops, &pernet_list, list)
+	list_for_each_entry_reverse(ops, &pernet_list, list) {
 		ops_exit_list(ops, &net_exit_list);
 
-	mutex_unlock(&net_mutex);
+		if (&ops->list == first_subsys) {
+			BUG_ON(!locked);
+			mutex_unlock(&net_mutex);
+			locked = false;
+		}
+	}
+
+	if (locked)
+		mutex_unlock(&net_mutex);
 
 	/* Free the net generic variables */
 	list_for_each_entry_reverse(ops, &pernet_list, list)

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH RFC 08/25] net: Move proc_net_ns_ops to pernet_sys list
  2017-11-17 18:27 [PATCH RFC 00/25] Replacing net_mutex with rw_semaphore Kirill Tkhai
                   ` (6 preceding siblings ...)
  2017-11-17 18:28 ` [PATCH RFC 07/25] net: Make sys sublist pernet_operations executed out of net_mutex Kirill Tkhai
@ 2017-11-17 18:28 ` Kirill Tkhai
  2017-11-17 18:28 ` [PATCH RFC 09/25] net: Move net_ns_ops " Kirill Tkhai
                   ` (17 subsequent siblings)
  25 siblings, 0 replies; 27+ messages in thread
From: Kirill Tkhai @ 2017-11-17 18:28 UTC (permalink / raw)
  To: davem, vyasevic, kstewart, pombredanne, vyasevich, mark.rutland,
	gregkh, adobriyan, fw, nicolas.dichtel, xiyou.wangcong,
	roman.kapl, paul, dsahern, daniel, lucien.xin, mschiffer,
	rshearma, linux-kernel, netdev, ktkhai, ebiederm, avagin,
	gorcunov, eric.dumazet, stephen, ktkhai

This patch starts to convert pernet_subsys, registered
from before initcalls.

Since proc_net_ns_ops is registered pernet_subsys,
made from:

start_kernel()->proc_root_init()->proc_net_init(),

and there is no a pernet_subsys, which is registered
earlier, we start from it.

proc_net_ns_ops::proc_net_ns_init()/proc_net_ns_exit()
register pernet net->proc_net and ->proc_net_stat, and
constructors and destructors of another pernet_operations
are not interested in foreign net's proc_net and proc_net_stat.
Proc filesystem privitives are synchronized on proc_subdir_lock.

So, it's safe to move proc_net_ns_ops to pernet_sys list
and execute its methods in parallel with another pernet
operations.

Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
---
 fs/proc/proc_net.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/proc/proc_net.c b/fs/proc/proc_net.c
index a2bf369c923d..5eb52765eeab 100644
--- a/fs/proc/proc_net.c
+++ b/fs/proc/proc_net.c
@@ -243,5 +243,5 @@ int __init proc_net_init(void)
 {
 	proc_symlink("net", NULL, "self/net");
 
-	return register_pernet_subsys(&proc_net_ns_ops);
+	return register_pernet_sys(&proc_net_ns_ops);
 }

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH RFC 09/25] net: Move net_ns_ops to pernet_sys list
  2017-11-17 18:27 [PATCH RFC 00/25] Replacing net_mutex with rw_semaphore Kirill Tkhai
                   ` (7 preceding siblings ...)
  2017-11-17 18:28 ` [PATCH RFC 08/25] net: Move proc_net_ns_ops to pernet_sys list Kirill Tkhai
@ 2017-11-17 18:28 ` Kirill Tkhai
  2017-11-17 18:28 ` [PATCH RFC 10/25] net: Move sysctl_pernet_ops " Kirill Tkhai
                   ` (16 subsequent siblings)
  25 siblings, 0 replies; 27+ messages in thread
From: Kirill Tkhai @ 2017-11-17 18:28 UTC (permalink / raw)
  To: davem, vyasevic, kstewart, pombredanne, vyasevich, mark.rutland,
	gregkh, adobriyan, fw, nicolas.dichtel, xiyou.wangcong,
	roman.kapl, paul, dsahern, daniel, lucien.xin, mschiffer,
	rshearma, linux-kernel, netdev, ktkhai, ebiederm, avagin,
	gorcunov, eric.dumazet, stephen, ktkhai

This patch starts to convert pernet_subsys, registered
from pure initcalls.

Since net_ns_init() is the only pure initcall in net subsystem,
and there is no early initcalls; the pernet subsys, it registers,
is the first in pernet_operations list. So, we start with it.

net_ns_ops::net_ns_net_init/net_ns_net_init, methods use only
ida_simple_* functions, which are not need a synchronization.

So it's safe to execute them in parallel with any other
pernet_operations, and thus we convert net_ns_ops to pernet_sys type.

Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
---
 net/core/net_namespace.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/core/net_namespace.c b/net/core/net_namespace.c
index 7aec8c1afe50..2e8295aa7003 100644
--- a/net/core/net_namespace.c
+++ b/net/core/net_namespace.c
@@ -899,7 +899,7 @@ static int __init net_ns_init(void)
 	init_net_initialized = true;
 	up_write(&net_sem);
 
-	register_pernet_subsys(&net_ns_ops);
+	register_pernet_sys(&net_ns_ops);
 
 	rtnl_register(PF_UNSPEC, RTM_NEWNSID, rtnl_net_newid, NULL,
 		      RTNL_FLAG_DOIT_UNLOCKED);

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH RFC 10/25] net: Move sysctl_pernet_ops to pernet_sys list
  2017-11-17 18:27 [PATCH RFC 00/25] Replacing net_mutex with rw_semaphore Kirill Tkhai
                   ` (8 preceding siblings ...)
  2017-11-17 18:28 ` [PATCH RFC 09/25] net: Move net_ns_ops " Kirill Tkhai
@ 2017-11-17 18:28 ` Kirill Tkhai
  2017-11-17 18:29 ` [PATCH RFC 11/25] net: Move netfilter_net_ops " Kirill Tkhai
                   ` (15 subsequent siblings)
  25 siblings, 0 replies; 27+ messages in thread
From: Kirill Tkhai @ 2017-11-17 18:28 UTC (permalink / raw)
  To: davem, vyasevic, kstewart, pombredanne, vyasevich, mark.rutland,
	gregkh, adobriyan, fw, nicolas.dichtel, xiyou.wangcong,
	roman.kapl, paul, dsahern, daniel, lucien.xin, mschiffer,
	rshearma, linux-kernel, netdev, ktkhai, ebiederm, avagin,
	gorcunov, eric.dumazet, stephen, ktkhai

This patch starts to convert pernet_subsys, registered
from core initcalls.

Since net/socket.o is the first linked file in net/Makefile,
its core initcalls execute the first. sysctl_pernet_ops is
the first pernet_subsys, registered from sock_init(), so
it goes ahead of others, registered via core_initcall().

Methods sysctl_net_init() and sysctl_net_exit() initialize
net::sysctls of a namespace.

pernet_operations::init()/exit() methods from the rest
of the list do not touch net::sysctls of strangers,
so it's safe to execute sysctl_pernet_ops's methods
in parallel with any other pernet_operations.

Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
---
 net/sysctl_net.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/sysctl_net.c b/net/sysctl_net.c
index 9aed6fe1bf1a..1b91db88e54a 100644
--- a/net/sysctl_net.c
+++ b/net/sysctl_net.c
@@ -103,7 +103,7 @@ __init int net_sysctl_init(void)
 	net_header = register_sysctl("net", empty);
 	if (!net_header)
 		goto out;
-	ret = register_pernet_subsys(&sysctl_pernet_ops);
+	ret = register_pernet_sys(&sysctl_pernet_ops);
 	if (ret)
 		goto out1;
 out:

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH RFC 11/25] net: Move netfilter_net_ops to pernet_sys list
  2017-11-17 18:27 [PATCH RFC 00/25] Replacing net_mutex with rw_semaphore Kirill Tkhai
                   ` (9 preceding siblings ...)
  2017-11-17 18:28 ` [PATCH RFC 10/25] net: Move sysctl_pernet_ops " Kirill Tkhai
@ 2017-11-17 18:29 ` Kirill Tkhai
  2017-11-17 18:29 ` [PATCH RFC 12/25] net: Move nf_log_net_ops " Kirill Tkhai
                   ` (14 subsequent siblings)
  25 siblings, 0 replies; 27+ messages in thread
From: Kirill Tkhai @ 2017-11-17 18:29 UTC (permalink / raw)
  To: davem, vyasevic, kstewart, pombredanne, vyasevich, mark.rutland,
	gregkh, adobriyan, fw, nicolas.dichtel, xiyou.wangcong,
	roman.kapl, paul, dsahern, daniel, lucien.xin, mschiffer,
	rshearma, linux-kernel, netdev, ktkhai, ebiederm, avagin,
	gorcunov, eric.dumazet, stephen, ktkhai

Since net/socket.o is the first linked file in net/Makefile,
its core initcalls execute the first. netfilter_net_ops
is executed right after sysctl_pernet_ops.

Methods netfilter_net_init() and netfilter_net_exit()
initialize net::nf::hooks and change net-related proc
directory of net. Another pernet_operations do not
interested in forein net::nf::hooks or proc entries,
so it's safe to move netfilter_net_ops to pernet list.

Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
---
 net/netfilter/core.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/netfilter/core.c b/net/netfilter/core.c
index 52cd2901a097..2bed28281b67 100644
--- a/net/netfilter/core.c
+++ b/net/netfilter/core.c
@@ -606,7 +606,7 @@ int __init netfilter_init(void)
 {
 	int ret;
 
-	ret = register_pernet_subsys(&netfilter_net_ops);
+	ret = register_pernet_sys(&netfilter_net_ops);
 	if (ret < 0)
 		goto err;
 

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH RFC 12/25] net: Move nf_log_net_ops to pernet_sys list
  2017-11-17 18:27 [PATCH RFC 00/25] Replacing net_mutex with rw_semaphore Kirill Tkhai
                   ` (10 preceding siblings ...)
  2017-11-17 18:29 ` [PATCH RFC 11/25] net: Move netfilter_net_ops " Kirill Tkhai
@ 2017-11-17 18:29 ` Kirill Tkhai
  2017-11-17 18:29 ` [PATCH RFC 13/25] net: Move net_inuse_ops " Kirill Tkhai
                   ` (13 subsequent siblings)
  25 siblings, 0 replies; 27+ messages in thread
From: Kirill Tkhai @ 2017-11-17 18:29 UTC (permalink / raw)
  To: davem, vyasevic, kstewart, pombredanne, vyasevich, mark.rutland,
	gregkh, adobriyan, fw, nicolas.dichtel, xiyou.wangcong,
	roman.kapl, paul, dsahern, daniel, lucien.xin, mschiffer,
	rshearma, linux-kernel, netdev, ktkhai, ebiederm, avagin,
	gorcunov, eric.dumazet, stephen, ktkhai

nf_log_net_ops are registered the same initcall
as netfilter_net_ops, so they has to be moved right
after netfilter_net_ops.

The ops would have had a problem in parallel execution
with others, if init_net had been possible to released.
But it's not, and the rest is safe for that. There is
memory allocation, which nobody else interested in,
and sysctl registration. So, we move it to pernet_sys
list.

Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
---
 net/netfilter/nf_log.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/netfilter/nf_log.c b/net/netfilter/nf_log.c
index 8bb152a7cca4..08868afad813 100644
--- a/net/netfilter/nf_log.c
+++ b/net/netfilter/nf_log.c
@@ -582,5 +582,5 @@ static struct pernet_operations nf_log_net_ops = {
 
 int __init netfilter_log_init(void)
 {
-	return register_pernet_subsys(&nf_log_net_ops);
+	return register_pernet_sys(&nf_log_net_ops);
 }

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH RFC 13/25] net: Move net_inuse_ops to pernet_sys list
  2017-11-17 18:27 [PATCH RFC 00/25] Replacing net_mutex with rw_semaphore Kirill Tkhai
                   ` (11 preceding siblings ...)
  2017-11-17 18:29 ` [PATCH RFC 12/25] net: Move nf_log_net_ops " Kirill Tkhai
@ 2017-11-17 18:29 ` Kirill Tkhai
  2017-11-17 18:29 ` [PATCH RFC 14/25] net: Move net_defaults_ops " Kirill Tkhai
                   ` (12 subsequent siblings)
  25 siblings, 0 replies; 27+ messages in thread
From: Kirill Tkhai @ 2017-11-17 18:29 UTC (permalink / raw)
  To: davem, vyasevic, kstewart, pombredanne, vyasevich, mark.rutland,
	gregkh, adobriyan, fw, nicolas.dichtel, xiyou.wangcong,
	roman.kapl, paul, dsahern, daniel, lucien.xin, mschiffer,
	rshearma, linux-kernel, netdev, ktkhai, ebiederm, avagin,
	gorcunov, eric.dumazet, stephen, ktkhai

net/core/sock.o is the first linked file in net/core/Makefile,
so its core initcall executes the first in the directory.

net_inuse_ops methods expose statistics in /proc.
No one from the rest of pernet_subsys or pernet_device lists
does not touch net::core::inuse. So, it's safe to move
net_inuse_ops to pernet_sys list.

Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
---
 net/core/sock.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/core/sock.c b/net/core/sock.c
index 13719af7b4e3..be050b044699 100644
--- a/net/core/sock.c
+++ b/net/core/sock.c
@@ -3081,7 +3081,7 @@ static struct pernet_operations net_inuse_ops = {
 
 static __init int net_inuse_init(void)
 {
-	if (register_pernet_subsys(&net_inuse_ops))
+	if (register_pernet_sys(&net_inuse_ops))
 		panic("Cannot initialize net inuse counters");
 
 	return 0;

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH RFC 14/25] net: Move net_defaults_ops to pernet_sys list
  2017-11-17 18:27 [PATCH RFC 00/25] Replacing net_mutex with rw_semaphore Kirill Tkhai
                   ` (12 preceding siblings ...)
  2017-11-17 18:29 ` [PATCH RFC 13/25] net: Move net_inuse_ops " Kirill Tkhai
@ 2017-11-17 18:29 ` Kirill Tkhai
  2017-11-17 18:29 ` [PATCH RFC 15/25] net: Move netlink_net_ops " Kirill Tkhai
                   ` (11 subsequent siblings)
  25 siblings, 0 replies; 27+ messages in thread
From: Kirill Tkhai @ 2017-11-17 18:29 UTC (permalink / raw)
  To: davem, vyasevic, kstewart, pombredanne, vyasevich, mark.rutland,
	gregkh, adobriyan, fw, nicolas.dichtel, xiyou.wangcong,
	roman.kapl, paul, dsahern, daniel, lucien.xin, mschiffer,
	rshearma, linux-kernel, netdev, ktkhai, ebiederm, avagin,
	gorcunov, eric.dumazet, stephen, ktkhai

According to net/core/Makefile, net/core/net_namespace.o
core initcalls execute right after net/core/sock.o.

net_defaults_ops introduces only net_defaults_init_net method,
and it acts on net::core::sysctl_somaxconn, which
is not interested the rest of pernet_subsys and pernet_device
lists. Then, move it to pernet_sys.

Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
---
 net/core/net_namespace.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/core/net_namespace.c b/net/core/net_namespace.c
index 2e8295aa7003..7fc9d44c1817 100644
--- a/net/core/net_namespace.c
+++ b/net/core/net_namespace.c
@@ -371,7 +371,7 @@ static struct pernet_operations net_defaults_ops = {
 
 static __init int net_defaults_init(void)
 {
-	if (register_pernet_subsys(&net_defaults_ops))
+	if (register_pernet_sys(&net_defaults_ops))
 		panic("Cannot initialize net default settings");
 
 	return 0;

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH RFC 15/25] net: Move netlink_net_ops to pernet_sys list
  2017-11-17 18:27 [PATCH RFC 00/25] Replacing net_mutex with rw_semaphore Kirill Tkhai
                   ` (13 preceding siblings ...)
  2017-11-17 18:29 ` [PATCH RFC 14/25] net: Move net_defaults_ops " Kirill Tkhai
@ 2017-11-17 18:29 ` Kirill Tkhai
  2017-11-17 18:29 ` [PATCH RFC 16/25] net: Move rtnetlink_net_ops " Kirill Tkhai
                   ` (10 subsequent siblings)
  25 siblings, 0 replies; 27+ messages in thread
From: Kirill Tkhai @ 2017-11-17 18:29 UTC (permalink / raw)
  To: davem, vyasevic, kstewart, pombredanne, vyasevich, mark.rutland,
	gregkh, adobriyan, fw, nicolas.dichtel, xiyou.wangcong,
	roman.kapl, paul, dsahern, daniel, lucien.xin, mschiffer,
	rshearma, linux-kernel, netdev, ktkhai, ebiederm, avagin,
	gorcunov, eric.dumazet, stephen, ktkhai

According to net/core/Makefile, net/core/af_netlink.o
core initcalls execute right after net/core/net_namespace.o.

The methods of netlink_net_ops create and destroy "netlink"
file, which are not interested for foreigh pernet_operations.
So, netlink_net_ops may safely be moved to pernet_sys list.

Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
---
 net/netlink/af_netlink.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/netlink/af_netlink.c b/net/netlink/af_netlink.c
index b9e0ee4e22f5..a4f1f5222b79 100644
--- a/net/netlink/af_netlink.c
+++ b/net/netlink/af_netlink.c
@@ -2735,7 +2735,7 @@ static int __init netlink_proto_init(void)
 	netlink_add_usersock_entry();
 
 	sock_register(&netlink_family_ops);
-	register_pernet_subsys(&netlink_net_ops);
+	register_pernet_sys(&netlink_net_ops);
 	/* The netlink device handler may be needed early. */
 	rtnetlink_init();
 out:

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH RFC 16/25] net: Move rtnetlink_net_ops to pernet_sys list
  2017-11-17 18:27 [PATCH RFC 00/25] Replacing net_mutex with rw_semaphore Kirill Tkhai
                   ` (14 preceding siblings ...)
  2017-11-17 18:29 ` [PATCH RFC 15/25] net: Move netlink_net_ops " Kirill Tkhai
@ 2017-11-17 18:29 ` Kirill Tkhai
  2017-11-17 18:29 ` [PATCH RFC 17/25] net: Move audit_net_ops " Kirill Tkhai
                   ` (9 subsequent siblings)
  25 siblings, 0 replies; 27+ messages in thread
From: Kirill Tkhai @ 2017-11-17 18:29 UTC (permalink / raw)
  To: davem, vyasevic, kstewart, pombredanne, vyasevich, mark.rutland,
	gregkh, adobriyan, fw, nicolas.dichtel, xiyou.wangcong,
	roman.kapl, paul, dsahern, daniel, lucien.xin, mschiffer,
	rshearma, linux-kernel, netdev, ktkhai, ebiederm, avagin,
	gorcunov, eric.dumazet, stephen, ktkhai

rtnetlink_net_ops are added the same core initcall
as netlink_net_ops, so they has to be added right
after netlink_net_ops.

rtnetlink_net_init() and rtnetlink_net_exit()
create and destroy netlink socket. It looks like,
another pernet_operations are not interested in
foreiner net::rtnl, so rtnetlink_net_ops may be
safely moved to pernet_sys list.

Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
---
 net/core/rtnetlink.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c
index cb06d43c4230..d9cf13554e4d 100644
--- a/net/core/rtnetlink.c
+++ b/net/core/rtnetlink.c
@@ -4503,7 +4503,7 @@ void __init rtnetlink_init(void)
 	for (i = 0; i < ARRAY_SIZE(rtnl_msg_handlers_ref); i++)
 		refcount_set(&rtnl_msg_handlers_ref[i], 1);
 
-	if (register_pernet_subsys(&rtnetlink_net_ops))
+	if (register_pernet_sys(&rtnetlink_net_ops))
 		panic("rtnetlink_init: cannot initialize rtnetlink\n");
 
 	register_netdevice_notifier(&rtnetlink_dev_notifier);

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH RFC 17/25] net: Move audit_net_ops to pernet_sys list
  2017-11-17 18:27 [PATCH RFC 00/25] Replacing net_mutex with rw_semaphore Kirill Tkhai
                   ` (15 preceding siblings ...)
  2017-11-17 18:29 ` [PATCH RFC 16/25] net: Move rtnetlink_net_ops " Kirill Tkhai
@ 2017-11-17 18:29 ` Kirill Tkhai
  2017-11-17 18:30 ` [PATCH RFC 18/25] net: Move uevent_net_ops " Kirill Tkhai
                   ` (8 subsequent siblings)
  25 siblings, 0 replies; 27+ messages in thread
From: Kirill Tkhai @ 2017-11-17 18:29 UTC (permalink / raw)
  To: davem, vyasevic, kstewart, pombredanne, vyasevich, mark.rutland,
	gregkh, adobriyan, fw, nicolas.dichtel, xiyou.wangcong,
	roman.kapl, paul, dsahern, daniel, lucien.xin, mschiffer,
	rshearma, linux-kernel, netdev, ktkhai, ebiederm, avagin,
	gorcunov, eric.dumazet, stephen, ktkhai

This patch starts to convert pernet_subsys, registered
from postcore initcalls.

These pernet_operations are in ./kernel directory, and
there are only one more postcore in ./lib. So, audit_net_ops
have to go the first.

audit_net_init() creates netlink socket, while audit_net_exit()
destroys it. The rest of the pernet_list are not interested
in the socket, so we move audit_net_ops to pernet_sys list.

Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
---
 kernel/audit.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/audit.c b/kernel/audit.c
index 227db99b0f19..bb4626d7e712 100644
--- a/kernel/audit.c
+++ b/kernel/audit.c
@@ -1549,7 +1549,7 @@ static int __init audit_init(void)
 
 	pr_info("initializing netlink subsys (%s)\n",
 		audit_default ? "enabled" : "disabled");
-	register_pernet_subsys(&audit_net_ops);
+	register_pernet_sys(&audit_net_ops);
 
 	audit_initialized = AUDIT_INITIALIZED;
 

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH RFC 18/25] net: Move uevent_net_ops to pernet_sys list
  2017-11-17 18:27 [PATCH RFC 00/25] Replacing net_mutex with rw_semaphore Kirill Tkhai
                   ` (16 preceding siblings ...)
  2017-11-17 18:29 ` [PATCH RFC 17/25] net: Move audit_net_ops " Kirill Tkhai
@ 2017-11-17 18:30 ` Kirill Tkhai
  2017-11-17 18:30 ` [PATCH RFC 19/25] net: Move proto_net_ops " Kirill Tkhai
                   ` (7 subsequent siblings)
  25 siblings, 0 replies; 27+ messages in thread
From: Kirill Tkhai @ 2017-11-17 18:30 UTC (permalink / raw)
  To: davem, vyasevic, kstewart, pombredanne, vyasevich, mark.rutland,
	gregkh, adobriyan, fw, nicolas.dichtel, xiyou.wangcong,
	roman.kapl, paul, dsahern, daniel, lucien.xin, mschiffer,
	rshearma, linux-kernel, netdev, ktkhai, ebiederm, avagin,
	gorcunov, eric.dumazet, stephen, ktkhai

This postcore_initcall() created pernet_operations
are registered from ./lib directory, and they have
to go right after audit_net_ops.

uevent_net_init() and uevent_net_exit() create and
destroy netlink socket, and these actions serialized
in netlink code.

Parallel execution with other pernet_operations
makes the socket disappear earlier from uevent_sock_list
on ->exit. As userspace can't be interested in broadcast
messages of dying net, and, as I see, no one in kernel
listen them, we may safely move uevent_net_ops to pernet_sys
list.

Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
---
 lib/kobject_uevent.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/lib/kobject_uevent.c b/lib/kobject_uevent.c
index c3e84edc47c9..84c9d85477cc 100644
--- a/lib/kobject_uevent.c
+++ b/lib/kobject_uevent.c
@@ -647,7 +647,7 @@ static struct pernet_operations uevent_net_ops = {
 
 static int __init kobject_uevent_init(void)
 {
-	return register_pernet_subsys(&uevent_net_ops);
+	return register_pernet_sys(&uevent_net_ops);
 }
 
 

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH RFC 19/25] net: Move proto_net_ops to pernet_sys list
  2017-11-17 18:27 [PATCH RFC 00/25] Replacing net_mutex with rw_semaphore Kirill Tkhai
                   ` (17 preceding siblings ...)
  2017-11-17 18:30 ` [PATCH RFC 18/25] net: Move uevent_net_ops " Kirill Tkhai
@ 2017-11-17 18:30 ` Kirill Tkhai
  2017-11-17 18:30 ` [PATCH RFC 20/25] net: Move pernet_subsys, registered via net_dev_init(), " Kirill Tkhai
                   ` (6 subsequent siblings)
  25 siblings, 0 replies; 27+ messages in thread
From: Kirill Tkhai @ 2017-11-17 18:30 UTC (permalink / raw)
  To: davem, vyasevic, kstewart, pombredanne, vyasevich, mark.rutland,
	gregkh, adobriyan, fw, nicolas.dichtel, xiyou.wangcong,
	roman.kapl, paul, dsahern, daniel, lucien.xin, mschiffer,
	rshearma, linux-kernel, netdev, ktkhai, ebiederm, avagin,
	gorcunov, eric.dumazet, stephen, ktkhai

This patch starts to convert pernet_subsys, registered
from subsys initcalls.

According to net/Makefile and net/core/Makefile, this
is the first exected subsys_initcall(), registering
pernet_subsys.

It seems to be executed in parallel with others,
as it's only creates/destoyes proc entry, which
nobody else is not interested in.

Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
---
 net/core/sock.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/core/sock.c b/net/core/sock.c
index be050b044699..ed12e115458b 100644
--- a/net/core/sock.c
+++ b/net/core/sock.c
@@ -3349,7 +3349,7 @@ static __net_initdata struct pernet_operations proto_net_ops = {
 
 static int __init proto_init(void)
 {
-	return register_pernet_subsys(&proto_net_ops);
+	return register_pernet_sys(&proto_net_ops);
 }
 
 subsys_initcall(proto_init);

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH RFC 20/25] net: Move pernet_subsys, registered via net_dev_init(), to pernet_sys list
  2017-11-17 18:27 [PATCH RFC 00/25] Replacing net_mutex with rw_semaphore Kirill Tkhai
                   ` (18 preceding siblings ...)
  2017-11-17 18:30 ` [PATCH RFC 19/25] net: Move proto_net_ops " Kirill Tkhai
@ 2017-11-17 18:30 ` Kirill Tkhai
  2017-11-17 18:30 ` [PATCH RFC 21/25] net: Move fib_* pernet_operations, registered via subsys_initcall(), " Kirill Tkhai
                   ` (5 subsequent siblings)
  25 siblings, 0 replies; 27+ messages in thread
From: Kirill Tkhai @ 2017-11-17 18:30 UTC (permalink / raw)
  To: davem, vyasevic, kstewart, pombredanne, vyasevich, mark.rutland,
	gregkh, adobriyan, fw, nicolas.dichtel, xiyou.wangcong,
	roman.kapl, paul, dsahern, daniel, lucien.xin, mschiffer,
	rshearma, linux-kernel, netdev, ktkhai, ebiederm, avagin,
	gorcunov, eric.dumazet, stephen, ktkhai

net/core/dev.o is lined after net/core/sock.o.

There are:
1)dev_proc_ops and dev_mc_net_ops, which create and destroy
pernet proc file and not interested to another net namespaces;
2)netdev_net_ops, which creates pernet hash, which is not
touched by another pernet_operations.

So, move it to pernet_sys list.

Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
---
 net/core/dev.c        |    2 +-
 net/core/net-procfs.c |    4 ++--
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/net/core/dev.c b/net/core/dev.c
index 8ee29f4f5fa9..b90a503a9e1a 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -8787,7 +8787,7 @@ static int __init net_dev_init(void)
 
 	INIT_LIST_HEAD(&offload_base);
 
-	if (register_pernet_subsys(&netdev_net_ops))
+	if (register_pernet_sys(&netdev_net_ops))
 		goto out;
 
 	/*
diff --git a/net/core/net-procfs.c b/net/core/net-procfs.c
index 615ccab55f38..46096219d574 100644
--- a/net/core/net-procfs.c
+++ b/net/core/net-procfs.c
@@ -413,8 +413,8 @@ static struct pernet_operations __net_initdata dev_mc_net_ops = {
 
 int __init dev_proc_init(void)
 {
-	int ret = register_pernet_subsys(&dev_proc_ops);
+	int ret = register_pernet_sys(&dev_proc_ops);
 	if (!ret)
-		return register_pernet_subsys(&dev_mc_net_ops);
+		return register_pernet_sys(&dev_mc_net_ops);
 	return ret;
 }

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH RFC 21/25] net: Move fib_* pernet_operations, registered via subsys_initcall(), to pernet_sys list
  2017-11-17 18:27 [PATCH RFC 00/25] Replacing net_mutex with rw_semaphore Kirill Tkhai
                   ` (19 preceding siblings ...)
  2017-11-17 18:30 ` [PATCH RFC 20/25] net: Move pernet_subsys, registered via net_dev_init(), " Kirill Tkhai
@ 2017-11-17 18:30 ` Kirill Tkhai
  2017-11-17 18:30 ` [PATCH RFC 22/25] net: Move subsys_initcall() registered pernet_operations from net/sched " Kirill Tkhai
                   ` (4 subsequent siblings)
  25 siblings, 0 replies; 27+ messages in thread
From: Kirill Tkhai @ 2017-11-17 18:30 UTC (permalink / raw)
  To: davem, vyasevic, kstewart, pombredanne, vyasevich, mark.rutland,
	gregkh, adobriyan, fw, nicolas.dichtel, xiyou.wangcong,
	roman.kapl, paul, dsahern, daniel, lucien.xin, mschiffer,
	rshearma, linux-kernel, netdev, ktkhai, ebiederm, avagin,
	gorcunov, eric.dumazet, stephen, ktkhai

Both of them create and initialize lists, which are not touched
by another foreing pernet_operations.

Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
---
 net/core/fib_notifier.c |    2 +-
 net/core/fib_rules.c    |    2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/net/core/fib_notifier.c b/net/core/fib_notifier.c
index 0c048bdeb016..782a1475a32e 100644
--- a/net/core/fib_notifier.c
+++ b/net/core/fib_notifier.c
@@ -175,7 +175,7 @@ static struct pernet_operations fib_notifier_net_ops = {
 
 static int __init fib_notifier_init(void)
 {
-	return register_pernet_subsys(&fib_notifier_net_ops);
+	return register_pernet_sys(&fib_notifier_net_ops);
 }
 
 subsys_initcall(fib_notifier_init);
diff --git a/net/core/fib_rules.c b/net/core/fib_rules.c
index 98e1066c3d55..b2706c18f0f3 100644
--- a/net/core/fib_rules.c
+++ b/net/core/fib_rules.c
@@ -1039,7 +1039,7 @@ static int __init fib_rules_init(void)
 	rtnl_register(PF_UNSPEC, RTM_DELRULE, fib_nl_delrule, NULL, 0);
 	rtnl_register(PF_UNSPEC, RTM_GETRULE, NULL, fib_nl_dumprule, 0);
 
-	err = register_pernet_subsys(&fib_rules_net_ops);
+	err = register_pernet_sys(&fib_rules_net_ops);
 	if (err < 0)
 		goto fail;
 

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH RFC 22/25] net: Move subsys_initcall() registered pernet_operations from net/sched to pernet_sys list
  2017-11-17 18:27 [PATCH RFC 00/25] Replacing net_mutex with rw_semaphore Kirill Tkhai
                   ` (20 preceding siblings ...)
  2017-11-17 18:30 ` [PATCH RFC 21/25] net: Move fib_* pernet_operations, registered via subsys_initcall(), " Kirill Tkhai
@ 2017-11-17 18:30 ` Kirill Tkhai
  2017-11-17 18:30 ` [PATCH RFC 23/25] net: Move genl_pernet_ops " Kirill Tkhai
                   ` (3 subsequent siblings)
  25 siblings, 0 replies; 27+ messages in thread
From: Kirill Tkhai @ 2017-11-17 18:30 UTC (permalink / raw)
  To: davem, vyasevic, kstewart, pombredanne, vyasevich, mark.rutland,
	gregkh, adobriyan, fw, nicolas.dichtel, xiyou.wangcong,
	roman.kapl, paul, dsahern, daniel, lucien.xin, mschiffer,
	rshearma, linux-kernel, netdev, ktkhai, ebiederm, avagin,
	gorcunov, eric.dumazet, stephen, ktkhai

psched_net_ops only creates and destroyes /proc entry,
and safe to be executed in parallel with any foreigh
pernet_operations.

tcf_action_net_ops initializes and destructs tcf_action_net::egdev_ht,
which is not touched by foreign pernet_operations.

So, move them to pernet_sys list.

Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
---
 net/sched/act_api.c |    2 +-
 net/sched/sch_api.c |    2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/net/sched/act_api.c b/net/sched/act_api.c
index 4d33a50a8a6d..f1de2146e6e0 100644
--- a/net/sched/act_api.c
+++ b/net/sched/act_api.c
@@ -1470,7 +1470,7 @@ static int __init tc_action_init(void)
 {
 	int err;
 
-	err = register_pernet_subsys(&tcf_action_net_ops);
+	err = register_pernet_sys(&tcf_action_net_ops);
 	if (err)
 		return err;
 
diff --git a/net/sched/sch_api.c b/net/sched/sch_api.c
index b6c4f536876b..68938ca4bbe1 100644
--- a/net/sched/sch_api.c
+++ b/net/sched/sch_api.c
@@ -2008,7 +2008,7 @@ static int __init pktsched_init(void)
 {
 	int err;
 
-	err = register_pernet_subsys(&psched_net_ops);
+	err = register_pernet_sys(&psched_net_ops);
 	if (err) {
 		pr_err("pktsched_init: "
 		       "cannot initialize per netns operations\n");

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH RFC 23/25] net: Move genl_pernet_ops to pernet_sys list
  2017-11-17 18:27 [PATCH RFC 00/25] Replacing net_mutex with rw_semaphore Kirill Tkhai
                   ` (21 preceding siblings ...)
  2017-11-17 18:30 ` [PATCH RFC 22/25] net: Move subsys_initcall() registered pernet_operations from net/sched " Kirill Tkhai
@ 2017-11-17 18:30 ` Kirill Tkhai
  2017-11-17 18:31 ` [PATCH RFC 24/25] net: Move wext_pernet_ops " Kirill Tkhai
                   ` (2 subsequent siblings)
  25 siblings, 0 replies; 27+ messages in thread
From: Kirill Tkhai @ 2017-11-17 18:30 UTC (permalink / raw)
  To: davem, vyasevic, kstewart, pombredanne, vyasevich, mark.rutland,
	gregkh, adobriyan, fw, nicolas.dichtel, xiyou.wangcong,
	roman.kapl, paul, dsahern, daniel, lucien.xin, mschiffer,
	rshearma, linux-kernel, netdev, ktkhai, ebiederm, avagin,
	gorcunov, eric.dumazet, stephen, ktkhai

This pernet_operations create and destroy net::genl_sock.
Foreign pernet_operations don't touch it.

Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
---
 net/netlink/genetlink.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/netlink/genetlink.c b/net/netlink/genetlink.c
index d444daf1ac04..da7ab3dd5609 100644
--- a/net/netlink/genetlink.c
+++ b/net/netlink/genetlink.c
@@ -1045,7 +1045,7 @@ static int __init genl_init(void)
 	if (err < 0)
 		goto problem;
 
-	err = register_pernet_subsys(&genl_pernet_ops);
+	err = register_pernet_sys(&genl_pernet_ops);
 	if (err)
 		goto problem;
 

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH RFC 24/25] net: Move wext_pernet_ops to pernet_sys list
  2017-11-17 18:27 [PATCH RFC 00/25] Replacing net_mutex with rw_semaphore Kirill Tkhai
                   ` (22 preceding siblings ...)
  2017-11-17 18:30 ` [PATCH RFC 23/25] net: Move genl_pernet_ops " Kirill Tkhai
@ 2017-11-17 18:31 ` Kirill Tkhai
  2017-11-17 18:31 ` [PATCH RFC 25/25] net: Move sysctl_core_ops " Kirill Tkhai
  2017-11-19  1:52 ` [PATCH RFC 00/25] Replacing net_mutex with rw_semaphore Eric W. Biederman
  25 siblings, 0 replies; 27+ messages in thread
From: Kirill Tkhai @ 2017-11-17 18:31 UTC (permalink / raw)
  To: davem, vyasevic, kstewart, pombredanne, vyasevich, mark.rutland,
	gregkh, adobriyan, fw, nicolas.dichtel, xiyou.wangcong,
	roman.kapl, paul, dsahern, daniel, lucien.xin, mschiffer,
	rshearma, linux-kernel, netdev, ktkhai, ebiederm, avagin,
	gorcunov, eric.dumazet, stephen, ktkhai

These pernet_operations initialize and purge net::wext_nlevents
queue, and are not touched by foreign pernet_operations.

Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
---
 net/wireless/wext-core.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/wireless/wext-core.c b/net/wireless/wext-core.c
index 6cdb054484d6..2103c2a003ed 100644
--- a/net/wireless/wext-core.c
+++ b/net/wireless/wext-core.c
@@ -394,7 +394,7 @@ static struct pernet_operations wext_pernet_ops = {
 
 static int __init wireless_nlevent_init(void)
 {
-	int err = register_pernet_subsys(&wext_pernet_ops);
+	int err = register_pernet_sys(&wext_pernet_ops);
 
 	if (err)
 		return err;

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH RFC 25/25] net: Move sysctl_core_ops to pernet_sys list
  2017-11-17 18:27 [PATCH RFC 00/25] Replacing net_mutex with rw_semaphore Kirill Tkhai
                   ` (23 preceding siblings ...)
  2017-11-17 18:31 ` [PATCH RFC 24/25] net: Move wext_pernet_ops " Kirill Tkhai
@ 2017-11-17 18:31 ` Kirill Tkhai
  2017-11-19  1:52 ` [PATCH RFC 00/25] Replacing net_mutex with rw_semaphore Eric W. Biederman
  25 siblings, 0 replies; 27+ messages in thread
From: Kirill Tkhai @ 2017-11-17 18:31 UTC (permalink / raw)
  To: davem, vyasevic, kstewart, pombredanne, vyasevich, mark.rutland,
	gregkh, adobriyan, fw, nicolas.dichtel, xiyou.wangcong,
	roman.kapl, paul, dsahern, daniel, lucien.xin, mschiffer,
	rshearma, linux-kernel, netdev, ktkhai, ebiederm, avagin,
	gorcunov, eric.dumazet, stephen, ktkhai

These pernet_operations register and destroy sysctl
directory, and it's not interested for foreign
pernet_operations.

Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
---
 net/core/sysctl_net_core.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/core/sysctl_net_core.c b/net/core/sysctl_net_core.c
index cbc3dde4cfcc..0dab679b33fa 100644
--- a/net/core/sysctl_net_core.c
+++ b/net/core/sysctl_net_core.c
@@ -525,7 +525,7 @@ static __net_initdata struct pernet_operations sysctl_core_ops = {
 static __init int sysctl_core_init(void)
 {
 	register_net_sysctl(&init_net, "net/core", net_core_table);
-	return register_pernet_subsys(&sysctl_core_ops);
+	return register_pernet_sys(&sysctl_core_ops);
 }
 
 fs_initcall(sysctl_core_init);

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* Re: [PATCH RFC 00/25] Replacing net_mutex with rw_semaphore
  2017-11-17 18:27 [PATCH RFC 00/25] Replacing net_mutex with rw_semaphore Kirill Tkhai
                   ` (24 preceding siblings ...)
  2017-11-17 18:31 ` [PATCH RFC 25/25] net: Move sysctl_core_ops " Kirill Tkhai
@ 2017-11-19  1:52 ` Eric W. Biederman
  25 siblings, 0 replies; 27+ messages in thread
From: Eric W. Biederman @ 2017-11-19  1:52 UTC (permalink / raw)
  To: Kirill Tkhai
  Cc: davem, vyasevic, kstewart, pombredanne, vyasevich, mark.rutland,
	gregkh, adobriyan, fw, nicolas.dichtel, xiyou.wangcong,
	roman.kapl, paul, dsahern, daniel, lucien.xin, mschiffer,
	rshearma, linux-kernel, netdev, avagin, gorcunov, eric.dumazet,
	stephen

Kirill Tkhai <ktkhai@virtuozzo.com> writes:

> Hi,
>
> this is continuation of discussion from here:
>
> https://lkml.org/lkml/2017/11/14/298
>
> The plan has changed a little bit, so I'd be happy to hear
> people's comments, before I dived into all 400+ pernet subsys
> and devices.
>
> The patch set adds pernet sys list ahead of subsys and device,
> and it's used for pernet_operations, which may be executed
> in parallel with any other pernet_operations methods. Also,
> some high-priority ops converted (up to registered using
> postcore_initcall(), and some subsys_initcall()) in order
> of appearance. The sequence in setup_net() is following:
>
> 1)execute all the callbacks from pernet_sys list
> 2)lock net_mutex
> 3)execute all the callbacks from pernet_subsys list
> 4)execute all the callbacks from pernet_device list
> 5)unlock net_mutex
>
> There was not pernet_operations, requiring additional
> synchronization, yet, but I've bumped in another problem.
> The problem is that some drivers may be compiled as modules
> and as kernel-image part. They register pernet_operations
> from device_initcall() for example. This initcall executes
> in different time comparing to in-kernel built-in only
> drivers.
>
> Imagine, we have three state driverA, and boolean driverB.
> driverA registers pernet_subsys from subsys_initcall().
> driverB registers pernet_subsys from fs_initcall().
> So, here we have two cases:
>
> driverA is module              driverA is built-in
> --------------------           -------------------
> register driverB ops           register driverA ops
> register driverA ops           register driverB ops
>
> So, the order is different. When converting driver one-by-one,
> it's impossible to make the order true for all .config
> states, because of the above. So, the bisect won't work.
>
> And it seems, it's just the same as to convert pernet_operations
> from all the files in file alphabetical order. What do you
> think about this? (Note, the patches has no such a problem
> at the moment, as there are all in-kernel early core drivers).
>
> Maybe there are another comments on the code.

I think there is an solution in the center.  Just have a count
(protected by down_write net_sem) of the number of modules that still
need net_mutex.

If the count is non-zero we take net_mutex in setup_net and cleanup_net.

That way limited network stacks can see the benefit and actively test
the parallism, while other configurations can be safe in still having
the same behavior.

Eric


> ---
>
> Kirill Tkhai (25):
>       net: Assign net to net_namespace_list in setup_net()
>       net: Cleanup copy_net_ns()
>       net: Introduce net_sem for protection of pernet_list
>       net: Move mutex_unlock() in cleanup_net() up
>       net: Add primitives to update heads of pernet_list sublists
>       net: Add pernet sys and registration functions
>       net: Make sys sublist pernet_operations executed out of net_mutex
>       net: Move proc_net_ns_ops to pernet_sys list
>       net: Move net_ns_ops to pernet_sys list
>       net: Move sysctl_pernet_ops to pernet_sys list
>       net: Move netfilter_net_ops to pernet_sys list
>       net: Move nf_log_net_ops to pernet_sys list
>       net: Move net_inuse_ops to pernet_sys list
>       net: Move net_defaults_ops to pernet_sys list
>       net: Move netlink_net_ops to pernet_sys list
>       net: Move rtnetlink_net_ops to pernet_sys list
>       net: Move audit_net_ops to pernet_sys list
>       net: Move uevent_net_ops to pernet_sys list
>       net: Move proto_net_ops to pernet_sys list
>       net: Move pernet_subsys, registered via net_dev_init(), to pernet_sys list
>       net: Move fib_* pernet_operations, registered via subsys_initcall(), to pernet_sys list
>       net: Move subsys_initcall() registered pernet_operations from net/sched to pernet_sys list
>       net: Move genl_pernet_ops to pernet_sys list
>       net: Move wext_pernet_ops to pernet_sys list
>       net: Move sysctl_core_ops to pernet_sys list
>
>
>  fs/proc/proc_net.c          |    2 
>  include/linux/rtnetlink.h   |    1 
>  include/net/net_namespace.h |    2 
>  kernel/audit.c              |    2 
>  lib/kobject_uevent.c        |    2 
>  net/core/dev.c              |    2 
>  net/core/fib_notifier.c     |    2 
>  net/core/fib_rules.c        |    2 
>  net/core/net-procfs.c       |    4 -
>  net/core/net_namespace.c    |  203 +++++++++++++++++++++++++++++++++----------
>  net/core/rtnetlink.c        |    6 +
>  net/core/sock.c             |    4 -
>  net/core/sysctl_net_core.c  |    2 
>  net/netfilter/core.c        |    2 
>  net/netfilter/nf_log.c      |    2 
>  net/netlink/af_netlink.c    |    2 
>  net/netlink/genetlink.c     |    2 
>  net/sched/act_api.c         |    2 
>  net/sched/sch_api.c         |    2 
>  net/sysctl_net.c            |    2 
>  net/wireless/wext-core.c    |    2 
>  21 files changed, 183 insertions(+), 67 deletions(-)
>
> --
> Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>

^ permalink raw reply	[flat|nested] 27+ messages in thread

end of thread, other threads:[~2017-11-19  1:52 UTC | newest]

Thread overview: 27+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-11-17 18:27 [PATCH RFC 00/25] Replacing net_mutex with rw_semaphore Kirill Tkhai
2017-11-17 18:27 ` [PATCH RFC 01/25] net: Assign net to net_namespace_list in setup_net() Kirill Tkhai
2017-11-17 18:27 ` [PATCH RFC 02/25] net: Cleanup copy_net_ns() Kirill Tkhai
2017-11-17 18:27 ` [PATCH RFC 03/25] net: Introduce net_sem for protection of pernet_list Kirill Tkhai
2017-11-17 18:27 ` [PATCH RFC 04/25] net: Move mutex_unlock() in cleanup_net() up Kirill Tkhai
2017-11-17 18:28 ` [PATCH RFC 05/25] net: Add primitives to update heads of pernet_list sublists Kirill Tkhai
2017-11-17 18:28 ` [PATCH RFC 06/25] net: Add pernet sys and registration functions Kirill Tkhai
2017-11-17 18:28 ` [PATCH RFC 07/25] net: Make sys sublist pernet_operations executed out of net_mutex Kirill Tkhai
2017-11-17 18:28 ` [PATCH RFC 08/25] net: Move proc_net_ns_ops to pernet_sys list Kirill Tkhai
2017-11-17 18:28 ` [PATCH RFC 09/25] net: Move net_ns_ops " Kirill Tkhai
2017-11-17 18:28 ` [PATCH RFC 10/25] net: Move sysctl_pernet_ops " Kirill Tkhai
2017-11-17 18:29 ` [PATCH RFC 11/25] net: Move netfilter_net_ops " Kirill Tkhai
2017-11-17 18:29 ` [PATCH RFC 12/25] net: Move nf_log_net_ops " Kirill Tkhai
2017-11-17 18:29 ` [PATCH RFC 13/25] net: Move net_inuse_ops " Kirill Tkhai
2017-11-17 18:29 ` [PATCH RFC 14/25] net: Move net_defaults_ops " Kirill Tkhai
2017-11-17 18:29 ` [PATCH RFC 15/25] net: Move netlink_net_ops " Kirill Tkhai
2017-11-17 18:29 ` [PATCH RFC 16/25] net: Move rtnetlink_net_ops " Kirill Tkhai
2017-11-17 18:29 ` [PATCH RFC 17/25] net: Move audit_net_ops " Kirill Tkhai
2017-11-17 18:30 ` [PATCH RFC 18/25] net: Move uevent_net_ops " Kirill Tkhai
2017-11-17 18:30 ` [PATCH RFC 19/25] net: Move proto_net_ops " Kirill Tkhai
2017-11-17 18:30 ` [PATCH RFC 20/25] net: Move pernet_subsys, registered via net_dev_init(), " Kirill Tkhai
2017-11-17 18:30 ` [PATCH RFC 21/25] net: Move fib_* pernet_operations, registered via subsys_initcall(), " Kirill Tkhai
2017-11-17 18:30 ` [PATCH RFC 22/25] net: Move subsys_initcall() registered pernet_operations from net/sched " Kirill Tkhai
2017-11-17 18:30 ` [PATCH RFC 23/25] net: Move genl_pernet_ops " Kirill Tkhai
2017-11-17 18:31 ` [PATCH RFC 24/25] net: Move wext_pernet_ops " Kirill Tkhai
2017-11-17 18:31 ` [PATCH RFC 25/25] net: Move sysctl_core_ops " Kirill Tkhai
2017-11-19  1:52 ` [PATCH RFC 00/25] Replacing net_mutex with rw_semaphore Eric W. Biederman

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.