All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 00/31] Replacing net_mutex with rw_semaphore
@ 2017-11-20 18:32 Kirill Tkhai
  2017-11-20 18:32 ` [PATCH v2 01/31] net: Assign net to net_namespace_list in setup_net() Kirill Tkhai
                   ` (32 more replies)
  0 siblings, 33 replies; 41+ messages in thread
From: Kirill Tkhai @ 2017-11-20 18:32 UTC (permalink / raw)
  To: davem, vyasevic, kstewart, pombredanne, vyasevich, mark.rutland,
	gregkh, adobriyan, fw, nicolas.dichtel, xiyou.wangcong,
	roman.kapl, paul, dsahern, daniel, lucien.xin, mschiffer,
	rshearma, linux-kernel, netdev, ktkhai, ebiederm, avagin,
	gorcunov, eric.dumazet, stephen, ktkhai

Hi,

there is the second version of patchset introducing net_sem
instead of net_mutex. The patchset adds net_sem in addition
to net_mutex and allows pernet_operations to be async. This
flag means, the pernet_operations methods are safe to be
executed with any othor pernet_operations (un)initializing
another net.

If there are only async pernet_operations in the system,
net_mutex is not used either for setup_net() or for cleanup_net().

The flag is little easier, then (un)register_pernet_sys(),
as it changes one line only. Also, it requires less changes
in code. In future, when all pernet_operations are async,
we'll just remove this struct field.

The pernet_operations converted in this patchset allow
to create minimal .config to have network working, and
the changes improve the performance like you may see
below:

    %for i in {1..10000}; do unshare -n bash -c exit; done
    
    *before*
    real 1m40,377s
    user 0m9,672s
    sys 0m19,928s
    
    *after*
    real 0m17,007s
    user 0m5,311s
    sys 0m11,779
    
    (5.8 times faster)
---

Kirill Tkhai (31):
      net: Assign net to net_namespace_list in setup_net()
      net: Cleanup copy_net_ns()
      net: Introduce net_sem for protection of pernet_list
      net: Move mutex_unlock() in cleanup_net() up
      net: Allow pernet_operations to be executed in parallel
      net: Convert proc_net_ns_ops
      net: Convert net_ns_ops methods
      net: Convert sysctl_pernet_ops
      net: Convert netfilter_net_ops
      net: Convert nf_log_net_ops
      net: Convert net_inuse_ops
      net: Convert net_defaults_ops
      net: Convert netlink_net_ops
      net: Convert rtnetlink_net_ops
      net: Convert audit_net_ops
      net: Convert uevent_net_ops
      net: Convert proto_net_ops
      net: Convert pernet_subsys ops, registered via net_dev_init()
      net: Convert fib_* pernet_operations, registered via subsys_initcall
      net: Convert subsys_initcall() registered pernet_operations from net/sched
      net: Convert genl_pernet_ops
      net: Convert wext_pernet_ops
      net: Convert sysctl_core_ops
      net: Convert pernet_subsys, registered from inet_init()
      net: Convert unix_net_ops
      net: Convert packet_net_ops
      net: Convert ipv4_sysctl_ops
      net: Convert addrconf_ops
      net: Convert loopback_net_ops
      net: Convert default_device_ops
      net: Convert diag_net_ops


 drivers/net/loopback.c      |    1 
 fs/proc/proc_net.c          |    1 
 include/linux/rtnetlink.h   |    1 
 include/net/net_namespace.h |    6 +++
 kernel/audit.c              |    1 
 lib/kobject_uevent.c        |    1 
 net/core/dev.c              |    2 +
 net/core/fib_notifier.c     |    1 
 net/core/fib_rules.c        |    1 
 net/core/net-procfs.c       |    2 +
 net/core/net_namespace.c    |   94 +++++++++++++++++++++++++------------------
 net/core/rtnetlink.c        |    5 +-
 net/core/sock.c             |    2 +
 net/core/sock_diag.c        |    1 
 net/core/sysctl_net_core.c  |    1 
 net/ipv4/af_inet.c          |    2 +
 net/ipv4/arp.c              |    1 
 net/ipv4/devinet.c          |    1 
 net/ipv4/fib_frontend.c     |    1 
 net/ipv4/icmp.c             |    1 
 net/ipv4/igmp.c             |    1 
 net/ipv4/ip_fragment.c      |    1 
 net/ipv4/ipmr.c             |    1 
 net/ipv4/ping.c             |    1 
 net/ipv4/proc.c             |    1 
 net/ipv4/raw.c              |    1 
 net/ipv4/route.c            |    4 ++
 net/ipv4/sysctl_net_ipv4.c  |    1 
 net/ipv4/tcp_ipv4.c         |    2 +
 net/ipv4/tcp_metrics.c      |    1 
 net/ipv4/udp.c              |    1 
 net/ipv4/udplite.c          |    1 
 net/ipv4/xfrm4_policy.c     |    1 
 net/ipv6/addrconf.c         |    1 
 net/netfilter/core.c        |    1 
 net/netfilter/nf_log.c      |    1 
 net/netlink/af_netlink.c    |    1 
 net/netlink/genetlink.c     |    1 
 net/packet/af_packet.c      |    1 
 net/sched/act_api.c         |    1 
 net/sched/sch_api.c         |    1 
 net/sysctl_net.c            |    1 
 net/unix/af_unix.c          |    1 
 net/wireless/wext-core.c    |    1 
 net/xfrm/xfrm_policy.c      |    1 
 45 files changed, 114 insertions(+), 41 deletions(-)

--
Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>

^ permalink raw reply	[flat|nested] 41+ messages in thread

* [PATCH v2 01/31] net: Assign net to net_namespace_list in setup_net()
  2017-11-20 18:32 [PATCH v2 00/31] Replacing net_mutex with rw_semaphore Kirill Tkhai
@ 2017-11-20 18:32 ` Kirill Tkhai
  2017-11-20 18:32 ` [PATCH v2 02/31] net: Cleanup copy_net_ns() Kirill Tkhai
                   ` (31 subsequent siblings)
  32 siblings, 0 replies; 41+ messages in thread
From: Kirill Tkhai @ 2017-11-20 18:32 UTC (permalink / raw)
  To: davem, vyasevic, kstewart, pombredanne, vyasevich, mark.rutland,
	gregkh, adobriyan, fw, nicolas.dichtel, xiyou.wangcong,
	roman.kapl, paul, dsahern, daniel, lucien.xin, mschiffer,
	rshearma, linux-kernel, netdev, ktkhai, ebiederm, avagin,
	gorcunov, eric.dumazet, stephen, ktkhai

This patch merges two repeating pieces of code in one,
and they will live in setup_net() now.

It acts as cleanup even despite init_net_initialized
assignment is reordered with the linking of net now.
This variable is need for proc_net_init() called from:

start_kernel()->proc_root_init()->proc_net_init(),

which can't race with net_ns_init(), called from
initcall.

Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
---
 net/core/net_namespace.c |   13 +++----------
 1 file changed, 3 insertions(+), 10 deletions(-)

diff --git a/net/core/net_namespace.c b/net/core/net_namespace.c
index b797832565d3..7ecf71050ffa 100644
--- a/net/core/net_namespace.c
+++ b/net/core/net_namespace.c
@@ -296,6 +296,9 @@ static __net_init int setup_net(struct net *net, struct user_namespace *user_ns)
 		if (error < 0)
 			goto out_undo;
 	}
+	rtnl_lock();
+	list_add_tail_rcu(&net->list, &net_namespace_list);
+	rtnl_unlock();
 out:
 	return error;
 
@@ -417,11 +420,6 @@ struct net *copy_net_ns(unsigned long flags,
 
 	net->ucounts = ucounts;
 	rv = setup_net(net, user_ns);
-	if (rv == 0) {
-		rtnl_lock();
-		list_add_tail_rcu(&net->list, &net_namespace_list);
-		rtnl_unlock();
-	}
 	mutex_unlock(&net_mutex);
 	if (rv < 0) {
 		dec_net_namespaces(ucounts);
@@ -847,11 +845,6 @@ static int __init net_ns_init(void)
 		panic("Could not setup the initial network namespace");
 
 	init_net_initialized = true;
-
-	rtnl_lock();
-	list_add_tail_rcu(&init_net.list, &net_namespace_list);
-	rtnl_unlock();
-
 	mutex_unlock(&net_mutex);
 
 	register_pernet_subsys(&net_ns_ops);

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH v2 02/31] net: Cleanup copy_net_ns()
  2017-11-20 18:32 [PATCH v2 00/31] Replacing net_mutex with rw_semaphore Kirill Tkhai
  2017-11-20 18:32 ` [PATCH v2 01/31] net: Assign net to net_namespace_list in setup_net() Kirill Tkhai
@ 2017-11-20 18:32 ` Kirill Tkhai
  2017-11-20 18:32 ` [PATCH v2 03/31] net: Introduce net_sem for protection of pernet_list Kirill Tkhai
                   ` (30 subsequent siblings)
  32 siblings, 0 replies; 41+ messages in thread
From: Kirill Tkhai @ 2017-11-20 18:32 UTC (permalink / raw)
  To: davem, vyasevic, kstewart, pombredanne, vyasevich, mark.rutland,
	gregkh, adobriyan, fw, nicolas.dichtel, xiyou.wangcong,
	roman.kapl, paul, dsahern, daniel, lucien.xin, mschiffer,
	rshearma, linux-kernel, netdev, ktkhai, ebiederm, avagin,
	gorcunov, eric.dumazet, stephen, ktkhai

Line up destructors actions in the revers order
to constructors. Next patches will add more actions,
and this will be comfortable, if there is the such
order.

Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
---
 net/core/net_namespace.c |   20 +++++++++-----------
 1 file changed, 9 insertions(+), 11 deletions(-)

diff --git a/net/core/net_namespace.c b/net/core/net_namespace.c
index 7ecf71050ffa..2e512965bf42 100644
--- a/net/core/net_namespace.c
+++ b/net/core/net_namespace.c
@@ -404,27 +404,25 @@ struct net *copy_net_ns(unsigned long flags,
 
 	net = net_alloc();
 	if (!net) {
-		dec_net_namespaces(ucounts);
-		return ERR_PTR(-ENOMEM);
+		rv = -ENOMEM;
+		goto dec_ucounts;
 	}
-
+	refcount_set(&net->passive, 1);
+	net->ucounts = ucounts;
 	get_user_ns(user_ns);
 
 	rv = mutex_lock_killable(&net_mutex);
-	if (rv < 0) {
-		net_free(net);
-		dec_net_namespaces(ucounts);
-		put_user_ns(user_ns);
-		return ERR_PTR(rv);
-	}
+	if (rv < 0)
+		goto put_userns;
 
-	net->ucounts = ucounts;
 	rv = setup_net(net, user_ns);
 	mutex_unlock(&net_mutex);
 	if (rv < 0) {
-		dec_net_namespaces(ucounts);
+put_userns:
 		put_user_ns(user_ns);
 		net_drop_ns(net);
+dec_ucounts:
+		dec_net_namespaces(ucounts);
 		return ERR_PTR(rv);
 	}
 	return net;

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH v2 03/31] net: Introduce net_sem for protection of pernet_list
  2017-11-20 18:32 [PATCH v2 00/31] Replacing net_mutex with rw_semaphore Kirill Tkhai
  2017-11-20 18:32 ` [PATCH v2 01/31] net: Assign net to net_namespace_list in setup_net() Kirill Tkhai
  2017-11-20 18:32 ` [PATCH v2 02/31] net: Cleanup copy_net_ns() Kirill Tkhai
@ 2017-11-20 18:32 ` Kirill Tkhai
  2018-01-17 20:04   ` Andrei Vagin
  2017-11-20 18:32 ` [PATCH v2 04/31] net: Move mutex_unlock() in cleanup_net() up Kirill Tkhai
                   ` (29 subsequent siblings)
  32 siblings, 1 reply; 41+ messages in thread
From: Kirill Tkhai @ 2017-11-20 18:32 UTC (permalink / raw)
  To: davem, vyasevic, kstewart, pombredanne, vyasevich, mark.rutland,
	gregkh, adobriyan, fw, nicolas.dichtel, xiyou.wangcong,
	roman.kapl, paul, dsahern, daniel, lucien.xin, mschiffer,
	rshearma, linux-kernel, netdev, ktkhai, ebiederm, avagin,
	gorcunov, eric.dumazet, stephen, ktkhai

Curently mutex is used to protect pernet operations list. It makes
cleanup_net() to execute ->exit methods of the same operations set,
which was used on the time of ->init, even after net namespace is
unlinked from net_namespace_list.

But the problem is it's need to synchronize_rcu() after net is removed
from net_namespace_list():

Destroy net_ns:
cleanup_net()
  mutex_lock(&net_mutex)
  list_del_rcu(&net->list)
  synchronize_rcu()                                  <--- Sleep there for ages
  list_for_each_entry_reverse(ops, &pernet_list, list)
    ops_exit_list(ops, &net_exit_list)
  list_for_each_entry_reverse(ops, &pernet_list, list)
    ops_free_list(ops, &net_exit_list)
  mutex_unlock(&net_mutex)

This primitive is not fast, especially on the systems with many processors
and/or when preemptible RCU is enabled in config. So, all the time, while
cleanup_net() is waiting for RCU grace period, creation of new net namespaces
is not possible, the tasks, who makes it, are sleeping on the same mutex:

Create net_ns:
copy_net_ns()
  mutex_lock_killable(&net_mutex)                    <--- Sleep there for ages

I observed 20-30 seconds hangs of "unshare -n" on ordinary 8-cpu laptop
with preemptible RCU enabled.

The solution is to convert net_mutex to the rw_semaphore and add small locks
to really small number of pernet_operations, what really need them. Then,
pernet_operations::init/::exit methods, modifying the net-related data,
will require down_read() locking only, while down_write() will be used
for changing pernet_list.

This gives signify performance increase, after all patch set is applied,
like you may see here:

%for i in {1..10000}; do unshare -n bash -c exit; done

*before*
real 1m40,377s
user 0m9,672s
sys 0m19,928s

*after*
real 0m17,007s
user 0m5,311s
sys 0m11,779

(5.8 times faster)

This patch starts replacing net_mutex to net_sem. It adds rw_semaphore,
describes the variables it protects, and makes to use where appropriate.
net_mutex is still present, and next patches will kick it out step-by-step.

Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
---
 include/linux/rtnetlink.h |    1 +
 net/core/net_namespace.c  |   39 ++++++++++++++++++++++++++-------------
 net/core/rtnetlink.c      |    4 ++--
 3 files changed, 29 insertions(+), 15 deletions(-)

diff --git a/include/linux/rtnetlink.h b/include/linux/rtnetlink.h
index 2032ce2eb20b..f640fc87fe1d 100644
--- a/include/linux/rtnetlink.h
+++ b/include/linux/rtnetlink.h
@@ -35,6 +35,7 @@ extern int rtnl_is_locked(void);
 
 extern wait_queue_head_t netdev_unregistering_wq;
 extern struct mutex net_mutex;
+extern struct rw_semaphore net_sem;
 
 #ifdef CONFIG_PROVE_LOCKING
 extern bool lockdep_rtnl_is_held(void);
diff --git a/net/core/net_namespace.c b/net/core/net_namespace.c
index 2e512965bf42..859dce31e37e 100644
--- a/net/core/net_namespace.c
+++ b/net/core/net_namespace.c
@@ -41,6 +41,11 @@ struct net init_net = {
 EXPORT_SYMBOL(init_net);
 
 static bool init_net_initialized;
+/*
+ * net_sem: protects: pernet_list, net_generic_ids,
+ * init_net_initialized and first_device pointer.
+ */
+DECLARE_RWSEM(net_sem);
 
 #define MIN_PERNET_OPS_ID	\
 	((sizeof(struct net_generic) + sizeof(void *) - 1) / sizeof(void *))
@@ -279,7 +284,7 @@ struct net *get_net_ns_by_id(struct net *net, int id)
  */
 static __net_init int setup_net(struct net *net, struct user_namespace *user_ns)
 {
-	/* Must be called with net_mutex held */
+	/* Must be called with net_sem held */
 	const struct pernet_operations *ops, *saved_ops;
 	int error = 0;
 	LIST_HEAD(net_exit_list);
@@ -411,12 +416,16 @@ struct net *copy_net_ns(unsigned long flags,
 	net->ucounts = ucounts;
 	get_user_ns(user_ns);
 
-	rv = mutex_lock_killable(&net_mutex);
+	rv = down_read_killable(&net_sem);
 	if (rv < 0)
 		goto put_userns;
-
+	rv = mutex_lock_killable(&net_mutex);
+	if (rv < 0)
+		goto up_read;
 	rv = setup_net(net, user_ns);
 	mutex_unlock(&net_mutex);
+up_read:
+	up_read(&net_sem);
 	if (rv < 0) {
 put_userns:
 		put_user_ns(user_ns);
@@ -443,6 +452,7 @@ static void cleanup_net(struct work_struct *work)
 	list_replace_init(&cleanup_list, &net_kill_list);
 	spin_unlock_irq(&cleanup_list_lock);
 
+	down_read(&net_sem);
 	mutex_lock(&net_mutex);
 
 	/* Don't let anyone else find us. */
@@ -484,6 +494,7 @@ static void cleanup_net(struct work_struct *work)
 		ops_free_list(ops, &net_exit_list);
 
 	mutex_unlock(&net_mutex);
+	up_read(&net_sem);
 
 	/* Ensure there are no outstanding rcu callbacks using this
 	 * network namespace.
@@ -510,8 +521,10 @@ static void cleanup_net(struct work_struct *work)
  */
 void net_ns_barrier(void)
 {
+	down_write(&net_sem);
 	mutex_lock(&net_mutex);
 	mutex_unlock(&net_mutex);
+	up_write(&net_sem);
 }
 EXPORT_SYMBOL(net_ns_barrier);
 
@@ -838,12 +851,12 @@ static int __init net_ns_init(void)
 
 	rcu_assign_pointer(init_net.gen, ng);
 
-	mutex_lock(&net_mutex);
+	down_write(&net_sem);
 	if (setup_net(&init_net, &init_user_ns))
 		panic("Could not setup the initial network namespace");
 
 	init_net_initialized = true;
-	mutex_unlock(&net_mutex);
+	up_write(&net_sem);
 
 	register_pernet_subsys(&net_ns_ops);
 
@@ -983,9 +996,9 @@ static void unregister_pernet_operations(struct pernet_operations *ops)
 int register_pernet_subsys(struct pernet_operations *ops)
 {
 	int error;
-	mutex_lock(&net_mutex);
+	down_write(&net_sem);
 	error =  register_pernet_operations(first_device, ops);
-	mutex_unlock(&net_mutex);
+	up_write(&net_sem);
 	return error;
 }
 EXPORT_SYMBOL_GPL(register_pernet_subsys);
@@ -1001,9 +1014,9 @@ EXPORT_SYMBOL_GPL(register_pernet_subsys);
  */
 void unregister_pernet_subsys(struct pernet_operations *ops)
 {
-	mutex_lock(&net_mutex);
+	down_write(&net_sem);
 	unregister_pernet_operations(ops);
-	mutex_unlock(&net_mutex);
+	up_write(&net_sem);
 }
 EXPORT_SYMBOL_GPL(unregister_pernet_subsys);
 
@@ -1029,11 +1042,11 @@ EXPORT_SYMBOL_GPL(unregister_pernet_subsys);
 int register_pernet_device(struct pernet_operations *ops)
 {
 	int error;
-	mutex_lock(&net_mutex);
+	down_write(&net_sem);
 	error = register_pernet_operations(&pernet_list, ops);
 	if (!error && (first_device == &pernet_list))
 		first_device = &ops->list;
-	mutex_unlock(&net_mutex);
+	up_write(&net_sem);
 	return error;
 }
 EXPORT_SYMBOL_GPL(register_pernet_device);
@@ -1049,11 +1062,11 @@ EXPORT_SYMBOL_GPL(register_pernet_device);
  */
 void unregister_pernet_device(struct pernet_operations *ops)
 {
-	mutex_lock(&net_mutex);
+	down_write(&net_sem);
 	if (&ops->list == first_device)
 		first_device = first_device->next;
 	unregister_pernet_operations(ops);
-	mutex_unlock(&net_mutex);
+	up_write(&net_sem);
 }
 EXPORT_SYMBOL_GPL(unregister_pernet_device);
 
diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c
index dabba2a91fc8..cb06d43c4230 100644
--- a/net/core/rtnetlink.c
+++ b/net/core/rtnetlink.c
@@ -390,11 +390,11 @@ static void rtnl_lock_unregistering_all(void)
 void rtnl_link_unregister(struct rtnl_link_ops *ops)
 {
 	/* Close the race with cleanup_net() */
-	mutex_lock(&net_mutex);
+	down_write(&net_sem);
 	rtnl_lock_unregistering_all();
 	__rtnl_link_unregister(ops);
 	rtnl_unlock();
-	mutex_unlock(&net_mutex);
+	up_write(&net_sem);
 }
 EXPORT_SYMBOL_GPL(rtnl_link_unregister);
 

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH v2 04/31] net: Move mutex_unlock() in cleanup_net() up
  2017-11-20 18:32 [PATCH v2 00/31] Replacing net_mutex with rw_semaphore Kirill Tkhai
                   ` (2 preceding siblings ...)
  2017-11-20 18:32 ` [PATCH v2 03/31] net: Introduce net_sem for protection of pernet_list Kirill Tkhai
@ 2017-11-20 18:32 ` Kirill Tkhai
  2017-11-20 18:32 ` [PATCH v2 05/31] net: Allow pernet_operations to be executed in parallel Kirill Tkhai
                   ` (28 subsequent siblings)
  32 siblings, 0 replies; 41+ messages in thread
From: Kirill Tkhai @ 2017-11-20 18:32 UTC (permalink / raw)
  To: davem, vyasevic, kstewart, pombredanne, vyasevich, mark.rutland,
	gregkh, adobriyan, fw, nicolas.dichtel, xiyou.wangcong,
	roman.kapl, paul, dsahern, daniel, lucien.xin, mschiffer,
	rshearma, linux-kernel, netdev, ktkhai, ebiederm, avagin,
	gorcunov, eric.dumazet, stephen, ktkhai

net_sem protects from pernet_list changing, while
ops_free_list() makes simple kfree(), and it can't
race with other pernet_operations callbacks.

So we may release net_mutex earlier then it was.

Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
---
 net/core/net_namespace.c |    3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/net/core/net_namespace.c b/net/core/net_namespace.c
index 859dce31e37e..c4f7452906bb 100644
--- a/net/core/net_namespace.c
+++ b/net/core/net_namespace.c
@@ -489,11 +489,12 @@ static void cleanup_net(struct work_struct *work)
 	list_for_each_entry_reverse(ops, &pernet_list, list)
 		ops_exit_list(ops, &net_exit_list);
 
+	mutex_unlock(&net_mutex);
+
 	/* Free the net generic variables */
 	list_for_each_entry_reverse(ops, &pernet_list, list)
 		ops_free_list(ops, &net_exit_list);
 
-	mutex_unlock(&net_mutex);
 	up_read(&net_sem);
 
 	/* Ensure there are no outstanding rcu callbacks using this

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH v2 05/31] net: Allow pernet_operations to be executed in parallel
  2017-11-20 18:32 [PATCH v2 00/31] Replacing net_mutex with rw_semaphore Kirill Tkhai
                   ` (3 preceding siblings ...)
  2017-11-20 18:32 ` [PATCH v2 04/31] net: Move mutex_unlock() in cleanup_net() up Kirill Tkhai
@ 2017-11-20 18:32 ` Kirill Tkhai
  2018-01-17 18:34   ` Andrei Vagin
  2017-11-20 18:33 ` [PATCH v2 06/31] net: Convert proc_net_ns_ops Kirill Tkhai
                   ` (27 subsequent siblings)
  32 siblings, 1 reply; 41+ messages in thread
From: Kirill Tkhai @ 2017-11-20 18:32 UTC (permalink / raw)
  To: davem, vyasevic, kstewart, pombredanne, vyasevich, mark.rutland,
	gregkh, adobriyan, fw, nicolas.dichtel, xiyou.wangcong,
	roman.kapl, paul, dsahern, daniel, lucien.xin, mschiffer,
	rshearma, linux-kernel, netdev, ktkhai, ebiederm, avagin,
	gorcunov, eric.dumazet, stephen, ktkhai

This adds new pernet_operations::async flag to indicate operations,
which ->init(), ->exit() and ->exit_batch() methods are allowed
to be executed in parallel with the methods of any other pernet_operations.

When there are only asynchronous pernet_operations in the system,
net_mutex won't be taken for a net construction and destruction.

Also, remove BUG_ON(mutex_is_locked()) from net_assign_generic()
without replacing with the equivalent net_sem check, as there is
one more lockdep assert below.

Suggested-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
---
 include/net/net_namespace.h |    6 ++++++
 net/core/net_namespace.c    |   29 +++++++++++++++++++----------
 2 files changed, 25 insertions(+), 10 deletions(-)

diff --git a/include/net/net_namespace.h b/include/net/net_namespace.h
index 10f99dafd5ac..db978c4755f7 100644
--- a/include/net/net_namespace.h
+++ b/include/net/net_namespace.h
@@ -303,6 +303,12 @@ struct pernet_operations {
 	void (*exit_batch)(struct list_head *net_exit_list);
 	unsigned int *id;
 	size_t size;
+	/*
+	 * Indicates above methods are allowe to be executed in parallel
+	 * with methods of any other pernet_operations, i.e. they are not
+	 * need synchronization via net_mutex.
+	 */
+	bool async;
 };
 
 /*
diff --git a/net/core/net_namespace.c b/net/core/net_namespace.c
index c4f7452906bb..550c766f73aa 100644
--- a/net/core/net_namespace.c
+++ b/net/core/net_namespace.c
@@ -41,8 +41,9 @@ struct net init_net = {
 EXPORT_SYMBOL(init_net);
 
 static bool init_net_initialized;
+static unsigned nr_sync_pernet_ops;
 /*
- * net_sem: protects: pernet_list, net_generic_ids,
+ * net_sem: protects: pernet_list, net_generic_ids, nr_sync_pernet_ops,
  * init_net_initialized and first_device pointer.
  */
 DECLARE_RWSEM(net_sem);
@@ -70,11 +71,10 @@ static int net_assign_generic(struct net *net, unsigned int id, void *data)
 {
 	struct net_generic *ng, *old_ng;
 
-	BUG_ON(!mutex_is_locked(&net_mutex));
 	BUG_ON(id < MIN_PERNET_OPS_ID);
 
 	old_ng = rcu_dereference_protected(net->gen,
-					   lockdep_is_held(&net_mutex));
+					   lockdep_is_held(&net_sem));
 	if (old_ng->s.len > id) {
 		old_ng->ptr[id] = data;
 		return 0;
@@ -419,11 +419,14 @@ struct net *copy_net_ns(unsigned long flags,
 	rv = down_read_killable(&net_sem);
 	if (rv < 0)
 		goto put_userns;
-	rv = mutex_lock_killable(&net_mutex);
-	if (rv < 0)
-		goto up_read;
+	if (nr_sync_pernet_ops) {
+		rv = mutex_lock_killable(&net_mutex);
+		if (rv < 0)
+			goto up_read;
+	}
 	rv = setup_net(net, user_ns);
-	mutex_unlock(&net_mutex);
+	if (nr_sync_pernet_ops)
+		mutex_unlock(&net_mutex);
 up_read:
 	up_read(&net_sem);
 	if (rv < 0) {
@@ -453,7 +456,8 @@ static void cleanup_net(struct work_struct *work)
 	spin_unlock_irq(&cleanup_list_lock);
 
 	down_read(&net_sem);
-	mutex_lock(&net_mutex);
+	if (nr_sync_pernet_ops)
+		mutex_lock(&net_mutex);
 
 	/* Don't let anyone else find us. */
 	rtnl_lock();
@@ -489,7 +493,8 @@ static void cleanup_net(struct work_struct *work)
 	list_for_each_entry_reverse(ops, &pernet_list, list)
 		ops_exit_list(ops, &net_exit_list);
 
-	mutex_unlock(&net_mutex);
+	if (nr_sync_pernet_ops)
+		mutex_unlock(&net_mutex);
 
 	/* Free the net generic variables */
 	list_for_each_entry_reverse(ops, &pernet_list, list)
@@ -961,6 +966,9 @@ static int register_pernet_operations(struct list_head *list,
 		rcu_barrier();
 		if (ops->id)
 			ida_remove(&net_generic_ids, *ops->id);
+	} else if (!ops->async) {
+		pr_info_once("Pernet operations %ps are sync.\n", ops);
+		nr_sync_pernet_ops++;
 	}
 
 	return error;
@@ -968,7 +976,8 @@ static int register_pernet_operations(struct list_head *list,
 
 static void unregister_pernet_operations(struct pernet_operations *ops)
 {
-	
+	if (!ops->async)
+		BUG_ON(nr_sync_pernet_ops-- == 0);
 	__unregister_pernet_operations(ops);
 	rcu_barrier();
 	if (ops->id)

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH v2 06/31] net: Convert proc_net_ns_ops
  2017-11-20 18:32 [PATCH v2 00/31] Replacing net_mutex with rw_semaphore Kirill Tkhai
                   ` (4 preceding siblings ...)
  2017-11-20 18:32 ` [PATCH v2 05/31] net: Allow pernet_operations to be executed in parallel Kirill Tkhai
@ 2017-11-20 18:33 ` Kirill Tkhai
  2017-11-20 18:33 ` [PATCH v2 07/31] net: Convert net_ns_ops methods Kirill Tkhai
                   ` (26 subsequent siblings)
  32 siblings, 0 replies; 41+ messages in thread
From: Kirill Tkhai @ 2017-11-20 18:33 UTC (permalink / raw)
  To: davem, vyasevic, kstewart, pombredanne, vyasevich, mark.rutland,
	gregkh, adobriyan, fw, nicolas.dichtel, xiyou.wangcong,
	roman.kapl, paul, dsahern, daniel, lucien.xin, mschiffer,
	rshearma, linux-kernel, netdev, ktkhai, ebiederm, avagin,
	gorcunov, eric.dumazet, stephen, ktkhai

This patch starts to convert pernet_subsys, registered
from before initcalls.

proc_net_ns_ops::proc_net_ns_init()/proc_net_ns_exit()
register pernet net->proc_net and ->proc_net_stat.

Constructors and destructors of another pernet_operations
are not interested in foreign net's proc_net and proc_net_stat.
Proc filesystem privitives are synchronized on proc_subdir_lock.

So, proc_net_ns_ops methods are able to be executed
in parallel with methods of other pernet operations.

Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
---
 fs/proc/proc_net.c |    1 +
 1 file changed, 1 insertion(+)

diff --git a/fs/proc/proc_net.c b/fs/proc/proc_net.c
index a2bf369c923d..2bf6170204b1 100644
--- a/fs/proc/proc_net.c
+++ b/fs/proc/proc_net.c
@@ -237,6 +237,7 @@ static __net_exit void proc_net_ns_exit(struct net *net)
 static struct pernet_operations __net_initdata proc_net_ns_ops = {
 	.init = proc_net_ns_init,
 	.exit = proc_net_ns_exit,
+	.async = true,
 };
 
 int __init proc_net_init(void)

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH v2 07/31] net: Convert net_ns_ops methods
  2017-11-20 18:32 [PATCH v2 00/31] Replacing net_mutex with rw_semaphore Kirill Tkhai
                   ` (5 preceding siblings ...)
  2017-11-20 18:33 ` [PATCH v2 06/31] net: Convert proc_net_ns_ops Kirill Tkhai
@ 2017-11-20 18:33 ` Kirill Tkhai
  2017-11-20 18:33 ` [PATCH v2 08/31] net: Convert sysctl_pernet_ops Kirill Tkhai
                   ` (25 subsequent siblings)
  32 siblings, 0 replies; 41+ messages in thread
From: Kirill Tkhai @ 2017-11-20 18:33 UTC (permalink / raw)
  To: davem, vyasevic, kstewart, pombredanne, vyasevich, mark.rutland,
	gregkh, adobriyan, fw, nicolas.dichtel, xiyou.wangcong,
	roman.kapl, paul, dsahern, daniel, lucien.xin, mschiffer,
	rshearma, linux-kernel, netdev, ktkhai, ebiederm, avagin,
	gorcunov, eric.dumazet, stephen, ktkhai

This patch starts to convert pernet_subsys, registered
from pure initcalls.

net_ns_ops::net_ns_net_init/net_ns_net_init, methods use only
ida_simple_* functions, which are not need a synchronization.

So, net_ns_ops methods are able to be executed
in parallel with methods of other pernet operations.

Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
---
 net/core/net_namespace.c |    1 +
 1 file changed, 1 insertion(+)

diff --git a/net/core/net_namespace.c b/net/core/net_namespace.c
index 550c766f73aa..757765d62daf 100644
--- a/net/core/net_namespace.c
+++ b/net/core/net_namespace.c
@@ -615,6 +615,7 @@ static __net_exit void net_ns_net_exit(struct net *net)
 static struct pernet_operations __net_initdata net_ns_ops = {
 	.init = net_ns_net_init,
 	.exit = net_ns_net_exit,
+	.async = true,
 };
 
 static const struct nla_policy rtnl_net_policy[NETNSA_MAX + 1] = {

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH v2 08/31] net: Convert sysctl_pernet_ops
  2017-11-20 18:32 [PATCH v2 00/31] Replacing net_mutex with rw_semaphore Kirill Tkhai
                   ` (6 preceding siblings ...)
  2017-11-20 18:33 ` [PATCH v2 07/31] net: Convert net_ns_ops methods Kirill Tkhai
@ 2017-11-20 18:33 ` Kirill Tkhai
  2017-11-20 18:33 ` [PATCH v2 09/31] net: Convert netfilter_net_ops Kirill Tkhai
                   ` (24 subsequent siblings)
  32 siblings, 0 replies; 41+ messages in thread
From: Kirill Tkhai @ 2017-11-20 18:33 UTC (permalink / raw)
  To: davem, vyasevic, kstewart, pombredanne, vyasevich, mark.rutland,
	gregkh, adobriyan, fw, nicolas.dichtel, xiyou.wangcong,
	roman.kapl, paul, dsahern, daniel, lucien.xin, mschiffer,
	rshearma, linux-kernel, netdev, ktkhai, ebiederm, avagin,
	gorcunov, eric.dumazet, stephen, ktkhai

This patch starts to convert pernet_subsys, registered
from core initcalls.

Methods sysctl_net_init() and sysctl_net_exit() initialize
net::sysctls table of a namespace.

pernet_operations::init()/exit() methods from the rest
of the list do not touch net::sysctls of strangers,
so it's safe to execute sysctl_pernet_ops's methods
in parallel with any other pernet_operations.

Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
---
 net/sysctl_net.c |    1 +
 1 file changed, 1 insertion(+)

diff --git a/net/sysctl_net.c b/net/sysctl_net.c
index 9aed6fe1bf1a..f424539829b7 100644
--- a/net/sysctl_net.c
+++ b/net/sysctl_net.c
@@ -89,6 +89,7 @@ static void __net_exit sysctl_net_exit(struct net *net)
 static struct pernet_operations sysctl_pernet_ops = {
 	.init = sysctl_net_init,
 	.exit = sysctl_net_exit,
+	.async = true,
 };
 
 static struct ctl_table_header *net_header;

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH v2 09/31] net: Convert netfilter_net_ops
  2017-11-20 18:32 [PATCH v2 00/31] Replacing net_mutex with rw_semaphore Kirill Tkhai
                   ` (7 preceding siblings ...)
  2017-11-20 18:33 ` [PATCH v2 08/31] net: Convert sysctl_pernet_ops Kirill Tkhai
@ 2017-11-20 18:33 ` Kirill Tkhai
  2017-11-20 18:33 ` [PATCH v2 10/31] net: Convert nf_log_net_ops Kirill Tkhai
                   ` (23 subsequent siblings)
  32 siblings, 0 replies; 41+ messages in thread
From: Kirill Tkhai @ 2017-11-20 18:33 UTC (permalink / raw)
  To: davem, vyasevic, kstewart, pombredanne, vyasevich, mark.rutland,
	gregkh, adobriyan, fw, nicolas.dichtel, xiyou.wangcong,
	roman.kapl, paul, dsahern, daniel, lucien.xin, mschiffer,
	rshearma, linux-kernel, netdev, ktkhai, ebiederm, avagin,
	gorcunov, eric.dumazet, stephen, ktkhai

Methods netfilter_net_init() and netfilter_net_exit()
initialize net::nf::hooks and change net-related proc
directory of net. Another pernet_operations are not
interested in forein net::nf::hooks or proc entries,
so it's safe to be execute them in parallel with
methods of other pernet operations.

Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
---
 net/netfilter/core.c |    1 +
 1 file changed, 1 insertion(+)

diff --git a/net/netfilter/core.c b/net/netfilter/core.c
index 52cd2901a097..bfe2e44244ee 100644
--- a/net/netfilter/core.c
+++ b/net/netfilter/core.c
@@ -600,6 +600,7 @@ static void __net_exit netfilter_net_exit(struct net *net)
 static struct pernet_operations netfilter_net_ops = {
 	.init = netfilter_net_init,
 	.exit = netfilter_net_exit,
+	.async = true,
 };
 
 int __init netfilter_init(void)

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH v2 10/31] net: Convert nf_log_net_ops
  2017-11-20 18:32 [PATCH v2 00/31] Replacing net_mutex with rw_semaphore Kirill Tkhai
                   ` (8 preceding siblings ...)
  2017-11-20 18:33 ` [PATCH v2 09/31] net: Convert netfilter_net_ops Kirill Tkhai
@ 2017-11-20 18:33 ` Kirill Tkhai
  2017-11-20 18:33 ` [PATCH v2 11/31] net: Convert net_inuse_ops Kirill Tkhai
                   ` (22 subsequent siblings)
  32 siblings, 0 replies; 41+ messages in thread
From: Kirill Tkhai @ 2017-11-20 18:33 UTC (permalink / raw)
  To: davem, vyasevic, kstewart, pombredanne, vyasevich, mark.rutland,
	gregkh, adobriyan, fw, nicolas.dichtel, xiyou.wangcong,
	roman.kapl, paul, dsahern, daniel, lucien.xin, mschiffer,
	rshearma, linux-kernel, netdev, ktkhai, ebiederm, avagin,
	gorcunov, eric.dumazet, stephen, ktkhai

The pernet_operations would have had a problem in parallel
execution with others, if init_net had been able to released.
But it's not, and the rest is safe for that.
There is memory allocation, which nobody else interested in,
and sysctl registration. So, we make it async.

Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
---
 net/netfilter/nf_log.c |    1 +
 1 file changed, 1 insertion(+)

diff --git a/net/netfilter/nf_log.c b/net/netfilter/nf_log.c
index 8bb152a7cca4..6137fb1bce66 100644
--- a/net/netfilter/nf_log.c
+++ b/net/netfilter/nf_log.c
@@ -578,6 +578,7 @@ static void __net_exit nf_log_net_exit(struct net *net)
 static struct pernet_operations nf_log_net_ops = {
 	.init = nf_log_net_init,
 	.exit = nf_log_net_exit,
+	.async = true,
 };
 
 int __init netfilter_log_init(void)

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH v2 11/31] net: Convert net_inuse_ops
  2017-11-20 18:32 [PATCH v2 00/31] Replacing net_mutex with rw_semaphore Kirill Tkhai
                   ` (9 preceding siblings ...)
  2017-11-20 18:33 ` [PATCH v2 10/31] net: Convert nf_log_net_ops Kirill Tkhai
@ 2017-11-20 18:33 ` Kirill Tkhai
  2017-11-20 18:34 ` [PATCH v2 12/31] net: Convert net_defaults_ops Kirill Tkhai
                   ` (21 subsequent siblings)
  32 siblings, 0 replies; 41+ messages in thread
From: Kirill Tkhai @ 2017-11-20 18:33 UTC (permalink / raw)
  To: davem, vyasevic, kstewart, pombredanne, vyasevich, mark.rutland,
	gregkh, adobriyan, fw, nicolas.dichtel, xiyou.wangcong,
	roman.kapl, paul, dsahern, daniel, lucien.xin, mschiffer,
	rshearma, linux-kernel, netdev, ktkhai, ebiederm, avagin,
	gorcunov, eric.dumazet, stephen, ktkhai

net_inuse_ops methods expose statistics in /proc.
No one from the rest of pernet_subsys or pernet_device
lists does not touch net::core::inuse.

So, it's safe to make net_inuse_ops async.

Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
---
 net/core/sock.c |    1 +
 1 file changed, 1 insertion(+)

diff --git a/net/core/sock.c b/net/core/sock.c
index c0b5b2f17412..f04f5ec87d04 100644
--- a/net/core/sock.c
+++ b/net/core/sock.c
@@ -3075,6 +3075,7 @@ static void __net_exit sock_inuse_exit_net(struct net *net)
 static struct pernet_operations net_inuse_ops = {
 	.init = sock_inuse_init_net,
 	.exit = sock_inuse_exit_net,
+	.async = true,
 };
 
 static __init int net_inuse_init(void)

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH v2 12/31] net: Convert net_defaults_ops
  2017-11-20 18:32 [PATCH v2 00/31] Replacing net_mutex with rw_semaphore Kirill Tkhai
                   ` (10 preceding siblings ...)
  2017-11-20 18:33 ` [PATCH v2 11/31] net: Convert net_inuse_ops Kirill Tkhai
@ 2017-11-20 18:34 ` Kirill Tkhai
  2017-11-20 18:34 ` [PATCH v2 13/31] net: Convert netlink_net_ops Kirill Tkhai
                   ` (20 subsequent siblings)
  32 siblings, 0 replies; 41+ messages in thread
From: Kirill Tkhai @ 2017-11-20 18:34 UTC (permalink / raw)
  To: davem, vyasevic, kstewart, pombredanne, vyasevich, mark.rutland,
	gregkh, adobriyan, fw, nicolas.dichtel, xiyou.wangcong,
	roman.kapl, paul, dsahern, daniel, lucien.xin, mschiffer,
	rshearma, linux-kernel, netdev, ktkhai, ebiederm, avagin,
	gorcunov, eric.dumazet, stephen, ktkhai

net_defaults_ops introduces only net_defaults_init_net method,
and it acts on net::core::sysctl_somaxconn, which
is not interesting for the rest of pernet_subsys and
pernet_device lists. Then, make it async.

Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
---
 net/core/net_namespace.c |    1 +
 1 file changed, 1 insertion(+)

diff --git a/net/core/net_namespace.c b/net/core/net_namespace.c
index 757765d62daf..c91b10731498 100644
--- a/net/core/net_namespace.c
+++ b/net/core/net_namespace.c
@@ -332,6 +332,7 @@ static int __net_init net_defaults_init_net(struct net *net)
 
 static struct pernet_operations net_defaults_ops = {
 	.init = net_defaults_init_net,
+	.async = true,
 };
 
 static __init int net_defaults_init(void)

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH v2 13/31] net: Convert netlink_net_ops
  2017-11-20 18:32 [PATCH v2 00/31] Replacing net_mutex with rw_semaphore Kirill Tkhai
                   ` (11 preceding siblings ...)
  2017-11-20 18:34 ` [PATCH v2 12/31] net: Convert net_defaults_ops Kirill Tkhai
@ 2017-11-20 18:34 ` Kirill Tkhai
  2017-11-20 18:34 ` [PATCH v2 14/31] net: Convert rtnetlink_net_ops Kirill Tkhai
                   ` (19 subsequent siblings)
  32 siblings, 0 replies; 41+ messages in thread
From: Kirill Tkhai @ 2017-11-20 18:34 UTC (permalink / raw)
  To: davem, vyasevic, kstewart, pombredanne, vyasevich, mark.rutland,
	gregkh, adobriyan, fw, nicolas.dichtel, xiyou.wangcong,
	roman.kapl, paul, dsahern, daniel, lucien.xin, mschiffer,
	rshearma, linux-kernel, netdev, ktkhai, ebiederm, avagin,
	gorcunov, eric.dumazet, stephen, ktkhai

The methods of netlink_net_ops create and destroy "netlink"
file, which are not interesting for foreigh pernet_operations.
So, netlink_net_ops may safely be made async.

Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
---
 net/netlink/af_netlink.c |    1 +
 1 file changed, 1 insertion(+)

diff --git a/net/netlink/af_netlink.c b/net/netlink/af_netlink.c
index b9e0ee4e22f5..1bb967bce57c 100644
--- a/net/netlink/af_netlink.c
+++ b/net/netlink/af_netlink.c
@@ -2687,6 +2687,7 @@ static void __init netlink_add_usersock_entry(void)
 static struct pernet_operations __net_initdata netlink_net_ops = {
 	.init = netlink_net_init,
 	.exit = netlink_net_exit,
+	.async = true,
 };
 
 static inline u32 netlink_hash(const void *data, u32 len, u32 seed)

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH v2 14/31] net: Convert rtnetlink_net_ops
  2017-11-20 18:32 [PATCH v2 00/31] Replacing net_mutex with rw_semaphore Kirill Tkhai
                   ` (12 preceding siblings ...)
  2017-11-20 18:34 ` [PATCH v2 13/31] net: Convert netlink_net_ops Kirill Tkhai
@ 2017-11-20 18:34 ` Kirill Tkhai
  2017-11-20 18:34 ` [PATCH v2 15/31] net: Convert audit_net_ops Kirill Tkhai
                   ` (18 subsequent siblings)
  32 siblings, 0 replies; 41+ messages in thread
From: Kirill Tkhai @ 2017-11-20 18:34 UTC (permalink / raw)
  To: davem, vyasevic, kstewart, pombredanne, vyasevich, mark.rutland,
	gregkh, adobriyan, fw, nicolas.dichtel, xiyou.wangcong,
	roman.kapl, paul, dsahern, daniel, lucien.xin, mschiffer,
	rshearma, linux-kernel, netdev, ktkhai, ebiederm, avagin,
	gorcunov, eric.dumazet, stephen, ktkhai

rtnetlink_net_init() and rtnetlink_net_exit()
create and destroy netlink socket. It looks like,
another pernet_operations are not interested in
foreiner net::rtnl, so rtnetlink_net_ops may be
safely made async.

Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
---
 net/core/rtnetlink.c |    1 +
 1 file changed, 1 insertion(+)

diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c
index cb06d43c4230..fb3f58cf9351 100644
--- a/net/core/rtnetlink.c
+++ b/net/core/rtnetlink.c
@@ -4494,6 +4494,7 @@ static void __net_exit rtnetlink_net_exit(struct net *net)
 static struct pernet_operations rtnetlink_net_ops = {
 	.init = rtnetlink_net_init,
 	.exit = rtnetlink_net_exit,
+	.async = true,
 };
 
 void __init rtnetlink_init(void)

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH v2 15/31] net: Convert audit_net_ops
  2017-11-20 18:32 [PATCH v2 00/31] Replacing net_mutex with rw_semaphore Kirill Tkhai
                   ` (13 preceding siblings ...)
  2017-11-20 18:34 ` [PATCH v2 14/31] net: Convert rtnetlink_net_ops Kirill Tkhai
@ 2017-11-20 18:34 ` Kirill Tkhai
  2017-11-20 18:34 ` [PATCH v2 16/31] net: Convert uevent_net_ops Kirill Tkhai
                   ` (17 subsequent siblings)
  32 siblings, 0 replies; 41+ messages in thread
From: Kirill Tkhai @ 2017-11-20 18:34 UTC (permalink / raw)
  To: davem, vyasevic, kstewart, pombredanne, vyasevich, mark.rutland,
	gregkh, adobriyan, fw, nicolas.dichtel, xiyou.wangcong,
	roman.kapl, paul, dsahern, daniel, lucien.xin, mschiffer,
	rshearma, linux-kernel, netdev, ktkhai, ebiederm, avagin,
	gorcunov, eric.dumazet, stephen, ktkhai

This patch starts to convert pernet_subsys, registered
from postcore initcalls.

audit_net_init() creates netlink socket, while audit_net_exit()
destroys it. The rest of the pernet_list are not interested
in the socket, so we make audit_net_ops async.

Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
---
 kernel/audit.c |    1 +
 1 file changed, 1 insertion(+)

diff --git a/kernel/audit.c b/kernel/audit.c
index 227db99b0f19..5e49b614d0e6 100644
--- a/kernel/audit.c
+++ b/kernel/audit.c
@@ -1526,6 +1526,7 @@ static struct pernet_operations audit_net_ops __net_initdata = {
 	.exit = audit_net_exit,
 	.id = &audit_net_id,
 	.size = sizeof(struct audit_net),
+	.async = true,
 };
 
 /* Initialize audit support at boot time. */

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH v2 16/31] net: Convert uevent_net_ops
  2017-11-20 18:32 [PATCH v2 00/31] Replacing net_mutex with rw_semaphore Kirill Tkhai
                   ` (14 preceding siblings ...)
  2017-11-20 18:34 ` [PATCH v2 15/31] net: Convert audit_net_ops Kirill Tkhai
@ 2017-11-20 18:34 ` Kirill Tkhai
  2017-11-20 18:34 ` [PATCH v2 17/31] net: Convert proto_net_ops Kirill Tkhai
                   ` (16 subsequent siblings)
  32 siblings, 0 replies; 41+ messages in thread
From: Kirill Tkhai @ 2017-11-20 18:34 UTC (permalink / raw)
  To: davem, vyasevic, kstewart, pombredanne, vyasevich, mark.rutland,
	gregkh, adobriyan, fw, nicolas.dichtel, xiyou.wangcong,
	roman.kapl, paul, dsahern, daniel, lucien.xin, mschiffer,
	rshearma, linux-kernel, netdev, ktkhai, ebiederm, avagin,
	gorcunov, eric.dumazet, stephen, ktkhai

uevent_net_init() and uevent_net_exit() create and
destroy netlink socket, and these actions serialized
in netlink code.

Parallel execution with other pernet_operations
makes the socket disappear earlier from uevent_sock_list
on ->exit. As userspace can't be interested in broadcast
messages of dying net, and, as I see, no one in kernel
listen them, we may safely make uevent_net_ops async.

Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
---
 lib/kobject_uevent.c |    1 +
 1 file changed, 1 insertion(+)

diff --git a/lib/kobject_uevent.c b/lib/kobject_uevent.c
index c3e84edc47c9..4a2c39ae1e65 100644
--- a/lib/kobject_uevent.c
+++ b/lib/kobject_uevent.c
@@ -643,6 +643,7 @@ static void uevent_net_exit(struct net *net)
 static struct pernet_operations uevent_net_ops = {
 	.init	= uevent_net_init,
 	.exit	= uevent_net_exit,
+	.async  = true,
 };
 
 static int __init kobject_uevent_init(void)

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH v2 17/31] net: Convert proto_net_ops
  2017-11-20 18:32 [PATCH v2 00/31] Replacing net_mutex with rw_semaphore Kirill Tkhai
                   ` (15 preceding siblings ...)
  2017-11-20 18:34 ` [PATCH v2 16/31] net: Convert uevent_net_ops Kirill Tkhai
@ 2017-11-20 18:34 ` Kirill Tkhai
  2017-11-20 18:35 ` [PATCH v2 18/31] net: Convert pernet_subsys ops, registered via net_dev_init() Kirill Tkhai
                   ` (15 subsequent siblings)
  32 siblings, 0 replies; 41+ messages in thread
From: Kirill Tkhai @ 2017-11-20 18:34 UTC (permalink / raw)
  To: davem, vyasevic, kstewart, pombredanne, vyasevich, mark.rutland,
	gregkh, adobriyan, fw, nicolas.dichtel, xiyou.wangcong,
	roman.kapl, paul, dsahern, daniel, lucien.xin, mschiffer,
	rshearma, linux-kernel, netdev, ktkhai, ebiederm, avagin,
	gorcunov, eric.dumazet, stephen, ktkhai

This patch starts to convert pernet_subsys, registered
from subsys initcalls.

It seems safe to be executed in parallel with others,
as it's only creates/destoyes proc entry,
which nobody else is not interested in.

Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
---
 net/core/sock.c |    1 +
 1 file changed, 1 insertion(+)

diff --git a/net/core/sock.c b/net/core/sock.c
index f04f5ec87d04..d9c3de4239e6 100644
--- a/net/core/sock.c
+++ b/net/core/sock.c
@@ -3344,6 +3344,7 @@ static __net_exit void proto_exit_net(struct net *net)
 static __net_initdata struct pernet_operations proto_net_ops = {
 	.init = proto_init_net,
 	.exit = proto_exit_net,
+	.async = true,
 };
 
 static int __init proto_init(void)

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH v2 18/31] net: Convert pernet_subsys ops, registered via net_dev_init()
  2017-11-20 18:32 [PATCH v2 00/31] Replacing net_mutex with rw_semaphore Kirill Tkhai
                   ` (16 preceding siblings ...)
  2017-11-20 18:34 ` [PATCH v2 17/31] net: Convert proto_net_ops Kirill Tkhai
@ 2017-11-20 18:35 ` Kirill Tkhai
  2017-11-20 18:35 ` [PATCH v2 19/31] net: Convert fib_* pernet_operations, registered via subsys_initcall Kirill Tkhai
                   ` (14 subsequent siblings)
  32 siblings, 0 replies; 41+ messages in thread
From: Kirill Tkhai @ 2017-11-20 18:35 UTC (permalink / raw)
  To: davem, vyasevic, kstewart, pombredanne, vyasevich, mark.rutland,
	gregkh, adobriyan, fw, nicolas.dichtel, xiyou.wangcong,
	roman.kapl, paul, dsahern, daniel, lucien.xin, mschiffer,
	rshearma, linux-kernel, netdev, ktkhai, ebiederm, avagin,
	gorcunov, eric.dumazet, stephen, ktkhai

There are:
1)dev_proc_ops and dev_mc_net_ops, which create and destroy
pernet proc file and not interested to another net namespaces;
2)netdev_net_ops, which creates pernet hash, which is not
touched by another pernet_operations.

So, make them async.

Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
---
 net/core/dev.c        |    1 +
 net/core/net-procfs.c |    2 ++
 2 files changed, 3 insertions(+)

diff --git a/net/core/dev.c b/net/core/dev.c
index 8ee29f4f5fa9..41a576a17430 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -8656,6 +8656,7 @@ static void __net_exit netdev_exit(struct net *net)
 static struct pernet_operations __net_initdata netdev_net_ops = {
 	.init = netdev_init,
 	.exit = netdev_exit,
+	.async = true,
 };
 
 static void __net_exit default_device_exit(struct net *net)
diff --git a/net/core/net-procfs.c b/net/core/net-procfs.c
index 615ccab55f38..16b250dd50ed 100644
--- a/net/core/net-procfs.c
+++ b/net/core/net-procfs.c
@@ -352,6 +352,7 @@ static void __net_exit dev_proc_net_exit(struct net *net)
 static struct pernet_operations __net_initdata dev_proc_ops = {
 	.init = dev_proc_net_init,
 	.exit = dev_proc_net_exit,
+	.async = true,
 };
 
 static int dev_mc_seq_show(struct seq_file *seq, void *v)
@@ -409,6 +410,7 @@ static void __net_exit dev_mc_net_exit(struct net *net)
 static struct pernet_operations __net_initdata dev_mc_net_ops = {
 	.init = dev_mc_net_init,
 	.exit = dev_mc_net_exit,
+	.async = true,
 };
 
 int __init dev_proc_init(void)

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH v2 19/31] net: Convert fib_* pernet_operations, registered via subsys_initcall
  2017-11-20 18:32 [PATCH v2 00/31] Replacing net_mutex with rw_semaphore Kirill Tkhai
                   ` (17 preceding siblings ...)
  2017-11-20 18:35 ` [PATCH v2 18/31] net: Convert pernet_subsys ops, registered via net_dev_init() Kirill Tkhai
@ 2017-11-20 18:35 ` Kirill Tkhai
  2017-11-20 18:35 ` [PATCH v2 20/31] net: Convert subsys_initcall() registered pernet_operations from net/sched Kirill Tkhai
                   ` (13 subsequent siblings)
  32 siblings, 0 replies; 41+ messages in thread
From: Kirill Tkhai @ 2017-11-20 18:35 UTC (permalink / raw)
  To: davem, vyasevic, kstewart, pombredanne, vyasevich, mark.rutland,
	gregkh, adobriyan, fw, nicolas.dichtel, xiyou.wangcong,
	roman.kapl, paul, dsahern, daniel, lucien.xin, mschiffer,
	rshearma, linux-kernel, netdev, ktkhai, ebiederm, avagin,
	gorcunov, eric.dumazet, stephen, ktkhai

Both of them create and initialize lists, which are not touched
by another foreing pernet_operations.

Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
---
 net/core/fib_notifier.c |    1 +
 net/core/fib_rules.c    |    1 +
 2 files changed, 2 insertions(+)

diff --git a/net/core/fib_notifier.c b/net/core/fib_notifier.c
index 0c048bdeb016..5ace0705a3f9 100644
--- a/net/core/fib_notifier.c
+++ b/net/core/fib_notifier.c
@@ -171,6 +171,7 @@ static void __net_exit fib_notifier_net_exit(struct net *net)
 static struct pernet_operations fib_notifier_net_ops = {
 	.init = fib_notifier_net_init,
 	.exit = fib_notifier_net_exit,
+	.async = true,
 };
 
 static int __init fib_notifier_init(void)
diff --git a/net/core/fib_rules.c b/net/core/fib_rules.c
index 98e1066c3d55..cb071b8e8d17 100644
--- a/net/core/fib_rules.c
+++ b/net/core/fib_rules.c
@@ -1030,6 +1030,7 @@ static void __net_exit fib_rules_net_exit(struct net *net)
 static struct pernet_operations fib_rules_net_ops = {
 	.init = fib_rules_net_init,
 	.exit = fib_rules_net_exit,
+	.async = true,
 };
 
 static int __init fib_rules_init(void)

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH v2 20/31] net: Convert subsys_initcall() registered pernet_operations from net/sched
  2017-11-20 18:32 [PATCH v2 00/31] Replacing net_mutex with rw_semaphore Kirill Tkhai
                   ` (18 preceding siblings ...)
  2017-11-20 18:35 ` [PATCH v2 19/31] net: Convert fib_* pernet_operations, registered via subsys_initcall Kirill Tkhai
@ 2017-11-20 18:35 ` Kirill Tkhai
  2017-11-20 18:35 ` [PATCH v2 21/31] net: Convert genl_pernet_ops Kirill Tkhai
                   ` (12 subsequent siblings)
  32 siblings, 0 replies; 41+ messages in thread
From: Kirill Tkhai @ 2017-11-20 18:35 UTC (permalink / raw)
  To: davem, vyasevic, kstewart, pombredanne, vyasevich, mark.rutland,
	gregkh, adobriyan, fw, nicolas.dichtel, xiyou.wangcong,
	roman.kapl, paul, dsahern, daniel, lucien.xin, mschiffer,
	rshearma, linux-kernel, netdev, ktkhai, ebiederm, avagin,
	gorcunov, eric.dumazet, stephen, ktkhai

psched_net_ops only creates and destroyes /proc entry,
and safe to be executed in parallel with any foreigh
pernet_operations.

tcf_action_net_ops initializes and destructs tcf_action_net::egdev_ht,
which is not touched by foreign pernet_operations.

So, make them async.

Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
---
 net/sched/act_api.c |    1 +
 net/sched/sch_api.c |    1 +
 2 files changed, 2 insertions(+)

diff --git a/net/sched/act_api.c b/net/sched/act_api.c
index 4d33a50a8a6d..41a26f551dbb 100644
--- a/net/sched/act_api.c
+++ b/net/sched/act_api.c
@@ -1464,6 +1464,7 @@ static struct pernet_operations tcf_action_net_ops = {
 	.exit = tcf_action_net_exit,
 	.id = &tcf_action_net_id,
 	.size = sizeof(struct tcf_action_net),
+	.async = true,
 };
 
 static int __init tc_action_init(void)
diff --git a/net/sched/sch_api.c b/net/sched/sch_api.c
index b6c4f536876b..09d63c83542a 100644
--- a/net/sched/sch_api.c
+++ b/net/sched/sch_api.c
@@ -2002,6 +2002,7 @@ static void __net_exit psched_net_exit(struct net *net)
 static struct pernet_operations psched_net_ops = {
 	.init = psched_net_init,
 	.exit = psched_net_exit,
+	.async = true,
 };
 
 static int __init pktsched_init(void)

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH v2 21/31] net: Convert genl_pernet_ops
  2017-11-20 18:32 [PATCH v2 00/31] Replacing net_mutex with rw_semaphore Kirill Tkhai
                   ` (19 preceding siblings ...)
  2017-11-20 18:35 ` [PATCH v2 20/31] net: Convert subsys_initcall() registered pernet_operations from net/sched Kirill Tkhai
@ 2017-11-20 18:35 ` Kirill Tkhai
  2017-11-20 18:35 ` [PATCH v2 22/31] net: Convert wext_pernet_ops Kirill Tkhai
                   ` (11 subsequent siblings)
  32 siblings, 0 replies; 41+ messages in thread
From: Kirill Tkhai @ 2017-11-20 18:35 UTC (permalink / raw)
  To: davem, vyasevic, kstewart, pombredanne, vyasevich, mark.rutland,
	gregkh, adobriyan, fw, nicolas.dichtel, xiyou.wangcong,
	roman.kapl, paul, dsahern, daniel, lucien.xin, mschiffer,
	rshearma, linux-kernel, netdev, ktkhai, ebiederm, avagin,
	gorcunov, eric.dumazet, stephen, ktkhai

This pernet_operations create and destroy net::genl_sock.
Foreign pernet_operations don't touch it.

Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
---
 net/netlink/genetlink.c |    1 +
 1 file changed, 1 insertion(+)

diff --git a/net/netlink/genetlink.c b/net/netlink/genetlink.c
index d444daf1ac04..a66fad4c5ffa 100644
--- a/net/netlink/genetlink.c
+++ b/net/netlink/genetlink.c
@@ -1035,6 +1035,7 @@ static void __net_exit genl_pernet_exit(struct net *net)
 static struct pernet_operations genl_pernet_ops = {
 	.init = genl_pernet_init,
 	.exit = genl_pernet_exit,
+	.async = true,
 };
 
 static int __init genl_init(void)

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH v2 22/31] net: Convert wext_pernet_ops
  2017-11-20 18:32 [PATCH v2 00/31] Replacing net_mutex with rw_semaphore Kirill Tkhai
                   ` (20 preceding siblings ...)
  2017-11-20 18:35 ` [PATCH v2 21/31] net: Convert genl_pernet_ops Kirill Tkhai
@ 2017-11-20 18:35 ` Kirill Tkhai
  2017-11-20 18:35 ` [PATCH v2 23/31] net: Convert sysctl_core_ops Kirill Tkhai
                   ` (10 subsequent siblings)
  32 siblings, 0 replies; 41+ messages in thread
From: Kirill Tkhai @ 2017-11-20 18:35 UTC (permalink / raw)
  To: davem, vyasevic, kstewart, pombredanne, vyasevich, mark.rutland,
	gregkh, adobriyan, fw, nicolas.dichtel, xiyou.wangcong,
	roman.kapl, paul, dsahern, daniel, lucien.xin, mschiffer,
	rshearma, linux-kernel, netdev, ktkhai, ebiederm, avagin,
	gorcunov, eric.dumazet, stephen, ktkhai

These pernet_operations initialize and purge net::wext_nlevents
queue, and are not touched by foreign pernet_operations.

Mark them async.

Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
---
 net/wireless/wext-core.c |    1 +
 1 file changed, 1 insertion(+)

diff --git a/net/wireless/wext-core.c b/net/wireless/wext-core.c
index 6cdb054484d6..32c9f1c303f9 100644
--- a/net/wireless/wext-core.c
+++ b/net/wireless/wext-core.c
@@ -390,6 +390,7 @@ static void __net_exit wext_pernet_exit(struct net *net)
 static struct pernet_operations wext_pernet_ops = {
 	.init = wext_pernet_init,
 	.exit = wext_pernet_exit,
+	.async = true,
 };
 
 static int __init wireless_nlevent_init(void)

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH v2 23/31] net: Convert sysctl_core_ops
  2017-11-20 18:32 [PATCH v2 00/31] Replacing net_mutex with rw_semaphore Kirill Tkhai
                   ` (21 preceding siblings ...)
  2017-11-20 18:35 ` [PATCH v2 22/31] net: Convert wext_pernet_ops Kirill Tkhai
@ 2017-11-20 18:35 ` Kirill Tkhai
  2017-11-20 18:35 ` [PATCH v2 24/31] net: Convert pernet_subsys, registered from inet_init() Kirill Tkhai
                   ` (9 subsequent siblings)
  32 siblings, 0 replies; 41+ messages in thread
From: Kirill Tkhai @ 2017-11-20 18:35 UTC (permalink / raw)
  To: davem, vyasevic, kstewart, pombredanne, vyasevich, mark.rutland,
	gregkh, adobriyan, fw, nicolas.dichtel, xiyou.wangcong,
	roman.kapl, paul, dsahern, daniel, lucien.xin, mschiffer,
	rshearma, linux-kernel, netdev, ktkhai, ebiederm, avagin,
	gorcunov, eric.dumazet, stephen, ktkhai

These pernet_operations register and destroy sysctl
directory, and it's not interested for foreign
pernet_operations.

Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
---
 net/core/sysctl_net_core.c |    1 +
 1 file changed, 1 insertion(+)

diff --git a/net/core/sysctl_net_core.c b/net/core/sysctl_net_core.c
index cbc3dde4cfcc..1f8c94d726da 100644
--- a/net/core/sysctl_net_core.c
+++ b/net/core/sysctl_net_core.c
@@ -520,6 +520,7 @@ static __net_exit void sysctl_core_net_exit(struct net *net)
 static __net_initdata struct pernet_operations sysctl_core_ops = {
 	.init = sysctl_core_net_init,
 	.exit = sysctl_core_net_exit,
+	.async = true,
 };
 
 static __init int sysctl_core_init(void)

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH v2 24/31] net: Convert pernet_subsys, registered from inet_init()
  2017-11-20 18:32 [PATCH v2 00/31] Replacing net_mutex with rw_semaphore Kirill Tkhai
                   ` (22 preceding siblings ...)
  2017-11-20 18:35 ` [PATCH v2 23/31] net: Convert sysctl_core_ops Kirill Tkhai
@ 2017-11-20 18:35 ` Kirill Tkhai
  2017-11-20 18:36 ` [PATCH v2 25/31] net: Convert unix_net_ops Kirill Tkhai
                   ` (8 subsequent siblings)
  32 siblings, 0 replies; 41+ messages in thread
From: Kirill Tkhai @ 2017-11-20 18:35 UTC (permalink / raw)
  To: davem, vyasevic, kstewart, pombredanne, vyasevich, mark.rutland,
	gregkh, adobriyan, fw, nicolas.dichtel, xiyou.wangcong,
	roman.kapl, paul, dsahern, daniel, lucien.xin, mschiffer,
	rshearma, linux-kernel, netdev, ktkhai, ebiederm, avagin,
	gorcunov, eric.dumazet, stephen, ktkhai

arp_net_ops just addr/removes /proc entry.

devinet_ops allocates and frees duplicate of init_net tables
and (un)registers sysctl entries.

fib_net_ops allocates and frees pernet tables, creates/destroys
netlink socket and (un)initializes /proc entries. Foreign
pernet_operations do not touch them.

ip_rt_proc_ops only modifies pernet /proc entries.

xfrm_net_ops creates/destroys /proc entries, allocates/frees
pernet statistics, hashes and tables, and (un)initializes
sysctl files. These are not touched by foreigh pernet_operations

xfrm4_net_ops allocates/frees private pernet memory, and
configures sysctls.

sysctl_route_ops creates/destroys sysctls.

rt_genid_ops only initializes fields of just allocated net.

ipv4_inetpeer_ops allocated/frees net private memory.

igmp_net_ops just creates/destroys /proc files and socket,
noone else interested in.

tcp_sk_ops seems to be safe, because tcp_sk_init() does not
depend on any other pernet_operations modifications. Iteration
over hash table in inet_twsk_purge() is made under RCU lock,
and it's safe to iterate the table this way. Removing from
the table happen from inet_twsk_deschedule_put(), but this
function is safe without any extern locks, as it's synchronized
inside itself. There are many examples, it's used in different
context. So, it's safe to leave tcp_sk_exit_batch() unlocked.

tcp_net_metrics_ops is synchronized on tcp_metrics_lock and safe.

udplite4_net_ops only creates/destroys pernet /proc file.

icmp_sk_ops creates percpu sockets, not touched by foreign
pernet_operations.

ipmr_net_ops creates/destroys pernet fib tables, (un)registers
fib rules and /proc files. This seem to be safe to execute
in parallel with foreign pernet_operations.

af_inet_ops just sets up default parameters of newly created net.

ipv4_mib_ops creates and destroys pernet percpu statistics.

raw_net_ops, tcp4_net_ops, udp4_net_ops, ping_v4_net_ops
and ip_proc_ops only create/destroy pernet /proc files.

ip4_frags_ops creates and destroys sysctl file.

So, it's safe to make the pernet_operations async.

Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
---
 net/ipv4/af_inet.c      |    2 ++
 net/ipv4/arp.c          |    1 +
 net/ipv4/devinet.c      |    1 +
 net/ipv4/fib_frontend.c |    1 +
 net/ipv4/icmp.c         |    1 +
 net/ipv4/igmp.c         |    1 +
 net/ipv4/ip_fragment.c  |    1 +
 net/ipv4/ipmr.c         |    1 +
 net/ipv4/ping.c         |    1 +
 net/ipv4/proc.c         |    1 +
 net/ipv4/raw.c          |    1 +
 net/ipv4/route.c        |    4 ++++
 net/ipv4/tcp_ipv4.c     |    2 ++
 net/ipv4/tcp_metrics.c  |    1 +
 net/ipv4/udp.c          |    1 +
 net/ipv4/udplite.c      |    1 +
 net/ipv4/xfrm4_policy.c |    1 +
 net/xfrm/xfrm_policy.c  |    1 +
 18 files changed, 23 insertions(+)

diff --git a/net/ipv4/af_inet.c b/net/ipv4/af_inet.c
index ce4aa827be05..d1a2e9afbb50 100644
--- a/net/ipv4/af_inet.c
+++ b/net/ipv4/af_inet.c
@@ -1697,6 +1697,7 @@ static __net_exit void ipv4_mib_exit_net(struct net *net)
 static __net_initdata struct pernet_operations ipv4_mib_ops = {
 	.init = ipv4_mib_init_net,
 	.exit = ipv4_mib_exit_net,
+	.async = true,
 };
 
 static int __init init_ipv4_mibs(void)
@@ -1750,6 +1751,7 @@ static __net_exit void inet_exit_net(struct net *net)
 static __net_initdata struct pernet_operations af_inet_ops = {
 	.init = inet_init_net,
 	.exit = inet_exit_net,
+	.async = true,
 };
 
 static int __init init_inet_pernet_ops(void)
diff --git a/net/ipv4/arp.c b/net/ipv4/arp.c
index a8d7c5a9fb05..19bcd10a928b 100644
--- a/net/ipv4/arp.c
+++ b/net/ipv4/arp.c
@@ -1443,6 +1443,7 @@ static void __net_exit arp_net_exit(struct net *net)
 static struct pernet_operations arp_net_ops = {
 	.init = arp_net_init,
 	.exit = arp_net_exit,
+	.async = true,
 };
 
 static int __init arp_proc_init(void)
diff --git a/net/ipv4/devinet.c b/net/ipv4/devinet.c
index a4573bccd6da..c359bda18ff5 100644
--- a/net/ipv4/devinet.c
+++ b/net/ipv4/devinet.c
@@ -2474,6 +2474,7 @@ static __net_exit void devinet_exit_net(struct net *net)
 static __net_initdata struct pernet_operations devinet_ops = {
 	.init = devinet_init_net,
 	.exit = devinet_exit_net,
+	.async = true,
 };
 
 static struct rtnl_af_ops inet_af_ops __read_mostly = {
diff --git a/net/ipv4/fib_frontend.c b/net/ipv4/fib_frontend.c
index f52d27a422c3..6eb4aa5ee66f 100644
--- a/net/ipv4/fib_frontend.c
+++ b/net/ipv4/fib_frontend.c
@@ -1361,6 +1361,7 @@ static void __net_exit fib_net_exit(struct net *net)
 static struct pernet_operations fib_net_ops = {
 	.init = fib_net_init,
 	.exit = fib_net_exit,
+	.async = true,
 };
 
 void __init ip_fib_init(void)
diff --git a/net/ipv4/icmp.c b/net/ipv4/icmp.c
index 1617604c9284..cc56efa64d5c 100644
--- a/net/ipv4/icmp.c
+++ b/net/ipv4/icmp.c
@@ -1257,6 +1257,7 @@ static int __net_init icmp_sk_init(struct net *net)
 static struct pernet_operations __net_initdata icmp_sk_ops = {
        .init = icmp_sk_init,
        .exit = icmp_sk_exit,
+       .async = true,
 };
 
 int __init icmp_init(void)
diff --git a/net/ipv4/igmp.c b/net/ipv4/igmp.c
index ab183af0b5b6..ee63dae1f48e 100644
--- a/net/ipv4/igmp.c
+++ b/net/ipv4/igmp.c
@@ -3004,6 +3004,7 @@ static void __net_exit igmp_net_exit(struct net *net)
 static struct pernet_operations igmp_net_ops = {
 	.init = igmp_net_init,
 	.exit = igmp_net_exit,
+	.async = true,
 };
 #endif
 
diff --git a/net/ipv4/ip_fragment.c b/net/ipv4/ip_fragment.c
index bbf1b94942c0..5e843ae5e468 100644
--- a/net/ipv4/ip_fragment.c
+++ b/net/ipv4/ip_fragment.c
@@ -885,6 +885,7 @@ static void __net_exit ipv4_frags_exit_net(struct net *net)
 static struct pernet_operations ip4_frags_ops = {
 	.init = ipv4_frags_init_net,
 	.exit = ipv4_frags_exit_net,
+	.async = true,
 };
 
 void __init ipfrag_init(void)
diff --git a/net/ipv4/ipmr.c b/net/ipv4/ipmr.c
index 40a43ad294cb..64299e2dd6a3 100644
--- a/net/ipv4/ipmr.c
+++ b/net/ipv4/ipmr.c
@@ -3330,6 +3330,7 @@ static void __net_exit ipmr_net_exit(struct net *net)
 static struct pernet_operations ipmr_net_ops = {
 	.init = ipmr_net_init,
 	.exit = ipmr_net_exit,
+	.async = true,
 };
 
 int __init ip_mr_init(void)
diff --git a/net/ipv4/ping.c b/net/ipv4/ping.c
index b8f0db54b197..0164def9c808 100644
--- a/net/ipv4/ping.c
+++ b/net/ipv4/ping.c
@@ -1204,6 +1204,7 @@ static void __net_exit ping_v4_proc_exit_net(struct net *net)
 static struct pernet_operations ping_v4_net_ops = {
 	.init = ping_v4_proc_init_net,
 	.exit = ping_v4_proc_exit_net,
+	.async = true,
 };
 
 int __init ping_proc_init(void)
diff --git a/net/ipv4/proc.c b/net/ipv4/proc.c
index 9f37c4727861..4fa547d896ec 100644
--- a/net/ipv4/proc.c
+++ b/net/ipv4/proc.c
@@ -552,6 +552,7 @@ static __net_exit void ip_proc_exit_net(struct net *net)
 static __net_initdata struct pernet_operations ip_proc_ops = {
 	.init = ip_proc_init_net,
 	.exit = ip_proc_exit_net,
+	.async = true,
 };
 
 int __init ip_misc_proc_init(void)
diff --git a/net/ipv4/raw.c b/net/ipv4/raw.c
index 33b70bfd1122..b02e58125033 100644
--- a/net/ipv4/raw.c
+++ b/net/ipv4/raw.c
@@ -1135,6 +1135,7 @@ static __net_exit void raw_exit_net(struct net *net)
 static __net_initdata struct pernet_operations raw_net_ops = {
 	.init = raw_init_net,
 	.exit = raw_exit_net,
+	.async = true,
 };
 
 int __init raw_proc_init(void)
diff --git a/net/ipv4/route.c b/net/ipv4/route.c
index 43b69af242e1..b5a173aae851 100644
--- a/net/ipv4/route.c
+++ b/net/ipv4/route.c
@@ -420,6 +420,7 @@ static void __net_exit ip_rt_do_proc_exit(struct net *net)
 static struct pernet_operations ip_rt_proc_ops __net_initdata =  {
 	.init = ip_rt_do_proc_init,
 	.exit = ip_rt_do_proc_exit,
+	.async = true,
 };
 
 static int __init ip_rt_proc_init(void)
@@ -2996,6 +2997,7 @@ static __net_exit void sysctl_route_net_exit(struct net *net)
 static __net_initdata struct pernet_operations sysctl_route_ops = {
 	.init = sysctl_route_net_init,
 	.exit = sysctl_route_net_exit,
+	.async = true,
 };
 #endif
 
@@ -3009,6 +3011,7 @@ static __net_init int rt_genid_init(struct net *net)
 
 static __net_initdata struct pernet_operations rt_genid_ops = {
 	.init = rt_genid_init,
+	.async = true,
 };
 
 static int __net_init ipv4_inetpeer_init(struct net *net)
@@ -3034,6 +3037,7 @@ static void __net_exit ipv4_inetpeer_exit(struct net *net)
 static __net_initdata struct pernet_operations ipv4_inetpeer_ops = {
 	.init	=	ipv4_inetpeer_init,
 	.exit	=	ipv4_inetpeer_exit,
+	.async	=	true,
 };
 
 #ifdef CONFIG_IP_ROUTE_CLASSID
diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
index c6bc0c4d19c6..36f5434365f8 100644
--- a/net/ipv4/tcp_ipv4.c
+++ b/net/ipv4/tcp_ipv4.c
@@ -2366,6 +2366,7 @@ static void __net_exit tcp4_proc_exit_net(struct net *net)
 static struct pernet_operations tcp4_net_ops = {
 	.init = tcp4_proc_init_net,
 	.exit = tcp4_proc_exit_net,
+	.async = true,
 };
 
 int __init tcp4_proc_init(void)
@@ -2552,6 +2553,7 @@ static struct pernet_operations __net_initdata tcp_sk_ops = {
        .init	   = tcp_sk_init,
        .exit	   = tcp_sk_exit,
        .exit_batch = tcp_sk_exit_batch,
+       .async	   = true,
 };
 
 void __init tcp_v4_init(void)
diff --git a/net/ipv4/tcp_metrics.c b/net/ipv4/tcp_metrics.c
index 7097f92d16e5..01a6926313ff 100644
--- a/net/ipv4/tcp_metrics.c
+++ b/net/ipv4/tcp_metrics.c
@@ -1027,6 +1027,7 @@ static void __net_exit tcp_net_metrics_exit_batch(struct list_head *net_exit_lis
 static __net_initdata struct pernet_operations tcp_net_metrics_ops = {
 	.init		=	tcp_net_metrics_init,
 	.exit_batch	=	tcp_net_metrics_exit_batch,
+	.async		=	true,
 };
 
 void __init tcp_metrics_init(void)
diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
index e4ff25c947c5..8e8d9ffb7e8d 100644
--- a/net/ipv4/udp.c
+++ b/net/ipv4/udp.c
@@ -2767,6 +2767,7 @@ static void __net_exit udp4_proc_exit_net(struct net *net)
 static struct pernet_operations udp4_net_ops = {
 	.init = udp4_proc_init_net,
 	.exit = udp4_proc_exit_net,
+	.async = true,
 };
 
 int __init udp4_proc_init(void)
diff --git a/net/ipv4/udplite.c b/net/ipv4/udplite.c
index 59f10fe9782e..407aaab97383 100644
--- a/net/ipv4/udplite.c
+++ b/net/ipv4/udplite.c
@@ -105,6 +105,7 @@ static void __net_exit udplite4_proc_exit_net(struct net *net)
 static struct pernet_operations udplite4_net_ops = {
 	.init = udplite4_proc_init_net,
 	.exit = udplite4_proc_exit_net,
+	.async = true,
 };
 
 static __init int udplite4_proc_init(void)
diff --git a/net/ipv4/xfrm4_policy.c b/net/ipv4/xfrm4_policy.c
index 05017e2c849c..753f526cf9db 100644
--- a/net/ipv4/xfrm4_policy.c
+++ b/net/ipv4/xfrm4_policy.c
@@ -365,6 +365,7 @@ static void __net_exit xfrm4_net_exit(struct net *net)
 static struct pernet_operations __net_initdata xfrm4_net_ops = {
 	.init	= xfrm4_net_init,
 	.exit	= xfrm4_net_exit,
+	.async	= true,
 };
 
 static void __init xfrm4_policy_init(void)
diff --git a/net/xfrm/xfrm_policy.c b/net/xfrm/xfrm_policy.c
index 9542975eb2f9..f5185538a3e9 100644
--- a/net/xfrm/xfrm_policy.c
+++ b/net/xfrm/xfrm_policy.c
@@ -2961,6 +2961,7 @@ static void __net_exit xfrm_net_exit(struct net *net)
 static struct pernet_operations __net_initdata xfrm_net_ops = {
 	.init = xfrm_net_init,
 	.exit = xfrm_net_exit,
+	.async = true,
 };
 
 void __init xfrm_init(void)

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH v2 25/31] net: Convert unix_net_ops
  2017-11-20 18:32 [PATCH v2 00/31] Replacing net_mutex with rw_semaphore Kirill Tkhai
                   ` (23 preceding siblings ...)
  2017-11-20 18:35 ` [PATCH v2 24/31] net: Convert pernet_subsys, registered from inet_init() Kirill Tkhai
@ 2017-11-20 18:36 ` Kirill Tkhai
  2017-11-20 18:36 ` [PATCH v2 26/31] net: Convert packet_net_ops Kirill Tkhai
                   ` (7 subsequent siblings)
  32 siblings, 0 replies; 41+ messages in thread
From: Kirill Tkhai @ 2017-11-20 18:36 UTC (permalink / raw)
  To: davem, vyasevic, kstewart, pombredanne, vyasevich, mark.rutland,
	gregkh, adobriyan, fw, nicolas.dichtel, xiyou.wangcong,
	roman.kapl, paul, dsahern, daniel, lucien.xin, mschiffer,
	rshearma, linux-kernel, netdev, ktkhai, ebiederm, avagin,
	gorcunov, eric.dumazet, stephen, ktkhai

These pernet_operations are just create and destroy
/proc and sysctl entries, and are not touched by
foreign pernet_operations.

So, we are able to make them async.

Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
---
 net/unix/af_unix.c |    1 +
 1 file changed, 1 insertion(+)

diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c
index a9ee634f3c42..1ddf77260849 100644
--- a/net/unix/af_unix.c
+++ b/net/unix/af_unix.c
@@ -2913,6 +2913,7 @@ static void __net_exit unix_net_exit(struct net *net)
 static struct pernet_operations unix_net_ops = {
 	.init = unix_net_init,
 	.exit = unix_net_exit,
+	.async = true,
 };
 
 static int __init af_unix_init(void)

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH v2 26/31] net: Convert packet_net_ops
  2017-11-20 18:32 [PATCH v2 00/31] Replacing net_mutex with rw_semaphore Kirill Tkhai
                   ` (24 preceding siblings ...)
  2017-11-20 18:36 ` [PATCH v2 25/31] net: Convert unix_net_ops Kirill Tkhai
@ 2017-11-20 18:36 ` Kirill Tkhai
  2017-11-20 18:36 ` [PATCH v2 27/31] net: Convert ipv4_sysctl_ops Kirill Tkhai
                   ` (6 subsequent siblings)
  32 siblings, 0 replies; 41+ messages in thread
From: Kirill Tkhai @ 2017-11-20 18:36 UTC (permalink / raw)
  To: davem, vyasevic, kstewart, pombredanne, vyasevich, mark.rutland,
	gregkh, adobriyan, fw, nicolas.dichtel, xiyou.wangcong,
	roman.kapl, paul, dsahern, daniel, lucien.xin, mschiffer,
	rshearma, linux-kernel, netdev, ktkhai, ebiederm, avagin,
	gorcunov, eric.dumazet, stephen, ktkhai

These pernet_operations just create and destroy /proc entry,
and another operations do not touch it.

Also, nobody else are interested in foreign net::packet::sklist.

Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
---
 net/packet/af_packet.c |    1 +
 1 file changed, 1 insertion(+)

diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c
index 737092ca9b4e..700cdf36767b 100644
--- a/net/packet/af_packet.c
+++ b/net/packet/af_packet.c
@@ -4566,6 +4566,7 @@ static void __net_exit packet_net_exit(struct net *net)
 static struct pernet_operations packet_net_ops = {
 	.init = packet_net_init,
 	.exit = packet_net_exit,
+	.async = true,
 };
 
 

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH v2 27/31] net: Convert ipv4_sysctl_ops
  2017-11-20 18:32 [PATCH v2 00/31] Replacing net_mutex with rw_semaphore Kirill Tkhai
                   ` (25 preceding siblings ...)
  2017-11-20 18:36 ` [PATCH v2 26/31] net: Convert packet_net_ops Kirill Tkhai
@ 2017-11-20 18:36 ` Kirill Tkhai
  2017-11-20 18:36 ` [PATCH v2 28/31] net: Convert addrconf_ops Kirill Tkhai
                   ` (5 subsequent siblings)
  32 siblings, 0 replies; 41+ messages in thread
From: Kirill Tkhai @ 2017-11-20 18:36 UTC (permalink / raw)
  To: davem, vyasevic, kstewart, pombredanne, vyasevich, mark.rutland,
	gregkh, adobriyan, fw, nicolas.dichtel, xiyou.wangcong,
	roman.kapl, paul, dsahern, daniel, lucien.xin, mschiffer,
	rshearma, linux-kernel, netdev, ktkhai, ebiederm, avagin,
	gorcunov, eric.dumazet, stephen, ktkhai

These pernet_operations create and destroy sysctl,
which are not touched by anybody else.

Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
---
 net/ipv4/sysctl_net_ipv4.c |    1 +
 1 file changed, 1 insertion(+)

diff --git a/net/ipv4/sysctl_net_ipv4.c b/net/ipv4/sysctl_net_ipv4.c
index 93e172118a94..89683d868b37 100644
--- a/net/ipv4/sysctl_net_ipv4.c
+++ b/net/ipv4/sysctl_net_ipv4.c
@@ -1219,6 +1219,7 @@ static __net_exit void ipv4_sysctl_exit_net(struct net *net)
 static __net_initdata struct pernet_operations ipv4_sysctl_ops = {
 	.init = ipv4_sysctl_init_net,
 	.exit = ipv4_sysctl_exit_net,
+	.async = true,
 };
 
 static __init int sysctl_ipv4_init(void)

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH v2 28/31] net: Convert addrconf_ops
  2017-11-20 18:32 [PATCH v2 00/31] Replacing net_mutex with rw_semaphore Kirill Tkhai
                   ` (26 preceding siblings ...)
  2017-11-20 18:36 ` [PATCH v2 27/31] net: Convert ipv4_sysctl_ops Kirill Tkhai
@ 2017-11-20 18:36 ` Kirill Tkhai
  2017-11-20 18:36 ` [PATCH v2 29/31] net: Convert loopback_net_ops Kirill Tkhai
                   ` (4 subsequent siblings)
  32 siblings, 0 replies; 41+ messages in thread
From: Kirill Tkhai @ 2017-11-20 18:36 UTC (permalink / raw)
  To: davem, vyasevic, kstewart, pombredanne, vyasevich, mark.rutland,
	gregkh, adobriyan, fw, nicolas.dichtel, xiyou.wangcong,
	roman.kapl, paul, dsahern, daniel, lucien.xin, mschiffer,
	rshearma, linux-kernel, netdev, ktkhai, ebiederm, avagin,
	gorcunov, eric.dumazet, stephen, ktkhai

These pernet_operations (un)register sysctl, which
are not touched by anybody else.

So, it's safe to make them async.

Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
---
 net/ipv6/addrconf.c |    1 +
 1 file changed, 1 insertion(+)

diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c
index a0ae1c9d37df..fb7cf120daa7 100644
--- a/net/ipv6/addrconf.c
+++ b/net/ipv6/addrconf.c
@@ -6523,6 +6523,7 @@ static void __net_exit addrconf_exit_net(struct net *net)
 static struct pernet_operations addrconf_ops = {
 	.init = addrconf_init_net,
 	.exit = addrconf_exit_net,
+	.async = true,
 };
 
 static struct rtnl_af_ops inet6_ops __read_mostly = {

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH v2 29/31] net: Convert loopback_net_ops
  2017-11-20 18:32 [PATCH v2 00/31] Replacing net_mutex with rw_semaphore Kirill Tkhai
                   ` (27 preceding siblings ...)
  2017-11-20 18:36 ` [PATCH v2 28/31] net: Convert addrconf_ops Kirill Tkhai
@ 2017-11-20 18:36 ` Kirill Tkhai
  2017-11-20 18:36 ` [PATCH v2 30/31] net: Convert default_device_ops Kirill Tkhai
                   ` (3 subsequent siblings)
  32 siblings, 0 replies; 41+ messages in thread
From: Kirill Tkhai @ 2017-11-20 18:36 UTC (permalink / raw)
  To: davem, vyasevic, kstewart, pombredanne, vyasevich, mark.rutland,
	gregkh, adobriyan, fw, nicolas.dichtel, xiyou.wangcong,
	roman.kapl, paul, dsahern, daniel, lucien.xin, mschiffer,
	rshearma, linux-kernel, netdev, ktkhai, ebiederm, avagin,
	gorcunov, eric.dumazet, stephen, ktkhai

These pernet_operations have only init() method. It allocates
memory for net_device, calls register_netdev() and assigns
net::loopback_dev.

register_netdev() is allowed be used without additional locks,
as it's synchronized on rtnl_lock(). There are many examples
of using this functon directly from ioctl().

The only difference, compared to ioctl(), is that net is not
completely alive at this moment. But it looks like, there is
no way for parallel pernet_operations to dereference
the net_device, as the most of struct net_device lists,
where it's linked, are related to net, and the net is not liked.

The exceptions are net_device::unreg_list, close_list, todo_list,
used for unregistration, and ::link_watch_list, where net_device
may be linked to global lists.

Unregistration of loopback_dev obviously can't happen, when
loopback_net_init() is executing, as the net as alive. It occurs
in default_device_ops, which currently requires net_mutex,
and it behaves as a barrier at the moment. It will be considered
in next patch.

Speaking about link_watch_list, it seems, there is no way
for loopback_dev at time of registration to be linked in lweventlist
and be available for another pernet_operations.

Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
---
 drivers/net/loopback.c |    1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/net/loopback.c b/drivers/net/loopback.c
index 30612497643c..b97a907ea5aa 100644
--- a/drivers/net/loopback.c
+++ b/drivers/net/loopback.c
@@ -230,4 +230,5 @@ static __net_init int loopback_net_init(struct net *net)
 /* Registered in net/core/dev.c */
 struct pernet_operations __net_initdata loopback_net_ops = {
 	.init = loopback_net_init,
+	.async = true,
 };

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH v2 30/31] net: Convert default_device_ops
  2017-11-20 18:32 [PATCH v2 00/31] Replacing net_mutex with rw_semaphore Kirill Tkhai
                   ` (28 preceding siblings ...)
  2017-11-20 18:36 ` [PATCH v2 29/31] net: Convert loopback_net_ops Kirill Tkhai
@ 2017-11-20 18:36 ` Kirill Tkhai
  2017-11-20 18:37 ` [PATCH v2 31/31] net: Convert diag_net_ops Kirill Tkhai
                   ` (2 subsequent siblings)
  32 siblings, 0 replies; 41+ messages in thread
From: Kirill Tkhai @ 2017-11-20 18:36 UTC (permalink / raw)
  To: davem, vyasevic, kstewart, pombredanne, vyasevich, mark.rutland,
	gregkh, adobriyan, fw, nicolas.dichtel, xiyou.wangcong,
	roman.kapl, paul, dsahern, daniel, lucien.xin, mschiffer,
	rshearma, linux-kernel, netdev, ktkhai, ebiederm, avagin,
	gorcunov, eric.dumazet, stephen, ktkhai

These pernet operations consist of exit() and exit_batch() methods.

default_device_exit() moves not-local and virtual devices to init_net.
There is nothing exiting, because this may happen in any time
on a working system, and rtnl_lock() and synchronize_net() protect
us from all cases of external dereference.

The same for default_device_exit_batch(). Similar unregisteration
may happen in any time on a system. Here several lists (like todo_list),
which are accessed under rtnl_lock(). After rtnl_unlock() and
netdev_run_todo() all the devices are flushed.

Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
---
 net/core/dev.c |    1 +
 1 file changed, 1 insertion(+)

diff --git a/net/core/dev.c b/net/core/dev.c
index 41a576a17430..914fdb260aae 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -8757,6 +8757,7 @@ static void __net_exit default_device_exit_batch(struct list_head *net_list)
 static struct pernet_operations __net_initdata default_device_ops = {
 	.exit = default_device_exit,
 	.exit_batch = default_device_exit_batch,
+	.async = true,
 };
 
 /*

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH v2 31/31] net: Convert diag_net_ops
  2017-11-20 18:32 [PATCH v2 00/31] Replacing net_mutex with rw_semaphore Kirill Tkhai
                   ` (29 preceding siblings ...)
  2017-11-20 18:36 ` [PATCH v2 30/31] net: Convert default_device_ops Kirill Tkhai
@ 2017-11-20 18:37 ` Kirill Tkhai
  2017-12-04 15:54 ` [PATCH v2 00/31] Replacing net_mutex with rw_semaphore Kirill Tkhai
  2018-01-18 17:43 ` Andrei Vagin
  32 siblings, 0 replies; 41+ messages in thread
From: Kirill Tkhai @ 2017-11-20 18:37 UTC (permalink / raw)
  To: davem, vyasevic, kstewart, pombredanne, vyasevich, mark.rutland,
	gregkh, adobriyan, fw, nicolas.dichtel, xiyou.wangcong,
	roman.kapl, paul, dsahern, daniel, lucien.xin, mschiffer,
	rshearma, linux-kernel, netdev, ktkhai, ebiederm, avagin,
	gorcunov, eric.dumazet, stephen, ktkhai

These pernet operations just create and destroy netlink
socket. The socket is pernet and else operations don't
touch it.

Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
---
 net/core/sock_diag.c |    1 +
 1 file changed, 1 insertion(+)

diff --git a/net/core/sock_diag.c b/net/core/sock_diag.c
index 217f4e3b82f6..220130aee51d 100644
--- a/net/core/sock_diag.c
+++ b/net/core/sock_diag.c
@@ -328,6 +328,7 @@ static void __net_exit diag_net_exit(struct net *net)
 static struct pernet_operations diag_net_ops = {
 	.init = diag_net_init,
 	.exit = diag_net_exit,
+	.async = true,
 };
 
 static int __init sock_diag_init(void)

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* Re: [PATCH v2 00/31] Replacing net_mutex with rw_semaphore
  2017-11-20 18:32 [PATCH v2 00/31] Replacing net_mutex with rw_semaphore Kirill Tkhai
                   ` (30 preceding siblings ...)
  2017-11-20 18:37 ` [PATCH v2 31/31] net: Convert diag_net_ops Kirill Tkhai
@ 2017-12-04 15:54 ` Kirill Tkhai
  2017-12-04 16:10   ` David Miller
  2018-01-18 17:43 ` Andrei Vagin
  32 siblings, 1 reply; 41+ messages in thread
From: Kirill Tkhai @ 2017-12-04 15:54 UTC (permalink / raw)
  To: davem, ebiederm
  Cc: vyasevic, kstewart, pombredanne, vyasevich, mark.rutland, gregkh,
	adobriyan, fw, nicolas.dichtel, xiyou.wangcong, roman.kapl, paul,
	dsahern, daniel, lucien.xin, mschiffer, rshearma, linux-kernel,
	netdev, avagin, gorcunov, eric.dumazet, stephen

Still no comments :(

Ping, ping, ping.

Here is the second version of big patch set with Eric's commentaries accounted.

What will we do next?

On 20.11.2017 21:32, Kirill Tkhai wrote:
> Hi,
> 
> there is the second version of patchset introducing net_sem
> instead of net_mutex. The patchset adds net_sem in addition
> to net_mutex and allows pernet_operations to be async. This
> flag means, the pernet_operations methods are safe to be
> executed with any othor pernet_operations (un)initializing
> another net.
> 
> If there are only async pernet_operations in the system,
> net_mutex is not used either for setup_net() or for cleanup_net().
> 
> The flag is little easier, then (un)register_pernet_sys(),
> as it changes one line only. Also, it requires less changes
> in code. In future, when all pernet_operations are async,
> we'll just remove this struct field.
> 
> The pernet_operations converted in this patchset allow
> to create minimal .config to have network working, and
> the changes improve the performance like you may see
> below:
> 
>     %for i in {1..10000}; do unshare -n bash -c exit; done
>     
>     *before*
>     real 1m40,377s
>     user 0m9,672s
>     sys 0m19,928s
>     
>     *after*
>     real 0m17,007s
>     user 0m5,311s
>     sys 0m11,779
>     
>     (5.8 times faster)
> ---
> 
> Kirill Tkhai (31):
>       net: Assign net to net_namespace_list in setup_net()
>       net: Cleanup copy_net_ns()
>       net: Introduce net_sem for protection of pernet_list
>       net: Move mutex_unlock() in cleanup_net() up
>       net: Allow pernet_operations to be executed in parallel
>       net: Convert proc_net_ns_ops
>       net: Convert net_ns_ops methods
>       net: Convert sysctl_pernet_ops
>       net: Convert netfilter_net_ops
>       net: Convert nf_log_net_ops
>       net: Convert net_inuse_ops
>       net: Convert net_defaults_ops
>       net: Convert netlink_net_ops
>       net: Convert rtnetlink_net_ops
>       net: Convert audit_net_ops
>       net: Convert uevent_net_ops
>       net: Convert proto_net_ops
>       net: Convert pernet_subsys ops, registered via net_dev_init()
>       net: Convert fib_* pernet_operations, registered via subsys_initcall
>       net: Convert subsys_initcall() registered pernet_operations from net/sched
>       net: Convert genl_pernet_ops
>       net: Convert wext_pernet_ops
>       net: Convert sysctl_core_ops
>       net: Convert pernet_subsys, registered from inet_init()
>       net: Convert unix_net_ops
>       net: Convert packet_net_ops
>       net: Convert ipv4_sysctl_ops
>       net: Convert addrconf_ops
>       net: Convert loopback_net_ops
>       net: Convert default_device_ops
>       net: Convert diag_net_ops
> 
> 
>  drivers/net/loopback.c      |    1 
>  fs/proc/proc_net.c          |    1 
>  include/linux/rtnetlink.h   |    1 
>  include/net/net_namespace.h |    6 +++
>  kernel/audit.c              |    1 
>  lib/kobject_uevent.c        |    1 
>  net/core/dev.c              |    2 +
>  net/core/fib_notifier.c     |    1 
>  net/core/fib_rules.c        |    1 
>  net/core/net-procfs.c       |    2 +
>  net/core/net_namespace.c    |   94 +++++++++++++++++++++++++------------------
>  net/core/rtnetlink.c        |    5 +-
>  net/core/sock.c             |    2 +
>  net/core/sock_diag.c        |    1 
>  net/core/sysctl_net_core.c  |    1 
>  net/ipv4/af_inet.c          |    2 +
>  net/ipv4/arp.c              |    1 
>  net/ipv4/devinet.c          |    1 
>  net/ipv4/fib_frontend.c     |    1 
>  net/ipv4/icmp.c             |    1 
>  net/ipv4/igmp.c             |    1 
>  net/ipv4/ip_fragment.c      |    1 
>  net/ipv4/ipmr.c             |    1 
>  net/ipv4/ping.c             |    1 
>  net/ipv4/proc.c             |    1 
>  net/ipv4/raw.c              |    1 
>  net/ipv4/route.c            |    4 ++
>  net/ipv4/sysctl_net_ipv4.c  |    1 
>  net/ipv4/tcp_ipv4.c         |    2 +
>  net/ipv4/tcp_metrics.c      |    1 
>  net/ipv4/udp.c              |    1 
>  net/ipv4/udplite.c          |    1 
>  net/ipv4/xfrm4_policy.c     |    1 
>  net/ipv6/addrconf.c         |    1 
>  net/netfilter/core.c        |    1 
>  net/netfilter/nf_log.c      |    1 
>  net/netlink/af_netlink.c    |    1 
>  net/netlink/genetlink.c     |    1 
>  net/packet/af_packet.c      |    1 
>  net/sched/act_api.c         |    1 
>  net/sched/sch_api.c         |    1 
>  net/sysctl_net.c            |    1 
>  net/unix/af_unix.c          |    1 
>  net/wireless/wext-core.c    |    1 
>  net/xfrm/xfrm_policy.c      |    1 
>  45 files changed, 114 insertions(+), 41 deletions(-)
> 
> --
> Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
> 

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v2 00/31] Replacing net_mutex with rw_semaphore
  2017-12-04 15:54 ` [PATCH v2 00/31] Replacing net_mutex with rw_semaphore Kirill Tkhai
@ 2017-12-04 16:10   ` David Miller
  2017-12-04 16:11     ` Kirill Tkhai
  0 siblings, 1 reply; 41+ messages in thread
From: David Miller @ 2017-12-04 16:10 UTC (permalink / raw)
  To: ktkhai
  Cc: ebiederm, vyasevic, kstewart, pombredanne, vyasevich,
	mark.rutland, gregkh, adobriyan, fw, nicolas.dichtel,
	xiyou.wangcong, roman.kapl, paul, dsahern, daniel, lucien.xin,
	mschiffer, rshearma, linux-kernel, netdev, avagin, gorcunov,
	eric.dumazet, stephen

From: Kirill Tkhai <ktkhai@virtuozzo.com>
Date: Mon, 4 Dec 2017 18:54:51 +0300

> Still no comments :(
> 
> Ping, ping, ping.

You cannot force people to prioritize reviewing your patch submission.

Screaming "ping ping ping" doesn't help, in fact is hinders.

> What will we do next?

Be patient.

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v2 00/31] Replacing net_mutex with rw_semaphore
  2017-12-04 16:10   ` David Miller
@ 2017-12-04 16:11     ` Kirill Tkhai
  0 siblings, 0 replies; 41+ messages in thread
From: Kirill Tkhai @ 2017-12-04 16:11 UTC (permalink / raw)
  To: David Miller
  Cc: ebiederm, vyasevic, kstewart, pombredanne, vyasevich,
	mark.rutland, gregkh, adobriyan, fw, nicolas.dichtel,
	xiyou.wangcong, roman.kapl, paul, dsahern, daniel, lucien.xin,
	mschiffer, rshearma, linux-kernel, netdev, avagin, gorcunov,
	eric.dumazet, stephen

On 04.12.2017 19:10, David Miller wrote:
> From: Kirill Tkhai <ktkhai@virtuozzo.com>
> Date: Mon, 4 Dec 2017 18:54:51 +0300
> 
>> Still no comments :(
>>
>> Ping, ping, ping.
> 
> You cannot force people to prioritize reviewing your patch submission.
> 
> Screaming "ping ping ping" doesn't help, in fact is hinders.
> 
>> What will we do next?
> 
> Be patient.

Ok, thanks

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v2 05/31] net: Allow pernet_operations to be executed in parallel
  2017-11-20 18:32 ` [PATCH v2 05/31] net: Allow pernet_operations to be executed in parallel Kirill Tkhai
@ 2018-01-17 18:34   ` Andrei Vagin
  2018-01-18 10:16     ` Kirill Tkhai
  0 siblings, 1 reply; 41+ messages in thread
From: Andrei Vagin @ 2018-01-17 18:34 UTC (permalink / raw)
  To: Kirill Tkhai
  Cc: davem, vyasevic, kstewart, pombredanne, vyasevich, mark.rutland,
	gregkh, adobriyan, fw, nicolas.dichtel, xiyou.wangcong,
	roman.kapl, paul, dsahern, daniel, lucien.xin, mschiffer,
	rshearma, linux-kernel, netdev, ebiederm, gorcunov, eric.dumazet,
	stephen

On Mon, Nov 20, 2017 at 09:32:55PM +0300, Kirill Tkhai wrote:
> This adds new pernet_operations::async flag to indicate operations,
> which ->init(), ->exit() and ->exit_batch() methods are allowed
> to be executed in parallel with the methods of any other pernet_operations.
> 
> When there are only asynchronous pernet_operations in the system,
> net_mutex won't be taken for a net construction and destruction.
> 
> Also, remove BUG_ON(mutex_is_locked()) from net_assign_generic()
> without replacing with the equivalent net_sem check, as there is
> one more lockdep assert below.
> 
> Suggested-by: Eric W. Biederman <ebiederm@xmission.com>
> Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
> ---
>  include/net/net_namespace.h |    6 ++++++
>  net/core/net_namespace.c    |   29 +++++++++++++++++++----------
>  2 files changed, 25 insertions(+), 10 deletions(-)
> 
> diff --git a/include/net/net_namespace.h b/include/net/net_namespace.h
> index 10f99dafd5ac..db978c4755f7 100644
> --- a/include/net/net_namespace.h
> +++ b/include/net/net_namespace.h
> @@ -303,6 +303,12 @@ struct pernet_operations {
>  	void (*exit_batch)(struct list_head *net_exit_list);
>  	unsigned int *id;
>  	size_t size;
> +	/*
> +	 * Indicates above methods are allowe to be executed in parallel
> +	 * with methods of any other pernet_operations, i.e. they are not
> +	 * need synchronization via net_mutex.
> +	 */
> +	bool async;
>  };
>  
>  /*
> diff --git a/net/core/net_namespace.c b/net/core/net_namespace.c
> index c4f7452906bb..550c766f73aa 100644
> --- a/net/core/net_namespace.c
> +++ b/net/core/net_namespace.c
> @@ -41,8 +41,9 @@ struct net init_net = {
>  EXPORT_SYMBOL(init_net);
>  
>  static bool init_net_initialized;
> +static unsigned nr_sync_pernet_ops;
>  /*
> - * net_sem: protects: pernet_list, net_generic_ids,
> + * net_sem: protects: pernet_list, net_generic_ids, nr_sync_pernet_ops,
>   * init_net_initialized and first_device pointer.
>   */
>  DECLARE_RWSEM(net_sem);
> @@ -70,11 +71,10 @@ static int net_assign_generic(struct net *net, unsigned int id, void *data)
>  {
>  	struct net_generic *ng, *old_ng;
>  
> -	BUG_ON(!mutex_is_locked(&net_mutex));
>  	BUG_ON(id < MIN_PERNET_OPS_ID);
>  
>  	old_ng = rcu_dereference_protected(net->gen,
> -					   lockdep_is_held(&net_mutex));
> +					   lockdep_is_held(&net_sem));
>  	if (old_ng->s.len > id) {
>  		old_ng->ptr[id] = data;
>  		return 0;
> @@ -419,11 +419,14 @@ struct net *copy_net_ns(unsigned long flags,
>  	rv = down_read_killable(&net_sem);
>  	if (rv < 0)
>  		goto put_userns;
> -	rv = mutex_lock_killable(&net_mutex);
> -	if (rv < 0)
> -		goto up_read;
> +	if (nr_sync_pernet_ops) {
> +		rv = mutex_lock_killable(&net_mutex);
> +		if (rv < 0)
> +			goto up_read;
> +	}
>  	rv = setup_net(net, user_ns);
> -	mutex_unlock(&net_mutex);
> +	if (nr_sync_pernet_ops)
> +		mutex_unlock(&net_mutex);
>  up_read:
>  	up_read(&net_sem);
>  	if (rv < 0) {
> @@ -453,7 +456,8 @@ static void cleanup_net(struct work_struct *work)
>  	spin_unlock_irq(&cleanup_list_lock);
>  
>  	down_read(&net_sem);
> -	mutex_lock(&net_mutex);
> +	if (nr_sync_pernet_ops)
> +		mutex_lock(&net_mutex);
>  
>  	/* Don't let anyone else find us. */
>  	rtnl_lock();
> @@ -489,7 +493,8 @@ static void cleanup_net(struct work_struct *work)
>  	list_for_each_entry_reverse(ops, &pernet_list, list)
>  		ops_exit_list(ops, &net_exit_list);
>  
> -	mutex_unlock(&net_mutex);
> +	if (nr_sync_pernet_ops)
> +		mutex_unlock(&net_mutex);
>  
>  	/* Free the net generic variables */
>  	list_for_each_entry_reverse(ops, &pernet_list, list)
> @@ -961,6 +966,9 @@ static int register_pernet_operations(struct list_head *list,
>  		rcu_barrier();
>  		if (ops->id)
>  			ida_remove(&net_generic_ids, *ops->id);
> +	} else if (!ops->async) {
> +		pr_info_once("Pernet operations %ps are sync.\n", ops);

As far as I understand, we have this sync mode for backward
compatibility with non-upstream modules, don't we? If the answer is yes,
it may be better to add WARN_ONCE here?

> +		nr_sync_pernet_ops++;
>  	}
>  
>  	return error;
> @@ -968,7 +976,8 @@ static int register_pernet_operations(struct list_head *list,
>  
>  static void unregister_pernet_operations(struct pernet_operations *ops)
>  {
> -	
> +	if (!ops->async)
> +		BUG_ON(nr_sync_pernet_ops-- == 0);
>  	__unregister_pernet_operations(ops);
>  	rcu_barrier();
>  	if (ops->id)
> 

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v2 03/31] net: Introduce net_sem for protection of pernet_list
  2017-11-20 18:32 ` [PATCH v2 03/31] net: Introduce net_sem for protection of pernet_list Kirill Tkhai
@ 2018-01-17 20:04   ` Andrei Vagin
  2018-01-18 10:14     ` Kirill Tkhai
  0 siblings, 1 reply; 41+ messages in thread
From: Andrei Vagin @ 2018-01-17 20:04 UTC (permalink / raw)
  To: Kirill Tkhai
  Cc: davem, vyasevic, kstewart, pombredanne, vyasevich, mark.rutland,
	gregkh, adobriyan, fw, nicolas.dichtel, xiyou.wangcong,
	roman.kapl, paul, dsahern, daniel, lucien.xin, mschiffer,
	rshearma, linux-kernel, netdev, ebiederm, gorcunov, eric.dumazet,
	stephen

On Mon, Nov 20, 2017 at 09:32:34PM +0300, Kirill Tkhai wrote:
> Curently mutex is used to protect pernet operations list. It makes
> cleanup_net() to execute ->exit methods of the same operations set,
> which was used on the time of ->init, even after net namespace is
> unlinked from net_namespace_list.
> 
> But the problem is it's need to synchronize_rcu() after net is removed
> from net_namespace_list():
> 
> Destroy net_ns:
> cleanup_net()
>   mutex_lock(&net_mutex)
>   list_del_rcu(&net->list)
>   synchronize_rcu()                                  <--- Sleep there for ages
>   list_for_each_entry_reverse(ops, &pernet_list, list)
>     ops_exit_list(ops, &net_exit_list)
>   list_for_each_entry_reverse(ops, &pernet_list, list)
>     ops_free_list(ops, &net_exit_list)
>   mutex_unlock(&net_mutex)
> 
> This primitive is not fast, especially on the systems with many processors
> and/or when preemptible RCU is enabled in config. So, all the time, while
> cleanup_net() is waiting for RCU grace period, creation of new net namespaces
> is not possible, the tasks, who makes it, are sleeping on the same mutex:
> 
> Create net_ns:
> copy_net_ns()
>   mutex_lock_killable(&net_mutex)                    <--- Sleep there for ages
> 
> I observed 20-30 seconds hangs of "unshare -n" on ordinary 8-cpu laptop
> with preemptible RCU enabled.
> 
> The solution is to convert net_mutex to the rw_semaphore and add small locks
> to really small number of pernet_operations, what really need them. Then,
> pernet_operations::init/::exit methods, modifying the net-related data,
> will require down_read() locking only, while down_write() will be used
> for changing pernet_list.
> 
> This gives signify performance increase, after all patch set is applied,
> like you may see here:
> 
> %for i in {1..10000}; do unshare -n bash -c exit; done
> 
> *before*
> real 1m40,377s
> user 0m9,672s
> sys 0m19,928s
> 
> *after*
> real 0m17,007s
> user 0m5,311s
> sys 0m11,779
> 
> (5.8 times faster)
> 
> This patch starts replacing net_mutex to net_sem. It adds rw_semaphore,
> describes the variables it protects, and makes to use where appropriate.
> net_mutex is still present, and next patches will kick it out step-by-step.
> 
> Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
> ---
>  include/linux/rtnetlink.h |    1 +
>  net/core/net_namespace.c  |   39 ++++++++++++++++++++++++++-------------
>  net/core/rtnetlink.c      |    4 ++--
>  3 files changed, 29 insertions(+), 15 deletions(-)
> 
> diff --git a/include/linux/rtnetlink.h b/include/linux/rtnetlink.h
> index 2032ce2eb20b..f640fc87fe1d 100644
> --- a/include/linux/rtnetlink.h
> +++ b/include/linux/rtnetlink.h
> @@ -35,6 +35,7 @@ extern int rtnl_is_locked(void);
>  
>  extern wait_queue_head_t netdev_unregistering_wq;
>  extern struct mutex net_mutex;
> +extern struct rw_semaphore net_sem;
>  
>  #ifdef CONFIG_PROVE_LOCKING
>  extern bool lockdep_rtnl_is_held(void);
> diff --git a/net/core/net_namespace.c b/net/core/net_namespace.c
> index 2e512965bf42..859dce31e37e 100644
> --- a/net/core/net_namespace.c
> +++ b/net/core/net_namespace.c
> @@ -41,6 +41,11 @@ struct net init_net = {

> static LIST_HEAD(pernet_list);
> static struct list_head *first_device = &pernet_list;
> DEFINE_MUTEX(net_mutex);

With all patches, we still have the net_mutex, I think we need to add a
comment, which explains why we need it. Are "sync" pernet operations
depricated after this series? Or is it ok to have them?


>  EXPORT_SYMBOL(init_net);
>  
>  static bool init_net_initialized;
> +/*
> + * net_sem: protects: pernet_list, net_generic_ids,
> + * init_net_initialized and first_device pointer.
> + */
> +DECLARE_RWSEM(net_sem);
>  
>  #define MIN_PERNET_OPS_ID	\
>  	((sizeof(struct net_generic) + sizeof(void *) - 1) / sizeof(void *))
> @@ -279,7 +284,7 @@ struct net *get_net_ns_by_id(struct net *net, int id)
>   */
>  static __net_init int setup_net(struct net *net, struct user_namespace *user_ns)
>  {
> -	/* Must be called with net_mutex held */
> +	/* Must be called with net_sem held */
>  	const struct pernet_operations *ops, *saved_ops;
>  	int error = 0;
>  	LIST_HEAD(net_exit_list);
> @@ -411,12 +416,16 @@ struct net *copy_net_ns(unsigned long flags,
>  	net->ucounts = ucounts;
>  	get_user_ns(user_ns);
>  
> -	rv = mutex_lock_killable(&net_mutex);
> +	rv = down_read_killable(&net_sem);
>  	if (rv < 0)
>  		goto put_userns;
> -
> +	rv = mutex_lock_killable(&net_mutex);
> +	if (rv < 0)
> +		goto up_read;
>  	rv = setup_net(net, user_ns);
>  	mutex_unlock(&net_mutex);
> +up_read:
> +	up_read(&net_sem);
>  	if (rv < 0) {
>  put_userns:
>  		put_user_ns(user_ns);
> @@ -443,6 +452,7 @@ static void cleanup_net(struct work_struct *work)
>  	list_replace_init(&cleanup_list, &net_kill_list);
>  	spin_unlock_irq(&cleanup_list_lock);
>  
> +	down_read(&net_sem);
>  	mutex_lock(&net_mutex);
>  
>  	/* Don't let anyone else find us. */
> @@ -484,6 +494,7 @@ static void cleanup_net(struct work_struct *work)
>  		ops_free_list(ops, &net_exit_list);
>  
>  	mutex_unlock(&net_mutex);
> +	up_read(&net_sem);
>  
>  	/* Ensure there are no outstanding rcu callbacks using this
>  	 * network namespace.
> @@ -510,8 +521,10 @@ static void cleanup_net(struct work_struct *work)
>   */
>  void net_ns_barrier(void)
>  {
> +	down_write(&net_sem);
>  	mutex_lock(&net_mutex);
>  	mutex_unlock(&net_mutex);
> +	up_write(&net_sem);
>  }
>  EXPORT_SYMBOL(net_ns_barrier);
>  
> @@ -838,12 +851,12 @@ static int __init net_ns_init(void)
>  
>  	rcu_assign_pointer(init_net.gen, ng);
>  
> -	mutex_lock(&net_mutex);
> +	down_write(&net_sem);
>  	if (setup_net(&init_net, &init_user_ns))
>  		panic("Could not setup the initial network namespace");
>  
>  	init_net_initialized = true;
> -	mutex_unlock(&net_mutex);
> +	up_write(&net_sem);
>  
>  	register_pernet_subsys(&net_ns_ops);
>  
> @@ -983,9 +996,9 @@ static void unregister_pernet_operations(struct pernet_operations *ops)
>  int register_pernet_subsys(struct pernet_operations *ops)
>  {
>  	int error;
> -	mutex_lock(&net_mutex);
> +	down_write(&net_sem);
>  	error =  register_pernet_operations(first_device, ops);
> -	mutex_unlock(&net_mutex);
> +	up_write(&net_sem);
>  	return error;
>  }
>  EXPORT_SYMBOL_GPL(register_pernet_subsys);
> @@ -1001,9 +1014,9 @@ EXPORT_SYMBOL_GPL(register_pernet_subsys);
>   */
>  void unregister_pernet_subsys(struct pernet_operations *ops)
>  {
> -	mutex_lock(&net_mutex);
> +	down_write(&net_sem);
>  	unregister_pernet_operations(ops);
> -	mutex_unlock(&net_mutex);
> +	up_write(&net_sem);
>  }
>  EXPORT_SYMBOL_GPL(unregister_pernet_subsys);
>  
> @@ -1029,11 +1042,11 @@ EXPORT_SYMBOL_GPL(unregister_pernet_subsys);
>  int register_pernet_device(struct pernet_operations *ops)
>  {
>  	int error;
> -	mutex_lock(&net_mutex);
> +	down_write(&net_sem);
>  	error = register_pernet_operations(&pernet_list, ops);
>  	if (!error && (first_device == &pernet_list))
>  		first_device = &ops->list;
> -	mutex_unlock(&net_mutex);
> +	up_write(&net_sem);
>  	return error;
>  }
>  EXPORT_SYMBOL_GPL(register_pernet_device);
> @@ -1049,11 +1062,11 @@ EXPORT_SYMBOL_GPL(register_pernet_device);
>   */
>  void unregister_pernet_device(struct pernet_operations *ops)
>  {
> -	mutex_lock(&net_mutex);
> +	down_write(&net_sem);
>  	if (&ops->list == first_device)
>  		first_device = first_device->next;
>  	unregister_pernet_operations(ops);
> -	mutex_unlock(&net_mutex);
> +	up_write(&net_sem);
>  }
>  EXPORT_SYMBOL_GPL(unregister_pernet_device);
>  
> diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c
> index dabba2a91fc8..cb06d43c4230 100644
> --- a/net/core/rtnetlink.c
> +++ b/net/core/rtnetlink.c
> @@ -390,11 +390,11 @@ static void rtnl_lock_unregistering_all(void)
>  void rtnl_link_unregister(struct rtnl_link_ops *ops)
>  {
>  	/* Close the race with cleanup_net() */
> -	mutex_lock(&net_mutex);
> +	down_write(&net_sem);
>  	rtnl_lock_unregistering_all();
>  	__rtnl_link_unregister(ops);
>  	rtnl_unlock();
> -	mutex_unlock(&net_mutex);
> +	up_write(&net_sem);
>  }
>  EXPORT_SYMBOL_GPL(rtnl_link_unregister);
>  
> 

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v2 03/31] net: Introduce net_sem for protection of pernet_list
  2018-01-17 20:04   ` Andrei Vagin
@ 2018-01-18 10:14     ` Kirill Tkhai
  0 siblings, 0 replies; 41+ messages in thread
From: Kirill Tkhai @ 2018-01-18 10:14 UTC (permalink / raw)
  To: Andrei Vagin
  Cc: davem, vyasevic, kstewart, pombredanne, vyasevich, mark.rutland,
	gregkh, adobriyan, fw, nicolas.dichtel, xiyou.wangcong,
	roman.kapl, paul, dsahern, daniel, lucien.xin, mschiffer,
	rshearma, linux-kernel, netdev, ebiederm, gorcunov, eric.dumazet,
	stephen

On 17.01.2018 23:04, Andrei Vagin wrote:
> On Mon, Nov 20, 2017 at 09:32:34PM +0300, Kirill Tkhai wrote:
>> Curently mutex is used to protect pernet operations list. It makes
>> cleanup_net() to execute ->exit methods of the same operations set,
>> which was used on the time of ->init, even after net namespace is
>> unlinked from net_namespace_list.
>>
>> But the problem is it's need to synchronize_rcu() after net is removed
>> from net_namespace_list():
>>
>> Destroy net_ns:
>> cleanup_net()
>>   mutex_lock(&net_mutex)
>>   list_del_rcu(&net->list)
>>   synchronize_rcu()                                  <--- Sleep there for ages
>>   list_for_each_entry_reverse(ops, &pernet_list, list)
>>     ops_exit_list(ops, &net_exit_list)
>>   list_for_each_entry_reverse(ops, &pernet_list, list)
>>     ops_free_list(ops, &net_exit_list)
>>   mutex_unlock(&net_mutex)
>>
>> This primitive is not fast, especially on the systems with many processors
>> and/or when preemptible RCU is enabled in config. So, all the time, while
>> cleanup_net() is waiting for RCU grace period, creation of new net namespaces
>> is not possible, the tasks, who makes it, are sleeping on the same mutex:
>>
>> Create net_ns:
>> copy_net_ns()
>>   mutex_lock_killable(&net_mutex)                    <--- Sleep there for ages
>>
>> I observed 20-30 seconds hangs of "unshare -n" on ordinary 8-cpu laptop
>> with preemptible RCU enabled.
>>
>> The solution is to convert net_mutex to the rw_semaphore and add small locks
>> to really small number of pernet_operations, what really need them. Then,
>> pernet_operations::init/::exit methods, modifying the net-related data,
>> will require down_read() locking only, while down_write() will be used
>> for changing pernet_list.
>>
>> This gives signify performance increase, after all patch set is applied,
>> like you may see here:
>>
>> %for i in {1..10000}; do unshare -n bash -c exit; done
>>
>> *before*
>> real 1m40,377s
>> user 0m9,672s
>> sys 0m19,928s
>>
>> *after*
>> real 0m17,007s
>> user 0m5,311s
>> sys 0m11,779
>>
>> (5.8 times faster)
>>
>> This patch starts replacing net_mutex to net_sem. It adds rw_semaphore,
>> describes the variables it protects, and makes to use where appropriate.
>> net_mutex is still present, and next patches will kick it out step-by-step.
>>
>> Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
>> ---
>>  include/linux/rtnetlink.h |    1 +
>>  net/core/net_namespace.c  |   39 ++++++++++++++++++++++++++-------------
>>  net/core/rtnetlink.c      |    4 ++--
>>  3 files changed, 29 insertions(+), 15 deletions(-)
>>
>> diff --git a/include/linux/rtnetlink.h b/include/linux/rtnetlink.h
>> index 2032ce2eb20b..f640fc87fe1d 100644
>> --- a/include/linux/rtnetlink.h
>> +++ b/include/linux/rtnetlink.h
>> @@ -35,6 +35,7 @@ extern int rtnl_is_locked(void);
>>  
>>  extern wait_queue_head_t netdev_unregistering_wq;
>>  extern struct mutex net_mutex;
>> +extern struct rw_semaphore net_sem;
>>  
>>  #ifdef CONFIG_PROVE_LOCKING
>>  extern bool lockdep_rtnl_is_held(void);
>> diff --git a/net/core/net_namespace.c b/net/core/net_namespace.c
>> index 2e512965bf42..859dce31e37e 100644
>> --- a/net/core/net_namespace.c
>> +++ b/net/core/net_namespace.c
>> @@ -41,6 +41,11 @@ struct net init_net = {
> 
>> static LIST_HEAD(pernet_list);
>> static struct list_head *first_device = &pernet_list;
>> DEFINE_MUTEX(net_mutex);
> 
> With all patches, we still have the net_mutex, I think we need to add a
> comment, which explains why we need it. Are "sync" pernet operations
> depricated after this series? Or is it ok to have them?
 
net_mutex will leave till the time all pernet_operations are converted.
But people, who don't use unconverted operations, will have performance
profit already. Comment is not a problem :)

Thanks, Kirill
 
>>  EXPORT_SYMBOL(init_net);
>>  
>>  static bool init_net_initialized;
>> +/*
>> + * net_sem: protects: pernet_list, net_generic_ids,
>> + * init_net_initialized and first_device pointer.
>> + */
>> +DECLARE_RWSEM(net_sem);
>>  
>>  #define MIN_PERNET_OPS_ID	\
>>  	((sizeof(struct net_generic) + sizeof(void *) - 1) / sizeof(void *))
>> @@ -279,7 +284,7 @@ struct net *get_net_ns_by_id(struct net *net, int id)
>>   */
>>  static __net_init int setup_net(struct net *net, struct user_namespace *user_ns)
>>  {
>> -	/* Must be called with net_mutex held */
>> +	/* Must be called with net_sem held */
>>  	const struct pernet_operations *ops, *saved_ops;
>>  	int error = 0;
>>  	LIST_HEAD(net_exit_list);
>> @@ -411,12 +416,16 @@ struct net *copy_net_ns(unsigned long flags,
>>  	net->ucounts = ucounts;
>>  	get_user_ns(user_ns);
>>  
>> -	rv = mutex_lock_killable(&net_mutex);
>> +	rv = down_read_killable(&net_sem);
>>  	if (rv < 0)
>>  		goto put_userns;
>> -
>> +	rv = mutex_lock_killable(&net_mutex);
>> +	if (rv < 0)
>> +		goto up_read;
>>  	rv = setup_net(net, user_ns);
>>  	mutex_unlock(&net_mutex);
>> +up_read:
>> +	up_read(&net_sem);
>>  	if (rv < 0) {
>>  put_userns:
>>  		put_user_ns(user_ns);
>> @@ -443,6 +452,7 @@ static void cleanup_net(struct work_struct *work)
>>  	list_replace_init(&cleanup_list, &net_kill_list);
>>  	spin_unlock_irq(&cleanup_list_lock);
>>  
>> +	down_read(&net_sem);
>>  	mutex_lock(&net_mutex);
>>  
>>  	/* Don't let anyone else find us. */
>> @@ -484,6 +494,7 @@ static void cleanup_net(struct work_struct *work)
>>  		ops_free_list(ops, &net_exit_list);
>>  
>>  	mutex_unlock(&net_mutex);
>> +	up_read(&net_sem);
>>  
>>  	/* Ensure there are no outstanding rcu callbacks using this
>>  	 * network namespace.
>> @@ -510,8 +521,10 @@ static void cleanup_net(struct work_struct *work)
>>   */
>>  void net_ns_barrier(void)
>>  {
>> +	down_write(&net_sem);
>>  	mutex_lock(&net_mutex);
>>  	mutex_unlock(&net_mutex);
>> +	up_write(&net_sem);
>>  }
>>  EXPORT_SYMBOL(net_ns_barrier);
>>  
>> @@ -838,12 +851,12 @@ static int __init net_ns_init(void)
>>  
>>  	rcu_assign_pointer(init_net.gen, ng);
>>  
>> -	mutex_lock(&net_mutex);
>> +	down_write(&net_sem);
>>  	if (setup_net(&init_net, &init_user_ns))
>>  		panic("Could not setup the initial network namespace");
>>  
>>  	init_net_initialized = true;
>> -	mutex_unlock(&net_mutex);
>> +	up_write(&net_sem);
>>  
>>  	register_pernet_subsys(&net_ns_ops);
>>  
>> @@ -983,9 +996,9 @@ static void unregister_pernet_operations(struct pernet_operations *ops)
>>  int register_pernet_subsys(struct pernet_operations *ops)
>>  {
>>  	int error;
>> -	mutex_lock(&net_mutex);
>> +	down_write(&net_sem);
>>  	error =  register_pernet_operations(first_device, ops);
>> -	mutex_unlock(&net_mutex);
>> +	up_write(&net_sem);
>>  	return error;
>>  }
>>  EXPORT_SYMBOL_GPL(register_pernet_subsys);
>> @@ -1001,9 +1014,9 @@ EXPORT_SYMBOL_GPL(register_pernet_subsys);
>>   */
>>  void unregister_pernet_subsys(struct pernet_operations *ops)
>>  {
>> -	mutex_lock(&net_mutex);
>> +	down_write(&net_sem);
>>  	unregister_pernet_operations(ops);
>> -	mutex_unlock(&net_mutex);
>> +	up_write(&net_sem);
>>  }
>>  EXPORT_SYMBOL_GPL(unregister_pernet_subsys);
>>  
>> @@ -1029,11 +1042,11 @@ EXPORT_SYMBOL_GPL(unregister_pernet_subsys);
>>  int register_pernet_device(struct pernet_operations *ops)
>>  {
>>  	int error;
>> -	mutex_lock(&net_mutex);
>> +	down_write(&net_sem);
>>  	error = register_pernet_operations(&pernet_list, ops);
>>  	if (!error && (first_device == &pernet_list))
>>  		first_device = &ops->list;
>> -	mutex_unlock(&net_mutex);
>> +	up_write(&net_sem);
>>  	return error;
>>  }
>>  EXPORT_SYMBOL_GPL(register_pernet_device);
>> @@ -1049,11 +1062,11 @@ EXPORT_SYMBOL_GPL(register_pernet_device);
>>   */
>>  void unregister_pernet_device(struct pernet_operations *ops)
>>  {
>> -	mutex_lock(&net_mutex);
>> +	down_write(&net_sem);
>>  	if (&ops->list == first_device)
>>  		first_device = first_device->next;
>>  	unregister_pernet_operations(ops);
>> -	mutex_unlock(&net_mutex);
>> +	up_write(&net_sem);
>>  }
>>  EXPORT_SYMBOL_GPL(unregister_pernet_device);
>>  
>> diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c
>> index dabba2a91fc8..cb06d43c4230 100644
>> --- a/net/core/rtnetlink.c
>> +++ b/net/core/rtnetlink.c
>> @@ -390,11 +390,11 @@ static void rtnl_lock_unregistering_all(void)
>>  void rtnl_link_unregister(struct rtnl_link_ops *ops)
>>  {
>>  	/* Close the race with cleanup_net() */
>> -	mutex_lock(&net_mutex);
>> +	down_write(&net_sem);
>>  	rtnl_lock_unregistering_all();
>>  	__rtnl_link_unregister(ops);
>>  	rtnl_unlock();
>> -	mutex_unlock(&net_mutex);
>> +	up_write(&net_sem);
>>  }
>>  EXPORT_SYMBOL_GPL(rtnl_link_unregister);
>>  
>>

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v2 05/31] net: Allow pernet_operations to be executed in parallel
  2018-01-17 18:34   ` Andrei Vagin
@ 2018-01-18 10:16     ` Kirill Tkhai
  0 siblings, 0 replies; 41+ messages in thread
From: Kirill Tkhai @ 2018-01-18 10:16 UTC (permalink / raw)
  To: Andrei Vagin
  Cc: davem, vyasevic, kstewart, pombredanne, vyasevich, mark.rutland,
	gregkh, adobriyan, fw, nicolas.dichtel, xiyou.wangcong,
	roman.kapl, paul, dsahern, daniel, lucien.xin, mschiffer,
	rshearma, linux-kernel, netdev, ebiederm, gorcunov, eric.dumazet,
	stephen

On 17.01.2018 21:34, Andrei Vagin wrote:
> On Mon, Nov 20, 2017 at 09:32:55PM +0300, Kirill Tkhai wrote:
>> This adds new pernet_operations::async flag to indicate operations,
>> which ->init(), ->exit() and ->exit_batch() methods are allowed
>> to be executed in parallel with the methods of any other pernet_operations.
>>
>> When there are only asynchronous pernet_operations in the system,
>> net_mutex won't be taken for a net construction and destruction.
>>
>> Also, remove BUG_ON(mutex_is_locked()) from net_assign_generic()
>> without replacing with the equivalent net_sem check, as there is
>> one more lockdep assert below.
>>
>> Suggested-by: Eric W. Biederman <ebiederm@xmission.com>
>> Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
>> ---
>>  include/net/net_namespace.h |    6 ++++++
>>  net/core/net_namespace.c    |   29 +++++++++++++++++++----------
>>  2 files changed, 25 insertions(+), 10 deletions(-)
>>
>> diff --git a/include/net/net_namespace.h b/include/net/net_namespace.h
>> index 10f99dafd5ac..db978c4755f7 100644
>> --- a/include/net/net_namespace.h
>> +++ b/include/net/net_namespace.h
>> @@ -303,6 +303,12 @@ struct pernet_operations {
>>  	void (*exit_batch)(struct list_head *net_exit_list);
>>  	unsigned int *id;
>>  	size_t size;
>> +	/*
>> +	 * Indicates above methods are allowe to be executed in parallel
>> +	 * with methods of any other pernet_operations, i.e. they are not
>> +	 * need synchronization via net_mutex.
>> +	 */
>> +	bool async;
>>  };
>>  
>>  /*
>> diff --git a/net/core/net_namespace.c b/net/core/net_namespace.c
>> index c4f7452906bb..550c766f73aa 100644
>> --- a/net/core/net_namespace.c
>> +++ b/net/core/net_namespace.c
>> @@ -41,8 +41,9 @@ struct net init_net = {
>>  EXPORT_SYMBOL(init_net);
>>  
>>  static bool init_net_initialized;
>> +static unsigned nr_sync_pernet_ops;
>>  /*
>> - * net_sem: protects: pernet_list, net_generic_ids,
>> + * net_sem: protects: pernet_list, net_generic_ids, nr_sync_pernet_ops,
>>   * init_net_initialized and first_device pointer.
>>   */
>>  DECLARE_RWSEM(net_sem);
>> @@ -70,11 +71,10 @@ static int net_assign_generic(struct net *net, unsigned int id, void *data)
>>  {
>>  	struct net_generic *ng, *old_ng;
>>  
>> -	BUG_ON(!mutex_is_locked(&net_mutex));
>>  	BUG_ON(id < MIN_PERNET_OPS_ID);
>>  
>>  	old_ng = rcu_dereference_protected(net->gen,
>> -					   lockdep_is_held(&net_mutex));
>> +					   lockdep_is_held(&net_sem));
>>  	if (old_ng->s.len > id) {
>>  		old_ng->ptr[id] = data;
>>  		return 0;
>> @@ -419,11 +419,14 @@ struct net *copy_net_ns(unsigned long flags,
>>  	rv = down_read_killable(&net_sem);
>>  	if (rv < 0)
>>  		goto put_userns;
>> -	rv = mutex_lock_killable(&net_mutex);
>> -	if (rv < 0)
>> -		goto up_read;
>> +	if (nr_sync_pernet_ops) {
>> +		rv = mutex_lock_killable(&net_mutex);
>> +		if (rv < 0)
>> +			goto up_read;
>> +	}
>>  	rv = setup_net(net, user_ns);
>> -	mutex_unlock(&net_mutex);
>> +	if (nr_sync_pernet_ops)
>> +		mutex_unlock(&net_mutex);
>>  up_read:
>>  	up_read(&net_sem);
>>  	if (rv < 0) {
>> @@ -453,7 +456,8 @@ static void cleanup_net(struct work_struct *work)
>>  	spin_unlock_irq(&cleanup_list_lock);
>>  
>>  	down_read(&net_sem);
>> -	mutex_lock(&net_mutex);
>> +	if (nr_sync_pernet_ops)
>> +		mutex_lock(&net_mutex);
>>  
>>  	/* Don't let anyone else find us. */
>>  	rtnl_lock();
>> @@ -489,7 +493,8 @@ static void cleanup_net(struct work_struct *work)
>>  	list_for_each_entry_reverse(ops, &pernet_list, list)
>>  		ops_exit_list(ops, &net_exit_list);
>>  
>> -	mutex_unlock(&net_mutex);
>> +	if (nr_sync_pernet_ops)
>> +		mutex_unlock(&net_mutex);
>>  
>>  	/* Free the net generic variables */
>>  	list_for_each_entry_reverse(ops, &pernet_list, list)
>> @@ -961,6 +966,9 @@ static int register_pernet_operations(struct list_head *list,
>>  		rcu_barrier();
>>  		if (ops->id)
>>  			ida_remove(&net_generic_ids, *ops->id);
>> +	} else if (!ops->async) {
>> +		pr_info_once("Pernet operations %ps are sync.\n", ops);
> 
> As far as I understand, we have this sync mode for backward
> compatibility with non-upstream modules, don't we? If the answer is yes,
> it may be better to add WARN_ONCE here?

There are 200+ more pernet operations requiring the review and making them async.
This pr_info_once() is to help people find unconverted pernet_operations they use
and start the work on converting them.

Thanks,
Kirill
 
>> +		nr_sync_pernet_ops++;
>>  	}
>>  
>>  	return error;
>> @@ -968,7 +976,8 @@ static int register_pernet_operations(struct list_head *list,
>>  
>>  static void unregister_pernet_operations(struct pernet_operations *ops)
>>  {
>> -	
>> +	if (!ops->async)
>> +		BUG_ON(nr_sync_pernet_ops-- == 0);
>>  	__unregister_pernet_operations(ops);
>>  	rcu_barrier();
>>  	if (ops->id)
>>

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v2 00/31] Replacing net_mutex with rw_semaphore
  2017-11-20 18:32 [PATCH v2 00/31] Replacing net_mutex with rw_semaphore Kirill Tkhai
                   ` (31 preceding siblings ...)
  2017-12-04 15:54 ` [PATCH v2 00/31] Replacing net_mutex with rw_semaphore Kirill Tkhai
@ 2018-01-18 17:43 ` Andrei Vagin
  2018-01-19  8:25   ` Kirill Tkhai
  32 siblings, 1 reply; 41+ messages in thread
From: Andrei Vagin @ 2018-01-18 17:43 UTC (permalink / raw)
  To: Kirill Tkhai
  Cc: davem, vyasevic, kstewart, pombredanne, vyasevich, mark.rutland,
	gregkh, adobriyan, fw, nicolas.dichtel, xiyou.wangcong,
	roman.kapl, paul, dsahern, daniel, lucien.xin, mschiffer,
	rshearma, linux-kernel, netdev, ebiederm, gorcunov, eric.dumazet,
	stephen

On Mon, Nov 20, 2017 at 09:32:08PM +0300, Kirill Tkhai wrote:
> Hi,
> 
> there is the second version of patchset introducing net_sem
> instead of net_mutex. The patchset adds net_sem in addition
> to net_mutex and allows pernet_operations to be async. This
> flag means, the pernet_operations methods are safe to be
> executed with any othor pernet_operations (un)initializing
> another net.
> 
> If there are only async pernet_operations in the system,
> net_mutex is not used either for setup_net() or for cleanup_net().
> 
> The flag is little easier, then (un)register_pernet_sys(),
> as it changes one line only. Also, it requires less changes
> in code. In future, when all pernet_operations are async,
> we'll just remove this struct field.
> 
> The pernet_operations converted in this patchset allow
> to create minimal .config to have network working, and
> the changes improve the performance like you may see
> below:
> 
>     %for i in {1..10000}; do unshare -n bash -c exit; done
>     
>     *before*
>     real 1m40,377s
>     user 0m9,672s
>     sys 0m19,928s
>     
>     *after*
>     real 0m17,007s
>     user 0m5,311s
>     sys 0m11,779
>     
>     (5.8 times faster)

Good job!

Acked-by: Andrei Vagin <avagin@virtuozzo.com>

> ---
> 
> Kirill Tkhai (31):
>       net: Assign net to net_namespace_list in setup_net()
>       net: Cleanup copy_net_ns()
>       net: Introduce net_sem for protection of pernet_list
>       net: Move mutex_unlock() in cleanup_net() up
>       net: Allow pernet_operations to be executed in parallel
>       net: Convert proc_net_ns_ops
>       net: Convert net_ns_ops methods
>       net: Convert sysctl_pernet_ops
>       net: Convert netfilter_net_ops
>       net: Convert nf_log_net_ops
>       net: Convert net_inuse_ops
>       net: Convert net_defaults_ops
>       net: Convert netlink_net_ops
>       net: Convert rtnetlink_net_ops
>       net: Convert audit_net_ops
>       net: Convert uevent_net_ops
>       net: Convert proto_net_ops
>       net: Convert pernet_subsys ops, registered via net_dev_init()
>       net: Convert fib_* pernet_operations, registered via subsys_initcall
>       net: Convert subsys_initcall() registered pernet_operations from net/sched
>       net: Convert genl_pernet_ops
>       net: Convert wext_pernet_ops
>       net: Convert sysctl_core_ops
>       net: Convert pernet_subsys, registered from inet_init()
>       net: Convert unix_net_ops
>       net: Convert packet_net_ops
>       net: Convert ipv4_sysctl_ops
>       net: Convert addrconf_ops
>       net: Convert loopback_net_ops
>       net: Convert default_device_ops
>       net: Convert diag_net_ops
> 
> 
>  drivers/net/loopback.c      |    1 
>  fs/proc/proc_net.c          |    1 
>  include/linux/rtnetlink.h   |    1 
>  include/net/net_namespace.h |    6 +++
>  kernel/audit.c              |    1 
>  lib/kobject_uevent.c        |    1 
>  net/core/dev.c              |    2 +
>  net/core/fib_notifier.c     |    1 
>  net/core/fib_rules.c        |    1 
>  net/core/net-procfs.c       |    2 +
>  net/core/net_namespace.c    |   94 +++++++++++++++++++++++++------------------
>  net/core/rtnetlink.c        |    5 +-
>  net/core/sock.c             |    2 +
>  net/core/sock_diag.c        |    1 
>  net/core/sysctl_net_core.c  |    1 
>  net/ipv4/af_inet.c          |    2 +
>  net/ipv4/arp.c              |    1 
>  net/ipv4/devinet.c          |    1 
>  net/ipv4/fib_frontend.c     |    1 
>  net/ipv4/icmp.c             |    1 
>  net/ipv4/igmp.c             |    1 
>  net/ipv4/ip_fragment.c      |    1 
>  net/ipv4/ipmr.c             |    1 
>  net/ipv4/ping.c             |    1 
>  net/ipv4/proc.c             |    1 
>  net/ipv4/raw.c              |    1 
>  net/ipv4/route.c            |    4 ++
>  net/ipv4/sysctl_net_ipv4.c  |    1 
>  net/ipv4/tcp_ipv4.c         |    2 +
>  net/ipv4/tcp_metrics.c      |    1 
>  net/ipv4/udp.c              |    1 
>  net/ipv4/udplite.c          |    1 
>  net/ipv4/xfrm4_policy.c     |    1 
>  net/ipv6/addrconf.c         |    1 
>  net/netfilter/core.c        |    1 
>  net/netfilter/nf_log.c      |    1 
>  net/netlink/af_netlink.c    |    1 
>  net/netlink/genetlink.c     |    1 
>  net/packet/af_packet.c      |    1 
>  net/sched/act_api.c         |    1 
>  net/sched/sch_api.c         |    1 
>  net/sysctl_net.c            |    1 
>  net/unix/af_unix.c          |    1 
>  net/wireless/wext-core.c    |    1 
>  net/xfrm/xfrm_policy.c      |    1 
>  45 files changed, 114 insertions(+), 41 deletions(-)
> 
> --
> Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v2 00/31] Replacing net_mutex with rw_semaphore
  2018-01-18 17:43 ` Andrei Vagin
@ 2018-01-19  8:25   ` Kirill Tkhai
  0 siblings, 0 replies; 41+ messages in thread
From: Kirill Tkhai @ 2018-01-19  8:25 UTC (permalink / raw)
  To: Andrei Vagin
  Cc: davem, vyasevic, kstewart, pombredanne, vyasevich, mark.rutland,
	gregkh, adobriyan, fw, nicolas.dichtel, xiyou.wangcong,
	roman.kapl, paul, dsahern, daniel, lucien.xin, mschiffer,
	rshearma, linux-kernel, netdev, ebiederm, gorcunov, eric.dumazet,
	stephen

On 18.01.2018 20:43, Andrei Vagin wrote:
> On Mon, Nov 20, 2017 at 09:32:08PM +0300, Kirill Tkhai wrote:
>> Hi,
>>
>> there is the second version of patchset introducing net_sem
>> instead of net_mutex. The patchset adds net_sem in addition
>> to net_mutex and allows pernet_operations to be async. This
>> flag means, the pernet_operations methods are safe to be
>> executed with any othor pernet_operations (un)initializing
>> another net.
>>
>> If there are only async pernet_operations in the system,
>> net_mutex is not used either for setup_net() or for cleanup_net().
>>
>> The flag is little easier, then (un)register_pernet_sys(),
>> as it changes one line only. Also, it requires less changes
>> in code. In future, when all pernet_operations are async,
>> we'll just remove this struct field.
>>
>> The pernet_operations converted in this patchset allow
>> to create minimal .config to have network working, and
>> the changes improve the performance like you may see
>> below:
>>
>>     %for i in {1..10000}; do unshare -n bash -c exit; done
>>     
>>     *before*
>>     real 1m40,377s
>>     user 0m9,672s
>>     sys 0m19,928s
>>     
>>     *after*
>>     real 0m17,007s
>>     user 0m5,311s
>>     sys 0m11,779
>>     
>>     (5.8 times faster)
> 
> Good job!
> 
> Acked-by: Andrei Vagin <avagin@virtuozzo.com>

Thanks, Andrei!

>> ---
>>
>> Kirill Tkhai (31):
>>       net: Assign net to net_namespace_list in setup_net()
>>       net: Cleanup copy_net_ns()
>>       net: Introduce net_sem for protection of pernet_list
>>       net: Move mutex_unlock() in cleanup_net() up
>>       net: Allow pernet_operations to be executed in parallel
>>       net: Convert proc_net_ns_ops
>>       net: Convert net_ns_ops methods
>>       net: Convert sysctl_pernet_ops
>>       net: Convert netfilter_net_ops
>>       net: Convert nf_log_net_ops
>>       net: Convert net_inuse_ops
>>       net: Convert net_defaults_ops
>>       net: Convert netlink_net_ops
>>       net: Convert rtnetlink_net_ops
>>       net: Convert audit_net_ops
>>       net: Convert uevent_net_ops
>>       net: Convert proto_net_ops
>>       net: Convert pernet_subsys ops, registered via net_dev_init()
>>       net: Convert fib_* pernet_operations, registered via subsys_initcall
>>       net: Convert subsys_initcall() registered pernet_operations from net/sched
>>       net: Convert genl_pernet_ops
>>       net: Convert wext_pernet_ops
>>       net: Convert sysctl_core_ops
>>       net: Convert pernet_subsys, registered from inet_init()
>>       net: Convert unix_net_ops
>>       net: Convert packet_net_ops
>>       net: Convert ipv4_sysctl_ops
>>       net: Convert addrconf_ops
>>       net: Convert loopback_net_ops
>>       net: Convert default_device_ops
>>       net: Convert diag_net_ops
>>
>>
>>  drivers/net/loopback.c      |    1 
>>  fs/proc/proc_net.c          |    1 
>>  include/linux/rtnetlink.h   |    1 
>>  include/net/net_namespace.h |    6 +++
>>  kernel/audit.c              |    1 
>>  lib/kobject_uevent.c        |    1 
>>  net/core/dev.c              |    2 +
>>  net/core/fib_notifier.c     |    1 
>>  net/core/fib_rules.c        |    1 
>>  net/core/net-procfs.c       |    2 +
>>  net/core/net_namespace.c    |   94 +++++++++++++++++++++++++------------------
>>  net/core/rtnetlink.c        |    5 +-
>>  net/core/sock.c             |    2 +
>>  net/core/sock_diag.c        |    1 
>>  net/core/sysctl_net_core.c  |    1 
>>  net/ipv4/af_inet.c          |    2 +
>>  net/ipv4/arp.c              |    1 
>>  net/ipv4/devinet.c          |    1 
>>  net/ipv4/fib_frontend.c     |    1 
>>  net/ipv4/icmp.c             |    1 
>>  net/ipv4/igmp.c             |    1 
>>  net/ipv4/ip_fragment.c      |    1 
>>  net/ipv4/ipmr.c             |    1 
>>  net/ipv4/ping.c             |    1 
>>  net/ipv4/proc.c             |    1 
>>  net/ipv4/raw.c              |    1 
>>  net/ipv4/route.c            |    4 ++
>>  net/ipv4/sysctl_net_ipv4.c  |    1 
>>  net/ipv4/tcp_ipv4.c         |    2 +
>>  net/ipv4/tcp_metrics.c      |    1 
>>  net/ipv4/udp.c              |    1 
>>  net/ipv4/udplite.c          |    1 
>>  net/ipv4/xfrm4_policy.c     |    1 
>>  net/ipv6/addrconf.c         |    1 
>>  net/netfilter/core.c        |    1 
>>  net/netfilter/nf_log.c      |    1 
>>  net/netlink/af_netlink.c    |    1 
>>  net/netlink/genetlink.c     |    1 
>>  net/packet/af_packet.c      |    1 
>>  net/sched/act_api.c         |    1 
>>  net/sched/sch_api.c         |    1 
>>  net/sysctl_net.c            |    1 
>>  net/unix/af_unix.c          |    1 
>>  net/wireless/wext-core.c    |    1 
>>  net/xfrm/xfrm_policy.c      |    1 
>>  45 files changed, 114 insertions(+), 41 deletions(-)
>>
>> --
>> Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>

^ permalink raw reply	[flat|nested] 41+ messages in thread

end of thread, other threads:[~2018-01-19  8:25 UTC | newest]

Thread overview: 41+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-11-20 18:32 [PATCH v2 00/31] Replacing net_mutex with rw_semaphore Kirill Tkhai
2017-11-20 18:32 ` [PATCH v2 01/31] net: Assign net to net_namespace_list in setup_net() Kirill Tkhai
2017-11-20 18:32 ` [PATCH v2 02/31] net: Cleanup copy_net_ns() Kirill Tkhai
2017-11-20 18:32 ` [PATCH v2 03/31] net: Introduce net_sem for protection of pernet_list Kirill Tkhai
2018-01-17 20:04   ` Andrei Vagin
2018-01-18 10:14     ` Kirill Tkhai
2017-11-20 18:32 ` [PATCH v2 04/31] net: Move mutex_unlock() in cleanup_net() up Kirill Tkhai
2017-11-20 18:32 ` [PATCH v2 05/31] net: Allow pernet_operations to be executed in parallel Kirill Tkhai
2018-01-17 18:34   ` Andrei Vagin
2018-01-18 10:16     ` Kirill Tkhai
2017-11-20 18:33 ` [PATCH v2 06/31] net: Convert proc_net_ns_ops Kirill Tkhai
2017-11-20 18:33 ` [PATCH v2 07/31] net: Convert net_ns_ops methods Kirill Tkhai
2017-11-20 18:33 ` [PATCH v2 08/31] net: Convert sysctl_pernet_ops Kirill Tkhai
2017-11-20 18:33 ` [PATCH v2 09/31] net: Convert netfilter_net_ops Kirill Tkhai
2017-11-20 18:33 ` [PATCH v2 10/31] net: Convert nf_log_net_ops Kirill Tkhai
2017-11-20 18:33 ` [PATCH v2 11/31] net: Convert net_inuse_ops Kirill Tkhai
2017-11-20 18:34 ` [PATCH v2 12/31] net: Convert net_defaults_ops Kirill Tkhai
2017-11-20 18:34 ` [PATCH v2 13/31] net: Convert netlink_net_ops Kirill Tkhai
2017-11-20 18:34 ` [PATCH v2 14/31] net: Convert rtnetlink_net_ops Kirill Tkhai
2017-11-20 18:34 ` [PATCH v2 15/31] net: Convert audit_net_ops Kirill Tkhai
2017-11-20 18:34 ` [PATCH v2 16/31] net: Convert uevent_net_ops Kirill Tkhai
2017-11-20 18:34 ` [PATCH v2 17/31] net: Convert proto_net_ops Kirill Tkhai
2017-11-20 18:35 ` [PATCH v2 18/31] net: Convert pernet_subsys ops, registered via net_dev_init() Kirill Tkhai
2017-11-20 18:35 ` [PATCH v2 19/31] net: Convert fib_* pernet_operations, registered via subsys_initcall Kirill Tkhai
2017-11-20 18:35 ` [PATCH v2 20/31] net: Convert subsys_initcall() registered pernet_operations from net/sched Kirill Tkhai
2017-11-20 18:35 ` [PATCH v2 21/31] net: Convert genl_pernet_ops Kirill Tkhai
2017-11-20 18:35 ` [PATCH v2 22/31] net: Convert wext_pernet_ops Kirill Tkhai
2017-11-20 18:35 ` [PATCH v2 23/31] net: Convert sysctl_core_ops Kirill Tkhai
2017-11-20 18:35 ` [PATCH v2 24/31] net: Convert pernet_subsys, registered from inet_init() Kirill Tkhai
2017-11-20 18:36 ` [PATCH v2 25/31] net: Convert unix_net_ops Kirill Tkhai
2017-11-20 18:36 ` [PATCH v2 26/31] net: Convert packet_net_ops Kirill Tkhai
2017-11-20 18:36 ` [PATCH v2 27/31] net: Convert ipv4_sysctl_ops Kirill Tkhai
2017-11-20 18:36 ` [PATCH v2 28/31] net: Convert addrconf_ops Kirill Tkhai
2017-11-20 18:36 ` [PATCH v2 29/31] net: Convert loopback_net_ops Kirill Tkhai
2017-11-20 18:36 ` [PATCH v2 30/31] net: Convert default_device_ops Kirill Tkhai
2017-11-20 18:37 ` [PATCH v2 31/31] net: Convert diag_net_ops Kirill Tkhai
2017-12-04 15:54 ` [PATCH v2 00/31] Replacing net_mutex with rw_semaphore Kirill Tkhai
2017-12-04 16:10   ` David Miller
2017-12-04 16:11     ` Kirill Tkhai
2018-01-18 17:43 ` Andrei Vagin
2018-01-19  8:25   ` Kirill Tkhai

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.