* [PATCH v3 net-next 00/15] net: more factorization in cleanup_net() paths
@ 2024-02-05 12:47 Eric Dumazet
2024-02-05 12:47 ` [PATCH v3 net-next 01/15] net: add exit_batch_rtnl() method Eric Dumazet
` (15 more replies)
0 siblings, 16 replies; 20+ messages in thread
From: Eric Dumazet @ 2024-02-05 12:47 UTC (permalink / raw)
To: David S . Miller, Jakub Kicinski, Paolo Abeni
Cc: Antoine Tenart, netdev, eric.dumazet, Eric Dumazet
This series is inspired by recent syzbot reports hinting to RTNL and
workqueue abuses.
rtnl_lock() is unfair to (single threaded) cleanup_net(), because
many threads can cause contention on it.
This series adds a new (struct pernet_operations) method,
so that cleanup_net() can hold RTNL longer once it finally
acquires it.
It also factorizes unregister_netdevice_many(), to further
reduce stalls in cleanup_net().
v3: Dropped "net: convert default_device_exit_batch() to exit_batch_rtnl method"
Jakub (and KASAN) reported issues with bridge, but the root cause was with this patch.
default_device_exit_batch() is the catch-all method, it includes "lo" device dismantle.
v2: Antoine Tenart feedback in
https://lore.kernel.org/netdev/170688415193.5216.10499830272732622816@kwain/
- Added bond_net_pre_exit() method to make sure bond_destroy_sysfs()
is called before we unregister the devices in bond_net_exit_batch_rtnl()
Eric Dumazet (15):
net: add exit_batch_rtnl() method
nexthop: convert nexthop_net_exit_batch to exit_batch_rtnl method
bareudp: use exit_batch_rtnl() method
bonding: use exit_batch_rtnl() method
geneve: use exit_batch_rtnl() method
gtp: use exit_batch_rtnl() method
ipv4: add __unregister_nexthop_notifier()
vxlan: use exit_batch_rtnl() method
ip6_gre: use exit_batch_rtnl() method
ip6_tunnel: use exit_batch_rtnl() method
ip6_vti: use exit_batch_rtnl() method
sit: use exit_batch_rtnl() method
ip_tunnel: use exit_batch_rtnl() method
bridge: use exit_batch_rtnl() method
xfrm: interface: use exit_batch_rtnl() method
drivers/net/bareudp.c | 13 ++++-------
drivers/net/bonding/bond_main.c | 37 ++++++++++++++++++++++----------
drivers/net/geneve.c | 13 ++++-------
drivers/net/gtp.c | 20 ++++++++---------
drivers/net/vxlan/vxlan_core.c | 21 ++++++++++--------
include/net/ip_tunnels.h | 3 ++-
include/net/net_namespace.h | 3 +++
include/net/nexthop.h | 1 +
net/bridge/br.c | 15 +++++--------
net/core/net_namespace.c | 31 ++++++++++++++++++++++++++-
net/ipv4/ip_gre.c | 24 +++++++++++++--------
net/ipv4/ip_tunnel.c | 10 ++++-----
net/ipv4/ip_vti.c | 8 ++++---
net/ipv4/ipip.c | 8 ++++---
net/ipv4/nexthop.c | 38 ++++++++++++++++++++++-----------
net/ipv6/ip6_gre.c | 12 +++++------
net/ipv6/ip6_tunnel.c | 12 +++++------
net/ipv6/ip6_vti.c | 12 +++++------
net/ipv6/sit.c | 13 +++++------
net/xfrm/xfrm_interface_core.c | 14 ++++++------
20 files changed, 177 insertions(+), 131 deletions(-)
--
2.43.0.594.gd9cf4e227d-goog
^ permalink raw reply [flat|nested] 20+ messages in thread
* [PATCH v3 net-next 01/15] net: add exit_batch_rtnl() method
2024-02-05 12:47 [PATCH v3 net-next 00/15] net: more factorization in cleanup_net() paths Eric Dumazet
@ 2024-02-05 12:47 ` Eric Dumazet
2024-02-05 12:47 ` [PATCH v3 net-next 02/15] nexthop: convert nexthop_net_exit_batch to exit_batch_rtnl method Eric Dumazet
` (14 subsequent siblings)
15 siblings, 0 replies; 20+ messages in thread
From: Eric Dumazet @ 2024-02-05 12:47 UTC (permalink / raw)
To: David S . Miller, Jakub Kicinski, Paolo Abeni
Cc: Antoine Tenart, netdev, eric.dumazet, Eric Dumazet
Many (struct pernet_operations)->exit_batch() methods have
to acquire rtnl.
In presence of rtnl mutex pressure, this makes cleanup_net()
very slow.
This patch adds a new exit_batch_rtnl() method to reduce
number of rtnl acquisitions from cleanup_net().
exit_batch_rtnl() handlers are called while rtnl is locked,
and devices to be killed can be queued in a list provided
as their second argument.
A single unregister_netdevice_many() is called right
before rtnl is released.
exit_batch_rtnl() handlers are called before ->exit() and
->exit_batch() handlers.
Signed-off-by: Eric Dumazet <edumazet@google.com>
---
include/net/net_namespace.h | 3 +++
net/core/net_namespace.c | 31 ++++++++++++++++++++++++++++++-
2 files changed, 33 insertions(+), 1 deletion(-)
diff --git a/include/net/net_namespace.h b/include/net/net_namespace.h
index cd0c2eedbb5e9ddcbd5e0a37e2eb7e0cf57495d5..20c34bd7a07783a9a13696fd74b41eff1ff860a8 100644
--- a/include/net/net_namespace.h
+++ b/include/net/net_namespace.h
@@ -448,6 +448,9 @@ struct pernet_operations {
void (*pre_exit)(struct net *net);
void (*exit)(struct net *net);
void (*exit_batch)(struct list_head *net_exit_list);
+ /* Following method is called with RTNL held. */
+ void (*exit_batch_rtnl)(struct list_head *net_exit_list,
+ struct list_head *dev_kill_list);
unsigned int *id;
size_t size;
};
diff --git a/net/core/net_namespace.c b/net/core/net_namespace.c
index 72799533426b6162256d7c4eef355af96c66e844..233ec0cdd0111d5ca21c6f8a66f4c1f3fbc4657b 100644
--- a/net/core/net_namespace.c
+++ b/net/core/net_namespace.c
@@ -318,8 +318,9 @@ static __net_init int setup_net(struct net *net, struct user_namespace *user_ns)
{
/* Must be called with pernet_ops_rwsem held */
const struct pernet_operations *ops, *saved_ops;
- int error = 0;
LIST_HEAD(net_exit_list);
+ LIST_HEAD(dev_kill_list);
+ int error = 0;
refcount_set(&net->ns.count, 1);
ref_tracker_dir_init(&net->refcnt_tracker, 128, "net refcnt");
@@ -357,6 +358,15 @@ static __net_init int setup_net(struct net *net, struct user_namespace *user_ns)
synchronize_rcu();
+ ops = saved_ops;
+ rtnl_lock();
+ list_for_each_entry_continue_reverse(ops, &pernet_list, list) {
+ if (ops->exit_batch_rtnl)
+ ops->exit_batch_rtnl(&net_exit_list, &dev_kill_list);
+ }
+ unregister_netdevice_many(&dev_kill_list);
+ rtnl_unlock();
+
ops = saved_ops;
list_for_each_entry_continue_reverse(ops, &pernet_list, list)
ops_exit_list(ops, &net_exit_list);
@@ -573,6 +583,7 @@ static void cleanup_net(struct work_struct *work)
struct net *net, *tmp, *last;
struct llist_node *net_kill_list;
LIST_HEAD(net_exit_list);
+ LIST_HEAD(dev_kill_list);
/* Atomically snapshot the list of namespaces to cleanup */
net_kill_list = llist_del_all(&cleanup_list);
@@ -613,6 +624,14 @@ static void cleanup_net(struct work_struct *work)
*/
synchronize_rcu();
+ rtnl_lock();
+ list_for_each_entry_reverse(ops, &pernet_list, list) {
+ if (ops->exit_batch_rtnl)
+ ops->exit_batch_rtnl(&net_exit_list, &dev_kill_list);
+ }
+ unregister_netdevice_many(&dev_kill_list);
+ rtnl_unlock();
+
/* Run all of the network namespace exit methods */
list_for_each_entry_reverse(ops, &pernet_list, list)
ops_exit_list(ops, &net_exit_list);
@@ -1193,7 +1212,17 @@ static void free_exit_list(struct pernet_operations *ops, struct list_head *net_
{
ops_pre_exit_list(ops, net_exit_list);
synchronize_rcu();
+
+ if (ops->exit_batch_rtnl) {
+ LIST_HEAD(dev_kill_list);
+
+ rtnl_lock();
+ ops->exit_batch_rtnl(net_exit_list, &dev_kill_list);
+ unregister_netdevice_many(&dev_kill_list);
+ rtnl_unlock();
+ }
ops_exit_list(ops, net_exit_list);
+
ops_free_list(ops, net_exit_list);
}
--
2.43.0.594.gd9cf4e227d-goog
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [PATCH v3 net-next 02/15] nexthop: convert nexthop_net_exit_batch to exit_batch_rtnl method
2024-02-05 12:47 [PATCH v3 net-next 00/15] net: more factorization in cleanup_net() paths Eric Dumazet
2024-02-05 12:47 ` [PATCH v3 net-next 01/15] net: add exit_batch_rtnl() method Eric Dumazet
@ 2024-02-05 12:47 ` Eric Dumazet
2024-02-05 12:47 ` [PATCH v3 net-next 03/15] bareudp: use exit_batch_rtnl() method Eric Dumazet
` (13 subsequent siblings)
15 siblings, 0 replies; 20+ messages in thread
From: Eric Dumazet @ 2024-02-05 12:47 UTC (permalink / raw)
To: David S . Miller, Jakub Kicinski, Paolo Abeni
Cc: Antoine Tenart, netdev, eric.dumazet, Eric Dumazet
exit_batch_rtnl() is called while RTNL is held.
This saves one rtnl_lock()/rtnl_unlock() pair.
We also need to create nexthop_net_exit()
to make sure net->nexthop.devhash is not freed too soon,
otherwise we will not be able to unregister netdev
from exit_batch_rtnl() methods.
Signed-off-by: Eric Dumazet <edumazet@google.com>
---
net/ipv4/nexthop.c | 19 ++++++++++++-------
1 file changed, 12 insertions(+), 7 deletions(-)
diff --git a/net/ipv4/nexthop.c b/net/ipv4/nexthop.c
index bbff68b5b5d4a1d835c9785fbe84f4cab32a1db0..7270a8631406c508eebf85c42eb29a5268d7d7cf 100644
--- a/net/ipv4/nexthop.c
+++ b/net/ipv4/nexthop.c
@@ -3737,16 +3737,20 @@ void nexthop_res_grp_activity_update(struct net *net, u32 id, u16 num_buckets,
}
EXPORT_SYMBOL(nexthop_res_grp_activity_update);
-static void __net_exit nexthop_net_exit_batch(struct list_head *net_list)
+static void __net_exit nexthop_net_exit_batch_rtnl(struct list_head *net_list,
+ struct list_head *dev_to_kill)
{
struct net *net;
- rtnl_lock();
- list_for_each_entry(net, net_list, exit_list) {
+ ASSERT_RTNL();
+ list_for_each_entry(net, net_list, exit_list)
flush_all_nexthops(net);
- kfree(net->nexthop.devhash);
- }
- rtnl_unlock();
+}
+
+static void __net_exit nexthop_net_exit(struct net *net)
+{
+ kfree(net->nexthop.devhash);
+ net->nexthop.devhash = NULL;
}
static int __net_init nexthop_net_init(struct net *net)
@@ -3764,7 +3768,8 @@ static int __net_init nexthop_net_init(struct net *net)
static struct pernet_operations nexthop_net_ops = {
.init = nexthop_net_init,
- .exit_batch = nexthop_net_exit_batch,
+ .exit = nexthop_net_exit,
+ .exit_batch_rtnl = nexthop_net_exit_batch_rtnl,
};
static int __init nexthop_init(void)
--
2.43.0.594.gd9cf4e227d-goog
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [PATCH v3 net-next 03/15] bareudp: use exit_batch_rtnl() method
2024-02-05 12:47 [PATCH v3 net-next 00/15] net: more factorization in cleanup_net() paths Eric Dumazet
2024-02-05 12:47 ` [PATCH v3 net-next 01/15] net: add exit_batch_rtnl() method Eric Dumazet
2024-02-05 12:47 ` [PATCH v3 net-next 02/15] nexthop: convert nexthop_net_exit_batch to exit_batch_rtnl method Eric Dumazet
@ 2024-02-05 12:47 ` Eric Dumazet
2024-02-05 12:47 ` [PATCH v3 net-next 04/15] bonding: " Eric Dumazet
` (12 subsequent siblings)
15 siblings, 0 replies; 20+ messages in thread
From: Eric Dumazet @ 2024-02-05 12:47 UTC (permalink / raw)
To: David S . Miller, Jakub Kicinski, Paolo Abeni
Cc: Antoine Tenart, netdev, eric.dumazet, Eric Dumazet
exit_batch_rtnl() is called while RTNL is held,
and devices to be unregistered can be queued in the dev_kill_list.
This saves one rtnl_lock()/rtnl_unlock() pair,
and one unregister_netdevice_many() call.
Signed-off-by: Eric Dumazet <edumazet@google.com>
---
drivers/net/bareudp.c | 13 ++++---------
1 file changed, 4 insertions(+), 9 deletions(-)
diff --git a/drivers/net/bareudp.c b/drivers/net/bareudp.c
index 31377bb1cc97cba08e02dc7d48761068627af3fb..4db6122c9b43032a36b98916bb4390e3d6f08f68 100644
--- a/drivers/net/bareudp.c
+++ b/drivers/net/bareudp.c
@@ -760,23 +760,18 @@ static void bareudp_destroy_tunnels(struct net *net, struct list_head *head)
unregister_netdevice_queue(bareudp->dev, head);
}
-static void __net_exit bareudp_exit_batch_net(struct list_head *net_list)
+static void __net_exit bareudp_exit_batch_rtnl(struct list_head *net_list,
+ struct list_head *dev_kill_list)
{
struct net *net;
- LIST_HEAD(list);
- rtnl_lock();
list_for_each_entry(net, net_list, exit_list)
- bareudp_destroy_tunnels(net, &list);
-
- /* unregister the devices gathered above */
- unregister_netdevice_many(&list);
- rtnl_unlock();
+ bareudp_destroy_tunnels(net, dev_kill_list);
}
static struct pernet_operations bareudp_net_ops = {
.init = bareudp_init_net,
- .exit_batch = bareudp_exit_batch_net,
+ .exit_batch_rtnl = bareudp_exit_batch_rtnl,
.id = &bareudp_net_id,
.size = sizeof(struct bareudp_net),
};
--
2.43.0.594.gd9cf4e227d-goog
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [PATCH v3 net-next 04/15] bonding: use exit_batch_rtnl() method
2024-02-05 12:47 [PATCH v3 net-next 00/15] net: more factorization in cleanup_net() paths Eric Dumazet
` (2 preceding siblings ...)
2024-02-05 12:47 ` [PATCH v3 net-next 03/15] bareudp: use exit_batch_rtnl() method Eric Dumazet
@ 2024-02-05 12:47 ` Eric Dumazet
2024-02-05 12:47 ` [PATCH v3 net-next 05/15] geneve: " Eric Dumazet
` (11 subsequent siblings)
15 siblings, 0 replies; 20+ messages in thread
From: Eric Dumazet @ 2024-02-05 12:47 UTC (permalink / raw)
To: David S . Miller, Jakub Kicinski, Paolo Abeni
Cc: Antoine Tenart, netdev, eric.dumazet, Eric Dumazet, Jay Vosburgh,
Andy Gospodarek
exit_batch_rtnl() is called while RTNL is held,
and devices to be unregistered can be queued in the dev_kill_list.
This saves one rtnl_lock()/rtnl_unlock() pair,
and one unregister_netdevice_many() call.
v2: Added bond_net_pre_exit() method to make sure bond_destroy_sysfs()
is called before we unregister the devices in bond_net_exit_batch_rtnl
(Antoine Tenart : https://lore.kernel.org/netdev/170688415193.5216.10499830272732622816@kwain/)
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Antoine Tenart <atenart@kernel.org>
Acked-by: Jay Vosburgh <j.vosburgh@gmail.com>
Cc: Andy Gospodarek <andy@greyhouse.net>
---
drivers/net/bonding/bond_main.c | 37 +++++++++++++++++++++++----------
1 file changed, 26 insertions(+), 11 deletions(-)
diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index 4e0600c7b050f21c82a8862e224bb055e95d5039..a5e3d000ebd85c09beba379a2e6a7f69a0fd4c88 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -6415,28 +6415,41 @@ static int __net_init bond_net_init(struct net *net)
return 0;
}
-static void __net_exit bond_net_exit_batch(struct list_head *net_list)
+/* According to commit 69b0216ac255 ("bonding: fix bonding_masters
+ * race condition in bond unloading") we need to remove sysfs files
+ * before we remove our devices (done later in bond_net_exit_batch_rtnl())
+ */
+static void __net_exit bond_net_pre_exit(struct net *net)
+{
+ struct bond_net *bn = net_generic(net, bond_net_id);
+
+ bond_destroy_sysfs(bn);
+}
+
+static void __net_exit bond_net_exit_batch_rtnl(struct list_head *net_list,
+ struct list_head *dev_kill_list)
{
struct bond_net *bn;
struct net *net;
- LIST_HEAD(list);
-
- list_for_each_entry(net, net_list, exit_list) {
- bn = net_generic(net, bond_net_id);
- bond_destroy_sysfs(bn);
- }
/* Kill off any bonds created after unregistering bond rtnl ops */
- rtnl_lock();
list_for_each_entry(net, net_list, exit_list) {
struct bonding *bond, *tmp_bond;
bn = net_generic(net, bond_net_id);
list_for_each_entry_safe(bond, tmp_bond, &bn->dev_list, bond_list)
- unregister_netdevice_queue(bond->dev, &list);
+ unregister_netdevice_queue(bond->dev, dev_kill_list);
}
- unregister_netdevice_many(&list);
- rtnl_unlock();
+}
+
+/* According to commit 23fa5c2caae0 ("bonding: destroy proc directory
+ * only after all bonds are gone") bond_destroy_proc_dir() is called
+ * after bond_net_exit_batch_rtnl() has completed.
+ */
+static void __net_exit bond_net_exit_batch(struct list_head *net_list)
+{
+ struct bond_net *bn;
+ struct net *net;
list_for_each_entry(net, net_list, exit_list) {
bn = net_generic(net, bond_net_id);
@@ -6446,6 +6459,8 @@ static void __net_exit bond_net_exit_batch(struct list_head *net_list)
static struct pernet_operations bond_net_ops = {
.init = bond_net_init,
+ .pre_exit = bond_net_pre_exit,
+ .exit_batch_rtnl = bond_net_exit_batch_rtnl,
.exit_batch = bond_net_exit_batch,
.id = &bond_net_id,
.size = sizeof(struct bond_net),
--
2.43.0.594.gd9cf4e227d-goog
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [PATCH v3 net-next 05/15] geneve: use exit_batch_rtnl() method
2024-02-05 12:47 [PATCH v3 net-next 00/15] net: more factorization in cleanup_net() paths Eric Dumazet
` (3 preceding siblings ...)
2024-02-05 12:47 ` [PATCH v3 net-next 04/15] bonding: " Eric Dumazet
@ 2024-02-05 12:47 ` Eric Dumazet
2024-02-06 9:19 ` Antoine Tenart
2024-02-05 12:47 ` [PATCH v3 net-next 06/15] gtp: " Eric Dumazet
` (10 subsequent siblings)
15 siblings, 1 reply; 20+ messages in thread
From: Eric Dumazet @ 2024-02-05 12:47 UTC (permalink / raw)
To: David S . Miller, Jakub Kicinski, Paolo Abeni
Cc: Antoine Tenart, netdev, eric.dumazet, Eric Dumazet
exit_batch_rtnl() is called while RTNL is held,
and devices to be unregistered can be queued in the dev_kill_list.
This saves one rtnl_lock()/rtnl_unlock() pair,
and one unregister_netdevice_many() call.
Signed-off-by: Eric Dumazet <edumazet@google.com>
---
drivers/net/geneve.c | 13 ++++---------
1 file changed, 4 insertions(+), 9 deletions(-)
diff --git a/drivers/net/geneve.c b/drivers/net/geneve.c
index 32c51c244153bd760b9f58001906c04c8b0f37ff..f31fc52ef397dfe0eba854385f783fbcad7e870f 100644
--- a/drivers/net/geneve.c
+++ b/drivers/net/geneve.c
@@ -1900,18 +1900,13 @@ static void geneve_destroy_tunnels(struct net *net, struct list_head *head)
}
}
-static void __net_exit geneve_exit_batch_net(struct list_head *net_list)
+static void __net_exit geneve_exit_batch_rtnl(struct list_head *net_list,
+ struct list_head *dev_to_kill)
{
struct net *net;
- LIST_HEAD(list);
- rtnl_lock();
list_for_each_entry(net, net_list, exit_list)
- geneve_destroy_tunnels(net, &list);
-
- /* unregister the devices gathered above */
- unregister_netdevice_many(&list);
- rtnl_unlock();
+ geneve_destroy_tunnels(net, dev_to_kill);
list_for_each_entry(net, net_list, exit_list) {
const struct geneve_net *gn = net_generic(net, geneve_net_id);
@@ -1922,7 +1917,7 @@ static void __net_exit geneve_exit_batch_net(struct list_head *net_list)
static struct pernet_operations geneve_net_ops = {
.init = geneve_init_net,
- .exit_batch = geneve_exit_batch_net,
+ .exit_batch_rtnl = geneve_exit_batch_rtnl,
.id = &geneve_net_id,
.size = sizeof(struct geneve_net),
};
--
2.43.0.594.gd9cf4e227d-goog
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [PATCH v3 net-next 06/15] gtp: use exit_batch_rtnl() method
2024-02-05 12:47 [PATCH v3 net-next 00/15] net: more factorization in cleanup_net() paths Eric Dumazet
` (4 preceding siblings ...)
2024-02-05 12:47 ` [PATCH v3 net-next 05/15] geneve: " Eric Dumazet
@ 2024-02-05 12:47 ` Eric Dumazet
2024-02-05 12:47 ` [PATCH v3 net-next 07/15] ipv4: add __unregister_nexthop_notifier() Eric Dumazet
` (9 subsequent siblings)
15 siblings, 0 replies; 20+ messages in thread
From: Eric Dumazet @ 2024-02-05 12:47 UTC (permalink / raw)
To: David S . Miller, Jakub Kicinski, Paolo Abeni
Cc: Antoine Tenart, netdev, eric.dumazet, Eric Dumazet
exit_batch_rtnl() is called while RTNL is held,
and devices to be unregistered can be queued in the dev_kill_list.
This saves one rtnl_lock()/rtnl_unlock() pair per netns
and one unregister_netdevice_many() call per netns.
Signed-off-by: Eric Dumazet <edumazet@google.com>
---
drivers/net/gtp.c | 20 ++++++++++----------
1 file changed, 10 insertions(+), 10 deletions(-)
diff --git a/drivers/net/gtp.c b/drivers/net/gtp.c
index b1919278e931f4e9fb6b2d2ec2feb2193b2cda61..62c601d9f7528d456dc6695814bf01a4d756d2da 100644
--- a/drivers/net/gtp.c
+++ b/drivers/net/gtp.c
@@ -1876,23 +1876,23 @@ static int __net_init gtp_net_init(struct net *net)
return 0;
}
-static void __net_exit gtp_net_exit(struct net *net)
+static void __net_exit gtp_net_exit_batch_rtnl(struct list_head *net_list,
+ struct list_head *dev_to_kill)
{
- struct gtp_net *gn = net_generic(net, gtp_net_id);
- struct gtp_dev *gtp;
- LIST_HEAD(list);
+ struct net *net;
- rtnl_lock();
- list_for_each_entry(gtp, &gn->gtp_dev_list, list)
- gtp_dellink(gtp->dev, &list);
+ list_for_each_entry(net, net_list, exit_list) {
+ struct gtp_net *gn = net_generic(net, gtp_net_id);
+ struct gtp_dev *gtp;
- unregister_netdevice_many(&list);
- rtnl_unlock();
+ list_for_each_entry(gtp, &gn->gtp_dev_list, list)
+ gtp_dellink(gtp->dev, dev_to_kill);
+ }
}
static struct pernet_operations gtp_net_ops = {
.init = gtp_net_init,
- .exit = gtp_net_exit,
+ .exit_batch_rtnl = gtp_net_exit_batch_rtnl,
.id = >p_net_id,
.size = sizeof(struct gtp_net),
};
--
2.43.0.594.gd9cf4e227d-goog
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [PATCH v3 net-next 07/15] ipv4: add __unregister_nexthop_notifier()
2024-02-05 12:47 [PATCH v3 net-next 00/15] net: more factorization in cleanup_net() paths Eric Dumazet
` (5 preceding siblings ...)
2024-02-05 12:47 ` [PATCH v3 net-next 06/15] gtp: " Eric Dumazet
@ 2024-02-05 12:47 ` Eric Dumazet
2024-02-05 12:47 ` [PATCH v3 net-next 08/15] vxlan: use exit_batch_rtnl() method Eric Dumazet
` (8 subsequent siblings)
15 siblings, 0 replies; 20+ messages in thread
From: Eric Dumazet @ 2024-02-05 12:47 UTC (permalink / raw)
To: David S . Miller, Jakub Kicinski, Paolo Abeni
Cc: Antoine Tenart, netdev, eric.dumazet, Eric Dumazet
unregister_nexthop_notifier() assumes the caller does not hold rtnl.
We need in the following patch to use it from a context
already holding rtnl.
Add __unregister_nexthop_notifier().
unregister_nexthop_notifier() becomes a wrapper.
Signed-off-by: Eric Dumazet <edumazet@google.com>
---
include/net/nexthop.h | 1 +
net/ipv4/nexthop.c | 19 +++++++++++++------
2 files changed, 14 insertions(+), 6 deletions(-)
diff --git a/include/net/nexthop.h b/include/net/nexthop.h
index d92046a4a078250eec528f3cb2c3ab557decad03..6647ad509faa02a9a13d58f3405c4a540abc5077 100644
--- a/include/net/nexthop.h
+++ b/include/net/nexthop.h
@@ -218,6 +218,7 @@ struct nh_notifier_info {
int register_nexthop_notifier(struct net *net, struct notifier_block *nb,
struct netlink_ext_ack *extack);
+int __unregister_nexthop_notifier(struct net *net, struct notifier_block *nb);
int unregister_nexthop_notifier(struct net *net, struct notifier_block *nb);
void nexthop_set_hw_flags(struct net *net, u32 id, bool offload, bool trap);
void nexthop_bucket_set_hw_flags(struct net *net, u32 id, u16 bucket_index,
diff --git a/net/ipv4/nexthop.c b/net/ipv4/nexthop.c
index 7270a8631406c508eebf85c42eb29a5268d7d7cf..70509da4f0806d25b3707835c08888d5e57b782e 100644
--- a/net/ipv4/nexthop.c
+++ b/net/ipv4/nexthop.c
@@ -3631,17 +3631,24 @@ int register_nexthop_notifier(struct net *net, struct notifier_block *nb,
}
EXPORT_SYMBOL(register_nexthop_notifier);
-int unregister_nexthop_notifier(struct net *net, struct notifier_block *nb)
+int __unregister_nexthop_notifier(struct net *net, struct notifier_block *nb)
{
int err;
- rtnl_lock();
err = blocking_notifier_chain_unregister(&net->nexthop.notifier_chain,
nb);
- if (err)
- goto unlock;
- nexthops_dump(net, nb, NEXTHOP_EVENT_DEL, NULL);
-unlock:
+ if (!err)
+ nexthops_dump(net, nb, NEXTHOP_EVENT_DEL, NULL);
+ return err;
+}
+EXPORT_SYMBOL(__unregister_nexthop_notifier);
+
+int unregister_nexthop_notifier(struct net *net, struct notifier_block *nb)
+{
+ int err;
+
+ rtnl_lock();
+ err = __unregister_nexthop_notifier(net, nb);
rtnl_unlock();
return err;
}
--
2.43.0.594.gd9cf4e227d-goog
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [PATCH v3 net-next 08/15] vxlan: use exit_batch_rtnl() method
2024-02-05 12:47 [PATCH v3 net-next 00/15] net: more factorization in cleanup_net() paths Eric Dumazet
` (6 preceding siblings ...)
2024-02-05 12:47 ` [PATCH v3 net-next 07/15] ipv4: add __unregister_nexthop_notifier() Eric Dumazet
@ 2024-02-05 12:47 ` Eric Dumazet
2024-02-05 12:47 ` [PATCH v3 net-next 09/15] ip6_gre: " Eric Dumazet
` (7 subsequent siblings)
15 siblings, 0 replies; 20+ messages in thread
From: Eric Dumazet @ 2024-02-05 12:47 UTC (permalink / raw)
To: David S . Miller, Jakub Kicinski, Paolo Abeni
Cc: Antoine Tenart, netdev, eric.dumazet, Eric Dumazet
exit_batch_rtnl() is called while RTNL is held,
and devices to be unregistered can be queued in the dev_kill_list.
This saves one rtnl_lock()/rtnl_unlock() pair per netns
and one unregister_netdevice_many() call.
Signed-off-by: Eric Dumazet <edumazet@google.com>
---
drivers/net/vxlan/vxlan_core.c | 21 ++++++++++++---------
1 file changed, 12 insertions(+), 9 deletions(-)
diff --git a/drivers/net/vxlan/vxlan_core.c b/drivers/net/vxlan/vxlan_core.c
index 16106e088c6301d3aaa47dd73985107945735b6e..df664de4b2b6cc361363b804e7ad531d59e2cdfa 100644
--- a/drivers/net/vxlan/vxlan_core.c
+++ b/drivers/net/vxlan/vxlan_core.c
@@ -4846,23 +4846,25 @@ static void vxlan_destroy_tunnels(struct net *net, struct list_head *head)
}
-static void __net_exit vxlan_exit_batch_net(struct list_head *net_list)
+static void __net_exit vxlan_exit_batch_rtnl(struct list_head *net_list,
+ struct list_head *dev_to_kill)
{
struct net *net;
- LIST_HEAD(list);
- unsigned int h;
+ ASSERT_RTNL();
list_for_each_entry(net, net_list, exit_list) {
struct vxlan_net *vn = net_generic(net, vxlan_net_id);
- unregister_nexthop_notifier(net, &vn->nexthop_notifier_block);
+ __unregister_nexthop_notifier(net, &vn->nexthop_notifier_block);
+
+ vxlan_destroy_tunnels(net, dev_to_kill);
}
- rtnl_lock();
- list_for_each_entry(net, net_list, exit_list)
- vxlan_destroy_tunnels(net, &list);
+}
- unregister_netdevice_many(&list);
- rtnl_unlock();
+static void __net_exit vxlan_exit_batch_net(struct list_head *net_list)
+{
+ struct net *net;
+ unsigned int h;
list_for_each_entry(net, net_list, exit_list) {
struct vxlan_net *vn = net_generic(net, vxlan_net_id);
@@ -4875,6 +4877,7 @@ static void __net_exit vxlan_exit_batch_net(struct list_head *net_list)
static struct pernet_operations vxlan_net_ops = {
.init = vxlan_init_net,
.exit_batch = vxlan_exit_batch_net,
+ .exit_batch_rtnl = vxlan_exit_batch_rtnl,
.id = &vxlan_net_id,
.size = sizeof(struct vxlan_net),
};
--
2.43.0.594.gd9cf4e227d-goog
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [PATCH v3 net-next 09/15] ip6_gre: use exit_batch_rtnl() method
2024-02-05 12:47 [PATCH v3 net-next 00/15] net: more factorization in cleanup_net() paths Eric Dumazet
` (7 preceding siblings ...)
2024-02-05 12:47 ` [PATCH v3 net-next 08/15] vxlan: use exit_batch_rtnl() method Eric Dumazet
@ 2024-02-05 12:47 ` Eric Dumazet
2024-02-05 12:47 ` [PATCH v3 net-next 10/15] ip6_tunnel: " Eric Dumazet
` (6 subsequent siblings)
15 siblings, 0 replies; 20+ messages in thread
From: Eric Dumazet @ 2024-02-05 12:47 UTC (permalink / raw)
To: David S . Miller, Jakub Kicinski, Paolo Abeni
Cc: Antoine Tenart, netdev, eric.dumazet, Eric Dumazet
exit_batch_rtnl() is called while RTNL is held,
and devices to be unregistered can be queued in the dev_kill_list.
This saves one rtnl_lock()/rtnl_unlock() pair
and one unregister_netdevice_many() call.
Signed-off-by: Eric Dumazet <edumazet@google.com>
---
net/ipv6/ip6_gre.c | 12 +++++-------
1 file changed, 5 insertions(+), 7 deletions(-)
diff --git a/net/ipv6/ip6_gre.c b/net/ipv6/ip6_gre.c
index 070d87abf7c0284aa23043391aab080534e144a7..428f03e9da45ac323aa357b5a9d299fb7f3d3a5b 100644
--- a/net/ipv6/ip6_gre.c
+++ b/net/ipv6/ip6_gre.c
@@ -1632,21 +1632,19 @@ static int __net_init ip6gre_init_net(struct net *net)
return err;
}
-static void __net_exit ip6gre_exit_batch_net(struct list_head *net_list)
+static void __net_exit ip6gre_exit_batch_rtnl(struct list_head *net_list,
+ struct list_head *dev_to_kill)
{
struct net *net;
- LIST_HEAD(list);
- rtnl_lock();
+ ASSERT_RTNL();
list_for_each_entry(net, net_list, exit_list)
- ip6gre_destroy_tunnels(net, &list);
- unregister_netdevice_many(&list);
- rtnl_unlock();
+ ip6gre_destroy_tunnels(net, dev_to_kill);
}
static struct pernet_operations ip6gre_net_ops = {
.init = ip6gre_init_net,
- .exit_batch = ip6gre_exit_batch_net,
+ .exit_batch_rtnl = ip6gre_exit_batch_rtnl,
.id = &ip6gre_net_id,
.size = sizeof(struct ip6gre_net),
};
--
2.43.0.594.gd9cf4e227d-goog
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [PATCH v3 net-next 10/15] ip6_tunnel: use exit_batch_rtnl() method
2024-02-05 12:47 [PATCH v3 net-next 00/15] net: more factorization in cleanup_net() paths Eric Dumazet
` (8 preceding siblings ...)
2024-02-05 12:47 ` [PATCH v3 net-next 09/15] ip6_gre: " Eric Dumazet
@ 2024-02-05 12:47 ` Eric Dumazet
2024-02-05 12:47 ` [PATCH v3 net-next 11/15] ip6_vti: " Eric Dumazet
` (5 subsequent siblings)
15 siblings, 0 replies; 20+ messages in thread
From: Eric Dumazet @ 2024-02-05 12:47 UTC (permalink / raw)
To: David S . Miller, Jakub Kicinski, Paolo Abeni
Cc: Antoine Tenart, netdev, eric.dumazet, Eric Dumazet
exit_batch_rtnl() is called while RTNL is held,
and devices to be unregistered can be queued in the dev_kill_list.
This saves one rtnl_lock()/rtnl_unlock() pair
and one unregister_netdevice_many() call.
Signed-off-by: Eric Dumazet <edumazet@google.com>
---
net/ipv6/ip6_tunnel.c | 12 +++++-------
1 file changed, 5 insertions(+), 7 deletions(-)
diff --git a/net/ipv6/ip6_tunnel.c b/net/ipv6/ip6_tunnel.c
index 9bbabf750a21e251d4e8f9e3059c707505f5ce32..bfb0a6c601c119cc38901998c47d0c98be047d90 100644
--- a/net/ipv6/ip6_tunnel.c
+++ b/net/ipv6/ip6_tunnel.c
@@ -2282,21 +2282,19 @@ static int __net_init ip6_tnl_init_net(struct net *net)
return err;
}
-static void __net_exit ip6_tnl_exit_batch_net(struct list_head *net_list)
+static void __net_exit ip6_tnl_exit_batch_rtnl(struct list_head *net_list,
+ struct list_head *dev_to_kill)
{
struct net *net;
- LIST_HEAD(list);
- rtnl_lock();
+ ASSERT_RTNL();
list_for_each_entry(net, net_list, exit_list)
- ip6_tnl_destroy_tunnels(net, &list);
- unregister_netdevice_many(&list);
- rtnl_unlock();
+ ip6_tnl_destroy_tunnels(net, dev_to_kill);
}
static struct pernet_operations ip6_tnl_net_ops = {
.init = ip6_tnl_init_net,
- .exit_batch = ip6_tnl_exit_batch_net,
+ .exit_batch_rtnl = ip6_tnl_exit_batch_rtnl,
.id = &ip6_tnl_net_id,
.size = sizeof(struct ip6_tnl_net),
};
--
2.43.0.594.gd9cf4e227d-goog
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [PATCH v3 net-next 11/15] ip6_vti: use exit_batch_rtnl() method
2024-02-05 12:47 [PATCH v3 net-next 00/15] net: more factorization in cleanup_net() paths Eric Dumazet
` (9 preceding siblings ...)
2024-02-05 12:47 ` [PATCH v3 net-next 10/15] ip6_tunnel: " Eric Dumazet
@ 2024-02-05 12:47 ` Eric Dumazet
2024-02-05 12:47 ` [PATCH v3 net-next 12/15] sit: " Eric Dumazet
` (4 subsequent siblings)
15 siblings, 0 replies; 20+ messages in thread
From: Eric Dumazet @ 2024-02-05 12:47 UTC (permalink / raw)
To: David S . Miller, Jakub Kicinski, Paolo Abeni
Cc: Antoine Tenart, netdev, eric.dumazet, Eric Dumazet
exit_batch_rtnl() is called while RTNL is held,
and devices to be unregistered can be queued in the dev_kill_list.
This saves one rtnl_lock()/rtnl_unlock() pair
and one unregister_netdevice_many() call.
Signed-off-by: Eric Dumazet <edumazet@google.com>
---
net/ipv6/ip6_vti.c | 12 +++++-------
1 file changed, 5 insertions(+), 7 deletions(-)
diff --git a/net/ipv6/ip6_vti.c b/net/ipv6/ip6_vti.c
index e550240c85e1c9f2fe2b835e903de28e1f08b3bc..cfe1b1ad4d85d303597784d5eeb3077383978d95 100644
--- a/net/ipv6/ip6_vti.c
+++ b/net/ipv6/ip6_vti.c
@@ -1174,24 +1174,22 @@ static int __net_init vti6_init_net(struct net *net)
return err;
}
-static void __net_exit vti6_exit_batch_net(struct list_head *net_list)
+static void __net_exit vti6_exit_batch_rtnl(struct list_head *net_list,
+ struct list_head *dev_to_kill)
{
struct vti6_net *ip6n;
struct net *net;
- LIST_HEAD(list);
- rtnl_lock();
+ ASSERT_RTNL();
list_for_each_entry(net, net_list, exit_list) {
ip6n = net_generic(net, vti6_net_id);
- vti6_destroy_tunnels(ip6n, &list);
+ vti6_destroy_tunnels(ip6n, dev_to_kill);
}
- unregister_netdevice_many(&list);
- rtnl_unlock();
}
static struct pernet_operations vti6_net_ops = {
.init = vti6_init_net,
- .exit_batch = vti6_exit_batch_net,
+ .exit_batch_rtnl = vti6_exit_batch_rtnl,
.id = &vti6_net_id,
.size = sizeof(struct vti6_net),
};
--
2.43.0.594.gd9cf4e227d-goog
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [PATCH v3 net-next 12/15] sit: use exit_batch_rtnl() method
2024-02-05 12:47 [PATCH v3 net-next 00/15] net: more factorization in cleanup_net() paths Eric Dumazet
` (10 preceding siblings ...)
2024-02-05 12:47 ` [PATCH v3 net-next 11/15] ip6_vti: " Eric Dumazet
@ 2024-02-05 12:47 ` Eric Dumazet
2024-02-05 12:47 ` [PATCH v3 net-next 13/15] ip_tunnel: " Eric Dumazet
` (3 subsequent siblings)
15 siblings, 0 replies; 20+ messages in thread
From: Eric Dumazet @ 2024-02-05 12:47 UTC (permalink / raw)
To: David S . Miller, Jakub Kicinski, Paolo Abeni
Cc: Antoine Tenart, netdev, eric.dumazet, Eric Dumazet
exit_batch_rtnl() is called while RTNL is held,
and devices to be unregistered can be queued in the dev_kill_list.
This saves one rtnl_lock()/rtnl_unlock() pair
and one unregister_netdevice_many() call.
Signed-off-by: Eric Dumazet <edumazet@google.com>
---
net/ipv6/sit.c | 13 +++++--------
1 file changed, 5 insertions(+), 8 deletions(-)
diff --git a/net/ipv6/sit.c b/net/ipv6/sit.c
index cc24cefdb85c0944c03c019b1c4214302d18e2c8..61b2b71fa8bedea6d185348ff781356652434b33 100644
--- a/net/ipv6/sit.c
+++ b/net/ipv6/sit.c
@@ -1875,22 +1875,19 @@ static int __net_init sit_init_net(struct net *net)
return err;
}
-static void __net_exit sit_exit_batch_net(struct list_head *net_list)
+static void __net_exit sit_exit_batch_rtnl(struct list_head *net_list,
+ struct list_head *dev_to_kill)
{
- LIST_HEAD(list);
struct net *net;
- rtnl_lock();
+ ASSERT_RTNL();
list_for_each_entry(net, net_list, exit_list)
- sit_destroy_tunnels(net, &list);
-
- unregister_netdevice_many(&list);
- rtnl_unlock();
+ sit_destroy_tunnels(net, dev_to_kill);
}
static struct pernet_operations sit_net_ops = {
.init = sit_init_net,
- .exit_batch = sit_exit_batch_net,
+ .exit_batch_rtnl = sit_exit_batch_rtnl,
.id = &sit_net_id,
.size = sizeof(struct sit_net),
};
--
2.43.0.594.gd9cf4e227d-goog
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [PATCH v3 net-next 13/15] ip_tunnel: use exit_batch_rtnl() method
2024-02-05 12:47 [PATCH v3 net-next 00/15] net: more factorization in cleanup_net() paths Eric Dumazet
` (11 preceding siblings ...)
2024-02-05 12:47 ` [PATCH v3 net-next 12/15] sit: " Eric Dumazet
@ 2024-02-05 12:47 ` Eric Dumazet
2024-02-05 12:47 ` [PATCH v3 net-next 14/15] bridge: " Eric Dumazet
` (2 subsequent siblings)
15 siblings, 0 replies; 20+ messages in thread
From: Eric Dumazet @ 2024-02-05 12:47 UTC (permalink / raw)
To: David S . Miller, Jakub Kicinski, Paolo Abeni
Cc: Antoine Tenart, netdev, eric.dumazet, Eric Dumazet
exit_batch_rtnl() is called while RTNL is held,
and devices to be unregistered can be queued in the dev_kill_list.
This saves one rtnl_lock()/rtnl_unlock() pair
and one unregister_netdevice_many() call.
This patch takes care of ipip, ip_vti, and ip_gre tunnels.
Signed-off-by: Eric Dumazet <edumazet@google.com>
---
include/net/ip_tunnels.h | 3 ++-
net/ipv4/ip_gre.c | 24 +++++++++++++++---------
net/ipv4/ip_tunnel.c | 10 ++++------
net/ipv4/ip_vti.c | 8 +++++---
net/ipv4/ipip.c | 8 +++++---
5 files changed, 31 insertions(+), 22 deletions(-)
diff --git a/include/net/ip_tunnels.h b/include/net/ip_tunnels.h
index 2d746f4c9a0a4792bc16971c107d598190897433..5cd64bb2104df389250fb3c518ba00a3826c53f7 100644
--- a/include/net/ip_tunnels.h
+++ b/include/net/ip_tunnels.h
@@ -284,7 +284,8 @@ int ip_tunnel_init_net(struct net *net, unsigned int ip_tnl_net_id,
struct rtnl_link_ops *ops, char *devname);
void ip_tunnel_delete_nets(struct list_head *list_net, unsigned int id,
- struct rtnl_link_ops *ops);
+ struct rtnl_link_ops *ops,
+ struct list_head *dev_to_kill);
void ip_tunnel_xmit(struct sk_buff *skb, struct net_device *dev,
const struct iphdr *tnl_params, const u8 protocol);
diff --git a/net/ipv4/ip_gre.c b/net/ipv4/ip_gre.c
index 5169c3c72cffe49cef613e69889d139db867ff74..aad5125b7a65ecc770f1b962ac5b417bd931e3ba 100644
--- a/net/ipv4/ip_gre.c
+++ b/net/ipv4/ip_gre.c
@@ -1025,14 +1025,16 @@ static int __net_init ipgre_init_net(struct net *net)
return ip_tunnel_init_net(net, ipgre_net_id, &ipgre_link_ops, NULL);
}
-static void __net_exit ipgre_exit_batch_net(struct list_head *list_net)
+static void __net_exit ipgre_exit_batch_rtnl(struct list_head *list_net,
+ struct list_head *dev_to_kill)
{
- ip_tunnel_delete_nets(list_net, ipgre_net_id, &ipgre_link_ops);
+ ip_tunnel_delete_nets(list_net, ipgre_net_id, &ipgre_link_ops,
+ dev_to_kill);
}
static struct pernet_operations ipgre_net_ops = {
.init = ipgre_init_net,
- .exit_batch = ipgre_exit_batch_net,
+ .exit_batch_rtnl = ipgre_exit_batch_rtnl,
.id = &ipgre_net_id,
.size = sizeof(struct ip_tunnel_net),
};
@@ -1697,14 +1699,16 @@ static int __net_init ipgre_tap_init_net(struct net *net)
return ip_tunnel_init_net(net, gre_tap_net_id, &ipgre_tap_ops, "gretap0");
}
-static void __net_exit ipgre_tap_exit_batch_net(struct list_head *list_net)
+static void __net_exit ipgre_tap_exit_batch_rtnl(struct list_head *list_net,
+ struct list_head *dev_to_kill)
{
- ip_tunnel_delete_nets(list_net, gre_tap_net_id, &ipgre_tap_ops);
+ ip_tunnel_delete_nets(list_net, gre_tap_net_id, &ipgre_tap_ops,
+ dev_to_kill);
}
static struct pernet_operations ipgre_tap_net_ops = {
.init = ipgre_tap_init_net,
- .exit_batch = ipgre_tap_exit_batch_net,
+ .exit_batch_rtnl = ipgre_tap_exit_batch_rtnl,
.id = &gre_tap_net_id,
.size = sizeof(struct ip_tunnel_net),
};
@@ -1715,14 +1719,16 @@ static int __net_init erspan_init_net(struct net *net)
&erspan_link_ops, "erspan0");
}
-static void __net_exit erspan_exit_batch_net(struct list_head *net_list)
+static void __net_exit erspan_exit_batch_rtnl(struct list_head *net_list,
+ struct list_head *dev_to_kill)
{
- ip_tunnel_delete_nets(net_list, erspan_net_id, &erspan_link_ops);
+ ip_tunnel_delete_nets(net_list, erspan_net_id, &erspan_link_ops,
+ dev_to_kill);
}
static struct pernet_operations erspan_net_ops = {
.init = erspan_init_net,
- .exit_batch = erspan_exit_batch_net,
+ .exit_batch_rtnl = erspan_exit_batch_rtnl,
.id = &erspan_net_id,
.size = sizeof(struct ip_tunnel_net),
};
diff --git a/net/ipv4/ip_tunnel.c b/net/ipv4/ip_tunnel.c
index beeae624c412d752bd5ee5d459a88f57640445e9..00da0b80320fb514bca58de7cd13894ab49a2ca6 100644
--- a/net/ipv4/ip_tunnel.c
+++ b/net/ipv4/ip_tunnel.c
@@ -1130,19 +1130,17 @@ static void ip_tunnel_destroy(struct net *net, struct ip_tunnel_net *itn,
}
void ip_tunnel_delete_nets(struct list_head *net_list, unsigned int id,
- struct rtnl_link_ops *ops)
+ struct rtnl_link_ops *ops,
+ struct list_head *dev_to_kill)
{
struct ip_tunnel_net *itn;
struct net *net;
- LIST_HEAD(list);
- rtnl_lock();
+ ASSERT_RTNL();
list_for_each_entry(net, net_list, exit_list) {
itn = net_generic(net, id);
- ip_tunnel_destroy(net, itn, &list, ops);
+ ip_tunnel_destroy(net, itn, dev_to_kill, ops);
}
- unregister_netdevice_many(&list);
- rtnl_unlock();
}
EXPORT_SYMBOL_GPL(ip_tunnel_delete_nets);
diff --git a/net/ipv4/ip_vti.c b/net/ipv4/ip_vti.c
index 9ab9b3ebe0cd1a9e95f489d98c5a3d89c7c0edf6..fb1f52d2131128a39ab5bf0482359b7b75989fb6 100644
--- a/net/ipv4/ip_vti.c
+++ b/net/ipv4/ip_vti.c
@@ -510,14 +510,16 @@ static int __net_init vti_init_net(struct net *net)
return 0;
}
-static void __net_exit vti_exit_batch_net(struct list_head *list_net)
+static void __net_exit vti_exit_batch_rtnl(struct list_head *list_net,
+ struct list_head *dev_to_kill)
{
- ip_tunnel_delete_nets(list_net, vti_net_id, &vti_link_ops);
+ ip_tunnel_delete_nets(list_net, vti_net_id, &vti_link_ops,
+ dev_to_kill);
}
static struct pernet_operations vti_net_ops = {
.init = vti_init_net,
- .exit_batch = vti_exit_batch_net,
+ .exit_batch_rtnl = vti_exit_batch_rtnl,
.id = &vti_net_id,
.size = sizeof(struct ip_tunnel_net),
};
diff --git a/net/ipv4/ipip.c b/net/ipv4/ipip.c
index 27b8f83c6ea200314f41a29ecfea494b9ddef2ca..0151eea06cc50bec4ae64f08ca6a7161e3cbf9ae 100644
--- a/net/ipv4/ipip.c
+++ b/net/ipv4/ipip.c
@@ -592,14 +592,16 @@ static int __net_init ipip_init_net(struct net *net)
return ip_tunnel_init_net(net, ipip_net_id, &ipip_link_ops, "tunl0");
}
-static void __net_exit ipip_exit_batch_net(struct list_head *list_net)
+static void __net_exit ipip_exit_batch_rtnl(struct list_head *list_net,
+ struct list_head *dev_to_kill)
{
- ip_tunnel_delete_nets(list_net, ipip_net_id, &ipip_link_ops);
+ ip_tunnel_delete_nets(list_net, ipip_net_id, &ipip_link_ops,
+ dev_to_kill);
}
static struct pernet_operations ipip_net_ops = {
.init = ipip_init_net,
- .exit_batch = ipip_exit_batch_net,
+ .exit_batch_rtnl = ipip_exit_batch_rtnl,
.id = &ipip_net_id,
.size = sizeof(struct ip_tunnel_net),
};
--
2.43.0.594.gd9cf4e227d-goog
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [PATCH v3 net-next 14/15] bridge: use exit_batch_rtnl() method
2024-02-05 12:47 [PATCH v3 net-next 00/15] net: more factorization in cleanup_net() paths Eric Dumazet
` (12 preceding siblings ...)
2024-02-05 12:47 ` [PATCH v3 net-next 13/15] ip_tunnel: " Eric Dumazet
@ 2024-02-05 12:47 ` Eric Dumazet
2024-02-05 12:47 ` [PATCH v3 net-next 15/15] xfrm: interface: " Eric Dumazet
2024-02-06 9:15 ` [PATCH v3 net-next 00/15] net: more factorization in cleanup_net() paths Paolo Abeni
15 siblings, 0 replies; 20+ messages in thread
From: Eric Dumazet @ 2024-02-05 12:47 UTC (permalink / raw)
To: David S . Miller, Jakub Kicinski, Paolo Abeni
Cc: Antoine Tenart, netdev, eric.dumazet, Eric Dumazet
exit_batch_rtnl() is called while RTNL is held,
and devices to be unregistered can be queued in the dev_kill_list.
This saves one rtnl_lock()/rtnl_unlock() pair per netns
and one unregister_netdevice_many() call.
Signed-off-by: Eric Dumazet <edumazet@google.com>
---
net/bridge/br.c | 15 +++++----------
1 file changed, 5 insertions(+), 10 deletions(-)
diff --git a/net/bridge/br.c b/net/bridge/br.c
index ac19b797dbece972f236211b9b286c298315df25..2cab878e0a39c99c10952be7d5c732a40c754655 100644
--- a/net/bridge/br.c
+++ b/net/bridge/br.c
@@ -356,26 +356,21 @@ void br_opt_toggle(struct net_bridge *br, enum net_bridge_opts opt, bool on)
clear_bit(opt, &br->options);
}
-static void __net_exit br_net_exit_batch(struct list_head *net_list)
+static void __net_exit br_net_exit_batch_rtnl(struct list_head *net_list,
+ struct list_head *dev_to_kill)
{
struct net_device *dev;
struct net *net;
- LIST_HEAD(list);
-
- rtnl_lock();
+ ASSERT_RTNL();
list_for_each_entry(net, net_list, exit_list)
for_each_netdev(net, dev)
if (netif_is_bridge_master(dev))
- br_dev_delete(dev, &list);
-
- unregister_netdevice_many(&list);
-
- rtnl_unlock();
+ br_dev_delete(dev, dev_to_kill);
}
static struct pernet_operations br_net_ops = {
- .exit_batch = br_net_exit_batch,
+ .exit_batch_rtnl = br_net_exit_batch_rtnl,
};
static const struct stp_proto br_stp_proto = {
--
2.43.0.594.gd9cf4e227d-goog
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [PATCH v3 net-next 15/15] xfrm: interface: use exit_batch_rtnl() method
2024-02-05 12:47 [PATCH v3 net-next 00/15] net: more factorization in cleanup_net() paths Eric Dumazet
` (13 preceding siblings ...)
2024-02-05 12:47 ` [PATCH v3 net-next 14/15] bridge: " Eric Dumazet
@ 2024-02-05 12:47 ` Eric Dumazet
2024-02-06 9:15 ` [PATCH v3 net-next 00/15] net: more factorization in cleanup_net() paths Paolo Abeni
15 siblings, 0 replies; 20+ messages in thread
From: Eric Dumazet @ 2024-02-05 12:47 UTC (permalink / raw)
To: David S . Miller, Jakub Kicinski, Paolo Abeni
Cc: Antoine Tenart, netdev, eric.dumazet, Eric Dumazet
exit_batch_rtnl() is called while RTNL is held,
and devices to be unregistered can be queued in the dev_kill_list.
This saves one rtnl_lock()/rtnl_unlock() pair per netns
and one unregister_netdevice_many() call.
Signed-off-by: Eric Dumazet <edumazet@google.com>
---
net/xfrm/xfrm_interface_core.c | 14 ++++++--------
1 file changed, 6 insertions(+), 8 deletions(-)
diff --git a/net/xfrm/xfrm_interface_core.c b/net/xfrm/xfrm_interface_core.c
index 21d50d75c26088063538d9b9da5cba93db181a1f..dafefef3cf51a79fd6701a8b78c3f8fcfd10615d 100644
--- a/net/xfrm/xfrm_interface_core.c
+++ b/net/xfrm/xfrm_interface_core.c
@@ -957,12 +957,12 @@ static struct rtnl_link_ops xfrmi_link_ops __read_mostly = {
.get_link_net = xfrmi_get_link_net,
};
-static void __net_exit xfrmi_exit_batch_net(struct list_head *net_exit_list)
+static void __net_exit xfrmi_exit_batch_rtnl(struct list_head *net_exit_list,
+ struct list_head *dev_to_kill)
{
struct net *net;
- LIST_HEAD(list);
- rtnl_lock();
+ ASSERT_RTNL();
list_for_each_entry(net, net_exit_list, exit_list) {
struct xfrmi_net *xfrmn = net_generic(net, xfrmi_net_id);
struct xfrm_if __rcu **xip;
@@ -973,18 +973,16 @@ static void __net_exit xfrmi_exit_batch_net(struct list_head *net_exit_list)
for (xip = &xfrmn->xfrmi[i];
(xi = rtnl_dereference(*xip)) != NULL;
xip = &xi->next)
- unregister_netdevice_queue(xi->dev, &list);
+ unregister_netdevice_queue(xi->dev, dev_to_kill);
}
xi = rtnl_dereference(xfrmn->collect_md_xfrmi);
if (xi)
- unregister_netdevice_queue(xi->dev, &list);
+ unregister_netdevice_queue(xi->dev, dev_to_kill);
}
- unregister_netdevice_many(&list);
- rtnl_unlock();
}
static struct pernet_operations xfrmi_net_ops = {
- .exit_batch = xfrmi_exit_batch_net,
+ .exit_batch_rtnl = xfrmi_exit_batch_rtnl,
.id = &xfrmi_net_id,
.size = sizeof(struct xfrmi_net),
};
--
2.43.0.594.gd9cf4e227d-goog
^ permalink raw reply related [flat|nested] 20+ messages in thread
* Re: [PATCH v3 net-next 00/15] net: more factorization in cleanup_net() paths
2024-02-05 12:47 [PATCH v3 net-next 00/15] net: more factorization in cleanup_net() paths Eric Dumazet
` (14 preceding siblings ...)
2024-02-05 12:47 ` [PATCH v3 net-next 15/15] xfrm: interface: " Eric Dumazet
@ 2024-02-06 9:15 ` Paolo Abeni
2024-02-06 10:48 ` Eric Dumazet
15 siblings, 1 reply; 20+ messages in thread
From: Paolo Abeni @ 2024-02-06 9:15 UTC (permalink / raw)
To: Eric Dumazet, David S . Miller, Jakub Kicinski
Cc: Antoine Tenart, netdev, eric.dumazet
On Mon, 2024-02-05 at 12:47 +0000, Eric Dumazet wrote:
> This series is inspired by recent syzbot reports hinting to RTNL and
> workqueue abuses.
>
> rtnl_lock() is unfair to (single threaded) cleanup_net(), because
> many threads can cause contention on it.
>
> This series adds a new (struct pernet_operations) method,
> so that cleanup_net() can hold RTNL longer once it finally
> acquires it.
>
> It also factorizes unregister_netdevice_many(), to further
> reduce stalls in cleanup_net().
>
> v3: Dropped "net: convert default_device_exit_batch() to exit_batch_rtnl method"
> Jakub (and KASAN) reported issues with bridge, but the root cause was with this patch.
> default_device_exit_batch() is the catch-all method, it includes "lo" device dismantle.
>
I *think* this still causes KASAN splat in the CI WRT vxlan devices,
e.g.:
https://netdev-3.bots.linux.dev/vmksft-net/results/453141/17-udpgro-fwd-sh/stdout
(at least this series is the most eye catching thing that landed into
the relevant batch)
Cheers,
Paolo
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH v3 net-next 05/15] geneve: use exit_batch_rtnl() method
2024-02-05 12:47 ` [PATCH v3 net-next 05/15] geneve: " Eric Dumazet
@ 2024-02-06 9:19 ` Antoine Tenart
2024-02-06 10:22 ` Eric Dumazet
0 siblings, 1 reply; 20+ messages in thread
From: Antoine Tenart @ 2024-02-06 9:19 UTC (permalink / raw)
To: David S . Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni
Cc: netdev, eric.dumazet, Eric Dumazet
Quoting Eric Dumazet (2024-02-05 13:47:42)
> -static void __net_exit geneve_exit_batch_net(struct list_head *net_list)
> +static void __net_exit geneve_exit_batch_rtnl(struct list_head *net_list,
> + struct list_head *dev_to_kill)
> {
> struct net *net;
> - LIST_HEAD(list);
>
> - rtnl_lock();
> list_for_each_entry(net, net_list, exit_list)
> - geneve_destroy_tunnels(net, &list);
> -
> - /* unregister the devices gathered above */
> - unregister_netdevice_many(&list);
> - rtnl_unlock();
> + geneve_destroy_tunnels(net, dev_to_kill);
>
> list_for_each_entry(net, net_list, exit_list) {
> const struct geneve_net *gn = net_generic(net, geneve_net_id);
Not shown in the diff here is:
WARN_ON_ONCE(!list_empty(&gn->sock_list));
I think this will be triggered as the above logic inverted two calls,
which are now,
1. WARN_ON_ONCE(...)
2. unregister_netdevice_many
But ->sock_list entries are removed in ndo_exit, called from
unregister_netdevice_many.
I guess the warning could be moved to exit_batch (or removed).
Thanks,
Antoine
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH v3 net-next 05/15] geneve: use exit_batch_rtnl() method
2024-02-06 9:19 ` Antoine Tenart
@ 2024-02-06 10:22 ` Eric Dumazet
0 siblings, 0 replies; 20+ messages in thread
From: Eric Dumazet @ 2024-02-06 10:22 UTC (permalink / raw)
To: Antoine Tenart
Cc: David S . Miller, Jakub Kicinski, Paolo Abeni, netdev, eric.dumazet
On Tue, Feb 6, 2024 at 10:19 AM Antoine Tenart <atenart@kernel.org> wrote:
>
> Quoting Eric Dumazet (2024-02-05 13:47:42)
> > -static void __net_exit geneve_exit_batch_net(struct list_head *net_list)
> > +static void __net_exit geneve_exit_batch_rtnl(struct list_head *net_list,
> > + struct list_head *dev_to_kill)
> > {
> > struct net *net;
> > - LIST_HEAD(list);
> >
> > - rtnl_lock();
> > list_for_each_entry(net, net_list, exit_list)
> > - geneve_destroy_tunnels(net, &list);
> > -
> > - /* unregister the devices gathered above */
> > - unregister_netdevice_many(&list);
> > - rtnl_unlock();
> > + geneve_destroy_tunnels(net, dev_to_kill);
> >
> > list_for_each_entry(net, net_list, exit_list) {
> > const struct geneve_net *gn = net_generic(net, geneve_net_id);
>
> Not shown in the diff here is:
>
> WARN_ON_ONCE(!list_empty(&gn->sock_list));
>
> I think this will be triggered as the above logic inverted two calls,
> which are now,
>
> 1. WARN_ON_ONCE(...)
> 2. unregister_netdevice_many
>
> But ->sock_list entries are removed in ndo_exit, called from
> unregister_netdevice_many.
>
> I guess the warning could be moved to exit_batch (or removed).
I will keep the warning, but move it, thanks [1]
Speaking of geneve, I think the synchronize_net() call from
geneve_sock_release() could easily be avoided.
[1] I will squash the following delta for v4 submission.
diff --git a/drivers/net/geneve.c b/drivers/net/geneve.c
index f31fc52ef397dfe0eba854385f783fbcad7e870f..23e97c2e4f6fcb90a8bbb117d7520397f560f15f
100644
--- a/drivers/net/geneve.c
+++ b/drivers/net/geneve.c
@@ -1907,17 +1907,19 @@ static void __net_exit
geneve_exit_batch_rtnl(struct list_head *net_list,
list_for_each_entry(net, net_list, exit_list)
geneve_destroy_tunnels(net, dev_to_kill);
+}
- list_for_each_entry(net, net_list, exit_list) {
- const struct geneve_net *gn = net_generic(net, geneve_net_id);
+static void __net_exit geneve_exit_net(struct net *net)
+{
+ const struct geneve_net *gn = net_generic(net, geneve_net_id);
- WARN_ON_ONCE(!list_empty(&gn->sock_list));
- }
+ WARN_ON_ONCE(!list_empty(&gn->sock_list));
}
static struct pernet_operations geneve_net_ops = {
.init = geneve_init_net,
.exit_batch_rtnl = geneve_exit_batch_rtnl,
+ .exit = geneve_exit_net,
.id = &geneve_net_id,
.size = sizeof(struct geneve_net),
};
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH v3 net-next 00/15] net: more factorization in cleanup_net() paths
2024-02-06 9:15 ` [PATCH v3 net-next 00/15] net: more factorization in cleanup_net() paths Paolo Abeni
@ 2024-02-06 10:48 ` Eric Dumazet
0 siblings, 0 replies; 20+ messages in thread
From: Eric Dumazet @ 2024-02-06 10:48 UTC (permalink / raw)
To: Paolo Abeni
Cc: David S . Miller, Jakub Kicinski, Antoine Tenart, netdev, eric.dumazet
On Tue, Feb 6, 2024 at 10:15 AM Paolo Abeni <pabeni@redhat.com> wrote:
>
> On Mon, 2024-02-05 at 12:47 +0000, Eric Dumazet wrote:
> > This series is inspired by recent syzbot reports hinting to RTNL and
> > workqueue abuses.
> >
> > rtnl_lock() is unfair to (single threaded) cleanup_net(), because
> > many threads can cause contention on it.
> >
> > This series adds a new (struct pernet_operations) method,
> > so that cleanup_net() can hold RTNL longer once it finally
> > acquires it.
> >
> > It also factorizes unregister_netdevice_many(), to further
> > reduce stalls in cleanup_net().
> >
> > v3: Dropped "net: convert default_device_exit_batch() to exit_batch_rtnl method"
> > Jakub (and KASAN) reported issues with bridge, but the root cause was with this patch.
> > default_device_exit_batch() is the catch-all method, it includes "lo" device dismantle.
> >
>
> I *think* this still causes KASAN splat in the CI WRT vxlan devices,
> e.g.:
>
> https://netdev-3.bots.linux.dev/vmksft-net/results/453141/17-udpgro-fwd-sh/stdout
>
> (at least this series is the most eye catching thing that landed into
> the relevant batch)
>
Interesting... vxlan_destroy_tunnels() uses
unregister_netdevice_queue() instead of vxlan_dellink() :/
So vn->vxlan_list is not properly updated.
I think my patch exposes an old bug (vxlan depended on
default_device_exit_batch being called before vxlan_exit_batch())
I will fix it, thanks Paolo.
^ permalink raw reply [flat|nested] 20+ messages in thread
end of thread, other threads:[~2024-02-06 10:48 UTC | newest]
Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-02-05 12:47 [PATCH v3 net-next 00/15] net: more factorization in cleanup_net() paths Eric Dumazet
2024-02-05 12:47 ` [PATCH v3 net-next 01/15] net: add exit_batch_rtnl() method Eric Dumazet
2024-02-05 12:47 ` [PATCH v3 net-next 02/15] nexthop: convert nexthop_net_exit_batch to exit_batch_rtnl method Eric Dumazet
2024-02-05 12:47 ` [PATCH v3 net-next 03/15] bareudp: use exit_batch_rtnl() method Eric Dumazet
2024-02-05 12:47 ` [PATCH v3 net-next 04/15] bonding: " Eric Dumazet
2024-02-05 12:47 ` [PATCH v3 net-next 05/15] geneve: " Eric Dumazet
2024-02-06 9:19 ` Antoine Tenart
2024-02-06 10:22 ` Eric Dumazet
2024-02-05 12:47 ` [PATCH v3 net-next 06/15] gtp: " Eric Dumazet
2024-02-05 12:47 ` [PATCH v3 net-next 07/15] ipv4: add __unregister_nexthop_notifier() Eric Dumazet
2024-02-05 12:47 ` [PATCH v3 net-next 08/15] vxlan: use exit_batch_rtnl() method Eric Dumazet
2024-02-05 12:47 ` [PATCH v3 net-next 09/15] ip6_gre: " Eric Dumazet
2024-02-05 12:47 ` [PATCH v3 net-next 10/15] ip6_tunnel: " Eric Dumazet
2024-02-05 12:47 ` [PATCH v3 net-next 11/15] ip6_vti: " Eric Dumazet
2024-02-05 12:47 ` [PATCH v3 net-next 12/15] sit: " Eric Dumazet
2024-02-05 12:47 ` [PATCH v3 net-next 13/15] ip_tunnel: " Eric Dumazet
2024-02-05 12:47 ` [PATCH v3 net-next 14/15] bridge: " Eric Dumazet
2024-02-05 12:47 ` [PATCH v3 net-next 15/15] xfrm: interface: " Eric Dumazet
2024-02-06 9:15 ` [PATCH v3 net-next 00/15] net: more factorization in cleanup_net() paths Paolo Abeni
2024-02-06 10:48 ` Eric Dumazet
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.