All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH net-next v3 0/2] openvswitch: allow specifying ifindex of new interfaces
@ 2022-08-25  2:04 Andrey Zhadchenko
  2022-08-25  2:04 ` [PATCH net-next v3 1/2] " Andrey Zhadchenko
                   ` (3 more replies)
  0 siblings, 4 replies; 5+ messages in thread
From: Andrey Zhadchenko @ 2022-08-25  2:04 UTC (permalink / raw)
  To: netdev
  Cc: dev, pshelar, davem, edumazet, kuba, pabeni, ptikhomirov,
	alexander.mikhalitsyn, avagin, brauner, i.maximets, aconole

Hi!

CRIU currently do not support checkpoint/restore of OVS configurations, but
there was several requests for it. For example,
https://github.com/lxc/lxc/issues/2909

The main problem is ifindexes of newly created interfaces. We realy need to
preserve them after restore. Current openvswitch API does not allow to
specify ifindex. Most of the time we can just create an interface via
generic netlink requests and plug it into ovs but datapaths (generally any
OVS_VPORT_TYPE_INTERNAL) can only be created via openvswitch requests which
do not support selecting ifindex.

This patch allows to do so.
For new datapaths I decided to use dp_infindex in header as infindex
because it control ifindex for other requests too.
For internal vports I reused OVS_VPORT_ATTR_IFINDEX.

The only concern I have is that previously dp_ifindex was not used for
OVS_DP_VMD_NEW requests and some software may not set it to zero. However
we have been running this patch at Virtuozzo for 2 years and have not
encountered this problem. Not sure if it is worth to add new
ovs_datapath_attr instead.

v2:
Added two more patches.

Add OVS_DP_ATTR_PER_CPU_PIDS to dumps as suggested by Ilya Maximets.
Without it we won't be able to checkpoint/restore new openvswitch
configurations which use OVS_DP_F_DISPATCH_UPCALL_PER_CPU flag.

Found and fixed memory leak on datapath creation error path.

v3:
Sent memleak fix separately to net.
Improved patches according to the reviews:
 - Added new OVS_DP_ATTR_IFINDEX instead of using ovs_header->dp_ifindex
 - Pre-allocated bigger reply message for upcall pids
 - Some small fixes

Andrey Zhadchenko (2):
  openvswitch: allow specifying ifindex of new interfaces
  openvswitch: add OVS_DP_ATTR_PER_CPU_PIDS to get requests

 include/uapi/linux/openvswitch.h     |  3 +++
 net/openvswitch/datapath.c           | 21 ++++++++++++++++++---
 net/openvswitch/vport-internal_dev.c |  1 +
 net/openvswitch/vport.h              |  2 ++
 4 files changed, 24 insertions(+), 3 deletions(-)

-- 
2.31.1


^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH net-next v3 1/2] openvswitch: allow specifying ifindex of new interfaces
  2022-08-25  2:04 [PATCH net-next v3 0/2] openvswitch: allow specifying ifindex of new interfaces Andrey Zhadchenko
@ 2022-08-25  2:04 ` Andrey Zhadchenko
  2022-08-25  2:04 ` [PATCH net-next v3 2/2] openvswitch: add OVS_DP_ATTR_PER_CPU_PIDS to get requests Andrey Zhadchenko
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 5+ messages in thread
From: Andrey Zhadchenko @ 2022-08-25  2:04 UTC (permalink / raw)
  To: netdev
  Cc: dev, pshelar, davem, edumazet, kuba, pabeni, ptikhomirov,
	alexander.mikhalitsyn, avagin, brauner, i.maximets, aconole

CRIU is preserving ifindexes of net devices after restoration. However,
current Open vSwitch API does not allow to target ifindex, so we cannot
correctly restore OVS configuration.

Add new OVS_DP_ATTR_IFINDEX for OVS_DP_CMD_NEW and use it as desired
ifindex.
Use OVS_VPORT_ATTR_IFINDEX during OVS_VPORT_CMD_NEW to specify new netdev
ifindex.

Signed-off-by: Andrey Zhadchenko <andrey.zhadchenko@virtuozzo.com>
---

v3:
Added new OVS_DP_ATTR_IFINDEX instead of using ovs_header->dp_ifindex to
ensure we won't break existing software
Removed unnecessary OVS_VPORT_TYPE_INTERNAL check
Moved desired_ifindex in struct vport_parms for better padding.
Added some documentation

 include/uapi/linux/openvswitch.h     |  3 +++
 net/openvswitch/datapath.c           | 11 +++++++++--
 net/openvswitch/vport-internal_dev.c |  1 +
 net/openvswitch/vport.h              |  2 ++
 4 files changed, 15 insertions(+), 2 deletions(-)

diff --git a/include/uapi/linux/openvswitch.h b/include/uapi/linux/openvswitch.h
index ce3e1738d427..94066f87e9ee 100644
--- a/include/uapi/linux/openvswitch.h
+++ b/include/uapi/linux/openvswitch.h
@@ -76,6 +76,8 @@ enum ovs_datapath_cmd {
  * datapath.  Always present in notifications.
  * @OVS_DP_ATTR_MEGAFLOW_STATS: Statistics about mega flow masks usage for the
  * datapath. Always present in notifications.
+ * @OVS_DP_ATTR_IFINDEX: Interface index for a new datapath netdev. Only
+ * valid for %OVS_DP_CMD_NEW requests.
  *
  * These attributes follow the &struct ovs_header within the Generic Netlink
  * payload for %OVS_DP_* commands.
@@ -92,6 +94,7 @@ enum ovs_datapath_attr {
 	OVS_DP_ATTR_PER_CPU_PIDS,   /* Netlink PIDS to receive upcalls in
 				     * per-cpu dispatch mode
 				     */
+	OVS_DP_ATTR_IFINDEX,
 	__OVS_DP_ATTR_MAX
 };
 
diff --git a/net/openvswitch/datapath.c b/net/openvswitch/datapath.c
index 7e8a39a35627..2ca86b53c032 100644
--- a/net/openvswitch/datapath.c
+++ b/net/openvswitch/datapath.c
@@ -1779,6 +1779,8 @@ static int ovs_dp_cmd_new(struct sk_buff *skb, struct genl_info *info)
 	parms.dp = dp;
 	parms.port_no = OVSP_LOCAL;
 	parms.upcall_portids = a[OVS_DP_ATTR_UPCALL_PID];
+	parms.desired_ifindex = a[OVS_DP_ATTR_IFINDEX]
+		? nla_get_u32(a[OVS_DP_ATTR_IFINDEX]) : 0;
 
 	/* So far only local changes have been made, now need the lock. */
 	ovs_lock();
@@ -1996,6 +1998,7 @@ static const struct nla_policy datapath_policy[OVS_DP_ATTR_MAX + 1] = {
 	[OVS_DP_ATTR_USER_FEATURES] = { .type = NLA_U32 },
 	[OVS_DP_ATTR_MASKS_CACHE_SIZE] =  NLA_POLICY_RANGE(NLA_U32, 0,
 		PCPU_MIN_UNIT_SIZE / sizeof(struct mask_cache_entry)),
+	[OVS_DP_ATTR_IFINDEX] = {.type = NLA_U32 },
 };
 
 static const struct genl_small_ops dp_datapath_genl_ops[] = {
@@ -2199,7 +2202,10 @@ static int ovs_vport_cmd_new(struct sk_buff *skb, struct genl_info *info)
 	if (!a[OVS_VPORT_ATTR_NAME] || !a[OVS_VPORT_ATTR_TYPE] ||
 	    !a[OVS_VPORT_ATTR_UPCALL_PID])
 		return -EINVAL;
-	if (a[OVS_VPORT_ATTR_IFINDEX])
+
+	parms.type = nla_get_u32(a[OVS_VPORT_ATTR_TYPE]);
+
+	if (a[OVS_VPORT_ATTR_IFINDEX] && parms.type != OVS_VPORT_TYPE_INTERNAL)
 		return -EOPNOTSUPP;
 
 	port_no = a[OVS_VPORT_ATTR_PORT_NO]
@@ -2236,11 +2242,12 @@ static int ovs_vport_cmd_new(struct sk_buff *skb, struct genl_info *info)
 	}
 
 	parms.name = nla_data(a[OVS_VPORT_ATTR_NAME]);
-	parms.type = nla_get_u32(a[OVS_VPORT_ATTR_TYPE]);
 	parms.options = a[OVS_VPORT_ATTR_OPTIONS];
 	parms.dp = dp;
 	parms.port_no = port_no;
 	parms.upcall_portids = a[OVS_VPORT_ATTR_UPCALL_PID];
+	parms.desired_ifindex = a[OVS_VPORT_ATTR_IFINDEX]
+		? nla_get_u32(a[OVS_VPORT_ATTR_IFINDEX]) : 0;
 
 	vport = new_vport(&parms);
 	err = PTR_ERR(vport);
diff --git a/net/openvswitch/vport-internal_dev.c b/net/openvswitch/vport-internal_dev.c
index 5b2ee9c1c00b..f0059b7243df 100644
--- a/net/openvswitch/vport-internal_dev.c
+++ b/net/openvswitch/vport-internal_dev.c
@@ -147,6 +147,7 @@ static struct vport *internal_dev_create(const struct vport_parms *parms)
 	}
 
 	dev_net_set(vport->dev, ovs_dp_get_net(vport->dp));
+	dev->ifindex = parms->desired_ifindex;
 	internal_dev = internal_dev_priv(vport->dev);
 	internal_dev->vport = vport;
 
diff --git a/net/openvswitch/vport.h b/net/openvswitch/vport.h
index 9de5030d9801..7d276f60c000 100644
--- a/net/openvswitch/vport.h
+++ b/net/openvswitch/vport.h
@@ -90,12 +90,14 @@ struct vport {
  * @type: New vport's type.
  * @options: %OVS_VPORT_ATTR_OPTIONS attribute from Netlink message, %NULL if
  * none was supplied.
+ * @desired_ifindex: New vport's ifindex.
  * @dp: New vport's datapath.
  * @port_no: New vport's port number.
  */
 struct vport_parms {
 	const char *name;
 	enum ovs_vport_type type;
+	int desired_ifindex;
 	struct nlattr *options;
 
 	/* For ovs_vport_alloc(). */
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH net-next v3 2/2] openvswitch: add OVS_DP_ATTR_PER_CPU_PIDS to get requests
  2022-08-25  2:04 [PATCH net-next v3 0/2] openvswitch: allow specifying ifindex of new interfaces Andrey Zhadchenko
  2022-08-25  2:04 ` [PATCH net-next v3 1/2] " Andrey Zhadchenko
@ 2022-08-25  2:04 ` Andrey Zhadchenko
  2022-08-26  9:11 ` [PATCH net-next v3 0/2] openvswitch: allow specifying ifindex of new interfaces Christian Brauner
  2022-08-27  2:50 ` patchwork-bot+netdevbpf
  3 siblings, 0 replies; 5+ messages in thread
From: Andrey Zhadchenko @ 2022-08-25  2:04 UTC (permalink / raw)
  To: netdev
  Cc: dev, pshelar, davem, edumazet, kuba, pabeni, ptikhomirov,
	alexander.mikhalitsyn, avagin, brauner, i.maximets, aconole

CRIU needs OVS_DP_ATTR_PER_CPU_PIDS to checkpoint/restore newest
openvswitch versions.
Add pids to generic datapath reply. Limit exported pids amount to
nr_cpu_ids.

Signed-off-by: Andrey Zhadchenko <andrey.zhadchenko@virtuozzo.com>
---

v3:
Greatly reduce patch size by allocating message big enough to fit reply
in all cases. Now there is no need to move allocation under the lock.
Remove unnecessary 64bit aligning for pids nlattr.


 net/openvswitch/datapath.c | 10 +++++++++-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/net/openvswitch/datapath.c b/net/openvswitch/datapath.c
index 2ca86b53c032..7bcb36705518 100644
--- a/net/openvswitch/datapath.c
+++ b/net/openvswitch/datapath.c
@@ -1515,6 +1515,7 @@ static size_t ovs_dp_cmd_msg_size(void)
 	msgsize += nla_total_size_64bit(sizeof(struct ovs_dp_megaflow_stats));
 	msgsize += nla_total_size(sizeof(u32)); /* OVS_DP_ATTR_USER_FEATURES */
 	msgsize += nla_total_size(sizeof(u32)); /* OVS_DP_ATTR_MASKS_CACHE_SIZE */
+	msgsize += nla_total_size(sizeof(u32) * nr_cpu_ids); /* OVS_DP_ATTR_PER_CPU_PIDS */
 
 	return msgsize;
 }
@@ -1526,7 +1527,8 @@ static int ovs_dp_cmd_fill_info(struct datapath *dp, struct sk_buff *skb,
 	struct ovs_header *ovs_header;
 	struct ovs_dp_stats dp_stats;
 	struct ovs_dp_megaflow_stats dp_megaflow_stats;
-	int err;
+	struct dp_nlsk_pids *pids = ovsl_dereference(dp->upcall_portids);
+	int err, pids_len;
 
 	ovs_header = genlmsg_put(skb, portid, seq, &dp_datapath_genl_family,
 				 flags, cmd);
@@ -1556,6 +1558,12 @@ static int ovs_dp_cmd_fill_info(struct datapath *dp, struct sk_buff *skb,
 			ovs_flow_tbl_masks_cache_size(&dp->table)))
 		goto nla_put_failure;
 
+	if (dp->user_features & OVS_DP_F_DISPATCH_UPCALL_PER_CPU && pids) {
+		pids_len = min(pids->n_pids, nr_cpu_ids) * sizeof(u32);
+		if (nla_put(skb, OVS_DP_ATTR_PER_CPU_PIDS, pids_len, &pids->pids))
+			goto nla_put_failure;
+	}
+
 	genlmsg_end(skb, ovs_header);
 	return 0;
 
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH net-next v3 0/2] openvswitch: allow specifying ifindex of new interfaces
  2022-08-25  2:04 [PATCH net-next v3 0/2] openvswitch: allow specifying ifindex of new interfaces Andrey Zhadchenko
  2022-08-25  2:04 ` [PATCH net-next v3 1/2] " Andrey Zhadchenko
  2022-08-25  2:04 ` [PATCH net-next v3 2/2] openvswitch: add OVS_DP_ATTR_PER_CPU_PIDS to get requests Andrey Zhadchenko
@ 2022-08-26  9:11 ` Christian Brauner
  2022-08-27  2:50 ` patchwork-bot+netdevbpf
  3 siblings, 0 replies; 5+ messages in thread
From: Christian Brauner @ 2022-08-26  9:11 UTC (permalink / raw)
  To: Andrey Zhadchenko
  Cc: netdev, dev, pshelar, davem, edumazet, kuba, pabeni, ptikhomirov,
	alexander.mikhalitsyn, avagin, i.maximets, aconole

On Thu, Aug 25, 2022 at 05:04:48AM +0300, Andrey Zhadchenko wrote:
> Hi!
> 
> CRIU currently do not support checkpoint/restore of OVS configurations, but
> there was several requests for it. For example,
> https://github.com/lxc/lxc/issues/2909
> 
> The main problem is ifindexes of newly created interfaces. We realy need to
> preserve them after restore. Current openvswitch API does not allow to
> specify ifindex. Most of the time we can just create an interface via
> generic netlink requests and plug it into ovs but datapaths (generally any
> OVS_VPORT_TYPE_INTERNAL) can only be created via openvswitch requests which
> do not support selecting ifindex.
> 
> This patch allows to do so.
> For new datapaths I decided to use dp_infindex in header as infindex
> because it control ifindex for other requests too.
> For internal vports I reused OVS_VPORT_ATTR_IFINDEX.
> 
> The only concern I have is that previously dp_ifindex was not used for
> OVS_DP_VMD_NEW requests and some software may not set it to zero. However
> we have been running this patch at Virtuozzo for 2 years and have not
> encountered this problem. Not sure if it is worth to add new
> ovs_datapath_attr instead.
> 
> v2:
> Added two more patches.
> 
> Add OVS_DP_ATTR_PER_CPU_PIDS to dumps as suggested by Ilya Maximets.
> Without it we won't be able to checkpoint/restore new openvswitch
> configurations which use OVS_DP_F_DISPATCH_UPCALL_PER_CPU flag.
> 
> Found and fixed memory leak on datapath creation error path.
> 
> v3:
> Sent memleak fix separately to net.
> Improved patches according to the reviews:
>  - Added new OVS_DP_ATTR_IFINDEX instead of using ovs_header->dp_ifindex
>  - Pre-allocated bigger reply message for upcall pids
>  - Some small fixes

Seems good,
Acked-by: Christian Brauner (Microsoft) <brauner@kernel.org>

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH net-next v3 0/2] openvswitch: allow specifying ifindex of new interfaces
  2022-08-25  2:04 [PATCH net-next v3 0/2] openvswitch: allow specifying ifindex of new interfaces Andrey Zhadchenko
                   ` (2 preceding siblings ...)
  2022-08-26  9:11 ` [PATCH net-next v3 0/2] openvswitch: allow specifying ifindex of new interfaces Christian Brauner
@ 2022-08-27  2:50 ` patchwork-bot+netdevbpf
  3 siblings, 0 replies; 5+ messages in thread
From: patchwork-bot+netdevbpf @ 2022-08-27  2:50 UTC (permalink / raw)
  To: Andrey Zhadchenko
  Cc: netdev, dev, pshelar, davem, edumazet, kuba, pabeni, ptikhomirov,
	alexander.mikhalitsyn, avagin, brauner, i.maximets, aconole

Hello:

This series was applied to netdev/net-next.git (master)
by Jakub Kicinski <kuba@kernel.org>:

On Thu, 25 Aug 2022 05:04:48 +0300 you wrote:
> Hi!
> 
> CRIU currently do not support checkpoint/restore of OVS configurations, but
> there was several requests for it. For example,
> https://github.com/lxc/lxc/issues/2909
> 
> The main problem is ifindexes of newly created interfaces. We realy need to
> preserve them after restore. Current openvswitch API does not allow to
> specify ifindex. Most of the time we can just create an interface via
> generic netlink requests and plug it into ovs but datapaths (generally any
> OVS_VPORT_TYPE_INTERNAL) can only be created via openvswitch requests which
> do not support selecting ifindex.
> 
> [...]

Here is the summary with links:
  - [net-next,v3,1/2] openvswitch: allow specifying ifindex of new interfaces
    https://git.kernel.org/netdev/net-next/c/54c4ef34c4b6
  - [net-next,v3,2/2] openvswitch: add OVS_DP_ATTR_PER_CPU_PIDS to get requests
    https://git.kernel.org/netdev/net-next/c/347541e299d5

You are awesome, thank you!
-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html



^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2022-08-27  2:50 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-08-25  2:04 [PATCH net-next v3 0/2] openvswitch: allow specifying ifindex of new interfaces Andrey Zhadchenko
2022-08-25  2:04 ` [PATCH net-next v3 1/2] " Andrey Zhadchenko
2022-08-25  2:04 ` [PATCH net-next v3 2/2] openvswitch: add OVS_DP_ATTR_PER_CPU_PIDS to get requests Andrey Zhadchenko
2022-08-26  9:11 ` [PATCH net-next v3 0/2] openvswitch: allow specifying ifindex of new interfaces Christian Brauner
2022-08-27  2:50 ` patchwork-bot+netdevbpf

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.