bpf.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH ipsec-next,v2 0/3]  xfrm: support collect metadata mode for xfrm interfaces
@ 2022-08-25 13:46 Eyal Birger
  2022-08-25 13:46 ` [PATCH ipsec-next,v2 1/3] net: allow storing xfrm interface metadata in metadata_dst Eyal Birger
                   ` (2 more replies)
  0 siblings, 3 replies; 11+ messages in thread
From: Eyal Birger @ 2022-08-25 13:46 UTC (permalink / raw)
  To: davem, edumazet, kuba, pabeni, steffen.klassert, herbert,
	dsahern, contact, pablo, nicolas.dichtel
  Cc: netdev, bpf, Eyal Birger

From: Eyal Birger <eyal@nsof.io>

This series adds support for "collect_md" mode in XFRM interfaces.

This feature is useful for maintaining a large number of IPsec connections
with the benefits of using a network interface while reducing the overhead
of maintaining a large number of devices.

Currently this is possible by having multiple connections share a common
interface by sharing the if_id identifier and using some other criteria
to distinguish between them - such as different subnets or skb marks.
This becomes complex in multi-tenant environments where subnets collide
and the mark space is used for other purposes.

Since the xfrm interface uses the if_id as the differentiator when
looking for policies, setting the if_id in the dst_metadata framework
allows using a single interface for different connections while having
the ability to selectively steer traffic to each one. In addition the
xfrm interface "link" property can also be specified to affect underlying
routing in the context of VRFs.

The series is composed of the following steps:

- Introduce a new METADATA_XFRM metadata type to be used for this purpose.
  Reuse of the existing "METADATA_IP_TUNNEL" type was rejected in [0] as
  XFRM does not necessarily represent an IP tunnel.

- Add support for collect metadata mode in xfrm interfaces

- Allow setting the XFRM metadata from the LWT infrastructure

Future additions could allow setting/getting the XFRM metadata from eBPF
programs, TC, OVS, NF, etc.

[0] https://patchwork.kernel.org/project/netdevbpf/patch/20201121142823.3629805-1-eyal.birger@gmail.com/#23824575

Eyal Birger (3):
  net: allow storing xfrm interface metadata in metadata_dst
  xfrm: interface: support collect metadata mode
  xfrm: lwtunnel: add lwtunnel support for xfrm interfaces in collect_md
    mode

 include/net/dst_metadata.h    |  31 +++++
 include/net/xfrm.h            |  11 +-
 include/uapi/linux/if_link.h  |   1 +
 include/uapi/linux/lwtunnel.h |  10 ++
 net/core/lwtunnel.c           |   1 +
 net/xfrm/xfrm_input.c         |   7 +-
 net/xfrm/xfrm_interface.c     | 220 +++++++++++++++++++++++++++++++---
 net/xfrm/xfrm_policy.c        |  10 +-
 8 files changed, 263 insertions(+), 28 deletions(-)

----

v2: add "link" property as suggested by Nicolas Dichtel

-- 
2.34.1


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH ipsec-next,v2 1/3] net: allow storing xfrm interface metadata in metadata_dst
  2022-08-25 13:46 [PATCH ipsec-next,v2 0/3] xfrm: support collect metadata mode for xfrm interfaces Eyal Birger
@ 2022-08-25 13:46 ` Eyal Birger
  2022-08-25 13:46 ` [PATCH ipsec-next,v2 2/3] xfrm: interface: support collect metadata mode Eyal Birger
  2022-08-25 13:46 ` [PATCH ipsec-next,v2 3/3] xfrm: lwtunnel: add lwtunnel support for xfrm interfaces in collect_md mode Eyal Birger
  2 siblings, 0 replies; 11+ messages in thread
From: Eyal Birger @ 2022-08-25 13:46 UTC (permalink / raw)
  To: davem, edumazet, kuba, pabeni, steffen.klassert, herbert,
	dsahern, contact, pablo, nicolas.dichtel
  Cc: netdev, bpf, Eyal Birger

XFRM interfaces provide the association of various XFRM transformations
to a netdevice using an 'if_id' identifier common to both the XFRM data
structures (polcies, states) and the interface. The if_id is configured by
the controlling entity (usually the IKE daemon) and can be used by the
administrator to define logical relations between different connections.

For example, different connections can share the if_id identifier so
that they pass through the same interface, . However, currently it is
not possible for connections using a different if_id to use the same
interface while retaining the logical separation between them, without
using additional criteria such as skb marks or different traffic
selectors.

When having a large number of connections, it is useful to have a the
logical separation offered by the if_id identifier but use a single
network interface. Similar to the way collect_md mode is used in IP
tunnels.

This patch attempts to enable different configuration mechanisms - such
as ebpf programs, LWT encapsulations, and TC - to attach metadata
to skbs which would carry the if_id. This way a single xfrm interface in
collect_md mode can demux traffic based on this configuration on tx and
provide this metadata on rx.

The XFRM metadata is somewhat similar to ip tunnel metadata in that it
has an "id", and shares similar configuration entities (bpf, tc, ...),
however, it does not necessarily represent an IP tunnel or use other
ip tunnel information, and also has an optional "link" property which
can be used for affecting underlying routing decisions.

Additional xfrm related criteria may also be added in the future.

Therefore, a new metadata type is introduced, to be used in subsequent
patches in the xfrm interface and configuration entities.

Signed-off-by: Eyal Birger <eyal.birger@gmail.com>

----

v2: add "link" property as suggested by Nicolas Dichtel
---
 include/net/dst_metadata.h | 20 ++++++++++++++++++++
 1 file changed, 20 insertions(+)

diff --git a/include/net/dst_metadata.h b/include/net/dst_metadata.h
index adab27ba1ecb..e4b059908cc7 100644
--- a/include/net/dst_metadata.h
+++ b/include/net/dst_metadata.h
@@ -9,6 +9,7 @@
 enum metadata_type {
 	METADATA_IP_TUNNEL,
 	METADATA_HW_PORT_MUX,
+	METADATA_XFRM,
 };
 
 struct hw_port_info {
@@ -16,12 +17,18 @@ struct hw_port_info {
 	u32 port_id;
 };
 
+struct xfrm_md_info {
+	u32 if_id;
+	int link;
+};
+
 struct metadata_dst {
 	struct dst_entry		dst;
 	enum metadata_type		type;
 	union {
 		struct ip_tunnel_info	tun_info;
 		struct hw_port_info	port_info;
+		struct xfrm_md_info	xfrm_info;
 	} u;
 };
 
@@ -53,6 +60,16 @@ skb_tunnel_info(const struct sk_buff *skb)
 	return NULL;
 }
 
+static inline struct xfrm_md_info *skb_xfrm_md_info(const struct sk_buff *skb)
+{
+	struct metadata_dst *md_dst = skb_metadata_dst(skb);
+
+	if (md_dst && md_dst->type == METADATA_XFRM)
+		return &md_dst->u.xfrm_info;
+
+	return NULL;
+}
+
 static inline bool skb_valid_dst(const struct sk_buff *skb)
 {
 	struct dst_entry *dst = skb_dst(skb);
@@ -82,6 +99,9 @@ static inline int skb_metadata_dst_cmp(const struct sk_buff *skb_a,
 		return memcmp(&a->u.tun_info, &b->u.tun_info,
 			      sizeof(a->u.tun_info) +
 					 a->u.tun_info.options_len);
+	case METADATA_XFRM:
+		return memcmp(&a->u.xfrm_info, &b->u.xfrm_info,
+			      sizeof(a->u.xfrm_info));
 	default:
 		return 1;
 	}
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH ipsec-next,v2 2/3] xfrm: interface: support collect metadata mode
  2022-08-25 13:46 [PATCH ipsec-next,v2 0/3] xfrm: support collect metadata mode for xfrm interfaces Eyal Birger
  2022-08-25 13:46 ` [PATCH ipsec-next,v2 1/3] net: allow storing xfrm interface metadata in metadata_dst Eyal Birger
@ 2022-08-25 13:46 ` Eyal Birger
  2022-08-25 14:24   ` Nikolay Aleksandrov
  2022-08-25 14:28   ` Nicolas Dichtel
  2022-08-25 13:46 ` [PATCH ipsec-next,v2 3/3] xfrm: lwtunnel: add lwtunnel support for xfrm interfaces in collect_md mode Eyal Birger
  2 siblings, 2 replies; 11+ messages in thread
From: Eyal Birger @ 2022-08-25 13:46 UTC (permalink / raw)
  To: davem, edumazet, kuba, pabeni, steffen.klassert, herbert,
	dsahern, contact, pablo, nicolas.dichtel
  Cc: netdev, bpf, Eyal Birger

This commit adds support for 'collect_md' mode on xfrm interfaces.

Each net can have one collect_md device, created by providing the
IFLA_XFRM_COLLECT_METADATA flag at creation. This device cannot be
altered and has no if_id or link device attributes.

On transmit to this device, the if_id is fetched from the attached dst
metadata on the skb. If exists, the link property is also fetched from
the metadata. The dst metadata type used is METADATA_XFRM which holds
these properties.

On the receive side, xfrmi_rcv_cb() populates a dst metadata for each
packet received and attaches it to the skb. The if_id used in this case is
fetched from the xfrm state, and the link is fetched from the incoming
device. This information can later be used by upper layers such as tc,
ebpf, and ip rules.

Because the skb is scrubed in xfrmi_rcv_cb(), the attachment of the dst
metadata is postponed until after scrubing. Similarly, xfrm_input() is
adapted to avoid dropping metadata dsts by only dropping 'valid'
(skb_valid_dst(skb) == true) dsts.

Policy matching on packets arriving from collect_md xfrmi devices is
done by using the xfrm state existing in the skb's sec_path.
The xfrm_if_cb.decode_cb() interface implemented by xfrmi_decode_session()
is changed to keep the details of the if_id extraction tucked away
in xfrm_interface.c.

Signed-off-by: Eyal Birger <eyal.birger@gmail.com>

----

v2:
  - add "link" property as suggested by Nicolas Dichtel
  - rename xfrm_if_decode_session_params to xfrm_if_decode_session_result
---
 include/net/xfrm.h           |  11 +++-
 include/uapi/linux/if_link.h |   1 +
 net/xfrm/xfrm_input.c        |   7 +-
 net/xfrm/xfrm_interface.c    | 120 +++++++++++++++++++++++++++++------
 net/xfrm/xfrm_policy.c       |  10 +--
 5 files changed, 121 insertions(+), 28 deletions(-)

diff --git a/include/net/xfrm.h b/include/net/xfrm.h
index 6e8fa98f786f..28b988577ed2 100644
--- a/include/net/xfrm.h
+++ b/include/net/xfrm.h
@@ -312,9 +312,15 @@ struct km_event {
 	struct net *net;
 };
 
+struct xfrm_if_decode_session_result {
+	struct net *net;
+	u32 if_id;
+};
+
 struct xfrm_if_cb {
-	struct xfrm_if	*(*decode_session)(struct sk_buff *skb,
-					   unsigned short family);
+	bool (*decode_session)(struct sk_buff *skb,
+			       unsigned short family,
+			       struct xfrm_if_decode_session_result *res);
 };
 
 void xfrm_if_register_cb(const struct xfrm_if_cb *ifcb);
@@ -985,6 +991,7 @@ void xfrm_dst_ifdown(struct dst_entry *dst, struct net_device *dev);
 struct xfrm_if_parms {
 	int link;		/* ifindex of underlying L2 interface */
 	u32 if_id;		/* interface identifyer */
+	bool collect_md;
 };
 
 struct xfrm_if {
diff --git a/include/uapi/linux/if_link.h b/include/uapi/linux/if_link.h
index e36d9d2c65a7..d96f13a42589 100644
--- a/include/uapi/linux/if_link.h
+++ b/include/uapi/linux/if_link.h
@@ -694,6 +694,7 @@ enum {
 	IFLA_XFRM_UNSPEC,
 	IFLA_XFRM_LINK,
 	IFLA_XFRM_IF_ID,
+	IFLA_XFRM_COLLECT_METADATA,
 	__IFLA_XFRM_MAX
 };
 
diff --git a/net/xfrm/xfrm_input.c b/net/xfrm/xfrm_input.c
index 144238a50f3d..25e822fb5771 100644
--- a/net/xfrm/xfrm_input.c
+++ b/net/xfrm/xfrm_input.c
@@ -20,6 +20,7 @@
 #include <net/xfrm.h>
 #include <net/ip_tunnels.h>
 #include <net/ip6_tunnel.h>
+#include <net/dst_metadata.h>
 
 #include "xfrm_inout.h"
 
@@ -720,7 +721,8 @@ int xfrm_input(struct sk_buff *skb, int nexthdr, __be32 spi, int encap_type)
 		sp = skb_sec_path(skb);
 		if (sp)
 			sp->olen = 0;
-		skb_dst_drop(skb);
+		if (skb_valid_dst(skb))
+			skb_dst_drop(skb);
 		gro_cells_receive(&gro_cells, skb);
 		return 0;
 	} else {
@@ -738,7 +740,8 @@ int xfrm_input(struct sk_buff *skb, int nexthdr, __be32 spi, int encap_type)
 			sp = skb_sec_path(skb);
 			if (sp)
 				sp->olen = 0;
-			skb_dst_drop(skb);
+			if (skb_valid_dst(skb))
+				skb_dst_drop(skb);
 			gro_cells_receive(&gro_cells, skb);
 			return err;
 		}
diff --git a/net/xfrm/xfrm_interface.c b/net/xfrm/xfrm_interface.c
index 5113fa0fbcee..389d8be12801 100644
--- a/net/xfrm/xfrm_interface.c
+++ b/net/xfrm/xfrm_interface.c
@@ -41,6 +41,7 @@
 #include <net/addrconf.h>
 #include <net/xfrm.h>
 #include <net/net_namespace.h>
+#include <net/dst_metadata.h>
 #include <net/netns/generic.h>
 #include <linux/etherdevice.h>
 
@@ -56,6 +57,7 @@ static const struct net_device_ops xfrmi_netdev_ops;
 struct xfrmi_net {
 	/* lists for storing interfaces in use */
 	struct xfrm_if __rcu *xfrmi[XFRMI_HASH_SIZE];
+	struct xfrm_if __rcu *collect_md_xfrmi;
 };
 
 #define for_each_xfrmi_rcu(start, xi) \
@@ -77,17 +79,23 @@ static struct xfrm_if *xfrmi_lookup(struct net *net, struct xfrm_state *x)
 			return xi;
 	}
 
+	xi = rcu_dereference(xfrmn->collect_md_xfrmi);
+	if (xi && xi->dev->flags & IFF_UP)
+		return xi;
+
 	return NULL;
 }
 
-static struct xfrm_if *xfrmi_decode_session(struct sk_buff *skb,
-					    unsigned short family)
+static bool xfrmi_decode_session(struct sk_buff *skb,
+				 unsigned short family,
+				 struct xfrm_if_decode_session_result *res)
 {
 	struct net_device *dev;
+	struct xfrm_if *xi;
 	int ifindex = 0;
 
 	if (!secpath_exists(skb) || !skb->dev)
-		return NULL;
+		return false;
 
 	switch (family) {
 	case AF_INET6:
@@ -107,11 +115,18 @@ static struct xfrm_if *xfrmi_decode_session(struct sk_buff *skb,
 	}
 
 	if (!dev || !(dev->flags & IFF_UP))
-		return NULL;
+		return false;
 	if (dev->netdev_ops != &xfrmi_netdev_ops)
-		return NULL;
+		return false;
 
-	return netdev_priv(dev);
+	xi = netdev_priv(dev);
+	res->net = xi->net;
+
+	if (xi->p.collect_md)
+		res->if_id = xfrm_input_state(skb)->if_id;
+	else
+		res->if_id = xi->p.if_id;
+	return true;
 }
 
 static void xfrmi_link(struct xfrmi_net *xfrmn, struct xfrm_if *xi)
@@ -157,7 +172,10 @@ static int xfrmi_create(struct net_device *dev)
 	if (err < 0)
 		goto out;
 
-	xfrmi_link(xfrmn, xi);
+	if (xi->p.collect_md)
+		rcu_assign_pointer(xfrmn->collect_md_xfrmi, xi);
+	else
+		xfrmi_link(xfrmn, xi);
 
 	return 0;
 
@@ -185,7 +203,10 @@ static void xfrmi_dev_uninit(struct net_device *dev)
 	struct xfrm_if *xi = netdev_priv(dev);
 	struct xfrmi_net *xfrmn = net_generic(xi->net, xfrmi_net_id);
 
-	xfrmi_unlink(xfrmn, xi);
+	if (xi->p.collect_md)
+		rcu_assign_pointer(xfrmn->collect_md_xfrmi, NULL);
+	else
+		xfrmi_unlink(xfrmn, xi);
 }
 
 static void xfrmi_scrub_packet(struct sk_buff *skb, bool xnet)
@@ -214,6 +235,7 @@ static int xfrmi_rcv_cb(struct sk_buff *skb, int err)
 	struct xfrm_state *x;
 	struct xfrm_if *xi;
 	bool xnet;
+	int link;
 
 	if (err && !secpath_exists(skb))
 		return 0;
@@ -224,6 +246,7 @@ static int xfrmi_rcv_cb(struct sk_buff *skb, int err)
 	if (!xi)
 		return 1;
 
+	link = skb->dev->ifindex;
 	dev = xi->dev;
 	skb->dev = dev;
 
@@ -254,6 +277,17 @@ static int xfrmi_rcv_cb(struct sk_buff *skb, int err)
 	}
 
 	xfrmi_scrub_packet(skb, xnet);
+	if (xi->p.collect_md) {
+		struct metadata_dst *md_dst;
+
+		md_dst = metadata_dst_alloc(0, METADATA_XFRM, GFP_ATOMIC);
+		if (!md_dst)
+			return -ENOMEM;
+
+		md_dst->u.xfrm_info.if_id = x->if_id;
+		md_dst->u.xfrm_info.link = link;
+		skb_dst_set(skb, (struct dst_entry *)md_dst);
+	}
 	dev_sw_netstats_rx_add(dev, skb->len);
 
 	return 0;
@@ -269,10 +303,24 @@ xfrmi_xmit2(struct sk_buff *skb, struct net_device *dev, struct flowi *fl)
 	struct net_device *tdev;
 	struct xfrm_state *x;
 	int err = -1;
+	u32 if_id;
 	int mtu;
 
+	if (xi->p.collect_md) {
+		struct xfrm_md_info *md_info = skb_xfrm_md_info(skb);
+
+		if (unlikely(!md_info))
+			return -EINVAL;
+
+		if_id = md_info->if_id;
+		if (md_info->link)
+			fl->flowi_oif = md_info->link;
+	} else {
+		if_id = xi->p.if_id;
+	}
+
 	dst_hold(dst);
-	dst = xfrm_lookup_with_ifid(xi->net, dst, fl, NULL, 0, xi->p.if_id);
+	dst = xfrm_lookup_with_ifid(xi->net, dst, fl, NULL, 0, if_id);
 	if (IS_ERR(dst)) {
 		err = PTR_ERR(dst);
 		dst = NULL;
@@ -283,7 +331,7 @@ xfrmi_xmit2(struct sk_buff *skb, struct net_device *dev, struct flowi *fl)
 	if (!x)
 		goto tx_err_link_failure;
 
-	if (x->if_id != xi->p.if_id)
+	if (x->if_id != if_id)
 		goto tx_err_link_failure;
 
 	tdev = dst->dev;
@@ -633,6 +681,9 @@ static void xfrmi_netlink_parms(struct nlattr *data[],
 
 	if (data[IFLA_XFRM_IF_ID])
 		parms->if_id = nla_get_u32(data[IFLA_XFRM_IF_ID]);
+
+	if (data[IFLA_XFRM_COLLECT_METADATA])
+		parms->collect_md = true;
 }
 
 static int xfrmi_newlink(struct net *src_net, struct net_device *dev,
@@ -645,14 +696,27 @@ static int xfrmi_newlink(struct net *src_net, struct net_device *dev,
 	int err;
 
 	xfrmi_netlink_parms(data, &p);
-	if (!p.if_id) {
-		NL_SET_ERR_MSG(extack, "if_id must be non zero");
-		return -EINVAL;
-	}
+	if (p.collect_md) {
+		struct xfrmi_net *xfrmn = net_generic(net, xfrmi_net_id);
 
-	xi = xfrmi_locate(net, &p);
-	if (xi)
-		return -EEXIST;
+		if (p.link || p.if_id) {
+			NL_SET_ERR_MSG(extack, "link and if_id must be zero");
+			return -EINVAL;
+		}
+
+		if (rtnl_dereference(xfrmn->collect_md_xfrmi))
+			return -EEXIST;
+
+	} else {
+		if (!p.if_id) {
+			NL_SET_ERR_MSG(extack, "if_id must be non zero");
+			return -EINVAL;
+		}
+
+		xi = xfrmi_locate(net, &p);
+		if (xi)
+			return -EEXIST;
+	}
 
 	xi = netdev_priv(dev);
 	xi->p = p;
@@ -682,12 +746,19 @@ static int xfrmi_changelink(struct net_device *dev, struct nlattr *tb[],
 		return -EINVAL;
 	}
 
+	if (p.collect_md) {
+		NL_SET_ERR_MSG(extack, "collect_md can't be changed");
+		return -EINVAL;
+	}
+
 	xi = xfrmi_locate(net, &p);
 	if (!xi) {
 		xi = netdev_priv(dev);
 	} else {
 		if (xi->dev != dev)
 			return -EEXIST;
+		if (xi->p.collect_md)
+			return -EINVAL;
 	}
 
 	return xfrmi_update(xi, &p);
@@ -700,6 +771,8 @@ static size_t xfrmi_get_size(const struct net_device *dev)
 		nla_total_size(4) +
 		/* IFLA_XFRM_IF_ID */
 		nla_total_size(4) +
+		/* IFLA_XFRM_COLLECT_METADATA */
+		nla_total_size(0) +
 		0;
 }
 
@@ -711,6 +784,10 @@ static int xfrmi_fill_info(struct sk_buff *skb, const struct net_device *dev)
 	if (nla_put_u32(skb, IFLA_XFRM_LINK, parm->link) ||
 	    nla_put_u32(skb, IFLA_XFRM_IF_ID, parm->if_id))
 		goto nla_put_failure;
+	if (xi->p.collect_md) {
+		if (nla_put_flag(skb, IFLA_XFRM_COLLECT_METADATA))
+			goto nla_put_failure;
+	}
 	return 0;
 
 nla_put_failure:
@@ -725,8 +802,10 @@ static struct net *xfrmi_get_link_net(const struct net_device *dev)
 }
 
 static const struct nla_policy xfrmi_policy[IFLA_XFRM_MAX + 1] = {
-	[IFLA_XFRM_LINK]	= { .type = NLA_U32 },
-	[IFLA_XFRM_IF_ID]	= { .type = NLA_U32 },
+	[IFLA_XFRM_UNSPEC]		= { .strict_start_type = IFLA_XFRM_COLLECT_METADATA },
+	[IFLA_XFRM_LINK]		= { .type = NLA_U32 },
+	[IFLA_XFRM_IF_ID]		= { .type = NLA_U32 },
+	[IFLA_XFRM_COLLECT_METADATA]	= { .type = NLA_FLAG },
 };
 
 static struct rtnl_link_ops xfrmi_link_ops __read_mostly = {
@@ -762,6 +841,9 @@ static void __net_exit xfrmi_exit_batch_net(struct list_head *net_exit_list)
 			     xip = &xi->next)
 				unregister_netdevice_queue(xi->dev, &list);
 		}
+		xi = rcu_dereference(xfrmn->collect_md_xfrmi);
+		if (xi)
+			unregister_netdevice_queue(xi->dev, &list);
 	}
 	unregister_netdevice_many(&list);
 	rtnl_unlock();
diff --git a/net/xfrm/xfrm_policy.c b/net/xfrm/xfrm_policy.c
index f1a0bab920a5..70376f3fe84a 100644
--- a/net/xfrm/xfrm_policy.c
+++ b/net/xfrm/xfrm_policy.c
@@ -3516,17 +3516,17 @@ int __xfrm_policy_check(struct sock *sk, int dir, struct sk_buff *skb,
 	int xerr_idx = -1;
 	const struct xfrm_if_cb *ifcb;
 	struct sec_path *sp;
-	struct xfrm_if *xi;
 	u32 if_id = 0;
 
 	rcu_read_lock();
 	ifcb = xfrm_if_get_cb();
 
 	if (ifcb) {
-		xi = ifcb->decode_session(skb, family);
-		if (xi) {
-			if_id = xi->p.if_id;
-			net = xi->net;
+		struct xfrm_if_decode_session_result r;
+
+		if (ifcb->decode_session(skb, family, &r)) {
+			if_id = r.if_id;
+			net = r.net;
 		}
 	}
 	rcu_read_unlock();
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH ipsec-next,v2 3/3] xfrm: lwtunnel: add lwtunnel support for xfrm interfaces in collect_md mode
  2022-08-25 13:46 [PATCH ipsec-next,v2 0/3] xfrm: support collect metadata mode for xfrm interfaces Eyal Birger
  2022-08-25 13:46 ` [PATCH ipsec-next,v2 1/3] net: allow storing xfrm interface metadata in metadata_dst Eyal Birger
  2022-08-25 13:46 ` [PATCH ipsec-next,v2 2/3] xfrm: interface: support collect metadata mode Eyal Birger
@ 2022-08-25 13:46 ` Eyal Birger
  2022-08-25 14:30   ` Nikolay Aleksandrov
  2022-08-25 14:30   ` Nicolas Dichtel
  2 siblings, 2 replies; 11+ messages in thread
From: Eyal Birger @ 2022-08-25 13:46 UTC (permalink / raw)
  To: davem, edumazet, kuba, pabeni, steffen.klassert, herbert,
	dsahern, contact, pablo, nicolas.dichtel
  Cc: netdev, bpf, Eyal Birger

Allow specifying the xfrm interface if_id and link as part of a route
metadata using the lwtunnel infrastructure.

This allows for example using a single xfrm interface in collect_md
mode as the target of multiple routes each specifying a different if_id.

With the appropriate changes to iproute2, considering an xfrm device
ipsec1 in collect_md mode one can for example add a route specifying
an if_id like so:

ip route add <SUBNET> dev ipsec1 encap xfrm if_id 1

In which case traffic routed to the device via this route would use
if_id in the xfrm interface policy lookup.

Or in the context of vrf, one can also specify the "link" property:

ip route add <SUBNET> dev ipsec1 encap xfrm if_id 1 dev eth15

Signed-off-by: Eyal Birger <eyal.birger@gmail.com>

----

v2:
 - move lwt_xfrm_info() helper to dst_metadata.h
 - add "link" property as suggested by Nicolas Dichtel
---
 include/net/dst_metadata.h    |  11 ++++
 include/uapi/linux/lwtunnel.h |  10 ++++
 net/core/lwtunnel.c           |   1 +
 net/xfrm/xfrm_interface.c     | 100 ++++++++++++++++++++++++++++++++++
 4 files changed, 122 insertions(+)

diff --git a/include/net/dst_metadata.h b/include/net/dst_metadata.h
index e4b059908cc7..57f75960fa28 100644
--- a/include/net/dst_metadata.h
+++ b/include/net/dst_metadata.h
@@ -60,13 +60,24 @@ skb_tunnel_info(const struct sk_buff *skb)
 	return NULL;
 }
 
+static inline struct xfrm_md_info *lwt_xfrm_info(struct lwtunnel_state *lwt)
+{
+	return (struct xfrm_md_info *)lwt->data;
+}
+
 static inline struct xfrm_md_info *skb_xfrm_md_info(const struct sk_buff *skb)
 {
 	struct metadata_dst *md_dst = skb_metadata_dst(skb);
+	struct dst_entry *dst;
 
 	if (md_dst && md_dst->type == METADATA_XFRM)
 		return &md_dst->u.xfrm_info;
 
+	dst = skb_dst(skb);
+	if (dst && dst->lwtstate &&
+	    dst->lwtstate->type == LWTUNNEL_ENCAP_XFRM)
+		return lwt_xfrm_info(dst->lwtstate);
+
 	return NULL;
 }
 
diff --git a/include/uapi/linux/lwtunnel.h b/include/uapi/linux/lwtunnel.h
index 2e206919125c..229655ef792f 100644
--- a/include/uapi/linux/lwtunnel.h
+++ b/include/uapi/linux/lwtunnel.h
@@ -15,6 +15,7 @@ enum lwtunnel_encap_types {
 	LWTUNNEL_ENCAP_SEG6_LOCAL,
 	LWTUNNEL_ENCAP_RPL,
 	LWTUNNEL_ENCAP_IOAM6,
+	LWTUNNEL_ENCAP_XFRM,
 	__LWTUNNEL_ENCAP_MAX,
 };
 
@@ -111,4 +112,13 @@ enum {
 
 #define LWT_BPF_MAX_HEADROOM 256
 
+enum {
+	LWT_XFRM_UNSPEC,
+	LWT_XFRM_IF_ID,
+	LWT_XFRM_LINK,
+	__LWT_XFRM_MAX,
+};
+
+#define LWT_XFRM_MAX (__LWT_XFRM_MAX - 1)
+
 #endif /* _UAPI_LWTUNNEL_H_ */
diff --git a/net/core/lwtunnel.c b/net/core/lwtunnel.c
index 9ccd64e8a666..6fac2f0ef074 100644
--- a/net/core/lwtunnel.c
+++ b/net/core/lwtunnel.c
@@ -50,6 +50,7 @@ static const char *lwtunnel_encap_str(enum lwtunnel_encap_types encap_type)
 		return "IOAM6";
 	case LWTUNNEL_ENCAP_IP6:
 	case LWTUNNEL_ENCAP_IP:
+	case LWTUNNEL_ENCAP_XFRM:
 	case LWTUNNEL_ENCAP_NONE:
 	case __LWTUNNEL_ENCAP_MAX:
 		/* should not have got here */
diff --git a/net/xfrm/xfrm_interface.c b/net/xfrm/xfrm_interface.c
index 389d8be12801..604de1ee3772 100644
--- a/net/xfrm/xfrm_interface.c
+++ b/net/xfrm/xfrm_interface.c
@@ -60,6 +60,103 @@ struct xfrmi_net {
 	struct xfrm_if __rcu *collect_md_xfrmi;
 };
 
+static const struct nla_policy xfrm_lwt_policy[LWT_XFRM_MAX + 1] = {
+	[LWT_XFRM_UNSPEC]	= { .type = NLA_REJECT },
+	[LWT_XFRM_IF_ID]	= { .type = NLA_U32 },
+	[LWT_XFRM_LINK]		= { .type = NLA_U32 },
+};
+
+static void xfrmi_destroy_state(struct lwtunnel_state *lwt)
+{
+}
+
+static int xfrmi_build_state(struct net *net, struct nlattr *nla,
+			     unsigned int family, const void *cfg,
+			     struct lwtunnel_state **ts,
+			     struct netlink_ext_ack *extack)
+{
+	struct nlattr *tb[LWT_XFRM_MAX + 1];
+	struct lwtunnel_state *new_state;
+	struct xfrm_md_info *info;
+	int ret;
+
+	ret = nla_parse_nested(tb, LWT_XFRM_MAX, nla, xfrm_lwt_policy, extack);
+	if (ret < 0)
+		return ret;
+
+	if (!tb[LWT_XFRM_IF_ID])
+		return -EINVAL;
+
+	new_state = lwtunnel_state_alloc(sizeof(*info));
+	if (!new_state)
+		return -ENOMEM;
+
+	new_state->type = LWTUNNEL_ENCAP_XFRM;
+
+	info = lwt_xfrm_info(new_state);
+
+	info->if_id = nla_get_u32(tb[LWT_XFRM_IF_ID]);
+	if (!info->if_id) {
+		ret = -EINVAL;
+		goto errout;
+	}
+
+	if (tb[LWT_XFRM_LINK]) {
+		info->link = nla_get_u32(tb[LWT_XFRM_LINK]);
+		if (!info->link) {
+			ret = -EINVAL;
+			goto errout;
+		}
+	}
+
+	*ts = new_state;
+	return 0;
+
+errout:
+	xfrmi_destroy_state(new_state);
+	kfree(new_state);
+	return ret;
+}
+
+static int xfrmi_fill_encap_info(struct sk_buff *skb,
+				 struct lwtunnel_state *lwt)
+{
+	struct xfrm_md_info *info = lwt_xfrm_info(lwt);
+
+	if (nla_put_u32(skb, LWT_XFRM_IF_ID, info->if_id))
+		return -EMSGSIZE;
+
+	if (info->link) {
+		if (nla_put_u32(skb, LWT_XFRM_LINK, info->link))
+			return -EMSGSIZE;
+	}
+
+	return 0;
+}
+
+static int xfrmi_encap_nlsize(struct lwtunnel_state *lwtstate)
+{
+	return nla_total_size(4) + /* LWT_XFRM_IF_ID */
+		nla_total_size(4); /* LWT_XFRM_LINK */
+}
+
+static int xfrmi_encap_cmp(struct lwtunnel_state *a, struct lwtunnel_state *b)
+{
+	struct xfrm_md_info *a_info = lwt_xfrm_info(a);
+	struct xfrm_md_info *b_info = lwt_xfrm_info(b);
+
+	return memcmp(a_info, b_info, sizeof(*a_info));
+}
+
+static const struct lwtunnel_encap_ops xfrmi_encap_ops = {
+	.build_state	= xfrmi_build_state,
+	.destroy_state	= xfrmi_destroy_state,
+	.fill_encap	= xfrmi_fill_encap_info,
+	.get_encap_size = xfrmi_encap_nlsize,
+	.cmp_encap	= xfrmi_encap_cmp,
+	.owner		= THIS_MODULE,
+};
+
 #define for_each_xfrmi_rcu(start, xi) \
 	for (xi = rcu_dereference(start); xi; xi = rcu_dereference(xi->next))
 
@@ -1081,6 +1178,8 @@ static int __init xfrmi_init(void)
 	if (err < 0)
 		goto rtnl_link_failed;
 
+	lwtunnel_encap_add_ops(&xfrmi_encap_ops, LWTUNNEL_ENCAP_XFRM);
+
 	xfrm_if_register_cb(&xfrm_if_cb);
 
 	return err;
@@ -1099,6 +1198,7 @@ static int __init xfrmi_init(void)
 static void __exit xfrmi_fini(void)
 {
 	xfrm_if_unregister_cb();
+	lwtunnel_encap_del_ops(&xfrmi_encap_ops, LWTUNNEL_ENCAP_XFRM);
 	rtnl_link_unregister(&xfrmi_link_ops);
 	xfrmi4_fini();
 	xfrmi6_fini();
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH ipsec-next,v2 2/3] xfrm: interface: support collect metadata mode
  2022-08-25 13:46 ` [PATCH ipsec-next,v2 2/3] xfrm: interface: support collect metadata mode Eyal Birger
@ 2022-08-25 14:24   ` Nikolay Aleksandrov
  2022-08-25 15:14     ` Eyal Birger
  2022-08-25 14:28   ` Nicolas Dichtel
  1 sibling, 1 reply; 11+ messages in thread
From: Nikolay Aleksandrov @ 2022-08-25 14:24 UTC (permalink / raw)
  To: Eyal Birger, davem, edumazet, kuba, pabeni, steffen.klassert,
	herbert, dsahern, contact, pablo, nicolas.dichtel
  Cc: netdev, bpf, daniel

On 25/08/2022 16:46, Eyal Birger wrote:
> This commit adds support for 'collect_md' mode on xfrm interfaces.
> 
> Each net can have one collect_md device, created by providing the
> IFLA_XFRM_COLLECT_METADATA flag at creation. This device cannot be
> altered and has no if_id or link device attributes.
> 
> On transmit to this device, the if_id is fetched from the attached dst
> metadata on the skb. If exists, the link property is also fetched from
> the metadata. The dst metadata type used is METADATA_XFRM which holds
> these properties.
> 
> On the receive side, xfrmi_rcv_cb() populates a dst metadata for each
> packet received and attaches it to the skb. The if_id used in this case is
> fetched from the xfrm state, and the link is fetched from the incoming
> device. This information can later be used by upper layers such as tc,
> ebpf, and ip rules.
> 
> Because the skb is scrubed in xfrmi_rcv_cb(), the attachment of the dst
> metadata is postponed until after scrubing. Similarly, xfrm_input() is
> adapted to avoid dropping metadata dsts by only dropping 'valid'
> (skb_valid_dst(skb) == true) dsts.
> 
> Policy matching on packets arriving from collect_md xfrmi devices is
> done by using the xfrm state existing in the skb's sec_path.
> The xfrm_if_cb.decode_cb() interface implemented by xfrmi_decode_session()
> is changed to keep the details of the if_id extraction tucked away
> in xfrm_interface.c.
> 
> Signed-off-by: Eyal Birger <eyal.birger@gmail.com>
> 
> ----
> 
> v2:
>   - add "link" property as suggested by Nicolas Dichtel
>   - rename xfrm_if_decode_session_params to xfrm_if_decode_session_result
> ---

(+CC Daniel)

Hi,
Generally I really like the idea, but I missed to comment the first round. :)
A few comments below..

>  include/net/xfrm.h           |  11 +++-
>  include/uapi/linux/if_link.h |   1 +
>  net/xfrm/xfrm_input.c        |   7 +-
>  net/xfrm/xfrm_interface.c    | 120 +++++++++++++++++++++++++++++------
>  net/xfrm/xfrm_policy.c       |  10 +--
>  5 files changed, 121 insertions(+), 28 deletions(-)
> 
> diff --git a/include/net/xfrm.h b/include/net/xfrm.h
> index 6e8fa98f786f..28b988577ed2 100644
> --- a/include/net/xfrm.h
> +++ b/include/net/xfrm.h
> @@ -312,9 +312,15 @@ struct km_event {
>  	struct net *net;
>  };
>  
> +struct xfrm_if_decode_session_result {
> +	struct net *net;
> +	u32 if_id;
> +};
> +
>  struct xfrm_if_cb {
> -	struct xfrm_if	*(*decode_session)(struct sk_buff *skb,
> -					   unsigned short family);
> +	bool (*decode_session)(struct sk_buff *skb,
> +			       unsigned short family,
> +			       struct xfrm_if_decode_session_result *res);
>  };
>  
>  void xfrm_if_register_cb(const struct xfrm_if_cb *ifcb);
> @@ -985,6 +991,7 @@ void xfrm_dst_ifdown(struct dst_entry *dst, struct net_device *dev);
>  struct xfrm_if_parms {
>  	int link;		/* ifindex of underlying L2 interface */
>  	u32 if_id;		/* interface identifyer */
> +	bool collect_md;
>  };
>  
>  struct xfrm_if {
> diff --git a/include/uapi/linux/if_link.h b/include/uapi/linux/if_link.h
> index e36d9d2c65a7..d96f13a42589 100644
> --- a/include/uapi/linux/if_link.h
> +++ b/include/uapi/linux/if_link.h
> @@ -694,6 +694,7 @@ enum {
>  	IFLA_XFRM_UNSPEC,
>  	IFLA_XFRM_LINK,
>  	IFLA_XFRM_IF_ID,
> +	IFLA_XFRM_COLLECT_METADATA,
>  	__IFLA_XFRM_MAX
>  };
>  
> diff --git a/net/xfrm/xfrm_input.c b/net/xfrm/xfrm_input.c
> index 144238a50f3d..25e822fb5771 100644
> --- a/net/xfrm/xfrm_input.c
> +++ b/net/xfrm/xfrm_input.c
> @@ -20,6 +20,7 @@
>  #include <net/xfrm.h>
>  #include <net/ip_tunnels.h>
>  #include <net/ip6_tunnel.h>
> +#include <net/dst_metadata.h>
>  
>  #include "xfrm_inout.h"
>  
> @@ -720,7 +721,8 @@ int xfrm_input(struct sk_buff *skb, int nexthdr, __be32 spi, int encap_type)
>  		sp = skb_sec_path(skb);
>  		if (sp)
>  			sp->olen = 0;
> -		skb_dst_drop(skb);
> +		if (skb_valid_dst(skb))
> +			skb_dst_drop(skb);
>  		gro_cells_receive(&gro_cells, skb);
>  		return 0;
>  	} else {
> @@ -738,7 +740,8 @@ int xfrm_input(struct sk_buff *skb, int nexthdr, __be32 spi, int encap_type)
>  			sp = skb_sec_path(skb);
>  			if (sp)
>  				sp->olen = 0;
> -			skb_dst_drop(skb);
> +			if (skb_valid_dst(skb))
> +				skb_dst_drop(skb);
>  			gro_cells_receive(&gro_cells, skb);
>  			return err;
>  		}
> diff --git a/net/xfrm/xfrm_interface.c b/net/xfrm/xfrm_interface.c
> index 5113fa0fbcee..389d8be12801 100644
> --- a/net/xfrm/xfrm_interface.c
> +++ b/net/xfrm/xfrm_interface.c
> @@ -41,6 +41,7 @@
>  #include <net/addrconf.h>
>  #include <net/xfrm.h>
>  #include <net/net_namespace.h>
> +#include <net/dst_metadata.h>
>  #include <net/netns/generic.h>
>  #include <linux/etherdevice.h>
>  
> @@ -56,6 +57,7 @@ static const struct net_device_ops xfrmi_netdev_ops;
>  struct xfrmi_net {
>  	/* lists for storing interfaces in use */
>  	struct xfrm_if __rcu *xfrmi[XFRMI_HASH_SIZE];
> +	struct xfrm_if __rcu *collect_md_xfrmi;
>  };
>  
>  #define for_each_xfrmi_rcu(start, xi) \
> @@ -77,17 +79,23 @@ static struct xfrm_if *xfrmi_lookup(struct net *net, struct xfrm_state *x)
>  			return xi;
>  	}
>  
> +	xi = rcu_dereference(xfrmn->collect_md_xfrmi);
> +	if (xi && xi->dev->flags & IFF_UP)

minor nit: xi && (xi->dev->flags & IFF_UP)

> +		return xi;
> +
>  	return NULL;
>  }
>  
> -static struct xfrm_if *xfrmi_decode_session(struct sk_buff *skb,
> -					    unsigned short family)
> +static bool xfrmi_decode_session(struct sk_buff *skb,
> +				 unsigned short family,
> +				 struct xfrm_if_decode_session_result *res)
>  {
>  	struct net_device *dev;
> +	struct xfrm_if *xi;
>  	int ifindex = 0;
>  
>  	if (!secpath_exists(skb) || !skb->dev)
> -		return NULL;
> +		return false;
>  
>  	switch (family) {
>  	case AF_INET6:
> @@ -107,11 +115,18 @@ static struct xfrm_if *xfrmi_decode_session(struct sk_buff *skb,
>  	}
>  
>  	if (!dev || !(dev->flags & IFF_UP))
> -		return NULL;
> +		return false;
>  	if (dev->netdev_ops != &xfrmi_netdev_ops)
> -		return NULL;
> +		return false;
>  
> -	return netdev_priv(dev);
> +	xi = netdev_priv(dev);
> +	res->net = xi->net;
> +
> +	if (xi->p.collect_md)
> +		res->if_id = xfrm_input_state(skb)->if_id;
> +	else
> +		res->if_id = xi->p.if_id;
> +	return true;
>  }
>  
>  static void xfrmi_link(struct xfrmi_net *xfrmn, struct xfrm_if *xi)
> @@ -157,7 +172,10 @@ static int xfrmi_create(struct net_device *dev)
>  	if (err < 0)
>  		goto out;
>  
> -	xfrmi_link(xfrmn, xi);
> +	if (xi->p.collect_md)
> +		rcu_assign_pointer(xfrmn->collect_md_xfrmi, xi);
> +	else
> +		xfrmi_link(xfrmn, xi);
>  
>  	return 0;
>  
> @@ -185,7 +203,10 @@ static void xfrmi_dev_uninit(struct net_device *dev)
>  	struct xfrm_if *xi = netdev_priv(dev);
>  	struct xfrmi_net *xfrmn = net_generic(xi->net, xfrmi_net_id);
>  
> -	xfrmi_unlink(xfrmn, xi);
> +	if (xi->p.collect_md)
> +		rcu_assign_pointer(xfrmn->collect_md_xfrmi, NULL);

RCU_INIT_POINTER()

> +	else
> +		xfrmi_unlink(xfrmn, xi);
>  }
>  
>  static void xfrmi_scrub_packet(struct sk_buff *skb, bool xnet)
> @@ -214,6 +235,7 @@ static int xfrmi_rcv_cb(struct sk_buff *skb, int err)
>  	struct xfrm_state *x;
>  	struct xfrm_if *xi;
>  	bool xnet;
> +	int link;
>  
>  	if (err && !secpath_exists(skb))
>  		return 0;
> @@ -224,6 +246,7 @@ static int xfrmi_rcv_cb(struct sk_buff *skb, int err)
>  	if (!xi)
>  		return 1;
>  
> +	link = skb->dev->ifindex;
>  	dev = xi->dev;
>  	skb->dev = dev;
>  
> @@ -254,6 +277,17 @@ static int xfrmi_rcv_cb(struct sk_buff *skb, int err)
>  	}
>  
>  	xfrmi_scrub_packet(skb, xnet);
> +	if (xi->p.collect_md) {
> +		struct metadata_dst *md_dst;
> +
> +		md_dst = metadata_dst_alloc(0, METADATA_XFRM, GFP_ATOMIC);
> +		if (!md_dst)
> +			return -ENOMEM;
> +
> +		md_dst->u.xfrm_info.if_id = x->if_id;
> +		md_dst->u.xfrm_info.link = link;
> +		skb_dst_set(skb, (struct dst_entry *)md_dst);
> +	}
>  	dev_sw_netstats_rx_add(dev, skb->len);
>  
>  	return 0;
> @@ -269,10 +303,24 @@ xfrmi_xmit2(struct sk_buff *skb, struct net_device *dev, struct flowi *fl)
>  	struct net_device *tdev;
>  	struct xfrm_state *x;
>  	int err = -1;
> +	u32 if_id;
>  	int mtu;
>  
> +	if (xi->p.collect_md) {
> +		struct xfrm_md_info *md_info = skb_xfrm_md_info(skb);
> +
> +		if (unlikely(!md_info))
> +			return -EINVAL;
> +
> +		if_id = md_info->if_id;
> +		if (md_info->link)
> +			fl->flowi_oif = md_info->link;
> +	} else {
> +		if_id = xi->p.if_id;
> +	}
> +
>  	dst_hold(dst);
> -	dst = xfrm_lookup_with_ifid(xi->net, dst, fl, NULL, 0, xi->p.if_id);
> +	dst = xfrm_lookup_with_ifid(xi->net, dst, fl, NULL, 0, if_id);
>  	if (IS_ERR(dst)) {
>  		err = PTR_ERR(dst);
>  		dst = NULL;
> @@ -283,7 +331,7 @@ xfrmi_xmit2(struct sk_buff *skb, struct net_device *dev, struct flowi *fl)
>  	if (!x)
>  		goto tx_err_link_failure;
>  
> -	if (x->if_id != xi->p.if_id)
> +	if (x->if_id != if_id)
>  		goto tx_err_link_failure;
>  
>  	tdev = dst->dev;
> @@ -633,6 +681,9 @@ static void xfrmi_netlink_parms(struct nlattr *data[],
>  
>  	if (data[IFLA_XFRM_IF_ID])
>  		parms->if_id = nla_get_u32(data[IFLA_XFRM_IF_ID]);
> +
> +	if (data[IFLA_XFRM_COLLECT_METADATA])
> +		parms->collect_md = true;
>  }
>  
>  static int xfrmi_newlink(struct net *src_net, struct net_device *dev,
> @@ -645,14 +696,27 @@ static int xfrmi_newlink(struct net *src_net, struct net_device *dev,
>  	int err;
>  
>  	xfrmi_netlink_parms(data, &p);
> -	if (!p.if_id) {
> -		NL_SET_ERR_MSG(extack, "if_id must be non zero");
> -		return -EINVAL;
> -	}
> +	if (p.collect_md) {
> +		struct xfrmi_net *xfrmn = net_generic(net, xfrmi_net_id);
>  
> -	xi = xfrmi_locate(net, &p);
> -	if (xi)
> -		return -EEXIST;
> +		if (p.link || p.if_id) {
> +			NL_SET_ERR_MSG(extack, "link and if_id must be zero");
> +			return -EINVAL;
> +		}
> +
> +		if (rtnl_dereference(xfrmn->collect_md_xfrmi))
> +			return -EEXIST;
> +
> +	} else {
> +		if (!p.if_id) {
> +			NL_SET_ERR_MSG(extack, "if_id must be non zero");
> +			return -EINVAL;
> +		}
> +
> +		xi = xfrmi_locate(net, &p);
> +		if (xi)
> +			return -EEXIST;
> +	}
>  
>  	xi = netdev_priv(dev);
>  	xi->p = p;
> @@ -682,12 +746,19 @@ static int xfrmi_changelink(struct net_device *dev, struct nlattr *tb[],
>  		return -EINVAL;
>  	}
>  
> +	if (p.collect_md) {
> +		NL_SET_ERR_MSG(extack, "collect_md can't be changed");
> +		return -EINVAL;
> +	}
> +
>  	xi = xfrmi_locate(net, &p);
>  	if (!xi) {
>  		xi = netdev_priv(dev);
>  	} else {
>  		if (xi->dev != dev)
>  			return -EEXIST;
> +		if (xi->p.collect_md)
> +			return -EINVAL;
>  	}
>  
>  	return xfrmi_update(xi, &p);
> @@ -700,6 +771,8 @@ static size_t xfrmi_get_size(const struct net_device *dev)
>  		nla_total_size(4) +
>  		/* IFLA_XFRM_IF_ID */
>  		nla_total_size(4) +
> +		/* IFLA_XFRM_COLLECT_METADATA */
> +		nla_total_size(0) +
>  		0;
>  }
>  
> @@ -711,6 +784,10 @@ static int xfrmi_fill_info(struct sk_buff *skb, const struct net_device *dev)
>  	if (nla_put_u32(skb, IFLA_XFRM_LINK, parm->link) ||
>  	    nla_put_u32(skb, IFLA_XFRM_IF_ID, parm->if_id))
>  		goto nla_put_failure;
> +	if (xi->p.collect_md) {
> +		if (nla_put_flag(skb, IFLA_XFRM_COLLECT_METADATA))

minor: these can be combined

> +			goto nla_put_failure;
> +	}
>  	return 0;
>  
>  nla_put_failure:
> @@ -725,8 +802,10 @@ static struct net *xfrmi_get_link_net(const struct net_device *dev)
>  }
>  
>  static const struct nla_policy xfrmi_policy[IFLA_XFRM_MAX + 1] = {
> -	[IFLA_XFRM_LINK]	= { .type = NLA_U32 },
> -	[IFLA_XFRM_IF_ID]	= { .type = NLA_U32 },
> +	[IFLA_XFRM_UNSPEC]		= { .strict_start_type = IFLA_XFRM_COLLECT_METADATA },
> +	[IFLA_XFRM_LINK]		= { .type = NLA_U32 },

link is signed, so s32

> +	[IFLA_XFRM_IF_ID]		= { .type = NLA_U32 },
> +	[IFLA_XFRM_COLLECT_METADATA]	= { .type = NLA_FLAG },
>  };
>  
>  static struct rtnl_link_ops xfrmi_link_ops __read_mostly = {
> @@ -762,6 +841,9 @@ static void __net_exit xfrmi_exit_batch_net(struct list_head *net_exit_list)
>  			     xip = &xi->next)
>  				unregister_netdevice_queue(xi->dev, &list);
>  		}
> +		xi = rcu_dereference(xfrmn->collect_md_xfrmi);

rtnl_dereference()

> +		if (xi)
> +			unregister_netdevice_queue(xi->dev, &list);
>  	}
>  	unregister_netdevice_many(&list);
>  	rtnl_unlock();
> diff --git a/net/xfrm/xfrm_policy.c b/net/xfrm/xfrm_policy.c
> index f1a0bab920a5..70376f3fe84a 100644
> --- a/net/xfrm/xfrm_policy.c
> +++ b/net/xfrm/xfrm_policy.c
> @@ -3516,17 +3516,17 @@ int __xfrm_policy_check(struct sock *sk, int dir, struct sk_buff *skb,
>  	int xerr_idx = -1;
>  	const struct xfrm_if_cb *ifcb;
>  	struct sec_path *sp;
> -	struct xfrm_if *xi;
>  	u32 if_id = 0;
>  
>  	rcu_read_lock();
>  	ifcb = xfrm_if_get_cb();
>  
>  	if (ifcb) {
> -		xi = ifcb->decode_session(skb, family);
> -		if (xi) {
> -			if_id = xi->p.if_id;
> -			net = xi->net;
> +		struct xfrm_if_decode_session_result r;
> +
> +		if (ifcb->decode_session(skb, family, &r)) {
> +			if_id = r.if_id;
> +			net = r.net;
>  		}
>  	}
>  	rcu_read_unlock();


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH ipsec-next,v2 2/3] xfrm: interface: support collect metadata mode
  2022-08-25 13:46 ` [PATCH ipsec-next,v2 2/3] xfrm: interface: support collect metadata mode Eyal Birger
  2022-08-25 14:24   ` Nikolay Aleksandrov
@ 2022-08-25 14:28   ` Nicolas Dichtel
  1 sibling, 0 replies; 11+ messages in thread
From: Nicolas Dichtel @ 2022-08-25 14:28 UTC (permalink / raw)
  To: Eyal Birger, davem, edumazet, kuba, pabeni, steffen.klassert,
	herbert, dsahern, contact, pablo
  Cc: netdev, bpf


Le 25/08/2022 à 15:46, Eyal Birger a écrit :
> This commit adds support for 'collect_md' mode on xfrm interfaces.
> 
> Each net can have one collect_md device, created by providing the
> IFLA_XFRM_COLLECT_METADATA flag at creation. This device cannot be
> altered and has no if_id or link device attributes.
> 
> On transmit to this device, the if_id is fetched from the attached dst
> metadata on the skb. If exists, the link property is also fetched from
> the metadata. The dst metadata type used is METADATA_XFRM which holds
> these properties.
> 
> On the receive side, xfrmi_rcv_cb() populates a dst metadata for each
> packet received and attaches it to the skb. The if_id used in this case is
> fetched from the xfrm state, and the link is fetched from the incoming
> device. This information can later be used by upper layers such as tc,
> ebpf, and ip rules.
> 
> Because the skb is scrubed in xfrmi_rcv_cb(), the attachment of the dst
> metadata is postponed until after scrubing. Similarly, xfrm_input() is
> adapted to avoid dropping metadata dsts by only dropping 'valid'
> (skb_valid_dst(skb) == true) dsts.
> 
> Policy matching on packets arriving from collect_md xfrmi devices is
> done by using the xfrm state existing in the skb's sec_path.
> The xfrm_if_cb.decode_cb() interface implemented by xfrmi_decode_session()
> is changed to keep the details of the if_id extraction tucked away
> in xfrm_interface.c.
> 
> Signed-off-by: Eyal Birger <eyal.birger@gmail.com>
> 
> ----
> 
> v2:
>   - add "link" property as suggested by Nicolas Dichtel
Thanks.

[snip]

> @@ -269,10 +303,24 @@ xfrmi_xmit2(struct sk_buff *skb, struct net_device *dev, struct flowi *fl)
>  	struct net_device *tdev;
>  	struct xfrm_state *x;
>  	int err = -1;
> +	u32 if_id;
>  	int mtu;
>  
> +	if (xi->p.collect_md) {
> +		struct xfrm_md_info *md_info = skb_xfrm_md_info(skb);
> +
> +		if (unlikely(!md_info))
> +			return -EINVAL;
> +
> +		if_id = md_info->if_id;
> +		if (md_info->link)
> +			fl->flowi_oif = md_info->link;
nit: flowi_oif is 0 in case of collect_md, thus the if can be omitted.

[snip]
> @@ -682,12 +746,19 @@ static int xfrmi_changelink(struct net_device *dev, struct nlattr *tb[],

>  		return -EINVAL;
>  	}
>  
> +	if (p.collect_md) {
> +		NL_SET_ERR_MSG(extack, "collect_md can't be changed");
> +		return -EINVAL;
> +	}
> +
>  	xi = xfrmi_locate(net, &p);
>  	if (!xi) {
>  		xi = netdev_priv(dev);
>  	} else {
>  		if (xi->dev != dev)
>  			return -EEXIST;
> +		if (xi->p.collect_md)
> +			return -EINVAL;

It could be useful to add an extack error msg in this case also.

Regards,
Nicolas

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH ipsec-next,v2 3/3] xfrm: lwtunnel: add lwtunnel support for xfrm interfaces in collect_md mode
  2022-08-25 13:46 ` [PATCH ipsec-next,v2 3/3] xfrm: lwtunnel: add lwtunnel support for xfrm interfaces in collect_md mode Eyal Birger
@ 2022-08-25 14:30   ` Nikolay Aleksandrov
  2022-08-25 14:30   ` Nicolas Dichtel
  1 sibling, 0 replies; 11+ messages in thread
From: Nikolay Aleksandrov @ 2022-08-25 14:30 UTC (permalink / raw)
  To: Eyal Birger, davem, edumazet, kuba, pabeni, steffen.klassert,
	herbert, dsahern, contact, pablo, nicolas.dichtel
  Cc: netdev, bpf

On 25/08/2022 16:46, Eyal Birger wrote:
> Allow specifying the xfrm interface if_id and link as part of a route
> metadata using the lwtunnel infrastructure.
> 
> This allows for example using a single xfrm interface in collect_md
> mode as the target of multiple routes each specifying a different if_id.
> 
> With the appropriate changes to iproute2, considering an xfrm device
> ipsec1 in collect_md mode one can for example add a route specifying
> an if_id like so:
> 
> ip route add <SUBNET> dev ipsec1 encap xfrm if_id 1
> 
> In which case traffic routed to the device via this route would use
> if_id in the xfrm interface policy lookup.
> 
> Or in the context of vrf, one can also specify the "link" property:
> 
> ip route add <SUBNET> dev ipsec1 encap xfrm if_id 1 dev eth15
> 
> Signed-off-by: Eyal Birger <eyal.birger@gmail.com>
> 
> ----
> 
> v2:
>  - move lwt_xfrm_info() helper to dst_metadata.h
>  - add "link" property as suggested by Nicolas Dichtel
> ---
>  include/net/dst_metadata.h    |  11 ++++
>  include/uapi/linux/lwtunnel.h |  10 ++++
>  net/core/lwtunnel.c           |   1 +
>  net/xfrm/xfrm_interface.c     | 100 ++++++++++++++++++++++++++++++++++
>  4 files changed, 122 insertions(+)
> 
> diff --git a/include/net/dst_metadata.h b/include/net/dst_metadata.h
> index e4b059908cc7..57f75960fa28 100644
> --- a/include/net/dst_metadata.h
> +++ b/include/net/dst_metadata.h
> @@ -60,13 +60,24 @@ skb_tunnel_info(const struct sk_buff *skb)
>  	return NULL;
>  }
>  
> +static inline struct xfrm_md_info *lwt_xfrm_info(struct lwtunnel_state *lwt)
> +{
> +	return (struct xfrm_md_info *)lwt->data;
> +}
> +
>  static inline struct xfrm_md_info *skb_xfrm_md_info(const struct sk_buff *skb)
>  {
>  	struct metadata_dst *md_dst = skb_metadata_dst(skb);
> +	struct dst_entry *dst;
>  
>  	if (md_dst && md_dst->type == METADATA_XFRM)
>  		return &md_dst->u.xfrm_info;
>  
> +	dst = skb_dst(skb);
> +	if (dst && dst->lwtstate &&
> +	    dst->lwtstate->type == LWTUNNEL_ENCAP_XFRM)
> +		return lwt_xfrm_info(dst->lwtstate);
> +
>  	return NULL;
>  }
>  
> diff --git a/include/uapi/linux/lwtunnel.h b/include/uapi/linux/lwtunnel.h
> index 2e206919125c..229655ef792f 100644
> --- a/include/uapi/linux/lwtunnel.h
> +++ b/include/uapi/linux/lwtunnel.h
> @@ -15,6 +15,7 @@ enum lwtunnel_encap_types {
>  	LWTUNNEL_ENCAP_SEG6_LOCAL,
>  	LWTUNNEL_ENCAP_RPL,
>  	LWTUNNEL_ENCAP_IOAM6,
> +	LWTUNNEL_ENCAP_XFRM,
>  	__LWTUNNEL_ENCAP_MAX,
>  };
>  
> @@ -111,4 +112,13 @@ enum {
>  
>  #define LWT_BPF_MAX_HEADROOM 256
>  
> +enum {
> +	LWT_XFRM_UNSPEC,
> +	LWT_XFRM_IF_ID,
> +	LWT_XFRM_LINK,
> +	__LWT_XFRM_MAX,
> +};
> +
> +#define LWT_XFRM_MAX (__LWT_XFRM_MAX - 1)
> +
>  #endif /* _UAPI_LWTUNNEL_H_ */
> diff --git a/net/core/lwtunnel.c b/net/core/lwtunnel.c
> index 9ccd64e8a666..6fac2f0ef074 100644
> --- a/net/core/lwtunnel.c
> +++ b/net/core/lwtunnel.c
> @@ -50,6 +50,7 @@ static const char *lwtunnel_encap_str(enum lwtunnel_encap_types encap_type)
>  		return "IOAM6";
>  	case LWTUNNEL_ENCAP_IP6:
>  	case LWTUNNEL_ENCAP_IP:
> +	case LWTUNNEL_ENCAP_XFRM:
>  	case LWTUNNEL_ENCAP_NONE:
>  	case __LWTUNNEL_ENCAP_MAX:
>  		/* should not have got here */
> diff --git a/net/xfrm/xfrm_interface.c b/net/xfrm/xfrm_interface.c
> index 389d8be12801..604de1ee3772 100644
> --- a/net/xfrm/xfrm_interface.c
> +++ b/net/xfrm/xfrm_interface.c
> @@ -60,6 +60,103 @@ struct xfrmi_net {
>  	struct xfrm_if __rcu *collect_md_xfrmi;
>  };
>  
> +static const struct nla_policy xfrm_lwt_policy[LWT_XFRM_MAX + 1] = {
> +	[LWT_XFRM_UNSPEC]	= { .type = NLA_REJECT },

IIRC this is automatically rejected (NL_VALIDATE_STRICT used by nla_parse_nested())

> +	[LWT_XFRM_IF_ID]	= { .type = NLA_U32 },

I think you can use NLA_POLICY_MIN() and simplify xfrmi_build_state() below

> +	[LWT_XFRM_LINK]		= { .type = NLA_U32 },

link is an int, so s32 and you can add validation via NLA_POLICY_MIN() so you
can remove the check for !info->link below

> +};
> +
> +static void xfrmi_destroy_state(struct lwtunnel_state *lwt)
> +{
> +}
> +
> +static int xfrmi_build_state(struct net *net, struct nlattr *nla,
> +			     unsigned int family, const void *cfg,
> +			     struct lwtunnel_state **ts,
> +			     struct netlink_ext_ack *extack)
> +{
> +	struct nlattr *tb[LWT_XFRM_MAX + 1];
> +	struct lwtunnel_state *new_state;
> +	struct xfrm_md_info *info;
> +	int ret;
> +
> +	ret = nla_parse_nested(tb, LWT_XFRM_MAX, nla, xfrm_lwt_policy, extack);
> +	if (ret < 0)
> +		return ret;
> +
> +	if (!tb[LWT_XFRM_IF_ID])
> +		return -EINVAL;
> +
> +	new_state = lwtunnel_state_alloc(sizeof(*info));
> +	if (!new_state)
> +		return -ENOMEM;
> +
> +	new_state->type = LWTUNNEL_ENCAP_XFRM;
> +
> +	info = lwt_xfrm_info(new_state);
> +
> +	info->if_id = nla_get_u32(tb[LWT_XFRM_IF_ID]);
> +	if (!info->if_id) {
> +		ret = -EINVAL;
> +		goto errout;
> +	}
> +
> +	if (tb[LWT_XFRM_LINK]) {
> +		info->link = nla_get_u32(tb[LWT_XFRM_LINK]);

same, s32

> +		if (!info->link) {
> +			ret = -EINVAL;
> +			goto errout;
> +		}
> +	}
> +
> +	*ts = new_state;
> +	return 0;
> +
> +errout:
> +	xfrmi_destroy_state(new_state);
> +	kfree(new_state);
> +	return ret;
> +}
> +
> +static int xfrmi_fill_encap_info(struct sk_buff *skb,
> +				 struct lwtunnel_state *lwt)
> +{
> +	struct xfrm_md_info *info = lwt_xfrm_info(lwt);
> +
> +	if (nla_put_u32(skb, LWT_XFRM_IF_ID, info->if_id))
> +		return -EMSGSIZE;
> +
> +	if (info->link) {
> +		if (nla_put_u32(skb, LWT_XFRM_LINK, info->link))

same and also minor nit: these can be combined

> +			return -EMSGSIZE;
> +	}
> +
> +	return 0;
> +}
> +
> +static int xfrmi_encap_nlsize(struct lwtunnel_state *lwtstate)
> +{
> +	return nla_total_size(4) + /* LWT_XFRM_IF_ID */
> +		nla_total_size(4); /* LWT_XFRM_LINK */

nit: nla_total_size(sizeof(u32))

> +}
> +
> +static int xfrmi_encap_cmp(struct lwtunnel_state *a, struct lwtunnel_state *b)
> +{
> +	struct xfrm_md_info *a_info = lwt_xfrm_info(a);
> +	struct xfrm_md_info *b_info = lwt_xfrm_info(b);
> +
> +	return memcmp(a_info, b_info, sizeof(*a_info));
> +}
> +
> +static const struct lwtunnel_encap_ops xfrmi_encap_ops = {
> +	.build_state	= xfrmi_build_state,
> +	.destroy_state	= xfrmi_destroy_state,
> +	.fill_encap	= xfrmi_fill_encap_info,
> +	.get_encap_size = xfrmi_encap_nlsize,
> +	.cmp_encap	= xfrmi_encap_cmp,
> +	.owner		= THIS_MODULE,
> +};
> +
>  #define for_each_xfrmi_rcu(start, xi) \
>  	for (xi = rcu_dereference(start); xi; xi = rcu_dereference(xi->next))
>  
> @@ -1081,6 +1178,8 @@ static int __init xfrmi_init(void)
>  	if (err < 0)
>  		goto rtnl_link_failed;
>  
> +	lwtunnel_encap_add_ops(&xfrmi_encap_ops, LWTUNNEL_ENCAP_XFRM);
> +
>  	xfrm_if_register_cb(&xfrm_if_cb);
>  
>  	return err;
> @@ -1099,6 +1198,7 @@ static int __init xfrmi_init(void)
>  static void __exit xfrmi_fini(void)
>  {
>  	xfrm_if_unregister_cb();
> +	lwtunnel_encap_del_ops(&xfrmi_encap_ops, LWTUNNEL_ENCAP_XFRM);
>  	rtnl_link_unregister(&xfrmi_link_ops);
>  	xfrmi4_fini();
>  	xfrmi6_fini();


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH ipsec-next,v2 3/3] xfrm: lwtunnel: add lwtunnel support for xfrm interfaces in collect_md mode
  2022-08-25 13:46 ` [PATCH ipsec-next,v2 3/3] xfrm: lwtunnel: add lwtunnel support for xfrm interfaces in collect_md mode Eyal Birger
  2022-08-25 14:30   ` Nikolay Aleksandrov
@ 2022-08-25 14:30   ` Nicolas Dichtel
  1 sibling, 0 replies; 11+ messages in thread
From: Nicolas Dichtel @ 2022-08-25 14:30 UTC (permalink / raw)
  To: Eyal Birger, davem, edumazet, kuba, pabeni, steffen.klassert,
	herbert, dsahern, contact, pablo
  Cc: netdev, bpf


Le 25/08/2022 à 15:46, Eyal Birger a écrit :
> Allow specifying the xfrm interface if_id and link as part of a route
> metadata using the lwtunnel infrastructure.
> 
> This allows for example using a single xfrm interface in collect_md
> mode as the target of multiple routes each specifying a different if_id.
> 
> With the appropriate changes to iproute2, considering an xfrm device
> ipsec1 in collect_md mode one can for example add a route specifying
> an if_id like so:
> 
> ip route add <SUBNET> dev ipsec1 encap xfrm if_id 1
> 
> In which case traffic routed to the device via this route would use
> if_id in the xfrm interface policy lookup.
> 
> Or in the context of vrf, one can also specify the "link" property:
> 
> ip route add <SUBNET> dev ipsec1 encap xfrm if_id 1 dev eth15
> 
> Signed-off-by: Eyal Birger <eyal.birger@gmail.com>

[snip]

> +static int xfrmi_build_state(struct net *net, struct nlattr *nla,
> +			     unsigned int family, const void *cfg,
> +			     struct lwtunnel_state **ts,
> +			     struct netlink_ext_ack *extack)
> +{
> +	struct nlattr *tb[LWT_XFRM_MAX + 1];
> +	struct lwtunnel_state *new_state;
> +	struct xfrm_md_info *info;
> +	int ret;
> +
> +	ret = nla_parse_nested(tb, LWT_XFRM_MAX, nla, xfrm_lwt_policy, extack);
> +	if (ret < 0)
> +		return ret;
> +
> +	if (!tb[LWT_XFRM_IF_ID])
> +		return -EINVAL;
It would be nice  to add extack error message for all error cases, particularly
for EINVAL ;-)

> +
> +	new_state = lwtunnel_state_alloc(sizeof(*info));
> +	if (!new_state)
> +		return -ENOMEM;
> +
> +	new_state->type = LWTUNNEL_ENCAP_XFRM;
> +
> +	info = lwt_xfrm_info(new_state);
> +
> +	info->if_id = nla_get_u32(tb[LWT_XFRM_IF_ID]);
> +	if (!info->if_id) {
> +		ret = -EINVAL;
> +		goto errout;
> +	}
> +
> +	if (tb[LWT_XFRM_LINK]) {
> +		info->link = nla_get_u32(tb[LWT_XFRM_LINK]);
> +		if (!info->link) {
> +			ret = -EINVAL;
> +			goto errout;
> +		}
> +	}
> +
> +	*ts = new_state;
> +	return 0;
> +
> +errout:
> +	xfrmi_destroy_state(new_state);
> +	kfree(new_state);
> +	return ret;
> +}

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH ipsec-next,v2 2/3] xfrm: interface: support collect metadata mode
  2022-08-25 14:24   ` Nikolay Aleksandrov
@ 2022-08-25 15:14     ` Eyal Birger
  2022-08-26  7:53       ` Nicolas Dichtel
  0 siblings, 1 reply; 11+ messages in thread
From: Eyal Birger @ 2022-08-25 15:14 UTC (permalink / raw)
  To: Nikolay Aleksandrov
  Cc: davem, edumazet, kuba, pabeni, steffen.klassert, herbert,
	dsahern, contact, pablo, nicolas.dichtel, netdev, bpf, daniel

On Thu, Aug 25, 2022 at 5:24 PM Nikolay Aleksandrov <razor@blackwall.org> wrote:
>
> On 25/08/2022 16:46, Eyal Birger wrote:
> > This commit adds support for 'collect_md' mode on xfrm interfaces.
> >
> > Each net can have one collect_md device, created by providing the
> > IFLA_XFRM_COLLECT_METADATA flag at creation. This device cannot be
> > altered and has no if_id or link device attributes.
> >
> > On transmit to this device, the if_id is fetched from the attached dst
> > metadata on the skb. If exists, the link property is also fetched from
> > the metadata. The dst metadata type used is METADATA_XFRM which holds
> > these properties.
> >
> > On the receive side, xfrmi_rcv_cb() populates a dst metadata for each
> > packet received and attaches it to the skb. The if_id used in this case is
> > fetched from the xfrm state, and the link is fetched from the incoming
> > device. This information can later be used by upper layers such as tc,
> > ebpf, and ip rules.
> >
> > Because the skb is scrubed in xfrmi_rcv_cb(), the attachment of the dst
> > metadata is postponed until after scrubing. Similarly, xfrm_input() is
> > adapted to avoid dropping metadata dsts by only dropping 'valid'
> > (skb_valid_dst(skb) == true) dsts.
> >
> > Policy matching on packets arriving from collect_md xfrmi devices is
> > done by using the xfrm state existing in the skb's sec_path.
> > The xfrm_if_cb.decode_cb() interface implemented by xfrmi_decode_session()
> > is changed to keep the details of the if_id extraction tucked away
> > in xfrm_interface.c.
> >
> > Signed-off-by: Eyal Birger <eyal.birger@gmail.com>
> >
> > ----
> >
> > v2:
> >   - add "link" property as suggested by Nicolas Dichtel
> >   - rename xfrm_if_decode_session_params to xfrm_if_decode_session_result
> > ---
>
> (+CC Daniel)
>
> Hi,
> Generally I really like the idea, but I missed to comment the first round. :)
> A few comments below..
>

Thanks for the review!

> >  include/net/xfrm.h           |  11 +++-
<...snip...>
> >
> >  static const struct nla_policy xfrmi_policy[IFLA_XFRM_MAX + 1] = {
> > -     [IFLA_XFRM_LINK]        = { .type = NLA_U32 },
> > -     [IFLA_XFRM_IF_ID]       = { .type = NLA_U32 },
> > +     [IFLA_XFRM_UNSPEC]              = { .strict_start_type = IFLA_XFRM_COLLECT_METADATA },
> > +     [IFLA_XFRM_LINK]                = { .type = NLA_U32 },
>
> link is signed, so s32

Ack on all comments except this one - I'm a little hesitant to change
this one as the change would be unrelated to this series.

Thanks!
Eyal.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH ipsec-next,v2 2/3] xfrm: interface: support collect metadata mode
  2022-08-25 15:14     ` Eyal Birger
@ 2022-08-26  7:53       ` Nicolas Dichtel
  2022-08-27 12:36         ` Nikolay Aleksandrov
  0 siblings, 1 reply; 11+ messages in thread
From: Nicolas Dichtel @ 2022-08-26  7:53 UTC (permalink / raw)
  To: Eyal Birger, Nikolay Aleksandrov
  Cc: davem, edumazet, kuba, pabeni, steffen.klassert, herbert,
	dsahern, contact, pablo, netdev, bpf, daniel


Le 25/08/2022 à 17:14, Eyal Birger a écrit :
> On Thu, Aug 25, 2022 at 5:24 PM Nikolay Aleksandrov <razor@blackwall.org> wrote:
>>
>> On 25/08/2022 16:46, Eyal Birger wrote:
>>> This commit adds support for 'collect_md' mode on xfrm interfaces.
>>>
>>> Each net can have one collect_md device, created by providing the
>>> IFLA_XFRM_COLLECT_METADATA flag at creation. This device cannot be
>>> altered and has no if_id or link device attributes.
>>>
>>> On transmit to this device, the if_id is fetched from the attached dst
>>> metadata on the skb. If exists, the link property is also fetched from
>>> the metadata. The dst metadata type used is METADATA_XFRM which holds
>>> these properties.
>>>
>>> On the receive side, xfrmi_rcv_cb() populates a dst metadata for each
>>> packet received and attaches it to the skb. The if_id used in this case is
>>> fetched from the xfrm state, and the link is fetched from the incoming
>>> device. This information can later be used by upper layers such as tc,
>>> ebpf, and ip rules.
>>>
>>> Because the skb is scrubed in xfrmi_rcv_cb(), the attachment of the dst
>>> metadata is postponed until after scrubing. Similarly, xfrm_input() is
>>> adapted to avoid dropping metadata dsts by only dropping 'valid'
>>> (skb_valid_dst(skb) == true) dsts.
>>>
>>> Policy matching on packets arriving from collect_md xfrmi devices is
>>> done by using the xfrm state existing in the skb's sec_path.
>>> The xfrm_if_cb.decode_cb() interface implemented by xfrmi_decode_session()
>>> is changed to keep the details of the if_id extraction tucked away
>>> in xfrm_interface.c.
>>>
>>> Signed-off-by: Eyal Birger <eyal.birger@gmail.com>
>>>
>>> ----
>>>
>>> v2:
>>>   - add "link" property as suggested by Nicolas Dichtel
>>>   - rename xfrm_if_decode_session_params to xfrm_if_decode_session_result
>>> ---
>>
>> (+CC Daniel)
>>
>> Hi,
>> Generally I really like the idea, but I missed to comment the first round. :)
>> A few comments below..
>>
> 
> Thanks for the review!
> 
>>>  include/net/xfrm.h           |  11 +++-
> <...snip...>
>>>
>>>  static const struct nla_policy xfrmi_policy[IFLA_XFRM_MAX + 1] = {
>>> -     [IFLA_XFRM_LINK]        = { .type = NLA_U32 },
>>> -     [IFLA_XFRM_IF_ID]       = { .type = NLA_U32 },
>>> +     [IFLA_XFRM_UNSPEC]              = { .strict_start_type = IFLA_XFRM_COLLECT_METADATA },
>>> +     [IFLA_XFRM_LINK]                = { .type = NLA_U32 },
>>
>> link is signed, so s32
> 
> Ack on all comments except this one - I'm a little hesitant to change
> this one as the change would be unrelated to this series.
I agree, it's unrelated to this series.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH ipsec-next,v2 2/3] xfrm: interface: support collect metadata mode
  2022-08-26  7:53       ` Nicolas Dichtel
@ 2022-08-27 12:36         ` Nikolay Aleksandrov
  0 siblings, 0 replies; 11+ messages in thread
From: Nikolay Aleksandrov @ 2022-08-27 12:36 UTC (permalink / raw)
  To: nicolas.dichtel, Eyal Birger
  Cc: davem, edumazet, kuba, pabeni, steffen.klassert, herbert,
	dsahern, contact, pablo, netdev, bpf, daniel

On 26/08/2022 10:53, Nicolas Dichtel wrote:
> 
> Le 25/08/2022 à 17:14, Eyal Birger a écrit :
>> On Thu, Aug 25, 2022 at 5:24 PM Nikolay Aleksandrov <razor@blackwall.org> wrote:
>>>
>>> On 25/08/2022 16:46, Eyal Birger wrote:
>>>> This commit adds support for 'collect_md' mode on xfrm interfaces.
>>>>
>>>> Each net can have one collect_md device, created by providing the
>>>> IFLA_XFRM_COLLECT_METADATA flag at creation. This device cannot be
>>>> altered and has no if_id or link device attributes.
>>>>
>>>> On transmit to this device, the if_id is fetched from the attached dst
>>>> metadata on the skb. If exists, the link property is also fetched from
>>>> the metadata. The dst metadata type used is METADATA_XFRM which holds
>>>> these properties.
>>>>
>>>> On the receive side, xfrmi_rcv_cb() populates a dst metadata for each
>>>> packet received and attaches it to the skb. The if_id used in this case is
>>>> fetched from the xfrm state, and the link is fetched from the incoming
>>>> device. This information can later be used by upper layers such as tc,
>>>> ebpf, and ip rules.
>>>>
>>>> Because the skb is scrubed in xfrmi_rcv_cb(), the attachment of the dst
>>>> metadata is postponed until after scrubing. Similarly, xfrm_input() is
>>>> adapted to avoid dropping metadata dsts by only dropping 'valid'
>>>> (skb_valid_dst(skb) == true) dsts.
>>>>
>>>> Policy matching on packets arriving from collect_md xfrmi devices is
>>>> done by using the xfrm state existing in the skb's sec_path.
>>>> The xfrm_if_cb.decode_cb() interface implemented by xfrmi_decode_session()
>>>> is changed to keep the details of the if_id extraction tucked away
>>>> in xfrm_interface.c.
>>>>
>>>> Signed-off-by: Eyal Birger <eyal.birger@gmail.com>
>>>>
>>>> ----
>>>>
>>>> v2:
>>>>   - add "link" property as suggested by Nicolas Dichtel
>>>>   - rename xfrm_if_decode_session_params to xfrm_if_decode_session_result
>>>> ---
>>>
>>> (+CC Daniel)
>>>
>>> Hi,
>>> Generally I really like the idea, but I missed to comment the first round. :)
>>> A few comments below..
>>>
>>
>> Thanks for the review!
>>
>>>>  include/net/xfrm.h           |  11 +++-
>> <...snip...>
>>>>
>>>>  static const struct nla_policy xfrmi_policy[IFLA_XFRM_MAX + 1] = {
>>>> -     [IFLA_XFRM_LINK]        = { .type = NLA_U32 },
>>>> -     [IFLA_XFRM_IF_ID]       = { .type = NLA_U32 },
>>>> +     [IFLA_XFRM_UNSPEC]              = { .strict_start_type = IFLA_XFRM_COLLECT_METADATA },
>>>> +     [IFLA_XFRM_LINK]                = { .type = NLA_U32 },
>>>
>>> link is signed, so s32
>>
>> Ack on all comments except this one - I'm a little hesitant to change
>> this one as the change would be unrelated to this series.
> I agree, it's unrelated to this series.

Ohh right, my bad. I somehow confused this for new code. :)

Cheers,
 Nik


^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2022-08-27 12:37 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-08-25 13:46 [PATCH ipsec-next,v2 0/3] xfrm: support collect metadata mode for xfrm interfaces Eyal Birger
2022-08-25 13:46 ` [PATCH ipsec-next,v2 1/3] net: allow storing xfrm interface metadata in metadata_dst Eyal Birger
2022-08-25 13:46 ` [PATCH ipsec-next,v2 2/3] xfrm: interface: support collect metadata mode Eyal Birger
2022-08-25 14:24   ` Nikolay Aleksandrov
2022-08-25 15:14     ` Eyal Birger
2022-08-26  7:53       ` Nicolas Dichtel
2022-08-27 12:36         ` Nikolay Aleksandrov
2022-08-25 14:28   ` Nicolas Dichtel
2022-08-25 13:46 ` [PATCH ipsec-next,v2 3/3] xfrm: lwtunnel: add lwtunnel support for xfrm interfaces in collect_md mode Eyal Birger
2022-08-25 14:30   ` Nikolay Aleksandrov
2022-08-25 14:30   ` Nicolas Dichtel

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).