netfilter-devel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH net-next 1/2] net: flow_offload: mangle 128-bit packet field with one action
@ 2019-08-20 10:52 Pablo Neira Ayuso
  2019-08-20 10:52 ` [PATCH net-next 2/2] netfilter: nft_payload: packet mangling offload support Pablo Neira Ayuso
  2019-08-20 14:15 ` [PATCH net-next 1/2] net: flow_offload: mangle 128-bit packet field with one action Edward Cree
  0 siblings, 2 replies; 9+ messages in thread
From: Pablo Neira Ayuso @ 2019-08-20 10:52 UTC (permalink / raw)
  To: netfilter-devel; +Cc: davem, netdev, jakub.kicinski, jiri, vladbu

The existing infrastructure needs the front-end to generate up to four
actions (one for each 32-bit word) to mangle an IPv6 address. This patch
allows you to mangle fields than are longer than 4-bytes with one single
action. Drivers have been adapted to this new representation following a
simple approach, that is, iterate over the array of words and configure
the hardware IR to make the packet mangling. FLOW_ACTION_MANGLE_MAX_WORDS
defines the maximum number of words from one given offset (currently 4
words).

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
 .../net/ethernet/chelsio/cxgb4/cxgb4_tc_flower.c   | 44 ++++++++++----
 drivers/net/ethernet/mellanox/mlx5/core/en_tc.c    | 50 +++++++++++-----
 drivers/net/ethernet/netronome/nfp/flower/action.c | 69 ++++++++++++++--------
 include/net/flow_offload.h                         |  9 ++-
 net/sched/cls_api.c                                |  7 ++-
 5 files changed, 125 insertions(+), 54 deletions(-)

diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_tc_flower.c b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_tc_flower.c
index e447976bdd3e..6a961e29a904 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_tc_flower.c
+++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_tc_flower.c
@@ -428,13 +428,17 @@ static void cxgb4_process_flow_actions(struct net_device *in,
 		case FLOW_ACTION_MANGLE: {
 			u32 mask, val, offset;
 			u8 htype;
+			int i;
 
 			htype = act->mangle.htype;
-			mask = act->mangle.mask;
-			val = act->mangle.val;
 			offset = act->mangle.offset;
 
-			process_pedit_field(fs, val, mask, offset, htype);
+			for (i = 0; i < act->mangle.words; i++) {
+				mask = act->mangle.data[i].mask;
+				val = act->mangle.data[i].val;
+				process_pedit_field(fs, val, mask, offset, htype);
+				offset += sizeof(u32);
+			}
 			}
 			break;
 		default:
@@ -456,16 +460,9 @@ static bool valid_l4_mask(u32 mask)
 	return hi && lo ? false : true;
 }
 
-static bool valid_pedit_action(struct net_device *dev,
-			       const struct flow_action_entry *act)
+static bool __valid_pedit_action(struct net_device *dev, u8 htype,
+				 __be32 mask, __be32 offset)
 {
-	u32 mask, offset;
-	u8 htype;
-
-	htype = act->mangle.htype;
-	mask = act->mangle.mask;
-	offset = act->mangle.offset;
-
 	switch (htype) {
 	case FLOW_ACT_MANGLE_HDR_TYPE_ETH:
 		switch (offset) {
@@ -541,6 +538,29 @@ static bool valid_pedit_action(struct net_device *dev,
 		netdev_err(dev, "%s: Unsupported pedit type\n", __func__);
 		return false;
 	}
+
+	return true;
+}
+
+static bool valid_pedit_action(struct net_device *dev,
+			       const struct flow_action_entry *act)
+{
+	u32 mask, offset;
+	u8 htype;
+	int i;
+
+	htype = act->mangle.htype;
+	offset = act->mangle.offset;
+
+	for (i = 0; i < act->mangle.words; i++) {
+		mask = act->mangle.data[i].mask;
+
+		if (!__valid_pedit_action(dev, htype, mask, offset))
+			return false;
+
+		offset += sizeof(u32);
+	}
+
 	return true;
 }
 
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c b/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
index c57f7533a6d0..bb24616ee27f 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
@@ -2411,6 +2411,7 @@ static int parse_tc_pedit_action(struct mlx5e_priv *priv,
 	int err = -EOPNOTSUPP;
 	u32 mask, val, offset;
 	u8 htype;
+	int i;
 
 	htype = act->mangle.htype;
 	err = -EOPNOTSUPP; /* can't be all optimistic */
@@ -2426,15 +2427,19 @@ static int parse_tc_pedit_action(struct mlx5e_priv *priv,
 		goto out_err;
 	}
 
-	mask = act->mangle.mask;
-	val = act->mangle.val;
 	offset = act->mangle.offset;
 
-	err = set_pedit_val(htype, ~mask, val, offset, &hdrs[cmd]);
-	if (err)
-		goto out_err;
+	for (i = 0; i < act->mangle.words; i++) {
+		val = act->mangle.data[i].val;
+		mask = act->mangle.data[i].mask;
 
-	hdrs[cmd].pedits++;
+		err = set_pedit_val(htype, ~mask, val, offset, &hdrs[cmd]);
+		if (err)
+			goto out_err;
+
+		offset += sizeof(u32);
+		hdrs[cmd].pedits++;
+	}
 
 	return 0;
 out_err:
@@ -2523,14 +2528,8 @@ struct ipv6_hoplimit_word {
 	__u8	hop_limit;
 };
 
-static bool is_action_keys_supported(const struct flow_action_entry *act)
+static bool __is_action_keys_supported(u8 htype, u32 offset, u32 mask)
 {
-	u32 mask, offset;
-	u8 htype;
-
-	htype = act->mangle.htype;
-	offset = act->mangle.offset;
-	mask = ~act->mangle.mask;
 	/* For IPv4 & IPv6 header check 4 byte word,
 	 * to determine that modified fields
 	 * are NOT ttl & hop_limit only.
@@ -2557,6 +2556,26 @@ static bool is_action_keys_supported(const struct flow_action_entry *act)
 	return false;
 }
 
+static bool is_action_keys_supported(const struct flow_action_entry *act)
+{
+	u32 mask, offset;
+	u8 htype;
+	int i;
+
+	htype = act->mangle.htype;
+	offset = act->mangle.offset;
+
+	for (i = 0; i < act->mangle.words; i++) {
+		mask = ~act->mangle.data[i].mask;
+		if (!__is_action_keys_supported(htype, offset, mask))
+			return false;
+
+		offset += sizeof(u32);
+	}
+
+	return true;
+}
+
 static bool modify_header_match_supported(struct mlx5_flow_spec *spec,
 					  struct flow_action *flow_action,
 					  u32 actions,
@@ -2654,8 +2673,9 @@ static int add_vlan_rewrite_action(struct mlx5e_priv *priv, int namespace,
 		.id = FLOW_ACTION_MANGLE,
 		.mangle.htype = FLOW_ACT_MANGLE_HDR_TYPE_ETH,
 		.mangle.offset = offsetof(struct vlan_ethhdr, h_vlan_TCI),
-		.mangle.mask = ~(u32)be16_to_cpu(*(__be16 *)&mask16),
-		.mangle.val = (u32)be16_to_cpu(*(__be16 *)&val16),
+		.mangle.data[0].mask = ~(u32)be16_to_cpu(*(__be16 *)&mask16),
+		.mangle.data[0].val = (u32)be16_to_cpu(*(__be16 *)&val16),
+		.mangle.words = 1,
 	};
 	u8 match_prio_mask, match_prio_val;
 	void *headers_c, *headers_v;
diff --git a/drivers/net/ethernet/netronome/nfp/flower/action.c b/drivers/net/ethernet/netronome/nfp/flower/action.c
index 1b019fdfcd97..15bace2354dc 100644
--- a/drivers/net/ethernet/netronome/nfp/flower/action.c
+++ b/drivers/net/ethernet/netronome/nfp/flower/action.c
@@ -485,7 +485,7 @@ static void nfp_fl_set_helper32(u32 value, u32 mask, u8 *p_exact, u8 *p_mask)
 }
 
 static int
-nfp_fl_set_eth(const struct flow_action_entry *act, u32 off,
+nfp_fl_set_eth(const struct flow_action_entry *act, u32 idx, u32 off,
 	       struct nfp_fl_set_eth *set_eth, struct netlink_ext_ack *extack)
 {
 	u32 exact, mask;
@@ -495,8 +495,8 @@ nfp_fl_set_eth(const struct flow_action_entry *act, u32 off,
 		return -EOPNOTSUPP;
 	}
 
-	mask = ~act->mangle.mask;
-	exact = act->mangle.val;
+	mask = ~act->mangle.data[idx].mask;
+	exact = act->mangle.data[idx].val;
 
 	if (exact & ~mask) {
 		NL_SET_ERR_MSG_MOD(extack, "unsupported offload: invalid pedit ethernet action");
@@ -520,7 +520,7 @@ struct ipv4_ttl_word {
 };
 
 static int
-nfp_fl_set_ip4(const struct flow_action_entry *act, u32 off,
+nfp_fl_set_ip4(const struct flow_action_entry *act, u32 idx, u32 off,
 	       struct nfp_fl_set_ip4_addrs *set_ip_addr,
 	       struct nfp_fl_set_ip4_ttl_tos *set_ip_ttl_tos,
 	       struct netlink_ext_ack *extack)
@@ -532,8 +532,8 @@ nfp_fl_set_ip4(const struct flow_action_entry *act, u32 off,
 	__be32 exact, mask;
 
 	/* We are expecting tcf_pedit to return a big endian value */
-	mask = (__force __be32)~act->mangle.mask;
-	exact = (__force __be32)act->mangle.val;
+	mask = (__force __be32)~act->mangle.data[idx].mask;
+	exact = (__force __be32)act->mangle.data[idx].val;
 
 	if (exact & ~mask) {
 		NL_SET_ERR_MSG_MOD(extack, "unsupported offload: invalid pedit IPv4 action");
@@ -662,7 +662,7 @@ nfp_fl_set_ip6_hop_limit_flow_label(u32 off, __be32 exact, __be32 mask,
 }
 
 static int
-nfp_fl_set_ip6(const struct flow_action_entry *act, u32 off,
+nfp_fl_set_ip6(const struct flow_action_entry *act, u32 idx, u32 off,
 	       struct nfp_fl_set_ipv6_addr *ip_dst,
 	       struct nfp_fl_set_ipv6_addr *ip_src,
 	       struct nfp_fl_set_ipv6_tc_hl_fl *ip_hl_fl,
@@ -673,8 +673,8 @@ nfp_fl_set_ip6(const struct flow_action_entry *act, u32 off,
 	u8 word;
 
 	/* We are expecting tcf_pedit to return a big endian value */
-	mask = (__force __be32)~act->mangle.mask;
-	exact = (__force __be32)act->mangle.val;
+	mask = (__force __be32)~act->mangle.data[idx].mask;
+	exact = (__force __be32)act->mangle.data[idx].val;
 
 	if (exact & ~mask) {
 		NL_SET_ERR_MSG_MOD(extack, "unsupported offload: invalid pedit IPv6 action");
@@ -702,7 +702,7 @@ nfp_fl_set_ip6(const struct flow_action_entry *act, u32 off,
 }
 
 static int
-nfp_fl_set_tport(const struct flow_action_entry *act, u32 off,
+nfp_fl_set_tport(const struct flow_action_entry *act, u32 idx, u32 off,
 		 struct nfp_fl_set_tport *set_tport, int opcode,
 		 struct netlink_ext_ack *extack)
 {
@@ -713,8 +713,8 @@ nfp_fl_set_tport(const struct flow_action_entry *act, u32 off,
 		return -EOPNOTSUPP;
 	}
 
-	mask = ~act->mangle.mask;
-	exact = act->mangle.val;
+	mask = ~act->mangle.data[idx].mask;
+	exact = act->mangle.data[idx].val;
 
 	if (exact & ~mask) {
 		NL_SET_ERR_MSG_MOD(extack, "unsupported offload: invalid pedit L4 action");
@@ -860,32 +860,31 @@ nfp_fl_commit_mangle(struct flow_cls_offload *flow, char *nfp_action,
 }
 
 static int
-nfp_fl_pedit(const struct flow_action_entry *act,
-	     struct flow_cls_offload *flow, char *nfp_action, int *a_len,
-	     u32 *csum_updated, struct nfp_flower_pedit_acts *set_act,
-	     struct netlink_ext_ack *extack)
+__nfp_fl_pedit(const struct flow_action_entry *act, u32 idx, u32 offset,
+	       struct flow_cls_offload *flow, char *nfp_action, int *a_len,
+	       u32 *csum_updated, struct nfp_flower_pedit_acts *set_act,
+	       struct netlink_ext_ack *extack)
 {
 	enum flow_action_mangle_base htype;
-	u32 offset;
 
 	htype = act->mangle.htype;
-	offset = act->mangle.offset;
 
 	switch (htype) {
 	case TCA_PEDIT_KEY_EX_HDR_TYPE_ETH:
-		return nfp_fl_set_eth(act, offset, &set_act->set_eth, extack);
+		return nfp_fl_set_eth(act, idx, offset, &set_act->set_eth,
+				      extack);
 	case TCA_PEDIT_KEY_EX_HDR_TYPE_IP4:
-		return nfp_fl_set_ip4(act, offset, &set_act->set_ip_addr,
+		return nfp_fl_set_ip4(act, idx, offset, &set_act->set_ip_addr,
 				      &set_act->set_ip_ttl_tos, extack);
 	case TCA_PEDIT_KEY_EX_HDR_TYPE_IP6:
-		return nfp_fl_set_ip6(act, offset, &set_act->set_ip6_dst,
+		return nfp_fl_set_ip6(act, idx, offset, &set_act->set_ip6_dst,
 				      &set_act->set_ip6_src,
 				      &set_act->set_ip6_tc_hl_fl, extack);
 	case TCA_PEDIT_KEY_EX_HDR_TYPE_TCP:
-		return nfp_fl_set_tport(act, offset, &set_act->set_tport,
+		return nfp_fl_set_tport(act, idx, offset, &set_act->set_tport,
 					NFP_FL_ACTION_OPCODE_SET_TCP, extack);
 	case TCA_PEDIT_KEY_EX_HDR_TYPE_UDP:
-		return nfp_fl_set_tport(act, offset, &set_act->set_tport,
+		return nfp_fl_set_tport(act, idx, offset, &set_act->set_tport,
 					NFP_FL_ACTION_OPCODE_SET_UDP, extack);
 	default:
 		NL_SET_ERR_MSG_MOD(extack, "unsupported offload: pedit on unsupported header");
@@ -894,6 +893,30 @@ nfp_fl_pedit(const struct flow_action_entry *act,
 }
 
 static int
+nfp_fl_pedit(const struct flow_action_entry *act,
+	     struct flow_cls_offload *flow, char *nfp_action, int *a_len,
+	     u32 *csum_updated, struct nfp_flower_pedit_acts *set_act,
+	     struct netlink_ext_ack *extack)
+{
+	u32 offset, idx;
+	int err;
+
+	offset = act->mangle.offset;
+
+	for (idx = 0; idx < act->mangle.words; idx++) {
+		err = __nfp_fl_pedit(act, idx, offset, flow,
+				     nfp_action, a_len, csum_updated, set_act,
+				     extack);
+		if (err < 0)
+			return err;
+
+		offset += sizeof(u32);
+	}
+
+	return 0;
+}
+
+static int
 nfp_flower_output_action(struct nfp_app *app,
 			 const struct flow_action_entry *act,
 			 struct nfp_fl_payload *nfp_fl, int *a_len,
diff --git a/include/net/flow_offload.h b/include/net/flow_offload.h
index e8069b6c474c..d3fc6b7dcd6a 100644
--- a/include/net/flow_offload.h
+++ b/include/net/flow_offload.h
@@ -153,6 +153,8 @@ enum flow_action_mangle_base {
 	FLOW_ACT_MANGLE_HDR_TYPE_UDP,
 };
 
+#define FLOW_ACTION_MANGLE_MAX_WORDS	4
+
 struct flow_action_entry {
 	enum flow_action_id		id;
 	union {
@@ -166,8 +168,11 @@ struct flow_action_entry {
 		struct {				/* FLOW_ACTION_PACKET_EDIT */
 			enum flow_action_mangle_base htype;
 			u32		offset;
-			u32		mask;
-			u32		val;
+			struct {
+				u32	mask;
+				u32	val;
+			} data[FLOW_ACTION_MANGLE_MAX_WORDS];
+			u32		words;
 		} mangle;
 		const struct ip_tunnel_info *tunnel;	/* FLOW_ACTION_TUNNEL_ENCAP */
 		u32			csum_flags;	/* FLOW_ACTION_CSUM */
diff --git a/net/sched/cls_api.c b/net/sched/cls_api.c
index e0d8b456e9f5..041cd4000389 100644
--- a/net/sched/cls_api.c
+++ b/net/sched/cls_api.c
@@ -3077,9 +3077,12 @@ int tc_setup_flow_action(struct flow_action *flow_action,
 					goto err_out;
 				}
 				entry->mangle.htype = tcf_pedit_htype(act, k);
-				entry->mangle.mask = tcf_pedit_mask(act, k);
-				entry->mangle.val = tcf_pedit_val(act, k);
+				entry->mangle.data[0].mask =
+					tcf_pedit_mask(act, k);
+				entry->mangle.data[0].val =
+					tcf_pedit_val(act, k);
 				entry->mangle.offset = tcf_pedit_offset(act, k);
+				entry->mangle.words = 1;
 				entry = &flow_action->entries[++j];
 			}
 		} else if (is_tcf_csum(act)) {
-- 
2.11.0



^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH net-next 2/2] netfilter: nft_payload: packet mangling offload support
  2019-08-20 10:52 [PATCH net-next 1/2] net: flow_offload: mangle 128-bit packet field with one action Pablo Neira Ayuso
@ 2019-08-20 10:52 ` Pablo Neira Ayuso
  2019-08-20 14:15 ` [PATCH net-next 1/2] net: flow_offload: mangle 128-bit packet field with one action Edward Cree
  1 sibling, 0 replies; 9+ messages in thread
From: Pablo Neira Ayuso @ 2019-08-20 10:52 UTC (permalink / raw)
  To: netfilter-devel; +Cc: davem, netdev, jakub.kicinski, jiri, vladbu

This patch allows for mangling packet fields using hardware offload
infrastructure.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
 net/netfilter/nft_payload.c | 82 +++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 82 insertions(+)

diff --git a/net/netfilter/nft_payload.c b/net/netfilter/nft_payload.c
index 22a80eb60222..d758c8900835 100644
--- a/net/netfilter/nft_payload.c
+++ b/net/netfilter/nft_payload.c
@@ -562,12 +562,94 @@ static int nft_payload_set_dump(struct sk_buff *skb, const struct nft_expr *expr
 	return -1;
 }
 
+static int nft_payload_offload_set_nh(struct nft_offload_ctx *ctx,
+				      struct nft_flow_rule *flow,
+				      const struct nft_payload_set *priv)
+{
+	int type = FLOW_ACT_MANGLE_UNSPEC;
+
+	switch (ctx->dep.l3num) {
+	case htons(ETH_P_IP):
+		type = FLOW_ACT_MANGLE_HDR_TYPE_IP4;
+		break;
+	case htons(ETH_P_IPV6):
+		type = FLOW_ACT_MANGLE_HDR_TYPE_IP6;
+		break;
+	}
+
+	return type;
+}
+
+static int nft_payload_offload_set_th(struct nft_offload_ctx *ctx,
+				      struct nft_flow_rule *flow,
+				      const struct nft_payload_set *priv)
+{
+	int type = FLOW_ACT_MANGLE_UNSPEC;
+
+	switch (ctx->dep.protonum) {
+	case IPPROTO_TCP:
+		type = FLOW_ACT_MANGLE_HDR_TYPE_TCP;
+		break;
+	case IPPROTO_UDP:
+		type = FLOW_ACT_MANGLE_HDR_TYPE_UDP;
+		break;
+	}
+
+	return type;
+}
+
+static inline u32 nft_payload_mask(int len)
+{
+	return (1 << (len * BITS_PER_BYTE)) - 1;
+}
+
+static int nft_payload_set_offload(struct nft_offload_ctx *ctx,
+				   struct nft_flow_rule *flow,
+				   const struct nft_expr *expr)
+{
+	const struct nft_payload_set *priv = nft_expr_priv(expr);
+	struct nft_offload_reg *sreg = &ctx->regs[priv->sreg];
+	int type = FLOW_ACT_MANGLE_UNSPEC;
+	struct flow_action_entry *entry;
+	u32 words;
+	int i;
+
+	switch (priv->base) {
+	case NFT_PAYLOAD_LL_HEADER:
+		type = FLOW_ACT_MANGLE_HDR_TYPE_ETH;
+		break;
+	case NFT_PAYLOAD_NETWORK_HEADER:
+		type = nft_payload_offload_set_nh(ctx, flow, priv);
+		break;
+	case NFT_PAYLOAD_TRANSPORT_HEADER:
+		type = nft_payload_offload_set_th(ctx, flow, priv);
+		break;
+	default:
+		WARN_ON_ONCE(1);
+		break;
+	}
+	words = round_up(priv->len, sizeof(u32)) / sizeof(u32);
+
+	entry = &flow->rule->action.entries[ctx->num_actions++];
+	entry->mangle.htype	= type;
+	entry->mangle.offset	= priv->offset;
+	for (i = 0; i < words; i++) {
+		entry->mangle.data[i].mask =
+			~htonl(nft_payload_mask(priv->len));
+		entry->mangle.data[i].val = sreg->data.data[i];
+	}
+	entry->mangle.words	= words;
+
+	return type != FLOW_ACT_MANGLE_UNSPEC ? 0 : -EOPNOTSUPP;
+}
+
 static const struct nft_expr_ops nft_payload_set_ops = {
 	.type		= &nft_payload_type,
 	.size		= NFT_EXPR_SIZE(sizeof(struct nft_payload_set)),
 	.eval		= nft_payload_set_eval,
 	.init		= nft_payload_set_init,
 	.dump		= nft_payload_set_dump,
+	.offload	= nft_payload_set_offload,
 };
 
 static const struct nft_expr_ops *
-- 
2.11.0



^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH net-next 1/2] net: flow_offload: mangle 128-bit packet field with one action
  2019-08-20 10:52 [PATCH net-next 1/2] net: flow_offload: mangle 128-bit packet field with one action Pablo Neira Ayuso
  2019-08-20 10:52 ` [PATCH net-next 2/2] netfilter: nft_payload: packet mangling offload support Pablo Neira Ayuso
@ 2019-08-20 14:15 ` Edward Cree
  2019-08-20 14:44   ` Pablo Neira Ayuso
  1 sibling, 1 reply; 9+ messages in thread
From: Edward Cree @ 2019-08-20 14:15 UTC (permalink / raw)
  To: Pablo Neira Ayuso, netfilter-devel
  Cc: davem, netdev, jakub.kicinski, jiri, vladbu

On 20/08/2019 11:52, Pablo Neira Ayuso wrote:
> The existing infrastructure needs the front-end to generate up to four
> actions (one for each 32-bit word) to mangle an IPv6 address. This patch
> allows you to mangle fields than are longer than 4-bytes with one single
> action. Drivers have been adapted to this new representation following a
> simple approach, that is, iterate over the array of words and configure
> the hardware IR to make the packet mangling. FLOW_ACTION_MANGLE_MAX_WORDS
> defines the maximum number of words from one given offset (currently 4
> words).
>
> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
What's the point of this?
Why do you need to be able to do this with a single action?  It doesn't
 look like this extra 70 lines of code is actually buying you anything,
 and it makes more work for any other drivers that want to implement the
 offload API.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH net-next 1/2] net: flow_offload: mangle 128-bit packet field with one action
  2019-08-20 14:15 ` [PATCH net-next 1/2] net: flow_offload: mangle 128-bit packet field with one action Edward Cree
@ 2019-08-20 14:44   ` Pablo Neira Ayuso
  2019-08-20 16:00     ` Edward Cree
  0 siblings, 1 reply; 9+ messages in thread
From: Pablo Neira Ayuso @ 2019-08-20 14:44 UTC (permalink / raw)
  To: Edward Cree; +Cc: netfilter-devel, davem, netdev, jakub.kicinski, jiri, vladbu

On Tue, Aug 20, 2019 at 03:15:16PM +0100, Edward Cree wrote:
> On 20/08/2019 11:52, Pablo Neira Ayuso wrote:
> > The existing infrastructure needs the front-end to generate up to four
> > actions (one for each 32-bit word) to mangle an IPv6 address. This patch
> > allows you to mangle fields than are longer than 4-bytes with one single
> > action. Drivers have been adapted to this new representation following a
> > simple approach, that is, iterate over the array of words and configure
> > the hardware IR to make the packet mangling. FLOW_ACTION_MANGLE_MAX_WORDS
> > defines the maximum number of words from one given offset (currently 4
> > words).
> >
> > Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
>
> What's the point of this?
> Why do you need to be able to do this with a single action?  It doesn't
>  look like this extra 70 lines of code is actually buying you anything,
>  and it makes more work for any other drivers that want to implement the
>  offload API.

It looks to me this limitation is coming from tc pedit.

Four actions to mangle an IPv6 address consume more memory when making
the translation, and if you expect a lot of rules.

I think drivers can do more than one 32-bit word mangling in one go.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH net-next 1/2] net: flow_offload: mangle 128-bit packet field with one action
  2019-08-20 14:44   ` Pablo Neira Ayuso
@ 2019-08-20 16:00     ` Edward Cree
  2019-08-20 17:33       ` Pablo Neira Ayuso
  0 siblings, 1 reply; 9+ messages in thread
From: Edward Cree @ 2019-08-20 16:00 UTC (permalink / raw)
  To: Pablo Neira Ayuso
  Cc: netfilter-devel, davem, netdev, jakub.kicinski, jiri, vladbu

On 20/08/2019 15:44, Pablo Neira Ayuso wrote:
> It looks to me this limitation is coming from tc pedit.
>
> Four actions to mangle an IPv6 address consume more memory when making
> the translation, and if you expect a lot of rules.
Your change means that now every pedit uses four hw entries, even if it
 was only meant to be a 32-bit mangle.  Host memory used to keep track of
 the pedit actions is cheap, hw entries in pedit tables are not.  Nor is
 driver implementation complexity.
NAK.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH net-next 1/2] net: flow_offload: mangle 128-bit packet field with one action
  2019-08-20 16:00     ` Edward Cree
@ 2019-08-20 17:33       ` Pablo Neira Ayuso
  2019-08-20 18:15         ` Edward Cree
  0 siblings, 1 reply; 9+ messages in thread
From: Pablo Neira Ayuso @ 2019-08-20 17:33 UTC (permalink / raw)
  To: Edward Cree; +Cc: netfilter-devel, davem, netdev, jakub.kicinski, jiri, vladbu

On Tue, Aug 20, 2019 at 05:00:26PM +0100, Edward Cree wrote:
> On 20/08/2019 15:44, Pablo Neira Ayuso wrote:
> > It looks to me this limitation is coming from tc pedit.
> >
> > Four actions to mangle an IPv6 address consume more memory when making
> > the translation, and if you expect a lot of rules.
>
> Your change means that now every pedit uses four hw entries, even if it
>  was only meant to be a 32-bit mangle.

It makes no sense to me that matching an IPv6 address takes _one_
action, while mangling an IPv6 address takes _four_ actions.

A consistent model for drivers is good to have.

I can update tc pedit to generate one single action for offset
consecutive packet editions, if that is the concern, I'll send a v2.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH net-next 1/2] net: flow_offload: mangle 128-bit packet field with one action
  2019-08-20 17:33       ` Pablo Neira Ayuso
@ 2019-08-20 18:15         ` Edward Cree
  2019-08-20 18:35           ` Pablo Neira Ayuso
  0 siblings, 1 reply; 9+ messages in thread
From: Edward Cree @ 2019-08-20 18:15 UTC (permalink / raw)
  To: Pablo Neira Ayuso
  Cc: netfilter-devel, davem, netdev, jakub.kicinski, jiri, vladbu

On 20/08/2019 18:33, Pablo Neira Ayuso wrote:
> I can update tc pedit to generate one single action for offset
> consecutive packet editions, if that is the concern, I'll send a v2.
IMHO the fix belongs in TC userland (i.e. iproute2), to turn a single action on the commandline for an ipv6 addr into four pedit actions before the kernel ever sees it.
Similarly if nftables wants to use this it should generate four separate pedit actions, probably in the kernel netfilter code as (I assume) your uAPI talks in terms of named fields rather than the u32ish offsets and masks of tc pedit.
The TC (well, flow_offload now I suppose) API should be kept narrow, not widened for things that can already be expressed adequately.  Your array of words inside a pedit action looks like a kind of loop unrolling but for data structures, which doesn't look sensible to me.

-Ed

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH net-next 1/2] net: flow_offload: mangle 128-bit packet field with one action
  2019-08-20 18:15         ` Edward Cree
@ 2019-08-20 18:35           ` Pablo Neira Ayuso
  2019-08-21 15:05             ` Edward Cree
  0 siblings, 1 reply; 9+ messages in thread
From: Pablo Neira Ayuso @ 2019-08-20 18:35 UTC (permalink / raw)
  To: Edward Cree; +Cc: netfilter-devel, davem, netdev, jakub.kicinski, jiri, vladbu

On Tue, Aug 20, 2019 at 07:15:10PM +0100, Edward Cree wrote:
> On 20/08/2019 18:33, Pablo Neira Ayuso wrote:
> > I can update tc pedit to generate one single action for offset
> > consecutive packet editions, if that is the concern, I'll send a v2.
> IMHO the fix belongs in TC userland (i.e. iproute2), to turn a
> single action on the commandline for an ipv6 addr into four pedit
> actions before the kernel ever sees it.
> Similarly if nftables wants to use this it should generate four
> separate pedit actions, probably in the kernel netfilter code as (I
> assume) your uAPI talks in terms of named fields rather than the
> u32ish offsets and masks of tc pedit.

The driver flow_offload API does not necessarily need to map 1:1 to
the netlink control plane / UAPI. The driver flow_offload API is
detached from UAPI and it is internal to drivers.

> The TC (well, flow_offload now I suppose) API should be kept narrow,
> not widened for things that can already be expressed adequately. 
> Your array of words inside a pedit action looks like a kind of loop
> unrolling but for data structures, which doesn't look sensible to
> me.

With one action that says "mangle an IPv6 at offset ip6 daddr field"
the driver has more global view on what is going on, rather than
having four actions to mangle four 32-bit words at some offset.

If this patch adds some loops here is because I did not want to make
too smart changes on the drivers.

The only reason I can find why mangling is restricted to 32-bits word
is tc pedit. The existing flow_offload API was modeled after tc
actions, which was exposing tc pedit implementation details to
hardware.

Please, allow for incremental updates on the flow_offload API to get
it better now. Later we'll have way more drivers it will become harder
to update this.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH net-next 1/2] net: flow_offload: mangle 128-bit packet field with one action
  2019-08-20 18:35           ` Pablo Neira Ayuso
@ 2019-08-21 15:05             ` Edward Cree
  0 siblings, 0 replies; 9+ messages in thread
From: Edward Cree @ 2019-08-21 15:05 UTC (permalink / raw)
  To: Pablo Neira Ayuso
  Cc: netfilter-devel, davem, netdev, jakub.kicinski, jiri, vladbu

On 20/08/2019 19:35, Pablo Neira Ayuso wrote:
> With one action that says "mangle an IPv6 at offset ip6 daddr field"
> the driver has more global view on what is going on, rather than
> having four actions to mangle four 32-bit words at some offset.
But the action doesn't say that, it still says "mangle four 32-bit
 words", it's just that they're now contiguous.  The driver doesn't
 know whether that's an IPv6 address or just a bunch of fields that
 happened to be next to one another.
(Besides, the driver can't rely on that 'global view', because if
 the actions did come from the TC uAPI, they're still going to be
 single u32 mangles.)

> If this patch adds some loops here is because I did not want to make
> too smart changes on the drivers.
The thing is, the drivers are already looping over TC actions, so they
 already naturally support multiple pedits.  You don't gain any
 expressiveness by combining them into batches of four, meanwhile you
 make the API less orthogonal and more laborious to implement.

> Please, allow for incremental updates on the flow_offload API to get
> it better now. Later we'll have way more drivers it will become harder
> to update this.
I'm not opposed to making the API better.  I just don't believe that
 this patch series achieves that.

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2019-08-21 15:05 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-08-20 10:52 [PATCH net-next 1/2] net: flow_offload: mangle 128-bit packet field with one action Pablo Neira Ayuso
2019-08-20 10:52 ` [PATCH net-next 2/2] netfilter: nft_payload: packet mangling offload support Pablo Neira Ayuso
2019-08-20 14:15 ` [PATCH net-next 1/2] net: flow_offload: mangle 128-bit packet field with one action Edward Cree
2019-08-20 14:44   ` Pablo Neira Ayuso
2019-08-20 16:00     ` Edward Cree
2019-08-20 17:33       ` Pablo Neira Ayuso
2019-08-20 18:15         ` Edward Cree
2019-08-20 18:35           ` Pablo Neira Ayuso
2019-08-21 15:05             ` Edward Cree

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).