All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH iptables 0/7] support for dynamic register allocation
@ 2022-04-24 21:56 Pablo Neira Ayuso
  2022-04-24 21:56 ` [PATCH iptables 1/7] nft-shared: update context register for bitwise expression Pablo Neira Ayuso
                   ` (6 more replies)
  0 siblings, 7 replies; 12+ messages in thread
From: Pablo Neira Ayuso @ 2022-04-24 21:56 UTC (permalink / raw)
  To: netfilter-devel

Hi,

This patchset is composed of:

- Fix for bitwise expression to not assume NFT_REG_1 as destination
  register.

- Add native meta mark match support.

- Dynamic register allocation to leverage infrastructure available
  since Linux kernel 5.18-rc. See patch 7/7 for details.

Pablo Neira Ayuso (7):
  nft-shared: update context register for bitwise expression
  nft: pass struct nft_xt_ctx to parse_meta()
  nft: native mark matching support
  nft: pass handle to helper functions to build netlink payload
  nft: prepare for dynamic register allocation
  nft: split gen_payload() to allocate register and initialize expression
  nft: support for dynamic register allocation

 iptables/Makefile.am                          |   2 +-
 iptables/nft-arp.c                            |  42 ++--
 iptables/nft-bridge.c                         |  42 ++--
 iptables/nft-ipv4.c                           |  22 +-
 iptables/nft-ipv6.c                           |  12 +-
 iptables/nft-regs.c                           | 191 ++++++++++++++++++
 iptables/nft-regs.h                           |   9 +
 iptables/nft-shared.c                         | 162 ++++++++++-----
 iptables/nft-shared.h                         |  32 +--
 iptables/nft.c                                | 125 ++++++++----
 iptables/nft.h                                |  25 +++
 .../nft-only/0009-needless-bitwise_0          | 180 ++++++++---------
 12 files changed, 599 insertions(+), 245 deletions(-)
 create mode 100644 iptables/nft-regs.c
 create mode 100644 iptables/nft-regs.h

-- 
2.30.2


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH iptables 1/7] nft-shared: update context register for bitwise expression
  2022-04-24 21:56 [PATCH iptables 0/7] support for dynamic register allocation Pablo Neira Ayuso
@ 2022-04-24 21:56 ` Pablo Neira Ayuso
  2022-04-24 21:56 ` [PATCH iptables 2/7] nft: pass struct nft_xt_ctx to parse_meta() Pablo Neira Ayuso
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 12+ messages in thread
From: Pablo Neira Ayuso @ 2022-04-24 21:56 UTC (permalink / raw)
  To: netfilter-devel

Update the destination register, otherwise nft_parse_cmp() gives up on
interpreting the cmp expression when bitwise sreg != dreg.

Fixes: 2c4a34c30cb4 ("iptables-compat: fix address prefix")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
 iptables/nft-shared.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/iptables/nft-shared.c b/iptables/nft-shared.c
index c57218542c96..b3993211c79d 100644
--- a/iptables/nft-shared.c
+++ b/iptables/nft-shared.c
@@ -460,6 +460,8 @@ static void nft_parse_bitwise(struct nft_xt_ctx *ctx, struct nftnl_expr *e)
 	if (ctx->reg && reg != ctx->reg)
 		return;
 
+	reg = nftnl_expr_get_u32(e, NFTNL_EXPR_BITWISE_DREG);
+	ctx->reg = reg;
 	data = nftnl_expr_get(e, NFTNL_EXPR_BITWISE_XOR, &len);
 	memcpy(ctx->bitwise.xor, data, len);
 	data = nftnl_expr_get(e, NFTNL_EXPR_BITWISE_MASK, &len);
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH iptables 2/7] nft: pass struct nft_xt_ctx to parse_meta()
  2022-04-24 21:56 [PATCH iptables 0/7] support for dynamic register allocation Pablo Neira Ayuso
  2022-04-24 21:56 ` [PATCH iptables 1/7] nft-shared: update context register for bitwise expression Pablo Neira Ayuso
@ 2022-04-24 21:56 ` Pablo Neira Ayuso
  2022-04-24 21:56 ` [PATCH iptables 3/7] nft: native mark matching support Pablo Neira Ayuso
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 12+ messages in thread
From: Pablo Neira Ayuso @ 2022-04-24 21:56 UTC (permalink / raw)
  To: netfilter-devel

In preparation for native mark match support.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
 iptables/nft-arp.c    | 2 +-
 iptables/nft-bridge.c | 2 +-
 iptables/nft-ipv4.c   | 2 +-
 iptables/nft-ipv6.c   | 2 +-
 iptables/nft-shared.c | 6 +++---
 iptables/nft-shared.h | 6 +++---
 6 files changed, 10 insertions(+), 10 deletions(-)

diff --git a/iptables/nft-arp.c b/iptables/nft-arp.c
index 028b06a608e4..89e6413441e2 100644
--- a/iptables/nft-arp.c
+++ b/iptables/nft-arp.c
@@ -174,7 +174,7 @@ static void nft_arp_parse_meta(struct nft_xt_ctx *ctx, struct nftnl_expr *e,
 	struct arpt_entry *fw = &cs->arp;
 	uint8_t flags = 0;
 
-	parse_meta(e, ctx->meta.key, fw->arp.iniface, fw->arp.iniface_mask,
+	parse_meta(ctx, e, ctx->meta.key, fw->arp.iniface, fw->arp.iniface_mask,
 		   fw->arp.outiface, fw->arp.outiface_mask,
 		   &flags);
 
diff --git a/iptables/nft-bridge.c b/iptables/nft-bridge.c
index d4b66a25c740..097ef6e16827 100644
--- a/iptables/nft-bridge.c
+++ b/iptables/nft-bridge.c
@@ -171,7 +171,7 @@ static void nft_bridge_parse_meta(struct nft_xt_ctx *ctx,
 	uint8_t invflags = 0;
 	char iifname[IFNAMSIZ] = {}, oifname[IFNAMSIZ] = {};
 
-	parse_meta(e, ctx->meta.key, iifname, NULL, oifname, NULL, &invflags);
+	parse_meta(ctx, e, ctx->meta.key, iifname, NULL, oifname, NULL, &invflags);
 
 	switch (ctx->meta.key) {
 	case NFT_META_BRI_IIFNAME:
diff --git a/iptables/nft-ipv4.c b/iptables/nft-ipv4.c
index af3d0c98b798..cf03edfae9ac 100644
--- a/iptables/nft-ipv4.c
+++ b/iptables/nft-ipv4.c
@@ -144,7 +144,7 @@ static void nft_ipv4_parse_meta(struct nft_xt_ctx *ctx, struct nftnl_expr *e,
 		break;
 	}
 
-	parse_meta(e, ctx->meta.key, cs->fw.ip.iniface, cs->fw.ip.iniface_mask,
+	parse_meta(ctx, e, ctx->meta.key, cs->fw.ip.iniface, cs->fw.ip.iniface_mask,
 		   cs->fw.ip.outiface, cs->fw.ip.outiface_mask,
 		   &cs->fw.ip.invflags);
 }
diff --git a/iptables/nft-ipv6.c b/iptables/nft-ipv6.c
index 892a48541593..5b767a4059e6 100644
--- a/iptables/nft-ipv6.c
+++ b/iptables/nft-ipv6.c
@@ -117,7 +117,7 @@ static void nft_ipv6_parse_meta(struct nft_xt_ctx *ctx, struct nftnl_expr *e,
 		break;
 	}
 
-	parse_meta(e, ctx->meta.key, cs->fw6.ipv6.iniface,
+	parse_meta(ctx, e, ctx->meta.key, cs->fw6.ipv6.iniface,
 		   cs->fw6.ipv6.iniface_mask, cs->fw6.ipv6.outiface,
 		   cs->fw6.ipv6.outiface_mask, &cs->fw6.ipv6.invflags);
 }
diff --git a/iptables/nft-shared.c b/iptables/nft-shared.c
index b3993211c79d..5b13b29c9844 100644
--- a/iptables/nft-shared.c
+++ b/iptables/nft-shared.c
@@ -261,9 +261,9 @@ static void parse_ifname(const char *name, unsigned int len, char *dst, unsigned
 		memset(mask, 0xff, len - 2);
 }
 
-int parse_meta(struct nftnl_expr *e, uint8_t key, char *iniface,
-		unsigned char *iniface_mask, char *outiface,
-		unsigned char *outiface_mask, uint8_t *invflags)
+int parse_meta(struct nft_xt_ctx *ctx, struct nftnl_expr *e, uint8_t key,
+	       char *iniface, unsigned char *iniface_mask,
+	       char *outiface, unsigned char *outiface_mask, uint8_t *invflags)
 {
 	uint32_t value;
 	const void *ifname;
diff --git a/iptables/nft-shared.h b/iptables/nft-shared.h
index 7b337943836a..092958cd67fa 100644
--- a/iptables/nft-shared.h
+++ b/iptables/nft-shared.h
@@ -156,9 +156,9 @@ bool is_same_interfaces(const char *a_iniface, const char *a_outiface,
 			unsigned const char *b_iniface_mask,
 			unsigned const char *b_outiface_mask);
 
-int parse_meta(struct nftnl_expr *e, uint8_t key, char *iniface,
-		unsigned char *iniface_mask, char *outiface,
-		unsigned char *outiface_mask, uint8_t *invflags);
+int parse_meta(struct nft_xt_ctx *ctx, struct nftnl_expr *e, uint8_t key,
+	       char *iniface, unsigned char *iniface_mask, char *outiface,
+	       unsigned char *outiface_mask, uint8_t *invflags);
 void get_cmp_data(struct nftnl_expr *e, void *data, size_t dlen, bool *inv);
 void nft_rule_to_iptables_command_state(struct nft_handle *h,
 					const struct nftnl_rule *r,
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH iptables 3/7] nft: native mark matching support
  2022-04-24 21:56 [PATCH iptables 0/7] support for dynamic register allocation Pablo Neira Ayuso
  2022-04-24 21:56 ` [PATCH iptables 1/7] nft-shared: update context register for bitwise expression Pablo Neira Ayuso
  2022-04-24 21:56 ` [PATCH iptables 2/7] nft: pass struct nft_xt_ctx to parse_meta() Pablo Neira Ayuso
@ 2022-04-24 21:56 ` Pablo Neira Ayuso
  2022-04-24 21:56 ` [PATCH iptables 4/7] nft: pass handle to helper functions to build netlink payload Pablo Neira Ayuso
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 12+ messages in thread
From: Pablo Neira Ayuso @ 2022-04-24 21:56 UTC (permalink / raw)
  To: netfilter-devel

Use meta mark + bitwise + cmp instead of nft_compat mark match.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
 iptables/nft-shared.c | 36 ++++++++++++++++++++++++++++++++++++
 iptables/nft.c        | 23 +++++++++++++++++++++++
 2 files changed, 59 insertions(+)

diff --git a/iptables/nft-shared.c b/iptables/nft-shared.c
index 5b13b29c9844..54a911801639 100644
--- a/iptables/nft-shared.c
+++ b/iptables/nft-shared.c
@@ -24,6 +24,7 @@
 #include <linux/netfilter/xt_comment.h>
 #include <linux/netfilter/xt_limit.h>
 #include <linux/netfilter/xt_NFLOG.h>
+#include <linux/netfilter/xt_mark.h>
 
 #include <libmnl/libmnl.h>
 #include <libnftnl/rule.h>
@@ -261,6 +262,38 @@ static void parse_ifname(const char *name, unsigned int len, char *dst, unsigned
 		memset(mask, 0xff, len - 2);
 }
 
+static struct xtables_match *
+nft_create_match(struct nft_xt_ctx *ctx,
+		 struct iptables_command_state *cs,
+		 const char *name);
+
+static int parse_meta_mark(struct nft_xt_ctx *ctx, struct nftnl_expr *e)
+{
+	struct xt_mark_mtinfo1 *mark;
+	struct xtables_match *match;
+	uint32_t value;
+
+	match = nft_create_match(ctx, ctx->cs, "mark");
+	if (!match)
+		return -1;
+
+	mark = (void*)match->m->data;
+
+	if (nftnl_expr_get_u32(e, NFTNL_EXPR_CMP_OP) == NFT_CMP_NEQ)
+		mark->invert = 1;
+
+	value = nftnl_expr_get_u32(e, NFTNL_EXPR_CMP_DATA);
+	mark->mark = value;
+	if (ctx->flags & NFT_XT_CTX_BITWISE) {
+		memcpy(&mark->mask, &ctx->bitwise.mask, sizeof(mark->mask));
+		ctx->flags &= ~NFT_XT_CTX_BITWISE;
+	} else {
+		mark->mask = 0xffffffff;
+	}
+
+	return 0;
+}
+
 int parse_meta(struct nft_xt_ctx *ctx, struct nftnl_expr *e, uint8_t key,
 	       char *iniface, unsigned char *iniface_mask,
 	       char *outiface, unsigned char *outiface_mask, uint8_t *invflags)
@@ -304,6 +337,9 @@ int parse_meta(struct nft_xt_ctx *ctx, struct nftnl_expr *e, uint8_t key,
 
 		parse_ifname(ifname, len, outiface, outiface_mask);
 		break;
+	case NFT_META_MARK:
+		parse_meta_mark(ctx, e);
+		break;
 	default:
 		return -1;
 	}
diff --git a/iptables/nft.c b/iptables/nft.c
index 6883662fa28d..a629aeff98b0 100644
--- a/iptables/nft.c
+++ b/iptables/nft.c
@@ -40,6 +40,7 @@
 
 #include <linux/netfilter/xt_limit.h>
 #include <linux/netfilter/xt_NFLOG.h>
+#include <linux/netfilter/xt_mark.h>
 
 #include <libmnl/libmnl.h>
 #include <libnftnl/gen.h>
@@ -1406,6 +1407,26 @@ static int add_nft_tcp(struct nftnl_rule *r, struct xt_entry_match *m)
 			      tcp->dpts, tcp->invflags & XT_TCP_INV_DSTPT);
 }
 
+static int add_nft_mark(struct nft_handle *h, struct nftnl_rule *r,
+			struct xt_entry_match *m)
+{
+	struct xt_mark_mtinfo1 *mark = (void *)m->data;
+	int op;
+
+	add_meta(r, NFT_META_MARK);
+	if (mark->mask != 0xffffffff)
+		add_bitwise(r, (uint8_t *)&mark->mask, sizeof(uint32_t));
+
+	if (mark->invert)
+		op = NFT_CMP_NEQ;
+	else
+		op = NFT_CMP_EQ;
+
+	add_cmp_u32(r, mark->mark, op);
+
+	return 0;
+}
+
 int add_match(struct nft_handle *h,
 	      struct nftnl_rule *r, struct xt_entry_match *m)
 {
@@ -1420,6 +1441,8 @@ int add_match(struct nft_handle *h,
 		return add_nft_udp(r, m);
 	else if (!strcmp(m->u.user.name, "tcp"))
 		return add_nft_tcp(r, m);
+	else if (!strcmp(m->u.user.name, "mark"))
+		return add_nft_mark(h, r, m);
 
 	expr = nftnl_expr_alloc("match");
 	if (expr == NULL)
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH iptables 4/7] nft: pass handle to helper functions to build netlink payload
  2022-04-24 21:56 [PATCH iptables 0/7] support for dynamic register allocation Pablo Neira Ayuso
                   ` (2 preceding siblings ...)
  2022-04-24 21:56 ` [PATCH iptables 3/7] nft: native mark matching support Pablo Neira Ayuso
@ 2022-04-24 21:56 ` Pablo Neira Ayuso
  2022-04-24 21:56 ` [PATCH iptables 5/7] nft: prepare for dynamic register allocation Pablo Neira Ayuso
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 12+ messages in thread
From: Pablo Neira Ayuso @ 2022-04-24 21:56 UTC (permalink / raw)
  To: netfilter-devel

Pass struct nft_handle to helper functions in preparation for the
dynamic register allocation.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
 iptables/nft-arp.c    | 22 +++++++++++-----------
 iptables/nft-bridge.c | 24 +++++++++++++-----------
 iptables/nft-ipv4.c   | 12 ++++++------
 iptables/nft-ipv6.c   | 10 +++++-----
 iptables/nft-shared.c | 31 ++++++++++++++++++-------------
 iptables/nft-shared.h | 14 +++++++-------
 iptables/nft.c        | 26 +++++++++++++-------------
 7 files changed, 73 insertions(+), 66 deletions(-)

diff --git a/iptables/nft-arp.c b/iptables/nft-arp.c
index 89e6413441e2..8c5ce3525dd5 100644
--- a/iptables/nft-arp.c
+++ b/iptables/nft-arp.c
@@ -63,18 +63,18 @@ static int nft_arp_add(struct nft_handle *h, struct nftnl_rule *r,
 
 	if (fw->arp.iniface[0] != '\0') {
 		op = nft_invflags2cmp(fw->arp.invflags, IPT_INV_VIA_IN);
-		add_iniface(r, fw->arp.iniface, op);
+		add_iniface(h, r, fw->arp.iniface, op);
 	}
 
 	if (fw->arp.outiface[0] != '\0') {
 		op = nft_invflags2cmp(fw->arp.invflags, IPT_INV_VIA_OUT);
-		add_outiface(r, fw->arp.outiface, op);
+		add_outiface(h, r, fw->arp.outiface, op);
 	}
 
 	if (fw->arp.arhrd != 0 ||
 	    fw->arp.invflags & IPT_INV_ARPHRD) {
 		op = nft_invflags2cmp(fw->arp.invflags, IPT_INV_ARPHRD);
-		add_payload(r, offsetof(struct arphdr, ar_hrd), 2,
+		add_payload(h, r, offsetof(struct arphdr, ar_hrd), 2,
 			    NFT_PAYLOAD_NETWORK_HEADER);
 		add_cmp_u16(r, fw->arp.arhrd, op);
 	}
@@ -82,7 +82,7 @@ static int nft_arp_add(struct nft_handle *h, struct nftnl_rule *r,
 	if (fw->arp.arpro != 0 ||
 	    fw->arp.invflags & IPT_INV_PROTO) {
 		op = nft_invflags2cmp(fw->arp.invflags, IPT_INV_PROTO);
-	        add_payload(r, offsetof(struct arphdr, ar_pro), 2,
+	        add_payload(h, r, offsetof(struct arphdr, ar_pro), 2,
 			    NFT_PAYLOAD_NETWORK_HEADER);
 		add_cmp_u16(r, fw->arp.arpro, op);
 	}
@@ -90,23 +90,23 @@ static int nft_arp_add(struct nft_handle *h, struct nftnl_rule *r,
 	if (fw->arp.arhln != 0 ||
 	    fw->arp.invflags & IPT_INV_ARPHLN) {
 		op = nft_invflags2cmp(fw->arp.invflags, IPT_INV_ARPHLN);
-		add_proto(r, offsetof(struct arphdr, ar_hln), 1,
+		add_proto(h, r, offsetof(struct arphdr, ar_hln), 1,
 			  fw->arp.arhln, op);
 	}
 
-	add_proto(r, offsetof(struct arphdr, ar_pln), 1, 4, NFT_CMP_EQ);
+	add_proto(h, r, offsetof(struct arphdr, ar_pln), 1, 4, NFT_CMP_EQ);
 
 	if (fw->arp.arpop != 0 ||
 	    fw->arp.invflags & IPT_INV_ARPOP) {
 		op = nft_invflags2cmp(fw->arp.invflags, IPT_INV_ARPOP);
-		add_payload(r, offsetof(struct arphdr, ar_op), 2,
+		add_payload(h, r, offsetof(struct arphdr, ar_op), 2,
 			    NFT_PAYLOAD_NETWORK_HEADER);
 		add_cmp_u16(r, fw->arp.arpop, op);
 	}
 
 	if (need_devaddr(&fw->arp.src_devaddr)) {
 		op = nft_invflags2cmp(fw->arp.invflags, IPT_INV_SRCDEVADDR);
-		add_addr(r, NFT_PAYLOAD_NETWORK_HEADER,
+		add_addr(h, r, NFT_PAYLOAD_NETWORK_HEADER,
 			 sizeof(struct arphdr),
 			 &fw->arp.src_devaddr.addr,
 			 &fw->arp.src_devaddr.mask,
@@ -118,7 +118,7 @@ static int nft_arp_add(struct nft_handle *h, struct nftnl_rule *r,
 	    fw->arp.smsk.s_addr != 0 ||
 	    fw->arp.invflags & IPT_INV_SRCIP) {
 		op = nft_invflags2cmp(fw->arp.invflags, IPT_INV_SRCIP);
-		add_addr(r, NFT_PAYLOAD_NETWORK_HEADER,
+		add_addr(h, r, NFT_PAYLOAD_NETWORK_HEADER,
 			 sizeof(struct arphdr) + fw->arp.arhln,
 			 &fw->arp.src.s_addr, &fw->arp.smsk.s_addr,
 			 sizeof(struct in_addr), op);
@@ -127,7 +127,7 @@ static int nft_arp_add(struct nft_handle *h, struct nftnl_rule *r,
 
 	if (need_devaddr(&fw->arp.tgt_devaddr)) {
 		op = nft_invflags2cmp(fw->arp.invflags, IPT_INV_TGTDEVADDR);
-		add_addr(r, NFT_PAYLOAD_NETWORK_HEADER,
+		add_addr(h, r, NFT_PAYLOAD_NETWORK_HEADER,
 			 sizeof(struct arphdr) + fw->arp.arhln + sizeof(struct in_addr),
 			 &fw->arp.tgt_devaddr.addr,
 			 &fw->arp.tgt_devaddr.mask,
@@ -138,7 +138,7 @@ static int nft_arp_add(struct nft_handle *h, struct nftnl_rule *r,
 	    fw->arp.tmsk.s_addr != 0 ||
 	    fw->arp.invflags & IPT_INV_DSTIP) {
 		op = nft_invflags2cmp(fw->arp.invflags, IPT_INV_DSTIP);
-		add_addr(r, NFT_PAYLOAD_NETWORK_HEADER,
+		add_addr(h, r, NFT_PAYLOAD_NETWORK_HEADER,
 			 sizeof(struct arphdr) + fw->arp.arhln + sizeof(struct in_addr) + fw->arp.arhln,
 			 &fw->arp.tgt.s_addr, &fw->arp.tmsk.s_addr,
 			 sizeof(struct in_addr), op);
diff --git a/iptables/nft-bridge.c b/iptables/nft-bridge.c
index 097ef6e16827..888d4b6baa57 100644
--- a/iptables/nft-bridge.c
+++ b/iptables/nft-bridge.c
@@ -65,26 +65,28 @@ static void ebt_print_mac_and_mask(const unsigned char *mac, const unsigned char
 		xtables_print_mac_and_mask(mac, mask);
 }
 
-static void add_logical_iniface(struct nftnl_rule *r, char *iface, uint32_t op)
+static void add_logical_iniface(struct nft_handle *h, struct nftnl_rule *r,
+				char *iface, uint32_t op)
 {
 	int iface_len;
 
 	iface_len = strlen(iface);
 
-	add_meta(r, NFT_META_BRI_IIFNAME);
+	add_meta(h, r, NFT_META_BRI_IIFNAME);
 	if (iface[iface_len - 1] == '+')
 		add_cmp_ptr(r, op, iface, iface_len - 1);
 	else
 		add_cmp_ptr(r, op, iface, iface_len + 1);
 }
 
-static void add_logical_outiface(struct nftnl_rule *r, char *iface, uint32_t op)
+static void add_logical_outiface(struct nft_handle *h, struct nftnl_rule *r,
+				 char *iface, uint32_t op)
 {
 	int iface_len;
 
 	iface_len = strlen(iface);
 
-	add_meta(r, NFT_META_BRI_OIFNAME);
+	add_meta(h, r, NFT_META_BRI_OIFNAME);
 	if (iface[iface_len - 1] == '+')
 		add_cmp_ptr(r, op, iface, iface_len - 1);
 	else
@@ -106,41 +108,41 @@ static int nft_bridge_add(struct nft_handle *h,
 
 	if (fw->in[0] != '\0') {
 		op = nft_invflags2cmp(fw->invflags, EBT_IIN);
-		add_iniface(r, fw->in, op);
+		add_iniface(h, r, fw->in, op);
 	}
 
 	if (fw->out[0] != '\0') {
 		op = nft_invflags2cmp(fw->invflags, EBT_IOUT);
-		add_outiface(r, fw->out, op);
+		add_outiface(h, r, fw->out, op);
 	}
 
 	if (fw->logical_in[0] != '\0') {
 		op = nft_invflags2cmp(fw->invflags, EBT_ILOGICALIN);
-		add_logical_iniface(r, fw->logical_in, op);
+		add_logical_iniface(h, r, fw->logical_in, op);
 	}
 
 	if (fw->logical_out[0] != '\0') {
 		op = nft_invflags2cmp(fw->invflags, EBT_ILOGICALOUT);
-		add_logical_outiface(r, fw->logical_out, op);
+		add_logical_outiface(h, r, fw->logical_out, op);
 	}
 
 	if (fw->bitmask & EBT_ISOURCE) {
 		op = nft_invflags2cmp(fw->invflags, EBT_ISOURCE);
-		add_addr(r, NFT_PAYLOAD_LL_HEADER,
+		add_addr(h, r, NFT_PAYLOAD_LL_HEADER,
 			 offsetof(struct ethhdr, h_source),
 			 fw->sourcemac, fw->sourcemsk, ETH_ALEN, op);
 	}
 
 	if (fw->bitmask & EBT_IDEST) {
 		op = nft_invflags2cmp(fw->invflags, EBT_IDEST);
-		add_addr(r, NFT_PAYLOAD_LL_HEADER,
+		add_addr(h, r, NFT_PAYLOAD_LL_HEADER,
 			 offsetof(struct ethhdr, h_dest),
 			 fw->destmac, fw->destmsk, ETH_ALEN, op);
 	}
 
 	if ((fw->bitmask & EBT_NOPROTO) == 0) {
 		op = nft_invflags2cmp(fw->invflags, EBT_IPROTO);
-		add_payload(r, offsetof(struct ethhdr, h_proto), 2,
+		add_payload(h, r, offsetof(struct ethhdr, h_proto), 2,
 			    NFT_PAYLOAD_LL_HEADER);
 		add_cmp_u16(r, fw->ethproto, op);
 	}
diff --git a/iptables/nft-ipv4.c b/iptables/nft-ipv4.c
index cf03edfae9ac..76a0e0de378c 100644
--- a/iptables/nft-ipv4.c
+++ b/iptables/nft-ipv4.c
@@ -35,35 +35,35 @@ static int nft_ipv4_add(struct nft_handle *h, struct nftnl_rule *r,
 
 	if (cs->fw.ip.iniface[0] != '\0') {
 		op = nft_invflags2cmp(cs->fw.ip.invflags, IPT_INV_VIA_IN);
-		add_iniface(r, cs->fw.ip.iniface, op);
+		add_iniface(h, r, cs->fw.ip.iniface, op);
 	}
 
 	if (cs->fw.ip.outiface[0] != '\0') {
 		op = nft_invflags2cmp(cs->fw.ip.invflags, IPT_INV_VIA_OUT);
-		add_outiface(r, cs->fw.ip.outiface, op);
+		add_outiface(h, r, cs->fw.ip.outiface, op);
 	}
 
 	if (cs->fw.ip.proto != 0) {
 		op = nft_invflags2cmp(cs->fw.ip.invflags, XT_INV_PROTO);
-		add_l4proto(r, cs->fw.ip.proto, op);
+		add_l4proto(h, r, cs->fw.ip.proto, op);
 	}
 
 	if (cs->fw.ip.src.s_addr || cs->fw.ip.smsk.s_addr || cs->fw.ip.invflags & IPT_INV_SRCIP) {
 		op = nft_invflags2cmp(cs->fw.ip.invflags, IPT_INV_SRCIP);
-		add_addr(r, NFT_PAYLOAD_NETWORK_HEADER,
+		add_addr(h, r, NFT_PAYLOAD_NETWORK_HEADER,
 			 offsetof(struct iphdr, saddr),
 			 &cs->fw.ip.src.s_addr, &cs->fw.ip.smsk.s_addr,
 			 sizeof(struct in_addr), op);
 	}
 	if (cs->fw.ip.dst.s_addr || cs->fw.ip.dmsk.s_addr || cs->fw.ip.invflags & IPT_INV_DSTIP) {
 		op = nft_invflags2cmp(cs->fw.ip.invflags, IPT_INV_DSTIP);
-		add_addr(r, NFT_PAYLOAD_NETWORK_HEADER,
+		add_addr(h, r, NFT_PAYLOAD_NETWORK_HEADER,
 			 offsetof(struct iphdr, daddr),
 			 &cs->fw.ip.dst.s_addr, &cs->fw.ip.dmsk.s_addr,
 			 sizeof(struct in_addr), op);
 	}
 	if (cs->fw.ip.flags & IPT_F_FRAG) {
-		add_payload(r, offsetof(struct iphdr, frag_off), 2,
+		add_payload(h, r, offsetof(struct iphdr, frag_off), 2,
 			    NFT_PAYLOAD_NETWORK_HEADER);
 		/* get the 13 bits that contain the fragment offset */
 		add_bitwise_u16(r, htons(0x1fff), 0);
diff --git a/iptables/nft-ipv6.c b/iptables/nft-ipv6.c
index 5b767a4059e6..9a29d18bc215 100644
--- a/iptables/nft-ipv6.c
+++ b/iptables/nft-ipv6.c
@@ -34,24 +34,24 @@ static int nft_ipv6_add(struct nft_handle *h, struct nftnl_rule *r,
 
 	if (cs->fw6.ipv6.iniface[0] != '\0') {
 		op = nft_invflags2cmp(cs->fw6.ipv6.invflags, IPT_INV_VIA_IN);
-		add_iniface(r, cs->fw6.ipv6.iniface, op);
+		add_iniface(h, r, cs->fw6.ipv6.iniface, op);
 	}
 
 	if (cs->fw6.ipv6.outiface[0] != '\0') {
 		op = nft_invflags2cmp(cs->fw6.ipv6.invflags, IPT_INV_VIA_OUT);
-		add_outiface(r, cs->fw6.ipv6.outiface, op);
+		add_outiface(h, r, cs->fw6.ipv6.outiface, op);
 	}
 
 	if (cs->fw6.ipv6.proto != 0) {
 		op = nft_invflags2cmp(cs->fw6.ipv6.invflags, XT_INV_PROTO);
-		add_l4proto(r, cs->fw6.ipv6.proto, op);
+		add_l4proto(h, r, cs->fw6.ipv6.proto, op);
 	}
 
 	if (!IN6_IS_ADDR_UNSPECIFIED(&cs->fw6.ipv6.src) ||
 	    !IN6_IS_ADDR_UNSPECIFIED(&cs->fw6.ipv6.smsk) ||
 	    (cs->fw6.ipv6.invflags & IPT_INV_SRCIP)) {
 		op = nft_invflags2cmp(cs->fw6.ipv6.invflags, IPT_INV_SRCIP);
-		add_addr(r, NFT_PAYLOAD_NETWORK_HEADER,
+		add_addr(h, r, NFT_PAYLOAD_NETWORK_HEADER,
 			 offsetof(struct ip6_hdr, ip6_src),
 			 &cs->fw6.ipv6.src, &cs->fw6.ipv6.smsk,
 			 sizeof(struct in6_addr), op);
@@ -60,7 +60,7 @@ static int nft_ipv6_add(struct nft_handle *h, struct nftnl_rule *r,
 	    !IN6_IS_ADDR_UNSPECIFIED(&cs->fw6.ipv6.dmsk) ||
 	    (cs->fw6.ipv6.invflags & IPT_INV_DSTIP)) {
 		op = nft_invflags2cmp(cs->fw6.ipv6.invflags, IPT_INV_DSTIP);
-		add_addr(r, NFT_PAYLOAD_NETWORK_HEADER,
+		add_addr(h, r, NFT_PAYLOAD_NETWORK_HEADER,
 			 offsetof(struct ip6_hdr, ip6_dst),
 			 &cs->fw6.ipv6.dst, &cs->fw6.ipv6.dmsk,
 			 sizeof(struct in6_addr), op);
diff --git a/iptables/nft-shared.c b/iptables/nft-shared.c
index 54a911801639..52821684445b 100644
--- a/iptables/nft-shared.c
+++ b/iptables/nft-shared.c
@@ -40,7 +40,7 @@ extern struct nft_family_ops nft_family_ops_ipv6;
 extern struct nft_family_ops nft_family_ops_arp;
 extern struct nft_family_ops nft_family_ops_bridge;
 
-void add_meta(struct nftnl_rule *r, uint32_t key)
+void add_meta(struct nft_handle *h, struct nftnl_rule *r, uint32_t key)
 {
 	struct nftnl_expr *expr;
 
@@ -54,7 +54,8 @@ void add_meta(struct nftnl_rule *r, uint32_t key)
 	nftnl_rule_add_expr(r, expr);
 }
 
-void add_payload(struct nftnl_rule *r, int offset, int len, uint32_t base)
+void add_payload(struct nft_handle *h, struct nftnl_rule *r,
+		 int offset, int len, uint32_t base)
 {
 	struct nftnl_expr *expr;
 
@@ -136,13 +137,14 @@ void add_cmp_u32(struct nftnl_rule *r, uint32_t val, uint32_t op)
 	add_cmp_ptr(r, op, &val, sizeof(val));
 }
 
-void add_iniface(struct nftnl_rule *r, char *iface, uint32_t op)
+void add_iniface(struct nft_handle *h, struct nftnl_rule *r,
+		 char *iface, uint32_t op)
 {
 	int iface_len;
 
 	iface_len = strlen(iface);
 
-	add_meta(r, NFT_META_IIFNAME);
+	add_meta(h, r, NFT_META_IIFNAME);
 	if (iface[iface_len - 1] == '+') {
 		if (iface_len > 1)
 			add_cmp_ptr(r, op, iface, iface_len - 1);
@@ -150,13 +152,14 @@ void add_iniface(struct nftnl_rule *r, char *iface, uint32_t op)
 		add_cmp_ptr(r, op, iface, iface_len + 1);
 }
 
-void add_outiface(struct nftnl_rule *r, char *iface, uint32_t op)
+void add_outiface(struct nft_handle *h, struct nftnl_rule *r,
+		  char *iface, uint32_t op)
 {
 	int iface_len;
 
 	iface_len = strlen(iface);
 
-	add_meta(r, NFT_META_OIFNAME);
+	add_meta(h, r, NFT_META_OIFNAME);
 	if (iface[iface_len - 1] == '+') {
 		if (iface_len > 1)
 			add_cmp_ptr(r, op, iface, iface_len - 1);
@@ -164,7 +167,8 @@ void add_outiface(struct nftnl_rule *r, char *iface, uint32_t op)
 		add_cmp_ptr(r, op, iface, iface_len + 1);
 }
 
-void add_addr(struct nftnl_rule *r, enum nft_payload_bases base, int offset,
+void add_addr(struct nft_handle *h, struct nftnl_rule *r,
+	      enum nft_payload_bases base, int offset,
 	      void *data, void *mask, size_t len, uint32_t op)
 {
 	const unsigned char *m = mask;
@@ -183,7 +187,7 @@ void add_addr(struct nftnl_rule *r, enum nft_payload_bases base, int offset,
 	if (!bitwise)
 		len = i;
 
-	add_payload(r, offset, len, base);
+	add_payload(h, r, offset, len, base);
 
 	if (bitwise)
 		add_bitwise(r, mask, len);
@@ -191,16 +195,17 @@ void add_addr(struct nftnl_rule *r, enum nft_payload_bases base, int offset,
 	add_cmp_ptr(r, op, data, len);
 }
 
-void add_proto(struct nftnl_rule *r, int offset, size_t len,
-	       uint8_t proto, uint32_t op)
+void add_proto(struct nft_handle *h, struct nftnl_rule *r,
+	       int offset, size_t len, uint8_t proto, uint32_t op)
 {
-	add_payload(r, offset, len, NFT_PAYLOAD_NETWORK_HEADER);
+	add_payload(h, r, offset, len, NFT_PAYLOAD_NETWORK_HEADER);
 	add_cmp_u8(r, proto, op);
 }
 
-void add_l4proto(struct nftnl_rule *r, uint8_t proto, uint32_t op)
+void add_l4proto(struct nft_handle *h, struct nftnl_rule *r,
+		 uint8_t proto, uint32_t op)
 {
-	add_meta(r, NFT_META_L4PROTO);
+	add_meta(h, r, NFT_META_L4PROTO);
 	add_cmp_u8(r, proto, op);
 }
 
diff --git a/iptables/nft-shared.h b/iptables/nft-shared.h
index 092958cd67fa..0bdb6848d853 100644
--- a/iptables/nft-shared.h
+++ b/iptables/nft-shared.h
@@ -132,21 +132,21 @@ struct nft_family_ops {
 			     int rulenum);
 };
 
-void add_meta(struct nftnl_rule *r, uint32_t key);
-void add_payload(struct nftnl_rule *r, int offset, int len, uint32_t base);
+void add_meta(struct nft_handle *h, struct nftnl_rule *r, uint32_t key);
+void add_payload(struct nft_handle *h, struct nftnl_rule *r, int offset, int len, uint32_t base);
 void add_bitwise(struct nftnl_rule *r, uint8_t *mask, size_t len);
 void add_bitwise_u16(struct nftnl_rule *r, uint16_t mask, uint16_t xor);
 void add_cmp_ptr(struct nftnl_rule *r, uint32_t op, void *data, size_t len);
 void add_cmp_u8(struct nftnl_rule *r, uint8_t val, uint32_t op);
 void add_cmp_u16(struct nftnl_rule *r, uint16_t val, uint32_t op);
 void add_cmp_u32(struct nftnl_rule *r, uint32_t val, uint32_t op);
-void add_iniface(struct nftnl_rule *r, char *iface, uint32_t op);
-void add_outiface(struct nftnl_rule *r, char *iface, uint32_t op);
-void add_addr(struct nftnl_rule *r, enum nft_payload_bases base, int offset,
+void add_iniface(struct nft_handle *h, struct nftnl_rule *r, char *iface, uint32_t op);
+void add_outiface(struct nft_handle *h, struct nftnl_rule *r, char *iface, uint32_t op);
+void add_addr(struct nft_handle *h, struct nftnl_rule *r, enum nft_payload_bases base, int offset,
 	      void *data, void *mask, size_t len, uint32_t op);
-void add_proto(struct nftnl_rule *r, int offset, size_t len,
+void add_proto(struct nft_handle *h, struct nftnl_rule *r, int offset, size_t len,
 	       uint8_t proto, uint32_t op);
-void add_l4proto(struct nftnl_rule *r, uint8_t proto, uint32_t op);
+void add_l4proto(struct nft_handle *h, struct nftnl_rule *r, uint8_t proto, uint32_t op);
 void add_compat(struct nftnl_rule *r, uint32_t proto, bool inv);
 
 bool is_same_interfaces(const char *a_iniface, const char *a_outiface,
diff --git a/iptables/nft.c b/iptables/nft.c
index a629aeff98b0..987b3c957b98 100644
--- a/iptables/nft.c
+++ b/iptables/nft.c
@@ -1216,7 +1216,7 @@ static int add_nft_among(struct nft_handle *h,
 	    (data->dst.cnt && data->dst.ip)) {
 		uint16_t eth_p_ip = htons(ETH_P_IP);
 
-		add_meta(r, NFT_META_PROTOCOL);
+		add_meta(h, r, NFT_META_PROTOCOL);
 		add_cmp_ptr(r, NFT_CMP_EQ, &eth_p_ip, 2);
 	}
 
@@ -1263,11 +1263,9 @@ static int expr_gen_range_cmp16(struct nftnl_rule *r,
 	return 0;
 }
 
-static int add_nft_tcpudp(struct nftnl_rule *r,
-			  uint16_t src[2],
-			  bool invert_src,
-			  uint16_t dst[2],
-			  bool invert_dst)
+static int add_nft_tcpudp(struct nft_handle *h,struct nftnl_rule *r,
+			  uint16_t src[2], bool invert_src,
+			  uint16_t dst[2], bool invert_dst)
 {
 	struct nftnl_expr *expr;
 	uint8_t op = NFT_CMP_EQ;
@@ -1332,7 +1330,8 @@ static bool udp_all_zero(const struct xt_udp *u)
 	return memcmp(u, &zero, sizeof(*u)) == 0;
 }
 
-static int add_nft_udp(struct nftnl_rule *r, struct xt_entry_match *m)
+static int add_nft_udp(struct nft_handle *h, struct nftnl_rule *r,
+		       struct xt_entry_match *m)
 {
 	struct xt_udp *udp = (void *)m->data;
 
@@ -1346,7 +1345,7 @@ static int add_nft_udp(struct nftnl_rule *r, struct xt_entry_match *m)
 		return ret;
 	}
 
-	return add_nft_tcpudp(r, udp->spts, udp->invflags & XT_UDP_INV_SRCPT,
+	return add_nft_tcpudp(h, r, udp->spts, udp->invflags & XT_UDP_INV_SRCPT,
 			      udp->dpts, udp->invflags & XT_UDP_INV_DSTPT);
 }
 
@@ -1380,7 +1379,8 @@ static bool tcp_all_zero(const struct xt_tcp *t)
 	return memcmp(t, &zero, sizeof(*t)) == 0;
 }
 
-static int add_nft_tcp(struct nftnl_rule *r, struct xt_entry_match *m)
+static int add_nft_tcp(struct nft_handle *h, struct nftnl_rule *r,
+		       struct xt_entry_match *m)
 {
 	static const uint8_t supported = XT_TCP_INV_SRCPT | XT_TCP_INV_DSTPT | XT_TCP_INV_FLAGS;
 	struct xt_tcp *tcp = (void *)m->data;
@@ -1403,7 +1403,7 @@ static int add_nft_tcp(struct nftnl_rule *r, struct xt_entry_match *m)
 			return ret;
 	}
 
-	return add_nft_tcpudp(r, tcp->spts, tcp->invflags & XT_TCP_INV_SRCPT,
+	return add_nft_tcpudp(h, r, tcp->spts, tcp->invflags & XT_TCP_INV_SRCPT,
 			      tcp->dpts, tcp->invflags & XT_TCP_INV_DSTPT);
 }
 
@@ -1413,7 +1413,7 @@ static int add_nft_mark(struct nft_handle *h, struct nftnl_rule *r,
 	struct xt_mark_mtinfo1 *mark = (void *)m->data;
 	int op;
 
-	add_meta(r, NFT_META_MARK);
+	add_meta(h, r, NFT_META_MARK);
 	if (mark->mask != 0xffffffff)
 		add_bitwise(r, (uint8_t *)&mark->mask, sizeof(uint32_t));
 
@@ -1438,9 +1438,9 @@ int add_match(struct nft_handle *h,
 	else if (!strcmp(m->u.user.name, "among"))
 		return add_nft_among(h, r, m);
 	else if (!strcmp(m->u.user.name, "udp"))
-		return add_nft_udp(r, m);
+		return add_nft_udp(h, r, m);
 	else if (!strcmp(m->u.user.name, "tcp"))
-		return add_nft_tcp(r, m);
+		return add_nft_tcp(h, r, m);
 	else if (!strcmp(m->u.user.name, "mark"))
 		return add_nft_mark(h, r, m);
 
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH iptables 5/7] nft: prepare for dynamic register allocation
  2022-04-24 21:56 [PATCH iptables 0/7] support for dynamic register allocation Pablo Neira Ayuso
                   ` (3 preceding siblings ...)
  2022-04-24 21:56 ` [PATCH iptables 4/7] nft: pass handle to helper functions to build netlink payload Pablo Neira Ayuso
@ 2022-04-24 21:56 ` Pablo Neira Ayuso
  2022-04-24 21:56 ` [PATCH iptables 6/7] nft: split gen_payload() to allocate register and initialize expression Pablo Neira Ayuso
  2022-04-24 21:56 ` [PATCH iptables 7/7] nft: support for dynamic register allocation Pablo Neira Ayuso
  6 siblings, 0 replies; 12+ messages in thread
From: Pablo Neira Ayuso @ 2022-04-24 21:56 UTC (permalink / raw)
  To: netfilter-devel

Store the register that has been allocated and pass it on to the next
expression. NFT_REG_1 is still used.

No functional changes are expected.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
 iptables/nft-arp.c    | 18 +++++---
 iptables/nft-bridge.c | 20 +++++----
 iptables/nft-ipv4.c   |  8 ++--
 iptables/nft-shared.c | 99 ++++++++++++++++++++++++++-----------------
 iptables/nft-shared.h | 16 +++----
 iptables/nft.c        | 70 ++++++++++++++++--------------
 6 files changed, 137 insertions(+), 94 deletions(-)

diff --git a/iptables/nft-arp.c b/iptables/nft-arp.c
index 8c5ce3525dd5..65bd965eb69f 100644
--- a/iptables/nft-arp.c
+++ b/iptables/nft-arp.c
@@ -73,18 +73,22 @@ static int nft_arp_add(struct nft_handle *h, struct nftnl_rule *r,
 
 	if (fw->arp.arhrd != 0 ||
 	    fw->arp.invflags & IPT_INV_ARPHRD) {
+		uint8_t reg;
+
 		op = nft_invflags2cmp(fw->arp.invflags, IPT_INV_ARPHRD);
 		add_payload(h, r, offsetof(struct arphdr, ar_hrd), 2,
-			    NFT_PAYLOAD_NETWORK_HEADER);
-		add_cmp_u16(r, fw->arp.arhrd, op);
+			    NFT_PAYLOAD_NETWORK_HEADER, &reg);
+		add_cmp_u16(r, fw->arp.arhrd, op, reg);
 	}
 
 	if (fw->arp.arpro != 0 ||
 	    fw->arp.invflags & IPT_INV_PROTO) {
+		uint8_t reg;
+
 		op = nft_invflags2cmp(fw->arp.invflags, IPT_INV_PROTO);
 	        add_payload(h, r, offsetof(struct arphdr, ar_pro), 2,
-			    NFT_PAYLOAD_NETWORK_HEADER);
-		add_cmp_u16(r, fw->arp.arpro, op);
+			    NFT_PAYLOAD_NETWORK_HEADER, &reg);
+		add_cmp_u16(r, fw->arp.arpro, op, reg);
 	}
 
 	if (fw->arp.arhln != 0 ||
@@ -98,10 +102,12 @@ static int nft_arp_add(struct nft_handle *h, struct nftnl_rule *r,
 
 	if (fw->arp.arpop != 0 ||
 	    fw->arp.invflags & IPT_INV_ARPOP) {
+		uint8_t reg;
+
 		op = nft_invflags2cmp(fw->arp.invflags, IPT_INV_ARPOP);
 		add_payload(h, r, offsetof(struct arphdr, ar_op), 2,
-			    NFT_PAYLOAD_NETWORK_HEADER);
-		add_cmp_u16(r, fw->arp.arpop, op);
+			    NFT_PAYLOAD_NETWORK_HEADER, &reg);
+		add_cmp_u16(r, fw->arp.arpop, op, reg);
 	}
 
 	if (need_devaddr(&fw->arp.src_devaddr)) {
diff --git a/iptables/nft-bridge.c b/iptables/nft-bridge.c
index 888d4b6baa57..106bcc72889f 100644
--- a/iptables/nft-bridge.c
+++ b/iptables/nft-bridge.c
@@ -69,28 +69,30 @@ static void add_logical_iniface(struct nft_handle *h, struct nftnl_rule *r,
 				char *iface, uint32_t op)
 {
 	int iface_len;
+	uint8_t reg;
 
 	iface_len = strlen(iface);
 
-	add_meta(h, r, NFT_META_BRI_IIFNAME);
+	add_meta(h, r, NFT_META_BRI_IIFNAME, &reg);
 	if (iface[iface_len - 1] == '+')
-		add_cmp_ptr(r, op, iface, iface_len - 1);
+		add_cmp_ptr(r, op, iface, iface_len - 1, reg);
 	else
-		add_cmp_ptr(r, op, iface, iface_len + 1);
+		add_cmp_ptr(r, op, iface, iface_len + 1, reg);
 }
 
 static void add_logical_outiface(struct nft_handle *h, struct nftnl_rule *r,
 				 char *iface, uint32_t op)
 {
 	int iface_len;
+	uint8_t reg;
 
 	iface_len = strlen(iface);
 
-	add_meta(h, r, NFT_META_BRI_OIFNAME);
+	add_meta(h, r, NFT_META_BRI_OIFNAME, &reg);
 	if (iface[iface_len - 1] == '+')
-		add_cmp_ptr(r, op, iface, iface_len - 1);
+		add_cmp_ptr(r, op, iface, iface_len - 1, reg);
 	else
-		add_cmp_ptr(r, op, iface, iface_len + 1);
+		add_cmp_ptr(r, op, iface, iface_len + 1, reg);
 }
 
 static int _add_action(struct nftnl_rule *r, struct iptables_command_state *cs)
@@ -141,10 +143,12 @@ static int nft_bridge_add(struct nft_handle *h,
 	}
 
 	if ((fw->bitmask & EBT_NOPROTO) == 0) {
+		uint8_t reg;
+
 		op = nft_invflags2cmp(fw->invflags, EBT_IPROTO);
 		add_payload(h, r, offsetof(struct ethhdr, h_proto), 2,
-			    NFT_PAYLOAD_LL_HEADER);
-		add_cmp_u16(r, fw->ethproto, op);
+			    NFT_PAYLOAD_LL_HEADER, &reg);
+		add_cmp_u16(r, fw->ethproto, op, reg);
 	}
 
 	add_compat(r, fw->ethproto, fw->invflags & EBT_IPROTO);
diff --git a/iptables/nft-ipv4.c b/iptables/nft-ipv4.c
index 76a0e0de378c..59c4a41f1a05 100644
--- a/iptables/nft-ipv4.c
+++ b/iptables/nft-ipv4.c
@@ -63,17 +63,19 @@ static int nft_ipv4_add(struct nft_handle *h, struct nftnl_rule *r,
 			 sizeof(struct in_addr), op);
 	}
 	if (cs->fw.ip.flags & IPT_F_FRAG) {
+		uint8_t reg;
+
 		add_payload(h, r, offsetof(struct iphdr, frag_off), 2,
-			    NFT_PAYLOAD_NETWORK_HEADER);
+			    NFT_PAYLOAD_NETWORK_HEADER, &reg);
 		/* get the 13 bits that contain the fragment offset */
-		add_bitwise_u16(r, htons(0x1fff), 0);
+		add_bitwise_u16(h, r, htons(0x1fff), 0, reg, &reg);
 
 		/* if offset is non-zero, this is a fragment */
 		op = NFT_CMP_NEQ;
 		if (cs->fw.ip.invflags & IPT_INV_FRAG)
 			op = NFT_CMP_EQ;
 
-		add_cmp_u16(r, 0, op);
+		add_cmp_u16(r, 0, op, reg);
 	}
 
 	add_compat(r, cs->fw.ip.proto, cs->fw.ip.invflags & XT_INV_PROTO);
diff --git a/iptables/nft-shared.c b/iptables/nft-shared.c
index 52821684445b..27e95c1ae4f3 100644
--- a/iptables/nft-shared.c
+++ b/iptables/nft-shared.c
@@ -40,74 +40,89 @@ extern struct nft_family_ops nft_family_ops_ipv6;
 extern struct nft_family_ops nft_family_ops_arp;
 extern struct nft_family_ops nft_family_ops_bridge;
 
-void add_meta(struct nft_handle *h, struct nftnl_rule *r, uint32_t key)
+void add_meta(struct nft_handle *h, struct nftnl_rule *r, uint32_t key,
+	      uint8_t *dreg)
 {
 	struct nftnl_expr *expr;
+	uint8_t reg;
 
 	expr = nftnl_expr_alloc("meta");
 	if (expr == NULL)
 		return;
 
+	reg = NFT_REG_1;
 	nftnl_expr_set_u32(expr, NFTNL_EXPR_META_KEY, key);
-	nftnl_expr_set_u32(expr, NFTNL_EXPR_META_DREG, NFT_REG_1);
-
+	nftnl_expr_set_u32(expr, NFTNL_EXPR_META_DREG, reg);
 	nftnl_rule_add_expr(r, expr);
+
+	*dreg = reg;
 }
 
 void add_payload(struct nft_handle *h, struct nftnl_rule *r,
-		 int offset, int len, uint32_t base)
+		 int offset, int len, uint32_t base, uint8_t *dreg)
 {
 	struct nftnl_expr *expr;
+	uint8_t reg;
 
 	expr = nftnl_expr_alloc("payload");
 	if (expr == NULL)
 		return;
 
+	reg = NFT_REG_1;
 	nftnl_expr_set_u32(expr, NFTNL_EXPR_PAYLOAD_BASE, base);
-	nftnl_expr_set_u32(expr, NFTNL_EXPR_PAYLOAD_DREG, NFT_REG_1);
+	nftnl_expr_set_u32(expr, NFTNL_EXPR_PAYLOAD_DREG, reg);
 	nftnl_expr_set_u32(expr, NFTNL_EXPR_PAYLOAD_OFFSET, offset);
 	nftnl_expr_set_u32(expr, NFTNL_EXPR_PAYLOAD_LEN, len);
-
 	nftnl_rule_add_expr(r, expr);
+
+	*dreg = reg;
 }
 
 /* bitwise operation is = sreg & mask ^ xor */
-void add_bitwise_u16(struct nftnl_rule *r, uint16_t mask, uint16_t xor)
+void add_bitwise_u16(struct nft_handle *h, struct nftnl_rule *r,
+		     uint16_t mask, uint16_t xor, uint8_t sreg, uint8_t *dreg)
 {
 	struct nftnl_expr *expr;
+	uint8_t reg;
 
 	expr = nftnl_expr_alloc("bitwise");
 	if (expr == NULL)
 		return;
 
-	nftnl_expr_set_u32(expr, NFTNL_EXPR_BITWISE_SREG, NFT_REG_1);
-	nftnl_expr_set_u32(expr, NFTNL_EXPR_BITWISE_DREG, NFT_REG_1);
+	reg = NFT_REG_1;
+	nftnl_expr_set_u32(expr, NFTNL_EXPR_BITWISE_SREG, sreg);
+	nftnl_expr_set_u32(expr, NFTNL_EXPR_BITWISE_DREG, reg);
 	nftnl_expr_set_u32(expr, NFTNL_EXPR_BITWISE_LEN, sizeof(uint16_t));
 	nftnl_expr_set(expr, NFTNL_EXPR_BITWISE_MASK, &mask, sizeof(uint16_t));
 	nftnl_expr_set(expr, NFTNL_EXPR_BITWISE_XOR, &xor, sizeof(uint16_t));
-
 	nftnl_rule_add_expr(r, expr);
+
+	*dreg = reg;
 }
 
-void add_bitwise(struct nftnl_rule *r, uint8_t *mask, size_t len)
+void add_bitwise(struct nft_handle *h, struct nftnl_rule *r,
+		 uint8_t *mask, size_t len, uint8_t sreg, uint8_t *dreg)
 {
 	struct nftnl_expr *expr;
 	uint32_t xor[4] = { 0 };
+	uint8_t reg = *dreg;
 
 	expr = nftnl_expr_alloc("bitwise");
 	if (expr == NULL)
 		return;
 
-	nftnl_expr_set_u32(expr, NFTNL_EXPR_BITWISE_SREG, NFT_REG_1);
-	nftnl_expr_set_u32(expr, NFTNL_EXPR_BITWISE_DREG, NFT_REG_1);
+	nftnl_expr_set_u32(expr, NFTNL_EXPR_BITWISE_SREG, sreg);
+	nftnl_expr_set_u32(expr, NFTNL_EXPR_BITWISE_DREG, reg);
 	nftnl_expr_set_u32(expr, NFTNL_EXPR_BITWISE_LEN, len);
 	nftnl_expr_set(expr, NFTNL_EXPR_BITWISE_MASK, mask, len);
 	nftnl_expr_set(expr, NFTNL_EXPR_BITWISE_XOR, &xor, len);
-
 	nftnl_rule_add_expr(r, expr);
+
+	*dreg = reg;
 }
 
-void add_cmp_ptr(struct nftnl_rule *r, uint32_t op, void *data, size_t len)
+void add_cmp_ptr(struct nftnl_rule *r, uint32_t op, void *data, size_t len,
+		 uint8_t sreg)
 {
 	struct nftnl_expr *expr;
 
@@ -115,56 +130,59 @@ void add_cmp_ptr(struct nftnl_rule *r, uint32_t op, void *data, size_t len)
 	if (expr == NULL)
 		return;
 
-	nftnl_expr_set_u32(expr, NFTNL_EXPR_CMP_SREG, NFT_REG_1);
+	nftnl_expr_set_u32(expr, NFTNL_EXPR_CMP_SREG, sreg);
 	nftnl_expr_set_u32(expr, NFTNL_EXPR_CMP_OP, op);
 	nftnl_expr_set(expr, NFTNL_EXPR_CMP_DATA, data, len);
-
 	nftnl_rule_add_expr(r, expr);
 }
 
-void add_cmp_u8(struct nftnl_rule *r, uint8_t val, uint32_t op)
+void add_cmp_u8(struct nftnl_rule *r, uint8_t val, uint32_t op, uint8_t sreg)
 {
-	add_cmp_ptr(r, op, &val, sizeof(val));
+	add_cmp_ptr(r, op, &val, sizeof(val), sreg);
 }
 
-void add_cmp_u16(struct nftnl_rule *r, uint16_t val, uint32_t op)
+void add_cmp_u16(struct nftnl_rule *r, uint16_t val, uint32_t op, uint8_t sreg)
 {
-	add_cmp_ptr(r, op, &val, sizeof(val));
+	add_cmp_ptr(r, op, &val, sizeof(val), sreg);
 }
 
-void add_cmp_u32(struct nftnl_rule *r, uint32_t val, uint32_t op)
+void add_cmp_u32(struct nftnl_rule *r, uint32_t val, uint32_t op, uint8_t sreg)
 {
-	add_cmp_ptr(r, op, &val, sizeof(val));
+	add_cmp_ptr(r, op, &val, sizeof(val), sreg);
 }
 
 void add_iniface(struct nft_handle *h, struct nftnl_rule *r,
 		 char *iface, uint32_t op)
 {
 	int iface_len;
+	uint8_t reg;
 
 	iface_len = strlen(iface);
 
-	add_meta(h, r, NFT_META_IIFNAME);
+	add_meta(h, r, NFT_META_IIFNAME, &reg);
 	if (iface[iface_len - 1] == '+') {
 		if (iface_len > 1)
-			add_cmp_ptr(r, op, iface, iface_len - 1);
-	} else
-		add_cmp_ptr(r, op, iface, iface_len + 1);
+			add_cmp_ptr(r, op, iface, iface_len - 1, reg);
+	} else {
+		add_cmp_ptr(r, op, iface, iface_len + 1, reg);
+	}
 }
 
 void add_outiface(struct nft_handle *h, struct nftnl_rule *r,
 		  char *iface, uint32_t op)
 {
 	int iface_len;
+	uint8_t reg;
 
 	iface_len = strlen(iface);
 
-	add_meta(h, r, NFT_META_OIFNAME);
+	add_meta(h, r, NFT_META_OIFNAME, &reg);
 	if (iface[iface_len - 1] == '+') {
 		if (iface_len > 1)
-			add_cmp_ptr(r, op, iface, iface_len - 1);
-	} else
-		add_cmp_ptr(r, op, iface, iface_len + 1);
+			add_cmp_ptr(r, op, iface, iface_len - 1, reg);
+	} else {
+		add_cmp_ptr(r, op, iface, iface_len + 1, reg);
+	}
 }
 
 void add_addr(struct nft_handle *h, struct nftnl_rule *r,
@@ -173,6 +191,7 @@ void add_addr(struct nft_handle *h, struct nftnl_rule *r,
 {
 	const unsigned char *m = mask;
 	bool bitwise = false;
+	uint8_t reg;
 	int i, j;
 
 	for (i = 0; i < len; i++) {
@@ -187,26 +206,30 @@ void add_addr(struct nft_handle *h, struct nftnl_rule *r,
 	if (!bitwise)
 		len = i;
 
-	add_payload(h, r, offset, len, base);
+	add_payload(h, r, offset, len, base, &reg);
 
 	if (bitwise)
-		add_bitwise(r, mask, len);
+		add_bitwise(h, r, mask, len, reg, &reg);
 
-	add_cmp_ptr(r, op, data, len);
+	add_cmp_ptr(r, op, data, len, reg);
 }
 
 void add_proto(struct nft_handle *h, struct nftnl_rule *r,
 	       int offset, size_t len, uint8_t proto, uint32_t op)
 {
-	add_payload(h, r, offset, len, NFT_PAYLOAD_NETWORK_HEADER);
-	add_cmp_u8(r, proto, op);
+	uint8_t reg;
+
+	add_payload(h, r, offset, len, NFT_PAYLOAD_NETWORK_HEADER, &reg);
+	add_cmp_u8(r, proto, op, reg);
 }
 
 void add_l4proto(struct nft_handle *h, struct nftnl_rule *r,
 		 uint8_t proto, uint32_t op)
 {
-	add_meta(h, r, NFT_META_L4PROTO);
-	add_cmp_u8(r, proto, op);
+	uint8_t reg;
+
+	add_meta(h, r, NFT_META_L4PROTO, &reg);
+	add_cmp_u8(r, proto, op, reg);
 }
 
 bool is_same_interfaces(const char *a_iniface, const char *a_outiface,
diff --git a/iptables/nft-shared.h b/iptables/nft-shared.h
index 0bdb6848d853..b04049047116 100644
--- a/iptables/nft-shared.h
+++ b/iptables/nft-shared.h
@@ -132,14 +132,14 @@ struct nft_family_ops {
 			     int rulenum);
 };
 
-void add_meta(struct nft_handle *h, struct nftnl_rule *r, uint32_t key);
-void add_payload(struct nft_handle *h, struct nftnl_rule *r, int offset, int len, uint32_t base);
-void add_bitwise(struct nftnl_rule *r, uint8_t *mask, size_t len);
-void add_bitwise_u16(struct nftnl_rule *r, uint16_t mask, uint16_t xor);
-void add_cmp_ptr(struct nftnl_rule *r, uint32_t op, void *data, size_t len);
-void add_cmp_u8(struct nftnl_rule *r, uint8_t val, uint32_t op);
-void add_cmp_u16(struct nftnl_rule *r, uint16_t val, uint32_t op);
-void add_cmp_u32(struct nftnl_rule *r, uint32_t val, uint32_t op);
+void add_meta(struct nft_handle *h, struct nftnl_rule *r, uint32_t key, uint8_t *dreg);
+void add_payload(struct nft_handle *h, struct nftnl_rule *r, int offset, int len, uint32_t base, uint8_t *dreg);
+void add_bitwise(struct nft_handle *h, struct nftnl_rule *r, uint8_t *mask, size_t len, uint8_t sreg, uint8_t *dreg);
+void add_bitwise_u16(struct nft_handle *h, struct nftnl_rule *r, uint16_t mask, uint16_t xor, uint8_t sreg, uint8_t *dreg);
+void add_cmp_ptr(struct nftnl_rule *r, uint32_t op, void *data, size_t len, uint8_t sreg);
+void add_cmp_u8(struct nftnl_rule *r, uint8_t val, uint32_t op, uint8_t sreg);
+void add_cmp_u16(struct nftnl_rule *r, uint16_t val, uint32_t op, uint8_t sreg);
+void add_cmp_u32(struct nftnl_rule *r, uint32_t val, uint32_t op, uint8_t sreg);
 void add_iniface(struct nft_handle *h, struct nftnl_rule *r, char *iface, uint32_t op);
 void add_outiface(struct nft_handle *h, struct nftnl_rule *r, char *iface, uint32_t op);
 void add_addr(struct nft_handle *h, struct nftnl_rule *r, enum nft_payload_bases base, int offset,
diff --git a/iptables/nft.c b/iptables/nft.c
index 987b3c957b98..bdfef0244b38 100644
--- a/iptables/nft.c
+++ b/iptables/nft.c
@@ -1090,16 +1090,22 @@ static struct nftnl_set *add_anon_set(struct nft_handle *h, const char *table,
 }
 
 static struct nftnl_expr *
-gen_payload(uint32_t base, uint32_t offset, uint32_t len, uint32_t dreg)
+gen_payload(struct nft_handle *h, uint32_t base, uint32_t offset, uint32_t len,
+	    uint8_t *dreg)
 {
 	struct nftnl_expr *e = nftnl_expr_alloc("payload");
+	uint8_t reg;
 
 	if (!e)
 		return NULL;
+
+	reg = NFT_REG_1;
 	nftnl_expr_set_u32(e, NFTNL_EXPR_PAYLOAD_BASE, base);
 	nftnl_expr_set_u32(e, NFTNL_EXPR_PAYLOAD_OFFSET, offset);
 	nftnl_expr_set_u32(e, NFTNL_EXPR_PAYLOAD_LEN, len);
-	nftnl_expr_set_u32(e, NFTNL_EXPR_PAYLOAD_DREG, dreg);
+	nftnl_expr_set_u32(e, NFTNL_EXPR_PAYLOAD_DREG, reg);
+	*dreg = reg;
+
 	return e;
 }
 
@@ -1144,6 +1150,7 @@ static int __add_nft_among(struct nft_handle *h, const char *table,
 	struct nftnl_expr *e;
 	struct nftnl_set *s;
 	uint32_t flags = 0;
+	uint8_t reg;
 	int idx = 0;
 
 	if (ip) {
@@ -1184,21 +1191,22 @@ static int __add_nft_among(struct nft_handle *h, const char *table,
 		nftnl_set_elem_add(s, elem);
 	}
 
-	e = gen_payload(NFT_PAYLOAD_LL_HEADER,
-			eth_addr_off[dst], ETH_ALEN, NFT_REG_1);
+	e = gen_payload(h, NFT_PAYLOAD_LL_HEADER,
+			eth_addr_off[dst], ETH_ALEN, &reg);
 	if (!e)
 		return -ENOMEM;
 	nftnl_rule_add_expr(r, e);
 
 	if (ip) {
-		e = gen_payload(NFT_PAYLOAD_NETWORK_HEADER, ip_addr_off[dst],
-				sizeof(struct in_addr), NFT_REG32_02);
+		e = gen_payload(h, NFT_PAYLOAD_NETWORK_HEADER, ip_addr_off[dst],
+				sizeof(struct in_addr), &reg);
 		if (!e)
 			return -ENOMEM;
 		nftnl_rule_add_expr(r, e);
 	}
 
-	e = gen_lookup(NFT_REG_1, "__set%d", set_id, inv);
+	reg = NFT_REG_1;
+	e = gen_lookup(reg, "__set%d", set_id, inv);
 	if (!e)
 		return -ENOMEM;
 	nftnl_rule_add_expr(r, e);
@@ -1215,9 +1223,10 @@ static int add_nft_among(struct nft_handle *h,
 	if ((data->src.cnt && data->src.ip) ||
 	    (data->dst.cnt && data->dst.ip)) {
 		uint16_t eth_p_ip = htons(ETH_P_IP);
+		uint8_t reg;
 
-		add_meta(h, r, NFT_META_PROTOCOL);
-		add_cmp_ptr(r, NFT_CMP_EQ, &eth_p_ip, 2);
+		add_meta(h, r, NFT_META_PROTOCOL, &reg);
+		add_cmp_ptr(r, NFT_CMP_EQ, &eth_p_ip, 2, reg);
 	}
 
 	if (data->src.cnt)
@@ -1233,17 +1242,17 @@ static int add_nft_among(struct nft_handle *h,
 static int expr_gen_range_cmp16(struct nftnl_rule *r,
 				uint16_t lo,
 				uint16_t hi,
-				bool invert)
+				bool invert, uint8_t reg)
 {
 	struct nftnl_expr *e;
 
 	if (lo == hi) {
-		add_cmp_u16(r, htons(lo), invert ? NFT_CMP_NEQ : NFT_CMP_EQ);
+		add_cmp_u16(r, htons(lo), invert ? NFT_CMP_NEQ : NFT_CMP_EQ, reg);
 		return 0;
 	}
 
 	if (lo == 0 && hi < 0xffff) {
-		add_cmp_u16(r, htons(hi) , invert ? NFT_CMP_GT : NFT_CMP_LTE);
+		add_cmp_u16(r, htons(hi) , invert ? NFT_CMP_GT : NFT_CMP_LTE, reg);
 		return 0;
 	}
 
@@ -1251,7 +1260,7 @@ static int expr_gen_range_cmp16(struct nftnl_rule *r,
 	if (!e)
 		return -ENOMEM;
 
-	nftnl_expr_set_u32(e, NFTNL_EXPR_RANGE_SREG, NFT_REG_1);
+	nftnl_expr_set_u32(e, NFTNL_EXPR_RANGE_SREG, reg);
 	nftnl_expr_set_u32(e, NFTNL_EXPR_RANGE_OP, invert ? NFT_RANGE_NEQ : NFT_RANGE_EQ);
 
 	lo = htons(lo);
@@ -1269,6 +1278,7 @@ static int add_nft_tcpudp(struct nft_handle *h,struct nftnl_rule *r,
 {
 	struct nftnl_expr *expr;
 	uint8_t op = NFT_CMP_EQ;
+	uint8_t reg;
 	int ret;
 
 	if (src[0] && src[0] == src[1] &&
@@ -1279,36 +1289,33 @@ static int add_nft_tcpudp(struct nft_handle *h,struct nftnl_rule *r,
 		if (invert_src)
 			op = NFT_CMP_NEQ;
 
-		expr = gen_payload(NFT_PAYLOAD_TRANSPORT_HEADER, 0, 4,
-				   NFT_REG_1);
+		expr = gen_payload(h, NFT_PAYLOAD_TRANSPORT_HEADER, 0, 4, &reg);
 		if (!expr)
 			return -ENOMEM;
 
 		nftnl_rule_add_expr(r, expr);
-		add_cmp_u32(r, htonl(combined), op);
+		add_cmp_u32(r, htonl(combined), op, reg);
 		return 0;
 	}
 
 	if (src[0] || src[1] < 0xffff) {
-		expr = gen_payload(NFT_PAYLOAD_TRANSPORT_HEADER,
-				   0, 2, NFT_REG_1);
+		expr = gen_payload(h, NFT_PAYLOAD_TRANSPORT_HEADER, 0, 2, &reg);
 		if (!expr)
 			return -ENOMEM;
 
 		nftnl_rule_add_expr(r, expr);
-		ret = expr_gen_range_cmp16(r, src[0], src[1], invert_src);
+		ret = expr_gen_range_cmp16(r, src[0], src[1], invert_src, reg);
 		if (ret)
 			return ret;
 	}
 
 	if (dst[0] || dst[1] < 0xffff) {
-		expr = gen_payload(NFT_PAYLOAD_TRANSPORT_HEADER,
-				   2, 2, NFT_REG_1);
+		expr = gen_payload(h, NFT_PAYLOAD_TRANSPORT_HEADER, 2, 2, &reg);
 		if (!expr)
 			return -ENOMEM;
 
 		nftnl_rule_add_expr(r, expr);
-		ret = expr_gen_range_cmp16(r, dst[0], dst[1], invert_dst);
+		ret = expr_gen_range_cmp16(r, dst[0], dst[1], invert_dst, reg);
 		if (ret)
 			return ret;
 	}
@@ -1349,22 +1356,22 @@ static int add_nft_udp(struct nft_handle *h, struct nftnl_rule *r,
 			      udp->dpts, udp->invflags & XT_UDP_INV_DSTPT);
 }
 
-static int add_nft_tcpflags(struct nftnl_rule *r,
+static int add_nft_tcpflags(struct nft_handle *h, struct nftnl_rule *r,
 			    uint8_t cmp, uint8_t mask,
 			    bool invert)
 {
 	struct nftnl_expr *e;
+	uint8_t reg;
 
-	e = gen_payload(NFT_PAYLOAD_TRANSPORT_HEADER,
-			13, 1, NFT_REG_1);
+	e = gen_payload(h, NFT_PAYLOAD_TRANSPORT_HEADER, 13, 1, &reg);
 
 	if (!e)
 		return -ENOMEM;
 
 	nftnl_rule_add_expr(r, e);
 
-	add_bitwise(r, &mask, 1);
-	add_cmp_u8(r, cmp, invert ? NFT_CMP_NEQ : NFT_CMP_EQ);
+	add_bitwise(h, r, &mask, 1, reg, &reg);
+	add_cmp_u8(r, cmp, invert ? NFT_CMP_NEQ : NFT_CMP_EQ, reg);
 
 	return 0;
 }
@@ -1396,7 +1403,7 @@ static int add_nft_tcp(struct nft_handle *h, struct nftnl_rule *r,
 	}
 
 	if (tcp->flg_mask) {
-		int ret = add_nft_tcpflags(r, tcp->flg_cmp, tcp->flg_mask,
+		int ret = add_nft_tcpflags(h, r, tcp->flg_cmp, tcp->flg_mask,
 					   tcp->invflags & XT_TCP_INV_FLAGS);
 
 		if (ret < 0)
@@ -1411,18 +1418,19 @@ static int add_nft_mark(struct nft_handle *h, struct nftnl_rule *r,
 			struct xt_entry_match *m)
 {
 	struct xt_mark_mtinfo1 *mark = (void *)m->data;
+	uint8_t reg;
 	int op;
 
-	add_meta(h, r, NFT_META_MARK);
+	add_meta(h, r, NFT_META_MARK, &reg);
 	if (mark->mask != 0xffffffff)
-		add_bitwise(r, (uint8_t *)&mark->mask, sizeof(uint32_t));
+		add_bitwise(h, r, (uint8_t *)&mark->mask, sizeof(uint32_t), reg, &reg);
 
 	if (mark->invert)
 		op = NFT_CMP_NEQ;
 	else
 		op = NFT_CMP_EQ;
 
-	add_cmp_u32(r, mark->mark, op);
+	add_cmp_u32(r, mark->mark, op, reg);
 
 	return 0;
 }
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH iptables 6/7] nft: split gen_payload() to allocate register and initialize expression
  2022-04-24 21:56 [PATCH iptables 0/7] support for dynamic register allocation Pablo Neira Ayuso
                   ` (4 preceding siblings ...)
  2022-04-24 21:56 ` [PATCH iptables 5/7] nft: prepare for dynamic register allocation Pablo Neira Ayuso
@ 2022-04-24 21:56 ` Pablo Neira Ayuso
  2022-04-24 21:56 ` [PATCH iptables 7/7] nft: support for dynamic register allocation Pablo Neira Ayuso
  6 siblings, 0 replies; 12+ messages in thread
From: Pablo Neira Ayuso @ 2022-04-24 21:56 UTC (permalink / raw)
  To: netfilter-devel

Add __gen_payload(), in preparation for the dynamic register allocation.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
 iptables/nft.c | 18 ++++++++++++++----
 1 file changed, 14 insertions(+), 4 deletions(-)

diff --git a/iptables/nft.c b/iptables/nft.c
index bdfef0244b38..07653ee1a3d6 100644
--- a/iptables/nft.c
+++ b/iptables/nft.c
@@ -1090,20 +1090,30 @@ static struct nftnl_set *add_anon_set(struct nft_handle *h, const char *table,
 }
 
 static struct nftnl_expr *
-gen_payload(struct nft_handle *h, uint32_t base, uint32_t offset, uint32_t len,
-	    uint8_t *dreg)
+__gen_payload(uint32_t base, uint32_t offset, uint32_t len, uint8_t reg)
 {
 	struct nftnl_expr *e = nftnl_expr_alloc("payload");
-	uint8_t reg;
 
 	if (!e)
 		return NULL;
 
-	reg = NFT_REG_1;
 	nftnl_expr_set_u32(e, NFTNL_EXPR_PAYLOAD_BASE, base);
 	nftnl_expr_set_u32(e, NFTNL_EXPR_PAYLOAD_OFFSET, offset);
 	nftnl_expr_set_u32(e, NFTNL_EXPR_PAYLOAD_LEN, len);
 	nftnl_expr_set_u32(e, NFTNL_EXPR_PAYLOAD_DREG, reg);
+
+	return e;
+}
+
+static struct nftnl_expr *
+gen_payload(struct nft_handle *h, uint32_t base, uint32_t offset, uint32_t len,
+	    uint8_t *dreg)
+{
+	struct nftnl_expr *e;
+	uint8_t reg;
+
+	reg = NFT_REG_1;
+	e = __gen_payload(base, offset, len, reg);
 	*dreg = reg;
 
 	return e;
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH iptables 7/7] nft: support for dynamic register allocation
  2022-04-24 21:56 [PATCH iptables 0/7] support for dynamic register allocation Pablo Neira Ayuso
                   ` (5 preceding siblings ...)
  2022-04-24 21:56 ` [PATCH iptables 6/7] nft: split gen_payload() to allocate register and initialize expression Pablo Neira Ayuso
@ 2022-04-24 21:56 ` Pablo Neira Ayuso
  2022-04-26 16:06   ` Phil Sutter
  6 siblings, 1 reply; 12+ messages in thread
From: Pablo Neira Ayuso @ 2022-04-24 21:56 UTC (permalink / raw)
  To: netfilter-devel

Starting Linux kernel 5.18-rc, operations on registers that already
contain the expected data are turned into noop.

Track operation on registers to use the same register through
payload_get_register() and meta_get_register(). This patch introduces an
LRU eviction strategy when all the registers are in used.

get_register() is used to allocate a register as scratchpad area: no
tracking is performed in this case. This is used for concatenations,
eg. ebt_among.

Using samples/pktgen/pktgen_bench_xmit_mode_netif_receive.sh

Benchmark #1: match on IPv6 address list

 *raw
 :PREROUTING DROP [9:2781]
 :OUTPUT ACCEPT [0:0]
 -A PREROUTING -d aaaa::bbbb -j DROP
 [... 98 times same rule above to trigger mismatch ...]
 -A PREROUTING -d fd00::1 -j DROP			# matching rule

 iptables-legacy	798Mb
 iptables-nft		801Mb (+0.37)

Benchmark #2: match on layer 4 protocol + port list

 *raw
 :PREROUTING DROP [9:2781]
 :OUTPUT ACCEPT [0:0]
 -A PREROUTING -p tcp --dport 23 -j DROP
 [... 98 times same rule above to trigger mismatch ...]
 -A PREROUTING -p udp --dport 9 -j DROP 		# matching rule

 iptables-legacy	648Mb
 iptables-nft		892Mb (+37.65%)

Benchmark #3: match on mark

 *raw
 :PREROUTING DROP [9:2781]
 :OUTPUT ACCEPT [0:0]
 -A PREROUTING -m mark --mark 100 -j DROP
 [... 98 times same rule above to trigger mismatch ...]
 -A PREROUTING -d 198.18.0.42/32 -j DROP		# matching rule

 iptables-legacy	255Mb
 iptables-nft		865Mb (+239.21%)

In these cases, iptables-nft generates netlink bytecode which uses the
native expressions, ie. payload + cmp and meta + cmp.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
 iptables/Makefile.am                          |   2 +-
 iptables/nft-regs.c                           | 191 ++++++++++++++++++
 iptables/nft-regs.h                           |   9 +
 iptables/nft-shared.c                         |  10 +-
 iptables/nft.c                                |  20 +-
 iptables/nft.h                                |  25 +++
 .../nft-only/0009-needless-bitwise_0          | 180 ++++++++---------
 7 files changed, 335 insertions(+), 102 deletions(-)
 create mode 100644 iptables/nft-regs.c
 create mode 100644 iptables/nft-regs.h

diff --git a/iptables/Makefile.am b/iptables/Makefile.am
index 0258264c4c70..2f8fe107cb05 100644
--- a/iptables/Makefile.am
+++ b/iptables/Makefile.am
@@ -35,7 +35,7 @@ endif
 xtables_nft_multi_CFLAGS  += -DENABLE_NFTABLES -DENABLE_IPV4 -DENABLE_IPV6
 xtables_nft_multi_SOURCES += xtables-save.c xtables-restore.c \
 				xtables-standalone.c xtables.c nft.c \
-				nft-shared.c nft-ipv4.c nft-ipv6.c nft-arp.c \
+				nft-shared.c nft-regs.c nft-ipv4.c nft-ipv6.c nft-arp.c \
 				xtables-monitor.c nft-cache.c \
 				xtables-arp.c \
 				nft-bridge.c nft-cmd.c nft-chain.c \
diff --git a/iptables/nft-regs.c b/iptables/nft-regs.c
new file mode 100644
index 000000000000..bfc762e4186f
--- /dev/null
+++ b/iptables/nft-regs.c
@@ -0,0 +1,191 @@
+/*
+ * (C) 2012-2022 by Pablo Neira Ayuso <pablo@netfilter.org>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ */
+
+/* Funded through the NGI0 PET Fund established by NLnet (https://nlnet.nl)
+ * with support from the European Commission's Next Generation Internet
+ * programme.
+ */
+
+#include <string.h>
+#include <stdio.h>
+#include <stdlib.h>
+#include <stdbool.h>
+#include <errno.h>
+#include <assert.h>
+
+#include "nft.h"
+#include "nft-regs.h"
+
+static uint64_t reg_genid(struct nft_handle *h)
+{
+	return ++h->reg_genid;
+}
+
+struct nft_reg_ctx {
+	uint64_t	genid;
+	int		reg;
+	int		evict;
+};
+
+static int reg_space(int i)
+{
+	return sizeof(uint32_t) * 16 - sizeof(uint32_t) * i;
+}
+
+static void register_track(const struct nft_handle *h,
+			   struct nft_reg_ctx *ctx, int i, int len)
+{
+	if (ctx->reg >= 0 || h->regs[i].word || reg_space(i) < len)
+		return;
+
+	if (h->regs[i].type == NFT_REG_UNSPEC) {
+		ctx->genid = h->reg_genid;
+		ctx->reg = i;
+	} else if (h->regs[i].genid < ctx->genid) {
+		ctx->genid = h->regs[i].genid;
+		ctx->evict = i;
+	} else if (h->regs[i].len == len) {
+		ctx->evict = i;
+		ctx->genid = 0;
+	}
+}
+
+static void register_evict(struct nft_reg_ctx *ctx, int i)
+{
+	if (ctx->reg < 0) {
+		assert(ctx->evict >= 0);
+		ctx->reg = ctx->evict;
+	}
+}
+
+static void __register_update(struct nft_handle *h, uint8_t reg,
+			      int type, uint32_t len, uint8_t word,
+			      uint64_t genid)
+{
+	h->regs[reg].type = type;
+	h->regs[reg].genid = genid;
+	h->regs[reg].len = len;
+	h->regs[reg].word = word;
+}
+
+static void register_cancel(struct nft_handle *h, struct nft_reg_ctx *ctx)
+{
+	int plen = h->regs[ctx->reg].len, i;
+
+	for (i = ctx->reg; plen > 0; i++, plen -= sizeof(uint32_t)) {
+		h->regs[i].type = NFT_REG_UNSPEC;
+		h->regs[i].word = 0;
+	}
+
+	while (h->regs[i].word != 0) {
+		h->regs[i].type = NFT_REG_UNSPEC;
+		h->regs[i].word = 0;
+		i++;
+	}
+}
+
+static void register_update(struct nft_handle *h, struct nft_reg_ctx *ctx,
+			    int type, uint32_t len, uint64_t genid)
+{
+	register_cancel(h, ctx);
+	__register_update(h, ctx->reg, type, len, 0, genid);
+}
+
+uint8_t meta_get_register(struct nft_handle *h, enum nft_meta_keys key)
+{
+	struct nft_reg_ctx ctx = {
+		.reg	= -1,
+		.evict	= -1,
+		.genid	= UINT64_MAX,
+	};
+	uint64_t genid;
+	int i;
+
+	for (i = 0; i < 16; i++) {
+		register_track(h, &ctx, i, sizeof(uint32_t));
+
+		if (h->regs[i].type != NFT_REG_META)
+			continue;
+
+		if (h->regs[i].meta.key == key &&
+		    h->regs[i].len == sizeof(uint32_t)) {
+			h->regs[i].genid = reg_genid(h);
+			return i + NFT_REG32_00;
+		}
+	}
+
+	register_evict(&ctx, i);
+
+	genid = reg_genid(h);
+	register_update(h, &ctx, NFT_REG_META, sizeof(uint32_t), genid);
+	h->regs[ctx.reg].meta.key = key;
+
+	return ctx.reg + NFT_REG32_00;
+}
+
+uint8_t payload_get_register(struct nft_handle *h, enum nft_payload_bases base,
+			    uint32_t offset, uint32_t len)
+{
+	struct nft_reg_ctx ctx = {
+		.reg	= -1,
+		.evict	= -1,
+		.genid	= UINT64_MAX,
+	};
+	int i, j, plen = len;
+	uint64_t genid;
+
+	for (i = 0; i < 16; i++) {
+		register_track(h, &ctx, i, len);
+
+		if (h->regs[i].type != NFT_REG_PAYLOAD)
+			continue;
+
+		if (h->regs[i].payload.base == base &&
+		    h->regs[i].payload.offset == offset &&
+		    h->regs[i].len >= plen) {
+			h->regs[i].genid = reg_genid(h);
+			return i + NFT_REG32_00;
+		}
+	}
+
+	register_evict(&ctx, i);
+
+	genid = reg_genid(h);
+	register_update(h, &ctx, NFT_REG_PAYLOAD, len, genid);
+	h->regs[ctx.reg].payload.base = base;
+	h->regs[ctx.reg].payload.offset = offset;
+
+	plen -= sizeof(uint32_t);
+	j = 1;
+	for (i = ctx.reg + 1; plen > 0; i++, plen -= sizeof(uint32_t)) {
+		__register_update(h, i, NFT_REG_PAYLOAD, len, j++, genid);
+		h->regs[i].payload.base = base;
+		h->regs[i].payload.offset = offset;
+	}
+
+	return ctx.reg + NFT_REG32_00;
+}
+
+uint8_t get_register(struct nft_handle *h, uint8_t len)
+{
+	struct nft_reg_ctx ctx = {
+		.reg	= -1,
+		.evict	= -1,
+		.genid	= UINT64_MAX,
+	};
+	int i;
+
+	for (i = 0; i < 16; i++)
+		register_track(h, &ctx, i, len);
+
+	register_evict(&ctx, i);
+	register_cancel(h, &ctx);
+
+	return ctx.reg + NFT_REG32_00;
+}
diff --git a/iptables/nft-regs.h b/iptables/nft-regs.h
new file mode 100644
index 000000000000..3953eae9f217
--- /dev/null
+++ b/iptables/nft-regs.h
@@ -0,0 +1,9 @@
+#ifndef _NFT_REGS_H_
+#define _NFT_REGS_H_
+
+uint8_t payload_get_register(struct nft_handle *h, enum nft_payload_bases base,
+			     uint32_t offset, uint32_t len);
+uint8_t meta_get_register(struct nft_handle *h, enum nft_meta_keys key);
+uint8_t get_register(struct nft_handle *h, uint8_t len);
+
+#endif /* _NFT_REGS_H_ */
diff --git a/iptables/nft-shared.c b/iptables/nft-shared.c
index 27e95c1ae4f3..ad5361942093 100644
--- a/iptables/nft-shared.c
+++ b/iptables/nft-shared.c
@@ -32,6 +32,7 @@
 
 #include "nft-shared.h"
 #include "nft-bridge.h"
+#include "nft-regs.h"
 #include "xshared.h"
 #include "nft.h"
 
@@ -50,7 +51,7 @@ void add_meta(struct nft_handle *h, struct nftnl_rule *r, uint32_t key,
 	if (expr == NULL)
 		return;
 
-	reg = NFT_REG_1;
+	reg = meta_get_register(h, key);
 	nftnl_expr_set_u32(expr, NFTNL_EXPR_META_KEY, key);
 	nftnl_expr_set_u32(expr, NFTNL_EXPR_META_DREG, reg);
 	nftnl_rule_add_expr(r, expr);
@@ -68,7 +69,7 @@ void add_payload(struct nft_handle *h, struct nftnl_rule *r,
 	if (expr == NULL)
 		return;
 
-	reg = NFT_REG_1;
+	reg = payload_get_register(h, base, offset, len);
 	nftnl_expr_set_u32(expr, NFTNL_EXPR_PAYLOAD_BASE, base);
 	nftnl_expr_set_u32(expr, NFTNL_EXPR_PAYLOAD_DREG, reg);
 	nftnl_expr_set_u32(expr, NFTNL_EXPR_PAYLOAD_OFFSET, offset);
@@ -89,7 +90,7 @@ void add_bitwise_u16(struct nft_handle *h, struct nftnl_rule *r,
 	if (expr == NULL)
 		return;
 
-	reg = NFT_REG_1;
+	reg = get_register(h, sizeof(uint32_t));
 	nftnl_expr_set_u32(expr, NFTNL_EXPR_BITWISE_SREG, sreg);
 	nftnl_expr_set_u32(expr, NFTNL_EXPR_BITWISE_DREG, reg);
 	nftnl_expr_set_u32(expr, NFTNL_EXPR_BITWISE_LEN, sizeof(uint16_t));
@@ -105,12 +106,13 @@ void add_bitwise(struct nft_handle *h, struct nftnl_rule *r,
 {
 	struct nftnl_expr *expr;
 	uint32_t xor[4] = { 0 };
-	uint8_t reg = *dreg;
+	uint8_t reg;
 
 	expr = nftnl_expr_alloc("bitwise");
 	if (expr == NULL)
 		return;
 
+	reg = get_register(h, len);
 	nftnl_expr_set_u32(expr, NFTNL_EXPR_BITWISE_SREG, sreg);
 	nftnl_expr_set_u32(expr, NFTNL_EXPR_BITWISE_DREG, reg);
 	nftnl_expr_set_u32(expr, NFTNL_EXPR_BITWISE_LEN, len);
diff --git a/iptables/nft.c b/iptables/nft.c
index 07653ee1a3d6..48330f285703 100644
--- a/iptables/nft.c
+++ b/iptables/nft.c
@@ -56,6 +56,7 @@
 #include <arpa/inet.h>
 
 #include "nft.h"
+#include "nft-regs.h"
 #include "xshared.h" /* proto_to_name */
 #include "nft-cache.h"
 #include "nft-shared.h"
@@ -1112,7 +1113,7 @@ gen_payload(struct nft_handle *h, uint32_t base, uint32_t offset, uint32_t len,
 	struct nftnl_expr *e;
 	uint8_t reg;
 
-	reg = NFT_REG_1;
+	reg = payload_get_register(h, base, offset, len);
 	e = __gen_payload(base, offset, len, reg);
 	*dreg = reg;
 
@@ -1157,10 +1158,10 @@ static int __add_nft_among(struct nft_handle *h, const char *table,
 		offsetof(struct iphdr, saddr),
 		offsetof(struct iphdr, daddr)
 	};
+	uint8_t reg, concat_len;
 	struct nftnl_expr *e;
 	struct nftnl_set *s;
 	uint32_t flags = 0;
-	uint8_t reg;
 	int idx = 0;
 
 	if (ip) {
@@ -1201,21 +1202,26 @@ static int __add_nft_among(struct nft_handle *h, const char *table,
 		nftnl_set_elem_add(s, elem);
 	}
 
-	e = gen_payload(h, NFT_PAYLOAD_LL_HEADER,
-			eth_addr_off[dst], ETH_ALEN, &reg);
+	concat_len = ETH_ALEN;
+	if (ip)
+		concat_len += sizeof(struct in_addr);
+
+	reg = get_register(h, concat_len);
+	e = __gen_payload(NFT_PAYLOAD_LL_HEADER,
+			  eth_addr_off[dst], ETH_ALEN, reg);
 	if (!e)
 		return -ENOMEM;
 	nftnl_rule_add_expr(r, e);
 
 	if (ip) {
-		e = gen_payload(h, NFT_PAYLOAD_NETWORK_HEADER, ip_addr_off[dst],
-				sizeof(struct in_addr), &reg);
+		e = __gen_payload(NFT_PAYLOAD_NETWORK_HEADER, ip_addr_off[dst],
+				  sizeof(struct in_addr), reg + 2);
 		if (!e)
 			return -ENOMEM;
 		nftnl_rule_add_expr(r, e);
 	}
 
-	reg = NFT_REG_1;
+	reg = get_register(h, sizeof(uint32_t));
 	e = gen_lookup(reg, "__set%d", set_id, inv);
 	if (!e)
 		return -ENOMEM;
diff --git a/iptables/nft.h b/iptables/nft.h
index 68b0910c8e18..3dc907b188ce 100644
--- a/iptables/nft.h
+++ b/iptables/nft.h
@@ -85,6 +85,28 @@ struct nft_cache_req {
 	struct list_head	chain_list;
 };
 
+enum nft_reg_type {
+	NFT_REG_UNSPEC	= 0,
+	NFT_REG_PAYLOAD,
+	NFT_REG_META,
+};
+
+struct nft_regs {
+	enum nft_reg_type			type;
+	uint32_t				len;
+	uint64_t				genid;
+	uint8_t					word;
+	union {
+		struct {
+			enum nft_meta_keys	key;
+		} meta;
+		struct {
+			enum nft_payload_bases	base;
+			uint32_t		offset;
+		} payload;
+	};
+};
+
 struct nft_handle {
 	int			family;
 	struct mnl_socket	*nl;
@@ -111,6 +133,9 @@ struct nft_handle {
 	bool			cache_init;
 	int			verbose;
 
+	struct nft_regs		regs[20];
+	uint64_t		reg_genid;
+
 	/* meta data, for error reporting */
 	struct {
 		unsigned int	lineno;
diff --git a/iptables/tests/shell/testcases/nft-only/0009-needless-bitwise_0 b/iptables/tests/shell/testcases/nft-only/0009-needless-bitwise_0
index 41588a10863e..34987b239d35 100755
--- a/iptables/tests/shell/testcases/nft-only/0009-needless-bitwise_0
+++ b/iptables/tests/shell/testcases/nft-only/0009-needless-bitwise_0
@@ -64,8 +64,8 @@ ip filter OUTPUT 5 4
 
 ip filter OUTPUT 6 5
   [ payload load 4b @ network header + 16 => reg 1 ]
-  [ bitwise reg 1 = ( reg 1 & 0xfcffffff ) ^ 0x00000000 ]
-  [ cmp eq reg 1 0x0002010a ]
+  [ bitwise reg 9 = ( reg 1 & 0xfcffffff ) ^ 0x00000000 ]
+  [ cmp eq reg 9 0x0002010a ]
   [ counter pkts 0 bytes 0 ]
 
 ip filter OUTPUT 7 6
@@ -98,8 +98,8 @@ ip6 filter OUTPUT 5 4
 
 ip6 filter OUTPUT 6 5
   [ payload load 16b @ network header + 24 => reg 1 ]
-  [ bitwise reg 1 = ( reg 1 & 0xffffffff 0xffffffff 0xffffffff 0xf0ffffff ) ^ 0x00000000 0x00000000 0x00000000 0x00000000 ]
-  [ cmp eq reg 1 0xffc0edfe 0x020100ee 0x06050403 0x00090807 ]
+  [ bitwise reg 2 = ( reg 1 & 0xffffffff 0xffffffff 0xffffffff 0xf0ffffff ) ^ 0x00000000 0x00000000 0x00000000 0x00000000 ]
+  [ cmp eq reg 2 0xffc0edfe 0x020100ee 0x06050403 0x00090807 ]
   [ counter pkts 0 bytes 0 ]
 
 ip6 filter OUTPUT 7 6
@@ -148,155 +148,155 @@ ip6 filter OUTPUT 15 14
 arp filter OUTPUT 3
   [ payload load 2b @ network header + 0 => reg 1 ]
   [ cmp eq reg 1 0x00000100 ]
-  [ payload load 1b @ network header + 4 => reg 1 ]
-  [ cmp eq reg 1 0x00000006 ]
-  [ payload load 1b @ network header + 5 => reg 1 ]
-  [ cmp eq reg 1 0x00000004 ]
-  [ payload load 4b @ network header + 24 => reg 1 ]
-  [ cmp eq reg 1 0x0302010a ]
+  [ payload load 1b @ network header + 4 => reg 9 ]
+  [ cmp eq reg 9 0x00000006 ]
+  [ payload load 1b @ network header + 5 => reg 10 ]
+  [ cmp eq reg 10 0x00000004 ]
+  [ payload load 4b @ network header + 24 => reg 11 ]
+  [ cmp eq reg 11 0x0302010a ]
   [ counter pkts 0 bytes 0 ]
 
 arp filter OUTPUT 4 3
   [ payload load 2b @ network header + 0 => reg 1 ]
   [ cmp eq reg 1 0x00000100 ]
-  [ payload load 1b @ network header + 4 => reg 1 ]
-  [ cmp eq reg 1 0x00000006 ]
-  [ payload load 1b @ network header + 5 => reg 1 ]
-  [ cmp eq reg 1 0x00000004 ]
-  [ payload load 4b @ network header + 24 => reg 1 ]
-  [ cmp eq reg 1 0x0302010a ]
+  [ payload load 1b @ network header + 4 => reg 9 ]
+  [ cmp eq reg 9 0x00000006 ]
+  [ payload load 1b @ network header + 5 => reg 10 ]
+  [ cmp eq reg 10 0x00000004 ]
+  [ payload load 4b @ network header + 24 => reg 11 ]
+  [ cmp eq reg 11 0x0302010a ]
   [ counter pkts 0 bytes 0 ]
 
 arp filter OUTPUT 5 4
   [ payload load 2b @ network header + 0 => reg 1 ]
   [ cmp eq reg 1 0x00000100 ]
-  [ payload load 1b @ network header + 4 => reg 1 ]
-  [ cmp eq reg 1 0x00000006 ]
-  [ payload load 1b @ network header + 5 => reg 1 ]
-  [ cmp eq reg 1 0x00000004 ]
-  [ payload load 4b @ network header + 24 => reg 1 ]
-  [ bitwise reg 1 = ( reg 1 & 0xfcffffff ) ^ 0x00000000 ]
-  [ cmp eq reg 1 0x0002010a ]
+  [ payload load 1b @ network header + 4 => reg 9 ]
+  [ cmp eq reg 9 0x00000006 ]
+  [ payload load 1b @ network header + 5 => reg 10 ]
+  [ cmp eq reg 10 0x00000004 ]
+  [ payload load 4b @ network header + 24 => reg 11 ]
+  [ bitwise reg 2 = ( reg 11 & 0xfcffffff ) ^ 0x00000000 ]
+  [ cmp eq reg 2 0x0002010a ]
   [ counter pkts 0 bytes 0 ]
 
 arp filter OUTPUT 6 5
   [ payload load 2b @ network header + 0 => reg 1 ]
   [ cmp eq reg 1 0x00000100 ]
-  [ payload load 1b @ network header + 4 => reg 1 ]
-  [ cmp eq reg 1 0x00000006 ]
-  [ payload load 1b @ network header + 5 => reg 1 ]
-  [ cmp eq reg 1 0x00000004 ]
-  [ payload load 3b @ network header + 24 => reg 1 ]
-  [ cmp eq reg 1 0x0002010a ]
+  [ payload load 1b @ network header + 4 => reg 9 ]
+  [ cmp eq reg 9 0x00000006 ]
+  [ payload load 1b @ network header + 5 => reg 10 ]
+  [ cmp eq reg 10 0x00000004 ]
+  [ payload load 3b @ network header + 24 => reg 11 ]
+  [ cmp eq reg 11 0x0002010a ]
   [ counter pkts 0 bytes 0 ]
 
 arp filter OUTPUT 7 6
   [ payload load 2b @ network header + 0 => reg 1 ]
   [ cmp eq reg 1 0x00000100 ]
-  [ payload load 1b @ network header + 4 => reg 1 ]
-  [ cmp eq reg 1 0x00000006 ]
-  [ payload load 1b @ network header + 5 => reg 1 ]
-  [ cmp eq reg 1 0x00000004 ]
-  [ payload load 2b @ network header + 24 => reg 1 ]
-  [ cmp eq reg 1 0x0000010a ]
+  [ payload load 1b @ network header + 4 => reg 9 ]
+  [ cmp eq reg 9 0x00000006 ]
+  [ payload load 1b @ network header + 5 => reg 10 ]
+  [ cmp eq reg 10 0x00000004 ]
+  [ payload load 2b @ network header + 24 => reg 11 ]
+  [ cmp eq reg 11 0x0000010a ]
   [ counter pkts 0 bytes 0 ]
 
 arp filter OUTPUT 8 7
   [ payload load 2b @ network header + 0 => reg 1 ]
   [ cmp eq reg 1 0x00000100 ]
-  [ payload load 1b @ network header + 4 => reg 1 ]
-  [ cmp eq reg 1 0x00000006 ]
-  [ payload load 1b @ network header + 5 => reg 1 ]
-  [ cmp eq reg 1 0x00000004 ]
-  [ payload load 1b @ network header + 24 => reg 1 ]
-  [ cmp eq reg 1 0x0000000a ]
+  [ payload load 1b @ network header + 4 => reg 9 ]
+  [ cmp eq reg 9 0x00000006 ]
+  [ payload load 1b @ network header + 5 => reg 10 ]
+  [ cmp eq reg 10 0x00000004 ]
+  [ payload load 1b @ network header + 24 => reg 11 ]
+  [ cmp eq reg 11 0x0000000a ]
   [ counter pkts 0 bytes 0 ]
 
 arp filter OUTPUT 9 8
   [ payload load 2b @ network header + 0 => reg 1 ]
   [ cmp eq reg 1 0x00000100 ]
-  [ payload load 1b @ network header + 4 => reg 1 ]
-  [ cmp eq reg 1 0x00000006 ]
-  [ payload load 1b @ network header + 5 => reg 1 ]
-  [ cmp eq reg 1 0x00000004 ]
+  [ payload load 1b @ network header + 4 => reg 9 ]
+  [ cmp eq reg 9 0x00000006 ]
+  [ payload load 1b @ network header + 5 => reg 10 ]
+  [ cmp eq reg 10 0x00000004 ]
   [ counter pkts 0 bytes 0 ]
 
 arp filter OUTPUT 10 9
   [ payload load 2b @ network header + 0 => reg 1 ]
   [ cmp eq reg 1 0x00000100 ]
-  [ payload load 1b @ network header + 4 => reg 1 ]
-  [ cmp eq reg 1 0x00000006 ]
-  [ payload load 1b @ network header + 5 => reg 1 ]
-  [ cmp eq reg 1 0x00000004 ]
-  [ payload load 6b @ network header + 18 => reg 1 ]
-  [ cmp eq reg 1 0xc000edfe 0x0000eeff ]
+  [ payload load 1b @ network header + 4 => reg 9 ]
+  [ cmp eq reg 9 0x00000006 ]
+  [ payload load 1b @ network header + 5 => reg 10 ]
+  [ cmp eq reg 10 0x00000004 ]
+  [ payload load 6b @ network header + 18 => reg 2 ]
+  [ cmp eq reg 2 0xc000edfe 0x0000eeff ]
   [ counter pkts 0 bytes 0 ]
 
 arp filter OUTPUT 11 10
   [ payload load 2b @ network header + 0 => reg 1 ]
   [ cmp eq reg 1 0x00000100 ]
-  [ payload load 1b @ network header + 4 => reg 1 ]
-  [ cmp eq reg 1 0x00000006 ]
-  [ payload load 1b @ network header + 5 => reg 1 ]
-  [ cmp eq reg 1 0x00000004 ]
-  [ payload load 6b @ network header + 18 => reg 1 ]
-  [ bitwise reg 1 = ( reg 1 & 0xffffffff 0x0000f0ff ) ^ 0x00000000 0x00000000 ]
-  [ cmp eq reg 1 0xc000edfe 0x0000e0ff ]
+  [ payload load 1b @ network header + 4 => reg 9 ]
+  [ cmp eq reg 9 0x00000006 ]
+  [ payload load 1b @ network header + 5 => reg 10 ]
+  [ cmp eq reg 10 0x00000004 ]
+  [ payload load 6b @ network header + 18 => reg 2 ]
+  [ bitwise reg 14 = ( reg 2 & 0xffffffff 0x0000f0ff ) ^ 0x00000000 0x00000000 ]
+  [ cmp eq reg 14 0xc000edfe 0x0000e0ff ]
   [ counter pkts 0 bytes 0 ]
 
 arp filter OUTPUT 12 11
   [ payload load 2b @ network header + 0 => reg 1 ]
   [ cmp eq reg 1 0x00000100 ]
-  [ payload load 1b @ network header + 4 => reg 1 ]
-  [ cmp eq reg 1 0x00000006 ]
-  [ payload load 1b @ network header + 5 => reg 1 ]
-  [ cmp eq reg 1 0x00000004 ]
-  [ payload load 5b @ network header + 18 => reg 1 ]
-  [ cmp eq reg 1 0xc000edfe 0x000000ff ]
+  [ payload load 1b @ network header + 4 => reg 9 ]
+  [ cmp eq reg 9 0x00000006 ]
+  [ payload load 1b @ network header + 5 => reg 10 ]
+  [ cmp eq reg 10 0x00000004 ]
+  [ payload load 5b @ network header + 18 => reg 2 ]
+  [ cmp eq reg 2 0xc000edfe 0x000000ff ]
   [ counter pkts 0 bytes 0 ]
 
 arp filter OUTPUT 13 12
   [ payload load 2b @ network header + 0 => reg 1 ]
   [ cmp eq reg 1 0x00000100 ]
-  [ payload load 1b @ network header + 4 => reg 1 ]
-  [ cmp eq reg 1 0x00000006 ]
-  [ payload load 1b @ network header + 5 => reg 1 ]
-  [ cmp eq reg 1 0x00000004 ]
-  [ payload load 4b @ network header + 18 => reg 1 ]
-  [ cmp eq reg 1 0xc000edfe ]
+  [ payload load 1b @ network header + 4 => reg 9 ]
+  [ cmp eq reg 9 0x00000006 ]
+  [ payload load 1b @ network header + 5 => reg 10 ]
+  [ cmp eq reg 10 0x00000004 ]
+  [ payload load 4b @ network header + 18 => reg 2 ]
+  [ cmp eq reg 2 0xc000edfe ]
   [ counter pkts 0 bytes 0 ]
 
 arp filter OUTPUT 14 13
   [ payload load 2b @ network header + 0 => reg 1 ]
   [ cmp eq reg 1 0x00000100 ]
-  [ payload load 1b @ network header + 4 => reg 1 ]
-  [ cmp eq reg 1 0x00000006 ]
-  [ payload load 1b @ network header + 5 => reg 1 ]
-  [ cmp eq reg 1 0x00000004 ]
-  [ payload load 3b @ network header + 18 => reg 1 ]
-  [ cmp eq reg 1 0x0000edfe ]
+  [ payload load 1b @ network header + 4 => reg 9 ]
+  [ cmp eq reg 9 0x00000006 ]
+  [ payload load 1b @ network header + 5 => reg 10 ]
+  [ cmp eq reg 10 0x00000004 ]
+  [ payload load 3b @ network header + 18 => reg 2 ]
+  [ cmp eq reg 2 0x0000edfe ]
   [ counter pkts 0 bytes 0 ]
 
 arp filter OUTPUT 15 14
   [ payload load 2b @ network header + 0 => reg 1 ]
   [ cmp eq reg 1 0x00000100 ]
-  [ payload load 1b @ network header + 4 => reg 1 ]
-  [ cmp eq reg 1 0x00000006 ]
-  [ payload load 1b @ network header + 5 => reg 1 ]
-  [ cmp eq reg 1 0x00000004 ]
-  [ payload load 2b @ network header + 18 => reg 1 ]
-  [ cmp eq reg 1 0x0000edfe ]
+  [ payload load 1b @ network header + 4 => reg 9 ]
+  [ cmp eq reg 9 0x00000006 ]
+  [ payload load 1b @ network header + 5 => reg 10 ]
+  [ cmp eq reg 10 0x00000004 ]
+  [ payload load 2b @ network header + 18 => reg 2 ]
+  [ cmp eq reg 2 0x0000edfe ]
   [ counter pkts 0 bytes 0 ]
 
 arp filter OUTPUT 16 15
   [ payload load 2b @ network header + 0 => reg 1 ]
   [ cmp eq reg 1 0x00000100 ]
-  [ payload load 1b @ network header + 4 => reg 1 ]
-  [ cmp eq reg 1 0x00000006 ]
-  [ payload load 1b @ network header + 5 => reg 1 ]
-  [ cmp eq reg 1 0x00000004 ]
-  [ payload load 1b @ network header + 18 => reg 1 ]
-  [ cmp eq reg 1 0x000000fe ]
+  [ payload load 1b @ network header + 4 => reg 9 ]
+  [ cmp eq reg 9 0x00000006 ]
+  [ payload load 1b @ network header + 5 => reg 10 ]
+  [ cmp eq reg 10 0x00000004 ]
+  [ payload load 1b @ network header + 18 => reg 2 ]
+  [ cmp eq reg 2 0x000000fe ]
   [ counter pkts 0 bytes 0 ]
 
 bridge filter OUTPUT 4
@@ -306,8 +306,8 @@ bridge filter OUTPUT 4
 
 bridge filter OUTPUT 5 4
   [ payload load 6b @ link header + 0 => reg 1 ]
-  [ bitwise reg 1 = ( reg 1 & 0xffffffff 0x0000f0ff ) ^ 0x00000000 0x00000000 ]
-  [ cmp eq reg 1 0xc000edfe 0x0000e0ff ]
+  [ bitwise reg 10 = ( reg 1 & 0xffffffff 0x0000f0ff ) ^ 0x00000000 0x00000000 ]
+  [ cmp eq reg 10 0xc000edfe 0x0000e0ff ]
   [ counter pkts 0 bytes 0 ]
 
 bridge filter OUTPUT 6 5
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH iptables 7/7] nft: support for dynamic register allocation
  2022-04-24 21:56 ` [PATCH iptables 7/7] nft: support for dynamic register allocation Pablo Neira Ayuso
@ 2022-04-26 16:06   ` Phil Sutter
  2022-04-26 19:32     ` Pablo Neira Ayuso
  2022-04-27 16:23     ` Pablo Neira Ayuso
  0 siblings, 2 replies; 12+ messages in thread
From: Phil Sutter @ 2022-04-26 16:06 UTC (permalink / raw)
  To: Pablo Neira Ayuso; +Cc: netfilter-devel

Hi Pablo,

On Sun, Apr 24, 2022 at 11:56:13PM +0200, Pablo Neira Ayuso wrote:
> Starting Linux kernel 5.18-rc, operations on registers that already
> contain the expected data are turned into noop.
> 
> Track operation on registers to use the same register through
> payload_get_register() and meta_get_register(). This patch introduces an
> LRU eviction strategy when all the registers are in used.
> 
> get_register() is used to allocate a register as scratchpad area: no
> tracking is performed in this case. This is used for concatenations,
> eg. ebt_among.
> 
> Using samples/pktgen/pktgen_bench_xmit_mode_netif_receive.sh
> 
> Benchmark #1: match on IPv6 address list
> 
>  *raw
>  :PREROUTING DROP [9:2781]
>  :OUTPUT ACCEPT [0:0]
>  -A PREROUTING -d aaaa::bbbb -j DROP
>  [... 98 times same rule above to trigger mismatch ...]
>  -A PREROUTING -d fd00::1 -j DROP			# matching rule
> 
>  iptables-legacy	798Mb
>  iptables-nft		801Mb (+0.37)
> 
> Benchmark #2: match on layer 4 protocol + port list
> 
>  *raw
>  :PREROUTING DROP [9:2781]
>  :OUTPUT ACCEPT [0:0]
>  -A PREROUTING -p tcp --dport 23 -j DROP
>  [... 98 times same rule above to trigger mismatch ...]
>  -A PREROUTING -p udp --dport 9 -j DROP 		# matching rule
> 
>  iptables-legacy	648Mb
>  iptables-nft		892Mb (+37.65%)
> 
> Benchmark #3: match on mark
> 
>  *raw
>  :PREROUTING DROP [9:2781]
>  :OUTPUT ACCEPT [0:0]
>  -A PREROUTING -m mark --mark 100 -j DROP
>  [... 98 times same rule above to trigger mismatch ...]
>  -A PREROUTING -d 198.18.0.42/32 -j DROP		# matching rule
> 
>  iptables-legacy	255Mb
>  iptables-nft		865Mb (+239.21%)

Great results, but obviously biased test cases. Did you measure a more
"realistic" ruleset?

> In these cases, iptables-nft generates netlink bytecode which uses the
> native expressions, ie. payload + cmp and meta + cmp.

Sounds like a real point for further conversion into native nftables
expressions where possible.

[...]
> diff --git a/iptables/nft-regs.c b/iptables/nft-regs.c
> new file mode 100644
> index 000000000000..bfc762e4186f
> --- /dev/null
> +++ b/iptables/nft-regs.c
> @@ -0,0 +1,191 @@
> +/*
> + * (C) 2012-2022 by Pablo Neira Ayuso <pablo@netfilter.org>
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License as published by
> + * the Free Software Foundation; either version 2 of the License, or
> + * (at your option) any later version.
> + */
> +
> +/* Funded through the NGI0 PET Fund established by NLnet (https://nlnet.nl)
> + * with support from the European Commission's Next Generation Internet
> + * programme.
> + */
> +
> +#include <string.h>
> +#include <stdio.h>
> +#include <stdlib.h>
> +#include <stdbool.h>
> +#include <errno.h>
> +#include <assert.h>
> +
> +#include "nft.h"
> +#include "nft-regs.h"
> +
> +static uint64_t reg_genid(struct nft_handle *h)
> +{
> +	return ++h->reg_genid;
> +}
> +
> +struct nft_reg_ctx {
> +	uint64_t	genid;
> +	int		reg;
> +	int		evict;
> +};

Move this struct declaration above the first function?

> +
> +static int reg_space(int i)
> +{
> +	return sizeof(uint32_t) * 16 - sizeof(uint32_t) * i;
> +}
> +
> +static void register_track(const struct nft_handle *h,
> +			   struct nft_reg_ctx *ctx, int i, int len)
> +{
> +	if (ctx->reg >= 0 || h->regs[i].word || reg_space(i) < len)
> +		return;

Since ctx->reg is not reset in callers' loops and reg_space(i) is
monotonic, maybe make those loop exit conditions by returning false from
register_track() in those cases?

> +	if (h->regs[i].type == NFT_REG_UNSPEC) {
> +		ctx->genid = h->reg_genid;

Is ctx->genid used in this case?

> +		ctx->reg = i;
> +	} else if (h->regs[i].genid < ctx->genid) {
> +		ctx->genid = h->regs[i].genid;
> +		ctx->evict = i;

What if the oldest reg is too small?

> +	} else if (h->regs[i].len == len) {
> +		ctx->evict = i;
> +		ctx->genid = 0;

Why prefer regs of same size over older ones?

> +	}
> +}
> +
> +static void register_evict(struct nft_reg_ctx *ctx, int i)

Unused parameter i.

> +{
> +	if (ctx->reg < 0) {
> +		assert(ctx->evict >= 0);
> +		ctx->reg = ctx->evict;
> +	}
> +}
> +
> +static void __register_update(struct nft_handle *h, uint8_t reg,
> +			      int type, uint32_t len, uint8_t word,
> +			      uint64_t genid)
> +{
> +	h->regs[reg].type = type;
> +	h->regs[reg].genid = genid;
> +	h->regs[reg].len = len;
> +	h->regs[reg].word = word;
> +}
> +
> +static void register_cancel(struct nft_handle *h, struct nft_reg_ctx *ctx)
> +{
> +	int plen = h->regs[ctx->reg].len, i;
> +
> +	for (i = ctx->reg; plen > 0; i++, plen -= sizeof(uint32_t)) {
> +		h->regs[i].type = NFT_REG_UNSPEC;
> +		h->regs[i].word = 0;
> +	}
> +
> +	while (h->regs[i].word != 0) {
> +		h->regs[i].type = NFT_REG_UNSPEC;
> +		h->regs[i].word = 0;
> +		i++;
> +	}
> +}
> +
> +static void register_update(struct nft_handle *h, struct nft_reg_ctx *ctx,
> +			    int type, uint32_t len, uint64_t genid)
> +{
> +	register_cancel(h, ctx);
> +	__register_update(h, ctx->reg, type, len, 0, genid);
> +}
> +
> +uint8_t meta_get_register(struct nft_handle *h, enum nft_meta_keys key)

Accept a uint32_t len parameter and replace all the sizeof(uint32_t)
below with it? Not needed but consistent with payload_get_register() and
less hard-coded values.

> +{
> +	struct nft_reg_ctx ctx = {
> +		.reg	= -1,
> +		.evict	= -1,
> +		.genid	= UINT64_MAX,
> +	};
> +	uint64_t genid;
> +	int i;
> +
> +	for (i = 0; i < 16; i++) {
> +		register_track(h, &ctx, i, sizeof(uint32_t));
> +
> +		if (h->regs[i].type != NFT_REG_META)
> +			continue;
> +
> +		if (h->regs[i].meta.key == key &&
> +		    h->regs[i].len == sizeof(uint32_t)) {
> +			h->regs[i].genid = reg_genid(h);
> +			return i + NFT_REG32_00;
> +		}
> +	}
> +
> +	register_evict(&ctx, i);
> +
> +	genid = reg_genid(h);
> +	register_update(h, &ctx, NFT_REG_META, sizeof(uint32_t), genid);
> +	h->regs[ctx.reg].meta.key = key;
> +
> +	return ctx.reg + NFT_REG32_00;
> +}
> +
> +uint8_t payload_get_register(struct nft_handle *h, enum nft_payload_bases base,
> +			    uint32_t offset, uint32_t len)
> +{
> +	struct nft_reg_ctx ctx = {
> +		.reg	= -1,
> +		.evict	= -1,
> +		.genid	= UINT64_MAX,
> +	};
> +	int i, j, plen = len;
> +	uint64_t genid;
> +
> +	for (i = 0; i < 16; i++) {
> +		register_track(h, &ctx, i, len);
> +
> +		if (h->regs[i].type != NFT_REG_PAYLOAD)
> +			continue;
> +
> +		if (h->regs[i].payload.base == base &&
> +		    h->regs[i].payload.offset == offset &&
> +		    h->regs[i].len >= plen) {
> +			h->regs[i].genid = reg_genid(h);
> +			return i + NFT_REG32_00;
> +		}
> +	}
> +
> +	register_evict(&ctx, i);
> +
> +	genid = reg_genid(h);
> +	register_update(h, &ctx, NFT_REG_PAYLOAD, len, genid);
> +	h->regs[ctx.reg].payload.base = base;
> +	h->regs[ctx.reg].payload.offset = offset;
> +
> +	plen -= sizeof(uint32_t);
> +	j = 1;
> +	for (i = ctx.reg + 1; plen > 0; i++, plen -= sizeof(uint32_t)) {
> +		__register_update(h, i, NFT_REG_PAYLOAD, len, j++, genid);
> +		h->regs[i].payload.base = base;
> +		h->regs[i].payload.offset = offset;
> +	}
> +
> +	return ctx.reg + NFT_REG32_00;
> +}
> +
> +uint8_t get_register(struct nft_handle *h, uint8_t len)
> +{
> +	struct nft_reg_ctx ctx = {
> +		.reg	= -1,
> +		.evict	= -1,
> +		.genid	= UINT64_MAX,
> +	};
> +	int i;
> +
> +	for (i = 0; i < 16; i++)
> +		register_track(h, &ctx, i, len);
> +
> +	register_evict(&ctx, i);
> +	register_cancel(h, &ctx);
> +
> +	return ctx.reg + NFT_REG32_00;
> +}
> diff --git a/iptables/nft-regs.h b/iptables/nft-regs.h
> new file mode 100644
> index 000000000000..3953eae9f217
> --- /dev/null
> +++ b/iptables/nft-regs.h
> @@ -0,0 +1,9 @@
> +#ifndef _NFT_REGS_H_
> +#define _NFT_REGS_H_
> +
> +uint8_t payload_get_register(struct nft_handle *h, enum nft_payload_bases base,
> +			     uint32_t offset, uint32_t len);
> +uint8_t meta_get_register(struct nft_handle *h, enum nft_meta_keys key);
> +uint8_t get_register(struct nft_handle *h, uint8_t len);
> +
> +#endif /* _NFT_REGS_H_ */
> diff --git a/iptables/nft-shared.c b/iptables/nft-shared.c
> index 27e95c1ae4f3..ad5361942093 100644
> --- a/iptables/nft-shared.c
> +++ b/iptables/nft-shared.c
> @@ -32,6 +32,7 @@
>  
>  #include "nft-shared.h"
>  #include "nft-bridge.h"
> +#include "nft-regs.h"
>  #include "xshared.h"
>  #include "nft.h"
>  
> @@ -50,7 +51,7 @@ void add_meta(struct nft_handle *h, struct nftnl_rule *r, uint32_t key,
>  	if (expr == NULL)
>  		return;
>  
> -	reg = NFT_REG_1;
> +	reg = meta_get_register(h, key);
>  	nftnl_expr_set_u32(expr, NFTNL_EXPR_META_KEY, key);
>  	nftnl_expr_set_u32(expr, NFTNL_EXPR_META_DREG, reg);
>  	nftnl_rule_add_expr(r, expr);
> @@ -68,7 +69,7 @@ void add_payload(struct nft_handle *h, struct nftnl_rule *r,
>  	if (expr == NULL)
>  		return;
>  
> -	reg = NFT_REG_1;
> +	reg = payload_get_register(h, base, offset, len);
>  	nftnl_expr_set_u32(expr, NFTNL_EXPR_PAYLOAD_BASE, base);
>  	nftnl_expr_set_u32(expr, NFTNL_EXPR_PAYLOAD_DREG, reg);
>  	nftnl_expr_set_u32(expr, NFTNL_EXPR_PAYLOAD_OFFSET, offset);
> @@ -89,7 +90,7 @@ void add_bitwise_u16(struct nft_handle *h, struct nftnl_rule *r,
>  	if (expr == NULL)
>  		return;
>  
> -	reg = NFT_REG_1;
> +	reg = get_register(h, sizeof(uint32_t));
>  	nftnl_expr_set_u32(expr, NFTNL_EXPR_BITWISE_SREG, sreg);
>  	nftnl_expr_set_u32(expr, NFTNL_EXPR_BITWISE_DREG, reg);
>  	nftnl_expr_set_u32(expr, NFTNL_EXPR_BITWISE_LEN, sizeof(uint16_t));
> @@ -105,12 +106,13 @@ void add_bitwise(struct nft_handle *h, struct nftnl_rule *r,
>  {
>  	struct nftnl_expr *expr;
>  	uint32_t xor[4] = { 0 };
> -	uint8_t reg = *dreg;
> +	uint8_t reg;
>  
>  	expr = nftnl_expr_alloc("bitwise");
>  	if (expr == NULL)
>  		return;
>  
> +	reg = get_register(h, len);
>  	nftnl_expr_set_u32(expr, NFTNL_EXPR_BITWISE_SREG, sreg);
>  	nftnl_expr_set_u32(expr, NFTNL_EXPR_BITWISE_DREG, reg);
>  	nftnl_expr_set_u32(expr, NFTNL_EXPR_BITWISE_LEN, len);
> diff --git a/iptables/nft.c b/iptables/nft.c
> index 07653ee1a3d6..48330f285703 100644
> --- a/iptables/nft.c
> +++ b/iptables/nft.c
> @@ -56,6 +56,7 @@
>  #include <arpa/inet.h>
>  
>  #include "nft.h"
> +#include "nft-regs.h"
>  #include "xshared.h" /* proto_to_name */
>  #include "nft-cache.h"
>  #include "nft-shared.h"
> @@ -1112,7 +1113,7 @@ gen_payload(struct nft_handle *h, uint32_t base, uint32_t offset, uint32_t len,
>  	struct nftnl_expr *e;
>  	uint8_t reg;
>  
> -	reg = NFT_REG_1;
> +	reg = payload_get_register(h, base, offset, len);
>  	e = __gen_payload(base, offset, len, reg);
>  	*dreg = reg;
>  
> @@ -1157,10 +1158,10 @@ static int __add_nft_among(struct nft_handle *h, const char *table,
>  		offsetof(struct iphdr, saddr),
>  		offsetof(struct iphdr, daddr)
>  	};
> +	uint8_t reg, concat_len;
>  	struct nftnl_expr *e;
>  	struct nftnl_set *s;
>  	uint32_t flags = 0;
> -	uint8_t reg;
>  	int idx = 0;
>  
>  	if (ip) {
> @@ -1201,21 +1202,26 @@ static int __add_nft_among(struct nft_handle *h, const char *table,
>  		nftnl_set_elem_add(s, elem);
>  	}
>  
> -	e = gen_payload(h, NFT_PAYLOAD_LL_HEADER,
> -			eth_addr_off[dst], ETH_ALEN, &reg);
> +	concat_len = ETH_ALEN;
> +	if (ip)
> +		concat_len += sizeof(struct in_addr);
> +
> +	reg = get_register(h, concat_len);
> +	e = __gen_payload(NFT_PAYLOAD_LL_HEADER,
> +			  eth_addr_off[dst], ETH_ALEN, reg);
>  	if (!e)
>  		return -ENOMEM;
>  	nftnl_rule_add_expr(r, e);
>  
>  	if (ip) {
> -		e = gen_payload(h, NFT_PAYLOAD_NETWORK_HEADER, ip_addr_off[dst],
> -				sizeof(struct in_addr), &reg);
> +		e = __gen_payload(NFT_PAYLOAD_NETWORK_HEADER, ip_addr_off[dst],
> +				  sizeof(struct in_addr), reg + 2);

With a respective macro, this could be 'reg + REG_ALIGN(ETH_ALEN)'.

>  		if (!e)
>  			return -ENOMEM;
>  		nftnl_rule_add_expr(r, e);
>  	}
>  
> -	reg = NFT_REG_1;
> +	reg = get_register(h, sizeof(uint32_t));
>  	e = gen_lookup(reg, "__set%d", set_id, inv);
>  	if (!e)
>  		return -ENOMEM;
> diff --git a/iptables/nft.h b/iptables/nft.h
> index 68b0910c8e18..3dc907b188ce 100644
> --- a/iptables/nft.h
> +++ b/iptables/nft.h
> @@ -85,6 +85,28 @@ struct nft_cache_req {
>  	struct list_head	chain_list;
>  };
>  
> +enum nft_reg_type {
> +	NFT_REG_UNSPEC	= 0,
> +	NFT_REG_PAYLOAD,
> +	NFT_REG_META,
> +};
> +
> +struct nft_regs {
> +	enum nft_reg_type			type;
> +	uint32_t				len;
> +	uint64_t				genid;
> +	uint8_t					word;
> +	union {
> +		struct {
> +			enum nft_meta_keys	key;
> +		} meta;
> +		struct {
> +			enum nft_payload_bases	base;
> +			uint32_t		offset;
> +		} payload;
> +	};
> +};
> +
>  struct nft_handle {
>  	int			family;
>  	struct mnl_socket	*nl;
> @@ -111,6 +133,9 @@ struct nft_handle {
>  	bool			cache_init;
>  	int			verbose;
>  
> +	struct nft_regs		regs[20];

Why 20? Ain't 16 enough?

> +	uint64_t		reg_genid;
> +
>  	/* meta data, for error reporting */
>  	struct {
>  		unsigned int	lineno;

Thanks, Phil

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH iptables 7/7] nft: support for dynamic register allocation
  2022-04-26 16:06   ` Phil Sutter
@ 2022-04-26 19:32     ` Pablo Neira Ayuso
  2022-04-27 16:23     ` Pablo Neira Ayuso
  1 sibling, 0 replies; 12+ messages in thread
From: Pablo Neira Ayuso @ 2022-04-26 19:32 UTC (permalink / raw)
  To: Phil Sutter, netfilter-devel

On Tue, Apr 26, 2022 at 06:06:41PM +0200, Phil Sutter wrote:
[...]
> > Benchmark #3: match on mark
> > 
> >  *raw
> >  :PREROUTING DROP [9:2781]
> >  :OUTPUT ACCEPT [0:0]
> >  -A PREROUTING -m mark --mark 100 -j DROP
> >  [... 98 times same rule above to trigger mismatch ...]
> >  -A PREROUTING -d 198.18.0.42/32 -j DROP		# matching rule
> > 
> >  iptables-legacy	255Mb
> >  iptables-nft		865Mb (+239.21%)
> 
> Great results, but obviously biased test cases. Did you measure a more
> "realistic" ruleset?

The goal of the benchmark is to show that iptables-legacy is optimized
for five-tuple matching, while iptables-nft with dynamic register
allocation is generically optimized for any selector through native
nftables bytecode.

> > In these cases, iptables-nft generates netlink bytecode which uses the
> > native expressions, ie. payload + cmp and meta + cmp.
> 
> Sounds like a real point for further conversion into native nftables
> expressions where possible.

Exactly.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH iptables 7/7] nft: support for dynamic register allocation
  2022-04-26 16:06   ` Phil Sutter
  2022-04-26 19:32     ` Pablo Neira Ayuso
@ 2022-04-27 16:23     ` Pablo Neira Ayuso
  2022-04-28 10:07       ` Phil Sutter
  1 sibling, 1 reply; 12+ messages in thread
From: Pablo Neira Ayuso @ 2022-04-27 16:23 UTC (permalink / raw)
  To: Phil Sutter, netfilter-devel

On Tue, Apr 26, 2022 at 06:06:41PM +0200, Phil Sutter wrote:
> On Sun, Apr 24, 2022 at 11:56:13PM +0200, Pablo Neira Ayuso wrote:
[...]
> > +
> > +static int reg_space(int i)
> > +{
> > +	return sizeof(uint32_t) * 16 - sizeof(uint32_t) * i;
> > +}
> > +
> > +static void register_track(const struct nft_handle *h,
> > +			   struct nft_reg_ctx *ctx, int i, int len)
> > +{
> > +	if (ctx->reg >= 0 || h->regs[i].word || reg_space(i) < len)
> > +		return;
> 
> Since ctx->reg is not reset in callers' loops and reg_space(i) is
> monotonic, maybe make those loop exit conditions by returning false from
> register_track() in those cases?

you mean:

        if (!register_track(...))
                continue;

?

That defeats the check for a matching register already storing the
data I need, ie.

        if (h->regs[i].type != NFT_REG_META)
                continue;
        ...

> > +	if (h->regs[i].type == NFT_REG_UNSPEC) {
> > +		ctx->genid = h->reg_genid;
> 
> Is ctx->genid used in this case?

It used to shortcircuit the logic to evict a register (no eviction
needed case), but that is not needed anymore since ctx->reg >= 0
already prevents this.

> > +		ctx->reg = i;
> > +	} else if (h->regs[i].genid < ctx->genid) {
> > +		ctx->genid = h->regs[i].genid;
> > +		ctx->evict = i;
> 
> What if the oldest reg is too small?

The reg_space(i) < len check prevents this?

> > +	} else if (h->regs[i].len == len) {
> > +		ctx->evict = i;
> > +		ctx->genid = 0;
> 
> Why prefer regs of same size over older ones?

this was an early optimization. An Ipv6 address might evict up four
registers, if n stores old data, then n+1, n+2, n+3 store recent data,
n+1, n+2, n+3 would be unfairly evicted.

I can remove this case: it is probably an early optimization. This is
the initial version of the dynamic register allocation infra. It
should be possible to catch for more suboptimal situations with real
rulesets, by incrementally reviewing generated bytecode.

[...]
> > +uint8_t meta_get_register(struct nft_handle *h, enum nft_meta_keys key)
> 
> Accept a uint32_t len parameter and replace all the sizeof(uint32_t)
> below with it? Not needed but consistent with payload_get_register() and
> less hard-coded values.

Actually NFT_META_TIME_NS uses 64 bits, assuming 32-bits for meta is
indeed not correct.

[...]
> > @@ -1201,21 +1202,26 @@ static int __add_nft_among(struct nft_handle *h, const char *table,
> >  		nftnl_set_elem_add(s, elem);
> >  	}
> >  
> > -	e = gen_payload(h, NFT_PAYLOAD_LL_HEADER,
> > -			eth_addr_off[dst], ETH_ALEN, &reg);
> > +	concat_len = ETH_ALEN;
> > +	if (ip)
> > +		concat_len += sizeof(struct in_addr);
> > +
> > +	reg = get_register(h, concat_len);
> > +	e = __gen_payload(NFT_PAYLOAD_LL_HEADER,
> > +			  eth_addr_off[dst], ETH_ALEN, reg);
> >  	if (!e)
> >  		return -ENOMEM;
> >  	nftnl_rule_add_expr(r, e);
> >  
> >  	if (ip) {
> > -		e = gen_payload(h, NFT_PAYLOAD_NETWORK_HEADER, ip_addr_off[dst],
> > -				sizeof(struct in_addr), &reg);
> > +		e = __gen_payload(NFT_PAYLOAD_NETWORK_HEADER, ip_addr_off[dst],
> > +				  sizeof(struct in_addr), reg + 2);
> 
> With a respective macro, this could be 'reg + REG_ALIGN(ETH_ALEN)'.

that's feasible.

> >  struct nft_handle {
> >  	int			family;
> >  	struct mnl_socket	*nl;
> > @@ -111,6 +133,9 @@ struct nft_handle {
> >  	bool			cache_init;
> >  	int			verbose;
> >  
> > +	struct nft_regs		regs[20];
> 
> Why 20? Ain't 16 enough?

Yes, this should be 16.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH iptables 7/7] nft: support for dynamic register allocation
  2022-04-27 16:23     ` Pablo Neira Ayuso
@ 2022-04-28 10:07       ` Phil Sutter
  0 siblings, 0 replies; 12+ messages in thread
From: Phil Sutter @ 2022-04-28 10:07 UTC (permalink / raw)
  To: Pablo Neira Ayuso; +Cc: netfilter-devel

On Wed, Apr 27, 2022 at 06:23:05PM +0200, Pablo Neira Ayuso wrote:
> On Tue, Apr 26, 2022 at 06:06:41PM +0200, Phil Sutter wrote:
> > On Sun, Apr 24, 2022 at 11:56:13PM +0200, Pablo Neira Ayuso wrote:
> [...]
> > > +
> > > +static int reg_space(int i)
> > > +{
> > > +	return sizeof(uint32_t) * 16 - sizeof(uint32_t) * i;
> > > +}
> > > +
> > > +static void register_track(const struct nft_handle *h,
> > > +			   struct nft_reg_ctx *ctx, int i, int len)
> > > +{
> > > +	if (ctx->reg >= 0 || h->regs[i].word || reg_space(i) < len)
> > > +		return;
> > 
> > Since ctx->reg is not reset in callers' loops and reg_space(i) is
> > monotonic, maybe make those loop exit conditions by returning false from
> > register_track() in those cases?
> 
> you mean:
> 
>         if (!register_track(...))
>                 continue;
> 
> ?
> 
> That defeats the check for a matching register already storing the
> data I need, ie.
> 
>         if (h->regs[i].type != NFT_REG_META)
>                 continue;
>         ...

Oh, indeed! It still holds for the 'reg_space(i) < len' case but it's
a corner case and not worth it.

> > > +	if (h->regs[i].type == NFT_REG_UNSPEC) {
> > > +		ctx->genid = h->reg_genid;
> > 
> > Is ctx->genid used in this case?
> 
> It used to shortcircuit the logic to evict a register (no eviction
> needed case), but that is not needed anymore since ctx->reg >= 0
> already prevents this.
> 
> > > +		ctx->reg = i;
> > > +	} else if (h->regs[i].genid < ctx->genid) {
> > > +		ctx->genid = h->regs[i].genid;
> > > +		ctx->evict = i;
> > 
> > What if the oldest reg is too small?
> 
> The reg_space(i) < len check prevents this?

Ah, I missed the fact that consecutive regs are simply overwritten in
the evict case. What I had in mind was something like this:

reg#	genid	size
--------------------
0	6	4
1	5	16
2	4	4
3	3	4
4	2	4
5	1	4

If that's all the space there is and we want to store 16 bytes, the
algorithm would evict reg2 and all the following while it could just
evict reg1 and preserve the others. I just realize this is the
optimization you're talking about here:

> > > +	} else if (h->regs[i].len == len) {
> > > +		ctx->evict = i;
> > > +		ctx->genid = 0;
> > 
> > Why prefer regs of same size over older ones?
> 
> this was an early optimization. An Ipv6 address might evict up four
> registers, if n stores old data, then n+1, n+2, n+3 store recent data,
> n+1, n+2, n+3 would be unfairly evicted.
> 
> I can remove this case: it is probably an early optimization. This is
> the initial version of the dynamic register allocation infra. It
> should be possible to catch for more suboptimal situations with real
> rulesets, by incrementally reviewing generated bytecode.

Sounds reasonable. I think there are more pitfalls regarding suboptimal
register choice in NFT_REG_UNSPEC case as it doesn't consider available
space either.

Anyway, the size consideration being done only for regs with same genid
as the last one being chosen for eviction seems wrong. Preferring a same
sized reg over older ones does not seem ideal to me, either. What do you
think about this:

| static void register_track(const struct nft_handle *h,
|                            struct nft_reg_ctx *ctx, int i, int len)
| {
|         if (ctx->reg >= 0 || h->regs[i].word || reg_space(i) < len)
|                 return;
| 
|         if (h->regs[i].type == NFT_REG_UNSPEC) {
|                 ctx->reg = i;
|         } else if (ctx->evict < 0 ||
|                    h->regs[ctx->evict].len < len ||
|                    (h->regs[i].len >= len &&
|                     h->regs[i].genid < ctx->genid)) {
|                 ctx->genid = h->regs[i].genid;
|                 ctx->evict = i;
|         }
| }

- If given register is free, use it and be done
- If there's no candidate for eviction, mark it as such
- If eviction candidate size is too small, prefer the current one
  (regardless of size)
- if both eviction candidate and current sizes are sufficient, prefer
  the older one

Cheers, Phil

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2022-04-28 10:13 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-04-24 21:56 [PATCH iptables 0/7] support for dynamic register allocation Pablo Neira Ayuso
2022-04-24 21:56 ` [PATCH iptables 1/7] nft-shared: update context register for bitwise expression Pablo Neira Ayuso
2022-04-24 21:56 ` [PATCH iptables 2/7] nft: pass struct nft_xt_ctx to parse_meta() Pablo Neira Ayuso
2022-04-24 21:56 ` [PATCH iptables 3/7] nft: native mark matching support Pablo Neira Ayuso
2022-04-24 21:56 ` [PATCH iptables 4/7] nft: pass handle to helper functions to build netlink payload Pablo Neira Ayuso
2022-04-24 21:56 ` [PATCH iptables 5/7] nft: prepare for dynamic register allocation Pablo Neira Ayuso
2022-04-24 21:56 ` [PATCH iptables 6/7] nft: split gen_payload() to allocate register and initialize expression Pablo Neira Ayuso
2022-04-24 21:56 ` [PATCH iptables 7/7] nft: support for dynamic register allocation Pablo Neira Ayuso
2022-04-26 16:06   ` Phil Sutter
2022-04-26 19:32     ` Pablo Neira Ayuso
2022-04-27 16:23     ` Pablo Neira Ayuso
2022-04-28 10:07       ` Phil Sutter

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.