All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCHv2 net-next 0/5] net: eliminate the duplicate code in the ct nat functions of ovs and tc
@ 2022-11-22 17:32 Xin Long
  2022-11-22 17:32 ` [PATCHv2 net-next 1/5] openvswitch: delete the unncessary skb_pull_rcsum call in ovs_ct_nat_execute Xin Long
                   ` (5 more replies)
  0 siblings, 6 replies; 23+ messages in thread
From: Xin Long @ 2022-11-22 17:32 UTC (permalink / raw)
  To: network dev, dev, ovs-dev
  Cc: davem, kuba, Eric Dumazet, Paolo Abeni, Pravin B Shelar,
	Jamal Hadi Salim, Cong Wang, Jiri Pirko, Pablo Neira Ayuso,
	Florian Westphal, Marcelo Ricardo Leitner, Davide Caratti,
	Oz Shlomo, Paul Blakey, Ilya Maximets, Eelco Chaudron,
	Aaron Conole

The changes in the patchset:

  "net: add helper support in tc act_ct for ovs offloading"

had moved some common ct code used by both OVS and TC into netfilter.

There are still some big functions pretty similar defined and used in
each of OVS and TC. It is not good to maintain such big function in 2
places. This patchset is to extract the functions for NAT processing
from OVS and TC to netfilter.

To make this change clear and safe, this patchset gets the common code
out of OVS and TC step by step: The patch 1-4 make some minor changes
in OVS and TC to make the NAT code of them completely the same, then
the patch 5 moves the common code to the netfilter and exports one
function called by each of OVS and TC.

v1->v2:
  - Create nf_nat_ovs.c to include the nat functions, as Pablo suggested.

Xin Long (5):
  openvswitch: delete the unncessary skb_pull_rcsum call in
    ovs_ct_nat_execute
  openvswitch: return NF_ACCEPT when OVS_CT_NAT is net set in info nat
  net: sched: return NF_ACCEPT when fails to add nat ext in
    tcf_ct_act_nat
  net: sched: update the nat flag for icmp error packets in
    ct_nat_execute
  net: move the nat function to nf_nat_ovs for ovs and tc

 include/net/netfilter/nf_nat.h |   4 +
 net/netfilter/Makefile         |   2 +-
 net/netfilter/nf_nat_ovs.c     | 135 ++++++++++++++++++++++++++++++
 net/openvswitch/conntrack.c    | 146 +++------------------------------
 net/sched/act_ct.c             | 136 +++---------------------------
 5 files changed, 164 insertions(+), 259 deletions(-)
 create mode 100644 net/netfilter/nf_nat_ovs.c

-- 
2.31.1


^ permalink raw reply	[flat|nested] 23+ messages in thread

* [PATCHv2 net-next 1/5] openvswitch: delete the unncessary skb_pull_rcsum call in ovs_ct_nat_execute
  2022-11-22 17:32 [PATCHv2 net-next 0/5] net: eliminate the duplicate code in the ct nat functions of ovs and tc Xin Long
@ 2022-11-22 17:32 ` Xin Long
  2022-11-22 17:32 ` [PATCHv2 net-next 2/5] openvswitch: return NF_ACCEPT when OVS_CT_NAT is net set in info nat Xin Long
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 23+ messages in thread
From: Xin Long @ 2022-11-22 17:32 UTC (permalink / raw)
  To: network dev, dev, ovs-dev
  Cc: davem, kuba, Eric Dumazet, Paolo Abeni, Pravin B Shelar,
	Jamal Hadi Salim, Cong Wang, Jiri Pirko, Pablo Neira Ayuso,
	Florian Westphal, Marcelo Ricardo Leitner, Davide Caratti,
	Oz Shlomo, Paul Blakey, Ilya Maximets, Eelco Chaudron,
	Aaron Conole

The calls to ovs_ct_nat_execute() are as below:

  ovs_ct_execute()
    ovs_ct_lookup()
      __ovs_ct_lookup()
        ovs_ct_nat()
          ovs_ct_nat_execute()
    ovs_ct_commit()
      __ovs_ct_lookup()
        ovs_ct_nat()
          ovs_ct_nat_execute()

and since skb_pull_rcsum() and skb_push_rcsum() are already
called in ovs_ct_execute(), there's no need to do it again
in ovs_ct_nat_execute().

Reviewed-by: Saeed Mahameed <saeed@kernel.org>
Acked-by: Aaron Conole <aconole@redhat.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
---
 net/openvswitch/conntrack.c | 17 ++++++-----------
 1 file changed, 6 insertions(+), 11 deletions(-)

diff --git a/net/openvswitch/conntrack.c b/net/openvswitch/conntrack.c
index 4348321856af..4c5e5a6475af 100644
--- a/net/openvswitch/conntrack.c
+++ b/net/openvswitch/conntrack.c
@@ -735,10 +735,7 @@ static int ovs_ct_nat_execute(struct sk_buff *skb, struct nf_conn *ct,
 			      const struct nf_nat_range2 *range,
 			      enum nf_nat_manip_type maniptype, struct sw_flow_key *key)
 {
-	int hooknum, nh_off, err = NF_ACCEPT;
-
-	nh_off = skb_network_offset(skb);
-	skb_pull_rcsum(skb, nh_off);
+	int hooknum, err = NF_ACCEPT;
 
 	/* See HOOK2MANIP(). */
 	if (maniptype == NF_NAT_MANIP_SRC)
@@ -755,7 +752,7 @@ static int ovs_ct_nat_execute(struct sk_buff *skb, struct nf_conn *ct,
 			if (!nf_nat_icmp_reply_translation(skb, ct, ctinfo,
 							   hooknum))
 				err = NF_DROP;
-			goto push;
+			goto out;
 		} else if (IS_ENABLED(CONFIG_IPV6) &&
 			   skb->protocol == htons(ETH_P_IPV6)) {
 			__be16 frag_off;
@@ -770,7 +767,7 @@ static int ovs_ct_nat_execute(struct sk_buff *skb, struct nf_conn *ct,
 								     hooknum,
 								     hdrlen))
 					err = NF_DROP;
-				goto push;
+				goto out;
 			}
 		}
 		/* Non-ICMP, fall thru to initialize if needed. */
@@ -788,7 +785,7 @@ static int ovs_ct_nat_execute(struct sk_buff *skb, struct nf_conn *ct,
 				? nf_nat_setup_info(ct, range, maniptype)
 				: nf_nat_alloc_null_binding(ct, hooknum);
 			if (err != NF_ACCEPT)
-				goto push;
+				goto out;
 		}
 		break;
 
@@ -798,13 +795,11 @@ static int ovs_ct_nat_execute(struct sk_buff *skb, struct nf_conn *ct,
 
 	default:
 		err = NF_DROP;
-		goto push;
+		goto out;
 	}
 
 	err = nf_nat_packet(ct, ctinfo, hooknum, skb);
-push:
-	skb_push_rcsum(skb, nh_off);
-
+out:
 	/* Update the flow key if NAT successful. */
 	if (err == NF_ACCEPT)
 		ovs_nat_update_key(key, skb, maniptype);
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCHv2 net-next 2/5] openvswitch: return NF_ACCEPT when OVS_CT_NAT is net set in info nat
  2022-11-22 17:32 [PATCHv2 net-next 0/5] net: eliminate the duplicate code in the ct nat functions of ovs and tc Xin Long
  2022-11-22 17:32 ` [PATCHv2 net-next 1/5] openvswitch: delete the unncessary skb_pull_rcsum call in ovs_ct_nat_execute Xin Long
@ 2022-11-22 17:32 ` Xin Long
  2022-11-23 14:24   ` Marcelo Ricardo Leitner
  2022-11-22 17:32 ` [PATCHv2 net-next 3/5] net: sched: return NF_ACCEPT when fails to add nat ext in tcf_ct_act_nat Xin Long
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 23+ messages in thread
From: Xin Long @ 2022-11-22 17:32 UTC (permalink / raw)
  To: network dev, dev, ovs-dev
  Cc: davem, kuba, Eric Dumazet, Paolo Abeni, Pravin B Shelar,
	Jamal Hadi Salim, Cong Wang, Jiri Pirko, Pablo Neira Ayuso,
	Florian Westphal, Marcelo Ricardo Leitner, Davide Caratti,
	Oz Shlomo, Paul Blakey, Ilya Maximets, Eelco Chaudron,
	Aaron Conole

Either OVS_CT_SRC_NAT or OVS_CT_DST_NAT is set, OVS_CT_NAT must be
set in info->nat. Thus, if OVS_CT_NAT is not set in info->nat, it
will definitely not do NAT but returns NF_ACCEPT in ovs_ct_nat().

This patch changes nothing funcational but only makes this return
earlier in ovs_ct_nat() to keep consistent with TC's processing
in tcf_ct_act_nat().

Reviewed-by: Saeed Mahameed <saeed@kernel.org>
Acked-by: Aaron Conole <aconole@redhat.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
---
 net/openvswitch/conntrack.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/net/openvswitch/conntrack.c b/net/openvswitch/conntrack.c
index 4c5e5a6475af..cc643a556ea1 100644
--- a/net/openvswitch/conntrack.c
+++ b/net/openvswitch/conntrack.c
@@ -816,6 +816,9 @@ static int ovs_ct_nat(struct net *net, struct sw_flow_key *key,
 	enum nf_nat_manip_type maniptype;
 	int err;
 
+	if (!(info->nat & OVS_CT_NAT))
+		return NF_ACCEPT;
+
 	/* Add NAT extension if not confirmed yet. */
 	if (!nf_ct_is_confirmed(ct) && !nf_ct_nat_ext_add(ct))
 		return NF_ACCEPT;   /* Can't NAT. */
@@ -825,8 +828,7 @@ static int ovs_ct_nat(struct net *net, struct sw_flow_key *key,
 	 * Make sure new expected connections (IP_CT_RELATED) are NATted only
 	 * when committing.
 	 */
-	if (info->nat & OVS_CT_NAT && ctinfo != IP_CT_NEW &&
-	    ct->status & IPS_NAT_MASK &&
+	if (ctinfo != IP_CT_NEW && ct->status & IPS_NAT_MASK &&
 	    (ctinfo != IP_CT_RELATED || info->commit)) {
 		/* NAT an established or related connection like before. */
 		if (CTINFO2DIR(ctinfo) == IP_CT_DIR_REPLY)
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCHv2 net-next 3/5] net: sched: return NF_ACCEPT when fails to add nat ext in tcf_ct_act_nat
  2022-11-22 17:32 [PATCHv2 net-next 0/5] net: eliminate the duplicate code in the ct nat functions of ovs and tc Xin Long
  2022-11-22 17:32 ` [PATCHv2 net-next 1/5] openvswitch: delete the unncessary skb_pull_rcsum call in ovs_ct_nat_execute Xin Long
  2022-11-22 17:32 ` [PATCHv2 net-next 2/5] openvswitch: return NF_ACCEPT when OVS_CT_NAT is net set in info nat Xin Long
@ 2022-11-22 17:32 ` Xin Long
  2022-11-23 14:23   ` Marcelo Ricardo Leitner
  2022-11-22 17:32 ` [PATCHv2 net-next 4/5] net: sched: update the nat flag for icmp error packets in ct_nat_execute Xin Long
                   ` (2 subsequent siblings)
  5 siblings, 1 reply; 23+ messages in thread
From: Xin Long @ 2022-11-22 17:32 UTC (permalink / raw)
  To: network dev, dev, ovs-dev
  Cc: davem, kuba, Eric Dumazet, Paolo Abeni, Pravin B Shelar,
	Jamal Hadi Salim, Cong Wang, Jiri Pirko, Pablo Neira Ayuso,
	Florian Westphal, Marcelo Ricardo Leitner, Davide Caratti,
	Oz Shlomo, Paul Blakey, Ilya Maximets, Eelco Chaudron,
	Aaron Conole

This patch changes to return NF_ACCEPT when fails to add nat
ext before doing NAT in tcf_ct_act_nat(), to keep consistent
with OVS' processing in ovs_ct_nat().

Reviewed-by: Saeed Mahameed <saeed@kernel.org>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
---
 net/sched/act_ct.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/sched/act_ct.c b/net/sched/act_ct.c
index da0b7f665277..8869b3ef6642 100644
--- a/net/sched/act_ct.c
+++ b/net/sched/act_ct.c
@@ -994,7 +994,7 @@ static int tcf_ct_act_nat(struct sk_buff *skb,
 
 	/* Add NAT extension if not confirmed yet. */
 	if (!nf_ct_is_confirmed(ct) && !nf_ct_nat_ext_add(ct))
-		return NF_DROP;   /* Can't NAT. */
+		return NF_ACCEPT;   /* Can't NAT. */
 
 	if (ctinfo != IP_CT_NEW && (ct->status & IPS_NAT_MASK) &&
 	    (ctinfo != IP_CT_RELATED || commit)) {
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCHv2 net-next 4/5] net: sched: update the nat flag for icmp error packets in ct_nat_execute
  2022-11-22 17:32 [PATCHv2 net-next 0/5] net: eliminate the duplicate code in the ct nat functions of ovs and tc Xin Long
                   ` (2 preceding siblings ...)
  2022-11-22 17:32 ` [PATCHv2 net-next 3/5] net: sched: return NF_ACCEPT when fails to add nat ext in tcf_ct_act_nat Xin Long
@ 2022-11-22 17:32 ` Xin Long
  2022-11-22 17:32 ` [PATCHv2 net-next 5/5] net: move the nat function to nf_nat_ovs for ovs and tc Xin Long
  2022-11-23 12:39 ` [PATCHv2 net-next 0/5] net: eliminate the duplicate code in the ct nat functions of " Marcelo Ricardo Leitner
  5 siblings, 0 replies; 23+ messages in thread
From: Xin Long @ 2022-11-22 17:32 UTC (permalink / raw)
  To: network dev, dev, ovs-dev
  Cc: davem, kuba, Eric Dumazet, Paolo Abeni, Pravin B Shelar,
	Jamal Hadi Salim, Cong Wang, Jiri Pirko, Pablo Neira Ayuso,
	Florian Westphal, Marcelo Ricardo Leitner, Davide Caratti,
	Oz Shlomo, Paul Blakey, Ilya Maximets, Eelco Chaudron,
	Aaron Conole

In ovs_ct_nat_execute(), the packet flow key nat flags are updated
when it processes ICMP(v6) error packets translation successfully.

In ct_nat_execute() when processing ICMP(v6) error packets translation
successfully, it should have done the same in ct_nat_execute() to set
post_ct_s/dnat flag, which will be used to update flow key nat flags
in OVS module later.

Reviewed-by: Saeed Mahameed <saeed@kernel.org>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
---
 net/sched/act_ct.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/sched/act_ct.c b/net/sched/act_ct.c
index 8869b3ef6642..c7782c9a6ab6 100644
--- a/net/sched/act_ct.c
+++ b/net/sched/act_ct.c
@@ -936,13 +936,13 @@ static int ct_nat_execute(struct sk_buff *skb, struct nf_conn *ct,
 	}
 
 	err = nf_nat_packet(ct, ctinfo, hooknum, skb);
+out:
 	if (err == NF_ACCEPT) {
 		if (maniptype == NF_NAT_MANIP_SRC)
 			tc_skb_cb(skb)->post_ct_snat = 1;
 		if (maniptype == NF_NAT_MANIP_DST)
 			tc_skb_cb(skb)->post_ct_dnat = 1;
 	}
-out:
 	return err;
 }
 #endif /* CONFIG_NF_NAT */
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCHv2 net-next 5/5] net: move the nat function to nf_nat_ovs for ovs and tc
  2022-11-22 17:32 [PATCHv2 net-next 0/5] net: eliminate the duplicate code in the ct nat functions of ovs and tc Xin Long
                   ` (3 preceding siblings ...)
  2022-11-22 17:32 ` [PATCHv2 net-next 4/5] net: sched: update the nat flag for icmp error packets in ct_nat_execute Xin Long
@ 2022-11-22 17:32 ` Xin Long
  2022-11-23 15:09   ` Marcelo Ricardo Leitner
  2022-11-23 18:52   ` Marcelo Ricardo Leitner
  2022-11-23 12:39 ` [PATCHv2 net-next 0/5] net: eliminate the duplicate code in the ct nat functions of " Marcelo Ricardo Leitner
  5 siblings, 2 replies; 23+ messages in thread
From: Xin Long @ 2022-11-22 17:32 UTC (permalink / raw)
  To: network dev, dev, ovs-dev
  Cc: davem, kuba, Eric Dumazet, Paolo Abeni, Pravin B Shelar,
	Jamal Hadi Salim, Cong Wang, Jiri Pirko, Pablo Neira Ayuso,
	Florian Westphal, Marcelo Ricardo Leitner, Davide Caratti,
	Oz Shlomo, Paul Blakey, Ilya Maximets, Eelco Chaudron,
	Aaron Conole

There are two nat functions are nearly the same in both OVS and
TC code, (ovs_)ct_nat_execute() and ovs_ct_nat/tcf_ct_act_nat().

This patch creates nf_nat_ovs.c under netfilter and moves them
there then exports nf_ct_nat() so that it can be shared by both
OVS and TC, and keeps the nat (type) check and nat flag update
in OVS and TC's own place, as these parts are different between
OVS and TC.

Note that in OVS nat function it was using skb->protocol to get
the proto as it already skips vlans in key_extract(), while it
doesn't in TC, and TC has to call skb_protocol() to get proto.
So in nf_ct_nat_execute(), we keep using skb_protocol() which
works for both OVS and TC contrack.

Reviewed-by: Saeed Mahameed <saeed@kernel.org>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
---
 include/net/netfilter/nf_nat.h |   4 +
 net/netfilter/Makefile         |   2 +-
 net/netfilter/nf_nat_ovs.c     | 135 ++++++++++++++++++++++++++++++++
 net/openvswitch/conntrack.c    | 137 +++------------------------------
 net/sched/act_ct.c             | 136 +++-----------------------------
 5 files changed, 161 insertions(+), 253 deletions(-)
 create mode 100644 net/netfilter/nf_nat_ovs.c

diff --git a/include/net/netfilter/nf_nat.h b/include/net/netfilter/nf_nat.h
index e9eb01e99d2f..9877f064548a 100644
--- a/include/net/netfilter/nf_nat.h
+++ b/include/net/netfilter/nf_nat.h
@@ -104,6 +104,10 @@ unsigned int
 nf_nat_inet_fn(void *priv, struct sk_buff *skb,
 	       const struct nf_hook_state *state);
 
+int nf_ct_nat(struct sk_buff *skb, struct nf_conn *ct,
+	      enum ip_conntrack_info ctinfo, int *action,
+	      const struct nf_nat_range2 *range, bool commit);
+
 static inline int nf_nat_initialized(const struct nf_conn *ct,
 				     enum nf_nat_manip_type manip)
 {
diff --git a/net/netfilter/Makefile b/net/netfilter/Makefile
index 1d4db1943936..4fa50d2842ec 100644
--- a/net/netfilter/Makefile
+++ b/net/netfilter/Makefile
@@ -52,7 +52,7 @@ obj-$(CONFIG_NF_CONNTRACK_SANE) += nf_conntrack_sane.o
 obj-$(CONFIG_NF_CONNTRACK_SIP) += nf_conntrack_sip.o
 obj-$(CONFIG_NF_CONNTRACK_TFTP) += nf_conntrack_tftp.o
 
-nf_nat-y	:= nf_nat_core.o nf_nat_proto.o nf_nat_helper.o
+nf_nat-y	:= nf_nat_core.o nf_nat_proto.o nf_nat_helper.o nf_nat_ovs.o
 
 obj-$(CONFIG_NF_LOG_SYSLOG) += nf_log_syslog.o
 
diff --git a/net/netfilter/nf_nat_ovs.c b/net/netfilter/nf_nat_ovs.c
new file mode 100644
index 000000000000..daff80e7a43a
--- /dev/null
+++ b/net/netfilter/nf_nat_ovs.c
@@ -0,0 +1,135 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/* Support nat functions for openvswitch and used by OVS and TC conntrack. */
+
+#include <net/netfilter/nf_nat.h>
+
+/* Modelled after nf_nat_ipv[46]_fn().
+ * range is only used for new, uninitialized NAT state.
+ * Returns either NF_ACCEPT or NF_DROP.
+ */
+static int nf_ct_nat_execute(struct sk_buff *skb, struct nf_conn *ct,
+			     enum ip_conntrack_info ctinfo, int *action,
+			     const struct nf_nat_range2 *range,
+			     enum nf_nat_manip_type maniptype)
+{
+	__be16 proto = skb_protocol(skb, true);
+	int hooknum, err = NF_ACCEPT;
+
+	/* See HOOK2MANIP(). */
+	if (maniptype == NF_NAT_MANIP_SRC)
+		hooknum = NF_INET_LOCAL_IN; /* Source NAT */
+	else
+		hooknum = NF_INET_LOCAL_OUT; /* Destination NAT */
+
+	switch (ctinfo) {
+	case IP_CT_RELATED:
+	case IP_CT_RELATED_REPLY:
+		if (proto == htons(ETH_P_IP) &&
+		    ip_hdr(skb)->protocol == IPPROTO_ICMP) {
+			if (!nf_nat_icmp_reply_translation(skb, ct, ctinfo,
+							   hooknum))
+				err = NF_DROP;
+			goto out;
+		} else if (IS_ENABLED(CONFIG_IPV6) && proto == htons(ETH_P_IPV6)) {
+			__be16 frag_off;
+			u8 nexthdr = ipv6_hdr(skb)->nexthdr;
+			int hdrlen = ipv6_skip_exthdr(skb,
+						      sizeof(struct ipv6hdr),
+						      &nexthdr, &frag_off);
+
+			if (hdrlen >= 0 && nexthdr == IPPROTO_ICMPV6) {
+				if (!nf_nat_icmpv6_reply_translation(skb, ct,
+								     ctinfo,
+								     hooknum,
+								     hdrlen))
+					err = NF_DROP;
+				goto out;
+			}
+		}
+		/* Non-ICMP, fall thru to initialize if needed. */
+		fallthrough;
+	case IP_CT_NEW:
+		/* Seen it before?  This can happen for loopback, retrans,
+		 * or local packets.
+		 */
+		if (!nf_nat_initialized(ct, maniptype)) {
+			/* Initialize according to the NAT action. */
+			err = (range && range->flags & NF_NAT_RANGE_MAP_IPS)
+				/* Action is set up to establish a new
+				 * mapping.
+				 */
+				? nf_nat_setup_info(ct, range, maniptype)
+				: nf_nat_alloc_null_binding(ct, hooknum);
+			if (err != NF_ACCEPT)
+				goto out;
+		}
+		break;
+
+	case IP_CT_ESTABLISHED:
+	case IP_CT_ESTABLISHED_REPLY:
+		break;
+
+	default:
+		err = NF_DROP;
+		goto out;
+	}
+
+	err = nf_nat_packet(ct, ctinfo, hooknum, skb);
+	if (err == NF_ACCEPT)
+		*action |= (1 << maniptype);
+out:
+	return err;
+}
+
+int nf_ct_nat(struct sk_buff *skb, struct nf_conn *ct,
+	      enum ip_conntrack_info ctinfo, int *action,
+	      const struct nf_nat_range2 *range, bool commit)
+{
+	enum nf_nat_manip_type maniptype;
+	int err, ct_action = *action;
+
+	*action = 0;
+
+	/* Add NAT extension if not confirmed yet. */
+	if (!nf_ct_is_confirmed(ct) && !nf_ct_nat_ext_add(ct))
+		return NF_ACCEPT;   /* Can't NAT. */
+
+	if (ctinfo != IP_CT_NEW && (ct->status & IPS_NAT_MASK) &&
+	    (ctinfo != IP_CT_RELATED || commit)) {
+		/* NAT an established or related connection like before. */
+		if (CTINFO2DIR(ctinfo) == IP_CT_DIR_REPLY)
+			/* This is the REPLY direction for a connection
+			 * for which NAT was applied in the forward
+			 * direction.  Do the reverse NAT.
+			 */
+			maniptype = ct->status & IPS_SRC_NAT
+				? NF_NAT_MANIP_DST : NF_NAT_MANIP_SRC;
+		else
+			maniptype = ct->status & IPS_SRC_NAT
+				? NF_NAT_MANIP_SRC : NF_NAT_MANIP_DST;
+	} else if (ct_action & (1 << NF_NAT_MANIP_SRC)) {
+		maniptype = NF_NAT_MANIP_SRC;
+	} else if (ct_action & (1 << NF_NAT_MANIP_DST)) {
+		maniptype = NF_NAT_MANIP_DST;
+	} else {
+		return NF_ACCEPT;
+	}
+
+	err = nf_ct_nat_execute(skb, ct, ctinfo, action, range, maniptype);
+	if (err == NF_ACCEPT && ct->status & IPS_DST_NAT) {
+		if (ct->status & IPS_SRC_NAT) {
+			if (maniptype == NF_NAT_MANIP_SRC)
+				maniptype = NF_NAT_MANIP_DST;
+			else
+				maniptype = NF_NAT_MANIP_SRC;
+
+			err = nf_ct_nat_execute(skb, ct, ctinfo, action, range,
+						maniptype);
+		} else if (CTINFO2DIR(ctinfo) == IP_CT_DIR_ORIGINAL) {
+			err = nf_ct_nat_execute(skb, ct, ctinfo, action, NULL,
+						NF_NAT_MANIP_SRC);
+		}
+	}
+	return err;
+}
+EXPORT_SYMBOL_GPL(nf_ct_nat);
diff --git a/net/openvswitch/conntrack.c b/net/openvswitch/conntrack.c
index cc643a556ea1..d03c75165663 100644
--- a/net/openvswitch/conntrack.c
+++ b/net/openvswitch/conntrack.c
@@ -726,144 +726,27 @@ static void ovs_nat_update_key(struct sw_flow_key *key,
 	}
 }
 
-/* Modelled after nf_nat_ipv[46]_fn().
- * range is only used for new, uninitialized NAT state.
- * Returns either NF_ACCEPT or NF_DROP.
- */
-static int ovs_ct_nat_execute(struct sk_buff *skb, struct nf_conn *ct,
-			      enum ip_conntrack_info ctinfo,
-			      const struct nf_nat_range2 *range,
-			      enum nf_nat_manip_type maniptype, struct sw_flow_key *key)
-{
-	int hooknum, err = NF_ACCEPT;
-
-	/* See HOOK2MANIP(). */
-	if (maniptype == NF_NAT_MANIP_SRC)
-		hooknum = NF_INET_LOCAL_IN; /* Source NAT */
-	else
-		hooknum = NF_INET_LOCAL_OUT; /* Destination NAT */
-
-	switch (ctinfo) {
-	case IP_CT_RELATED:
-	case IP_CT_RELATED_REPLY:
-		if (IS_ENABLED(CONFIG_NF_NAT) &&
-		    skb->protocol == htons(ETH_P_IP) &&
-		    ip_hdr(skb)->protocol == IPPROTO_ICMP) {
-			if (!nf_nat_icmp_reply_translation(skb, ct, ctinfo,
-							   hooknum))
-				err = NF_DROP;
-			goto out;
-		} else if (IS_ENABLED(CONFIG_IPV6) &&
-			   skb->protocol == htons(ETH_P_IPV6)) {
-			__be16 frag_off;
-			u8 nexthdr = ipv6_hdr(skb)->nexthdr;
-			int hdrlen = ipv6_skip_exthdr(skb,
-						      sizeof(struct ipv6hdr),
-						      &nexthdr, &frag_off);
-
-			if (hdrlen >= 0 && nexthdr == IPPROTO_ICMPV6) {
-				if (!nf_nat_icmpv6_reply_translation(skb, ct,
-								     ctinfo,
-								     hooknum,
-								     hdrlen))
-					err = NF_DROP;
-				goto out;
-			}
-		}
-		/* Non-ICMP, fall thru to initialize if needed. */
-		fallthrough;
-	case IP_CT_NEW:
-		/* Seen it before?  This can happen for loopback, retrans,
-		 * or local packets.
-		 */
-		if (!nf_nat_initialized(ct, maniptype)) {
-			/* Initialize according to the NAT action. */
-			err = (range && range->flags & NF_NAT_RANGE_MAP_IPS)
-				/* Action is set up to establish a new
-				 * mapping.
-				 */
-				? nf_nat_setup_info(ct, range, maniptype)
-				: nf_nat_alloc_null_binding(ct, hooknum);
-			if (err != NF_ACCEPT)
-				goto out;
-		}
-		break;
-
-	case IP_CT_ESTABLISHED:
-	case IP_CT_ESTABLISHED_REPLY:
-		break;
-
-	default:
-		err = NF_DROP;
-		goto out;
-	}
-
-	err = nf_nat_packet(ct, ctinfo, hooknum, skb);
-out:
-	/* Update the flow key if NAT successful. */
-	if (err == NF_ACCEPT)
-		ovs_nat_update_key(key, skb, maniptype);
-
-	return err;
-}
-
 /* Returns NF_DROP if the packet should be dropped, NF_ACCEPT otherwise. */
 static int ovs_ct_nat(struct net *net, struct sw_flow_key *key,
 		      const struct ovs_conntrack_info *info,
 		      struct sk_buff *skb, struct nf_conn *ct,
 		      enum ip_conntrack_info ctinfo)
 {
-	enum nf_nat_manip_type maniptype;
-	int err;
+	int err, action = 0;
 
 	if (!(info->nat & OVS_CT_NAT))
 		return NF_ACCEPT;
+	if (info->nat & OVS_CT_SRC_NAT)
+		action |= (1 << NF_NAT_MANIP_SRC);
+	if (info->nat & OVS_CT_DST_NAT)
+		action |= (1 << NF_NAT_MANIP_DST);
 
-	/* Add NAT extension if not confirmed yet. */
-	if (!nf_ct_is_confirmed(ct) && !nf_ct_nat_ext_add(ct))
-		return NF_ACCEPT;   /* Can't NAT. */
+	err = nf_ct_nat(skb, ct, ctinfo, &action, &info->range, info->commit);
 
-	/* Determine NAT type.
-	 * Check if the NAT type can be deduced from the tracked connection.
-	 * Make sure new expected connections (IP_CT_RELATED) are NATted only
-	 * when committing.
-	 */
-	if (ctinfo != IP_CT_NEW && ct->status & IPS_NAT_MASK &&
-	    (ctinfo != IP_CT_RELATED || info->commit)) {
-		/* NAT an established or related connection like before. */
-		if (CTINFO2DIR(ctinfo) == IP_CT_DIR_REPLY)
-			/* This is the REPLY direction for a connection
-			 * for which NAT was applied in the forward
-			 * direction.  Do the reverse NAT.
-			 */
-			maniptype = ct->status & IPS_SRC_NAT
-				? NF_NAT_MANIP_DST : NF_NAT_MANIP_SRC;
-		else
-			maniptype = ct->status & IPS_SRC_NAT
-				? NF_NAT_MANIP_SRC : NF_NAT_MANIP_DST;
-	} else if (info->nat & OVS_CT_SRC_NAT) {
-		maniptype = NF_NAT_MANIP_SRC;
-	} else if (info->nat & OVS_CT_DST_NAT) {
-		maniptype = NF_NAT_MANIP_DST;
-	} else {
-		return NF_ACCEPT; /* Connection is not NATed. */
-	}
-	err = ovs_ct_nat_execute(skb, ct, ctinfo, &info->range, maniptype, key);
-
-	if (err == NF_ACCEPT && ct->status & IPS_DST_NAT) {
-		if (ct->status & IPS_SRC_NAT) {
-			if (maniptype == NF_NAT_MANIP_SRC)
-				maniptype = NF_NAT_MANIP_DST;
-			else
-				maniptype = NF_NAT_MANIP_SRC;
-
-			err = ovs_ct_nat_execute(skb, ct, ctinfo, &info->range,
-						 maniptype, key);
-		} else if (CTINFO2DIR(ctinfo) == IP_CT_DIR_ORIGINAL) {
-			err = ovs_ct_nat_execute(skb, ct, ctinfo, NULL,
-						 NF_NAT_MANIP_SRC, key);
-		}
-	}
+	if (action & (1 << NF_NAT_MANIP_SRC))
+		ovs_nat_update_key(key, skb, NF_NAT_MANIP_SRC);
+	if (action & (1 << NF_NAT_MANIP_DST))
+		ovs_nat_update_key(key, skb, NF_NAT_MANIP_DST);
 
 	return err;
 }
diff --git a/net/sched/act_ct.c b/net/sched/act_ct.c
index c7782c9a6ab6..0c410220239f 100644
--- a/net/sched/act_ct.c
+++ b/net/sched/act_ct.c
@@ -863,90 +863,6 @@ static void tcf_ct_params_free_rcu(struct rcu_head *head)
 	tcf_ct_params_free(params);
 }
 
-#if IS_ENABLED(CONFIG_NF_NAT)
-/* Modelled after nf_nat_ipv[46]_fn().
- * range is only used for new, uninitialized NAT state.
- * Returns either NF_ACCEPT or NF_DROP.
- */
-static int ct_nat_execute(struct sk_buff *skb, struct nf_conn *ct,
-			  enum ip_conntrack_info ctinfo,
-			  const struct nf_nat_range2 *range,
-			  enum nf_nat_manip_type maniptype)
-{
-	__be16 proto = skb_protocol(skb, true);
-	int hooknum, err = NF_ACCEPT;
-
-	/* See HOOK2MANIP(). */
-	if (maniptype == NF_NAT_MANIP_SRC)
-		hooknum = NF_INET_LOCAL_IN; /* Source NAT */
-	else
-		hooknum = NF_INET_LOCAL_OUT; /* Destination NAT */
-
-	switch (ctinfo) {
-	case IP_CT_RELATED:
-	case IP_CT_RELATED_REPLY:
-		if (proto == htons(ETH_P_IP) &&
-		    ip_hdr(skb)->protocol == IPPROTO_ICMP) {
-			if (!nf_nat_icmp_reply_translation(skb, ct, ctinfo,
-							   hooknum))
-				err = NF_DROP;
-			goto out;
-		} else if (IS_ENABLED(CONFIG_IPV6) && proto == htons(ETH_P_IPV6)) {
-			__be16 frag_off;
-			u8 nexthdr = ipv6_hdr(skb)->nexthdr;
-			int hdrlen = ipv6_skip_exthdr(skb,
-						      sizeof(struct ipv6hdr),
-						      &nexthdr, &frag_off);
-
-			if (hdrlen >= 0 && nexthdr == IPPROTO_ICMPV6) {
-				if (!nf_nat_icmpv6_reply_translation(skb, ct,
-								     ctinfo,
-								     hooknum,
-								     hdrlen))
-					err = NF_DROP;
-				goto out;
-			}
-		}
-		/* Non-ICMP, fall thru to initialize if needed. */
-		fallthrough;
-	case IP_CT_NEW:
-		/* Seen it before?  This can happen for loopback, retrans,
-		 * or local packets.
-		 */
-		if (!nf_nat_initialized(ct, maniptype)) {
-			/* Initialize according to the NAT action. */
-			err = (range && range->flags & NF_NAT_RANGE_MAP_IPS)
-				/* Action is set up to establish a new
-				 * mapping.
-				 */
-				? nf_nat_setup_info(ct, range, maniptype)
-				: nf_nat_alloc_null_binding(ct, hooknum);
-			if (err != NF_ACCEPT)
-				goto out;
-		}
-		break;
-
-	case IP_CT_ESTABLISHED:
-	case IP_CT_ESTABLISHED_REPLY:
-		break;
-
-	default:
-		err = NF_DROP;
-		goto out;
-	}
-
-	err = nf_nat_packet(ct, ctinfo, hooknum, skb);
-out:
-	if (err == NF_ACCEPT) {
-		if (maniptype == NF_NAT_MANIP_SRC)
-			tc_skb_cb(skb)->post_ct_snat = 1;
-		if (maniptype == NF_NAT_MANIP_DST)
-			tc_skb_cb(skb)->post_ct_dnat = 1;
-	}
-	return err;
-}
-#endif /* CONFIG_NF_NAT */
-
 static void tcf_ct_act_set_mark(struct nf_conn *ct, u32 mark, u32 mask)
 {
 #if IS_ENABLED(CONFIG_NF_CONNTRACK_MARK)
@@ -986,52 +902,22 @@ static int tcf_ct_act_nat(struct sk_buff *skb,
 			  bool commit)
 {
 #if IS_ENABLED(CONFIG_NF_NAT)
-	int err;
-	enum nf_nat_manip_type maniptype;
+	int err, action = 0;
 
 	if (!(ct_action & TCA_CT_ACT_NAT))
 		return NF_ACCEPT;
+	if (ct_action & TCA_CT_ACT_NAT_SRC)
+		action |= (1 << NF_NAT_MANIP_SRC);
+	if (ct_action & TCA_CT_ACT_NAT_DST)
+		action |= (1 << NF_NAT_MANIP_DST);
 
-	/* Add NAT extension if not confirmed yet. */
-	if (!nf_ct_is_confirmed(ct) && !nf_ct_nat_ext_add(ct))
-		return NF_ACCEPT;   /* Can't NAT. */
-
-	if (ctinfo != IP_CT_NEW && (ct->status & IPS_NAT_MASK) &&
-	    (ctinfo != IP_CT_RELATED || commit)) {
-		/* NAT an established or related connection like before. */
-		if (CTINFO2DIR(ctinfo) == IP_CT_DIR_REPLY)
-			/* This is the REPLY direction for a connection
-			 * for which NAT was applied in the forward
-			 * direction.  Do the reverse NAT.
-			 */
-			maniptype = ct->status & IPS_SRC_NAT
-				? NF_NAT_MANIP_DST : NF_NAT_MANIP_SRC;
-		else
-			maniptype = ct->status & IPS_SRC_NAT
-				? NF_NAT_MANIP_SRC : NF_NAT_MANIP_DST;
-	} else if (ct_action & TCA_CT_ACT_NAT_SRC) {
-		maniptype = NF_NAT_MANIP_SRC;
-	} else if (ct_action & TCA_CT_ACT_NAT_DST) {
-		maniptype = NF_NAT_MANIP_DST;
-	} else {
-		return NF_ACCEPT;
-	}
+	err = nf_ct_nat(skb, ct, ctinfo, &action, range, commit);
+
+	if (action & (1 << NF_NAT_MANIP_SRC))
+		tc_skb_cb(skb)->post_ct_snat = 1;
+	if (action & (1 << NF_NAT_MANIP_DST))
+		tc_skb_cb(skb)->post_ct_dnat = 1;
 
-	err = ct_nat_execute(skb, ct, ctinfo, range, maniptype);
-	if (err == NF_ACCEPT && ct->status & IPS_DST_NAT) {
-		if (ct->status & IPS_SRC_NAT) {
-			if (maniptype == NF_NAT_MANIP_SRC)
-				maniptype = NF_NAT_MANIP_DST;
-			else
-				maniptype = NF_NAT_MANIP_SRC;
-
-			err = ct_nat_execute(skb, ct, ctinfo, range,
-					     maniptype);
-		} else if (CTINFO2DIR(ctinfo) == IP_CT_DIR_ORIGINAL) {
-			err = ct_nat_execute(skb, ct, ctinfo, NULL,
-					     NF_NAT_MANIP_SRC);
-		}
-	}
 	return err;
 #else
 	return NF_ACCEPT;
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [PATCHv2 net-next 0/5] net: eliminate the duplicate code in the ct nat functions of ovs and tc
  2022-11-22 17:32 [PATCHv2 net-next 0/5] net: eliminate the duplicate code in the ct nat functions of ovs and tc Xin Long
                   ` (4 preceding siblings ...)
  2022-11-22 17:32 ` [PATCHv2 net-next 5/5] net: move the nat function to nf_nat_ovs for ovs and tc Xin Long
@ 2022-11-23 12:39 ` Marcelo Ricardo Leitner
  5 siblings, 0 replies; 23+ messages in thread
From: Marcelo Ricardo Leitner @ 2022-11-23 12:39 UTC (permalink / raw)
  To: Xin Long
  Cc: network dev, dev, ovs-dev, davem, kuba, Eric Dumazet,
	Paolo Abeni, Pravin B Shelar, Jamal Hadi Salim, Cong Wang,
	Jiri Pirko, Pablo Neira Ayuso, Florian Westphal, Davide Caratti,
	Oz Shlomo, Paul Blakey, Ilya Maximets, Eelco Chaudron,
	Aaron Conole

On Tue, Nov 22, 2022 at 12:32:16PM -0500, Xin Long wrote:
> The changes in the patchset:
> 
>   "net: add helper support in tc act_ct for ovs offloading"
> 
> had moved some common ct code used by both OVS and TC into netfilter.

Please give me today to review this patchset.

Thanks,
Marcelo

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCHv2 net-next 3/5] net: sched: return NF_ACCEPT when fails to add nat ext in tcf_ct_act_nat
  2022-11-22 17:32 ` [PATCHv2 net-next 3/5] net: sched: return NF_ACCEPT when fails to add nat ext in tcf_ct_act_nat Xin Long
@ 2022-11-23 14:23   ` Marcelo Ricardo Leitner
  2022-12-01 16:53     ` Xin Long
  0 siblings, 1 reply; 23+ messages in thread
From: Marcelo Ricardo Leitner @ 2022-11-23 14:23 UTC (permalink / raw)
  To: Xin Long
  Cc: network dev, dev, ovs-dev, davem, kuba, Eric Dumazet,
	Paolo Abeni, Pravin B Shelar, Jamal Hadi Salim, Cong Wang,
	Jiri Pirko, Pablo Neira Ayuso, Florian Westphal, Davide Caratti,
	Oz Shlomo, Paul Blakey, Ilya Maximets, Eelco Chaudron,
	Aaron Conole

On Tue, Nov 22, 2022 at 12:32:19PM -0500, Xin Long wrote:
> This patch changes to return NF_ACCEPT when fails to add nat
> ext before doing NAT in tcf_ct_act_nat(), to keep consistent
> with OVS' processing in ovs_ct_nat().
> 
> Reviewed-by: Saeed Mahameed <saeed@kernel.org>
> Signed-off-by: Xin Long <lucien.xin@gmail.com>
> ---
>  net/sched/act_ct.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/net/sched/act_ct.c b/net/sched/act_ct.c
> index da0b7f665277..8869b3ef6642 100644
> --- a/net/sched/act_ct.c
> +++ b/net/sched/act_ct.c
> @@ -994,7 +994,7 @@ static int tcf_ct_act_nat(struct sk_buff *skb,
>  
>  	/* Add NAT extension if not confirmed yet. */
>  	if (!nf_ct_is_confirmed(ct) && !nf_ct_nat_ext_add(ct))
> -		return NF_DROP;   /* Can't NAT. */
> +		return NF_ACCEPT;   /* Can't NAT. */

I'm wondering if the fix should actually be in OVS, to make it drop
the packet? Aaron, Eelco?

If the user asked for NAT, and it can't NAT, it doesn't seem right to
forward the packet while not performing the asked action.

If we follow the code here, it may even commit the entry without the
NAT extension, rendering the connection useless/broken per the first
if condition above. It just won't try again.

>  
>  	if (ctinfo != IP_CT_NEW && (ct->status & IPS_NAT_MASK) &&
>  	    (ctinfo != IP_CT_RELATED || commit)) {
> -- 
> 2.31.1
> 

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCHv2 net-next 2/5] openvswitch: return NF_ACCEPT when OVS_CT_NAT is net set in info nat
  2022-11-22 17:32 ` [PATCHv2 net-next 2/5] openvswitch: return NF_ACCEPT when OVS_CT_NAT is net set in info nat Xin Long
@ 2022-11-23 14:24   ` Marcelo Ricardo Leitner
  0 siblings, 0 replies; 23+ messages in thread
From: Marcelo Ricardo Leitner @ 2022-11-23 14:24 UTC (permalink / raw)
  To: Xin Long
  Cc: network dev, dev, ovs-dev, davem, kuba, Eric Dumazet,
	Paolo Abeni, Pravin B Shelar, Jamal Hadi Salim, Cong Wang,
	Jiri Pirko, Pablo Neira Ayuso, Florian Westphal, Davide Caratti,
	Oz Shlomo, Paul Blakey, Ilya Maximets, Eelco Chaudron,
	Aaron Conole

There's a typo in the subject here, s/is net/is not/ .


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCHv2 net-next 5/5] net: move the nat function to nf_nat_ovs for ovs and tc
  2022-11-22 17:32 ` [PATCHv2 net-next 5/5] net: move the nat function to nf_nat_ovs for ovs and tc Xin Long
@ 2022-11-23 15:09   ` Marcelo Ricardo Leitner
  2022-11-23 15:13     ` [ovs-dev] " Marcelo Ricardo Leitner
  2022-11-23 18:52   ` Marcelo Ricardo Leitner
  1 sibling, 1 reply; 23+ messages in thread
From: Marcelo Ricardo Leitner @ 2022-11-23 15:09 UTC (permalink / raw)
  To: Xin Long
  Cc: network dev, dev, ovs-dev, davem, kuba, Eric Dumazet,
	Paolo Abeni, Pravin B Shelar, Jamal Hadi Salim, Cong Wang,
	Jiri Pirko, Pablo Neira Ayuso, Florian Westphal, Davide Caratti,
	Oz Shlomo, Paul Blakey, Ilya Maximets, Eelco Chaudron,
	Aaron Conole

On Tue, Nov 22, 2022 at 12:32:21PM -0500, Xin Long wrote:
> +int nf_ct_nat(struct sk_buff *skb, struct nf_conn *ct,
> +	      enum ip_conntrack_info ctinfo, int *action,
> +	      const struct nf_nat_range2 *range, bool commit)
> +{
> +	enum nf_nat_manip_type maniptype;
> +	int err, ct_action = *action;
> +
> +	*action = 0;
> +
> +	/* Add NAT extension if not confirmed yet. */
> +	if (!nf_ct_is_confirmed(ct) && !nf_ct_nat_ext_add(ct))
> +		return NF_ACCEPT;   /* Can't NAT. */
> +
> +	if (ctinfo != IP_CT_NEW && (ct->status & IPS_NAT_MASK) &&
> +	    (ctinfo != IP_CT_RELATED || commit)) {
> +		/* NAT an established or related connection like before. */
> +		if (CTINFO2DIR(ctinfo) == IP_CT_DIR_REPLY)
> +			/* This is the REPLY direction for a connection
> +			 * for which NAT was applied in the forward
> +			 * direction.  Do the reverse NAT.
> +			 */
> +			maniptype = ct->status & IPS_SRC_NAT
> +				? NF_NAT_MANIP_DST : NF_NAT_MANIP_SRC;
> +		else
> +			maniptype = ct->status & IPS_SRC_NAT
> +				? NF_NAT_MANIP_SRC : NF_NAT_MANIP_DST;
> +	} else if (ct_action & (1 << NF_NAT_MANIP_SRC)) {
> +		maniptype = NF_NAT_MANIP_SRC;
> +	} else if (ct_action & (1 << NF_NAT_MANIP_DST)) {
> +		maniptype = NF_NAT_MANIP_DST;
> +	} else {
> +		return NF_ACCEPT;
> +	}
> +
> +	err = nf_ct_nat_execute(skb, ct, ctinfo, action, range, maniptype);
> +	if (err == NF_ACCEPT && ct->status & IPS_DST_NAT) {
> +		if (ct->status & IPS_SRC_NAT) {
> +			if (maniptype == NF_NAT_MANIP_SRC)
> +				maniptype = NF_NAT_MANIP_DST;
> +			else
> +				maniptype = NF_NAT_MANIP_SRC;
> +
> +			err = nf_ct_nat_execute(skb, ct, ctinfo, action, range,
> +						maniptype);
> +		} else if (CTINFO2DIR(ctinfo) == IP_CT_DIR_ORIGINAL) {
> +			err = nf_ct_nat_execute(skb, ct, ctinfo, action, NULL,
> +						NF_NAT_MANIP_SRC);
> +		}
> +	}
> +	return err;
> +}
> +EXPORT_SYMBOL_GPL(nf_ct_nat);
> diff --git a/net/openvswitch/conntrack.c b/net/openvswitch/conntrack.c
> index cc643a556ea1..d03c75165663 100644
> --- a/net/openvswitch/conntrack.c
> +++ b/net/openvswitch/conntrack.c
> @@ -726,144 +726,27 @@ static void ovs_nat_update_key(struct sw_flow_key *key,
>  	}
>  }
>  
> -/* Modelled after nf_nat_ipv[46]_fn().
> - * range is only used for new, uninitialized NAT state.
> - * Returns either NF_ACCEPT or NF_DROP.
> - */
> -static int ovs_ct_nat_execute(struct sk_buff *skb, struct nf_conn *ct,
> -			      enum ip_conntrack_info ctinfo,
> -			      const struct nf_nat_range2 *range,
> -			      enum nf_nat_manip_type maniptype, struct sw_flow_key *key)
> -{
> -	int hooknum, err = NF_ACCEPT;
> -
> -	/* See HOOK2MANIP(). */
> -	if (maniptype == NF_NAT_MANIP_SRC)
> -		hooknum = NF_INET_LOCAL_IN; /* Source NAT */
> -	else
> -		hooknum = NF_INET_LOCAL_OUT; /* Destination NAT */
> -
> -	switch (ctinfo) {
> -	case IP_CT_RELATED:
> -	case IP_CT_RELATED_REPLY:
> -		if (IS_ENABLED(CONFIG_NF_NAT) &&
> -		    skb->protocol == htons(ETH_P_IP) &&
> -		    ip_hdr(skb)->protocol == IPPROTO_ICMP) {
> -			if (!nf_nat_icmp_reply_translation(skb, ct, ctinfo,
> -							   hooknum))
> -				err = NF_DROP;
> -			goto out;
> -		} else if (IS_ENABLED(CONFIG_IPV6) &&
> -			   skb->protocol == htons(ETH_P_IPV6)) {
> -			__be16 frag_off;
> -			u8 nexthdr = ipv6_hdr(skb)->nexthdr;
> -			int hdrlen = ipv6_skip_exthdr(skb,
> -						      sizeof(struct ipv6hdr),
> -						      &nexthdr, &frag_off);
> -
> -			if (hdrlen >= 0 && nexthdr == IPPROTO_ICMPV6) {
> -				if (!nf_nat_icmpv6_reply_translation(skb, ct,
> -								     ctinfo,
> -								     hooknum,
> -								     hdrlen))
> -					err = NF_DROP;
> -				goto out;
> -			}
> -		}
> -		/* Non-ICMP, fall thru to initialize if needed. */
> -		fallthrough;
> -	case IP_CT_NEW:
> -		/* Seen it before?  This can happen for loopback, retrans,
> -		 * or local packets.
> -		 */
> -		if (!nf_nat_initialized(ct, maniptype)) {
> -			/* Initialize according to the NAT action. */
> -			err = (range && range->flags & NF_NAT_RANGE_MAP_IPS)
> -				/* Action is set up to establish a new
> -				 * mapping.
> -				 */
> -				? nf_nat_setup_info(ct, range, maniptype)
> -				: nf_nat_alloc_null_binding(ct, hooknum);
> -			if (err != NF_ACCEPT)
> -				goto out;
> -		}
> -		break;
> -
> -	case IP_CT_ESTABLISHED:
> -	case IP_CT_ESTABLISHED_REPLY:
> -		break;
> -
> -	default:
> -		err = NF_DROP;
> -		goto out;
> -	}
> -
> -	err = nf_nat_packet(ct, ctinfo, hooknum, skb);
> -out:
> -	/* Update the flow key if NAT successful. */
> -	if (err == NF_ACCEPT)
> -		ovs_nat_update_key(key, skb, maniptype);
> -
> -	return err;
> -}
> -
>  /* Returns NF_DROP if the packet should be dropped, NF_ACCEPT otherwise. */
>  static int ovs_ct_nat(struct net *net, struct sw_flow_key *key,
>  		      const struct ovs_conntrack_info *info,
>  		      struct sk_buff *skb, struct nf_conn *ct,
>  		      enum ip_conntrack_info ctinfo)
>  {
> -	enum nf_nat_manip_type maniptype;
> -	int err;
> +	int err, action = 0;
>  
>  	if (!(info->nat & OVS_CT_NAT))
>  		return NF_ACCEPT;
> +	if (info->nat & OVS_CT_SRC_NAT)
> +		action |= (1 << NF_NAT_MANIP_SRC);
> +	if (info->nat & OVS_CT_DST_NAT)
> +		action |= (1 << NF_NAT_MANIP_DST);

I'm wondering why this dance at this level with supporting multiple
MANIPs while actually only one can be used at a time.

act_ct will reject an action using both:
        if ((p->ct_action & TCA_CT_ACT_NAT_SRC) &&
            (p->ct_action & TCA_CT_ACT_NAT_DST)) {
                NL_SET_ERR_MSG_MOD(extack, "dnat and snat can't be enabled at the same time");
                return -EOPNOTSUPP;

I couldn't find this kind of check in ovs code right now (didn't look much, I
confess :)), but even the code here was already doing:

-	} else if (info->nat & OVS_CT_SRC_NAT) {
-		maniptype = NF_NAT_MANIP_SRC;
-	} else if (info->nat & OVS_CT_DST_NAT) {
-		maniptype = NF_NAT_MANIP_DST;

And in case of tuple conflict, maniptype will be forcibly updated in
[*] below.

Anyhow, if really needed, it would be nice to use BIT(NF_NAT_MANIP_..)
instead.

>  
> -	/* Add NAT extension if not confirmed yet. */
> -	if (!nf_ct_is_confirmed(ct) && !nf_ct_nat_ext_add(ct))
> -		return NF_ACCEPT;   /* Can't NAT. */
> +	err = nf_ct_nat(skb, ct, ctinfo, &action, &info->range, info->commit);
>  
> -	/* Determine NAT type.
> -	 * Check if the NAT type can be deduced from the tracked connection.
> -	 * Make sure new expected connections (IP_CT_RELATED) are NATted only
> -	 * when committing.
> -	 */
> -	if (ctinfo != IP_CT_NEW && ct->status & IPS_NAT_MASK &&
> -	    (ctinfo != IP_CT_RELATED || info->commit)) {
> -		/* NAT an established or related connection like before. */
> -		if (CTINFO2DIR(ctinfo) == IP_CT_DIR_REPLY)
> -			/* This is the REPLY direction for a connection
> -			 * for which NAT was applied in the forward
> -			 * direction.  Do the reverse NAT.
> -			 */
> -			maniptype = ct->status & IPS_SRC_NAT
> -				? NF_NAT_MANIP_DST : NF_NAT_MANIP_SRC;
> -		else
> -			maniptype = ct->status & IPS_SRC_NAT
> -				? NF_NAT_MANIP_SRC : NF_NAT_MANIP_DST;
> -	} else if (info->nat & OVS_CT_SRC_NAT) {
> -		maniptype = NF_NAT_MANIP_SRC;
> -	} else if (info->nat & OVS_CT_DST_NAT) {
> -		maniptype = NF_NAT_MANIP_DST;
> -	} else {
> -		return NF_ACCEPT; /* Connection is not NATed. */
> -	}
> -	err = ovs_ct_nat_execute(skb, ct, ctinfo, &info->range, maniptype, key);
> -
> -	if (err == NF_ACCEPT && ct->status & IPS_DST_NAT) {
> -		if (ct->status & IPS_SRC_NAT) {
> -			if (maniptype == NF_NAT_MANIP_SRC)
> -				maniptype = NF_NAT_MANIP_DST;
> -			else
> -				maniptype = NF_NAT_MANIP_SRC;

[*]

> -
> -			err = ovs_ct_nat_execute(skb, ct, ctinfo, &info->range,
> -						 maniptype, key);
> -		} else if (CTINFO2DIR(ctinfo) == IP_CT_DIR_ORIGINAL) {
> -			err = ovs_ct_nat_execute(skb, ct, ctinfo, NULL,
> -						 NF_NAT_MANIP_SRC, key);
> -		}
> -	}
> +	if (action & (1 << NF_NAT_MANIP_SRC))
> +		ovs_nat_update_key(key, skb, NF_NAT_MANIP_SRC);
> +	if (action & (1 << NF_NAT_MANIP_DST))
> +		ovs_nat_update_key(key, skb, NF_NAT_MANIP_DST);
>  
>  	return err;
>  }
> diff --git a/net/sched/act_ct.c b/net/sched/act_ct.c
> index c7782c9a6ab6..0c410220239f 100644
> --- a/net/sched/act_ct.c
> +++ b/net/sched/act_ct.c
> @@ -863,90 +863,6 @@ static void tcf_ct_params_free_rcu(struct rcu_head *head)
>  	tcf_ct_params_free(params);
>  }
>  
> -#if IS_ENABLED(CONFIG_NF_NAT)
> -/* Modelled after nf_nat_ipv[46]_fn().
> - * range is only used for new, uninitialized NAT state.
> - * Returns either NF_ACCEPT or NF_DROP.
> - */
> -static int ct_nat_execute(struct sk_buff *skb, struct nf_conn *ct,
> -			  enum ip_conntrack_info ctinfo,
> -			  const struct nf_nat_range2 *range,
> -			  enum nf_nat_manip_type maniptype)
> -{
> -	__be16 proto = skb_protocol(skb, true);
> -	int hooknum, err = NF_ACCEPT;
> -
> -	/* See HOOK2MANIP(). */
> -	if (maniptype == NF_NAT_MANIP_SRC)
> -		hooknum = NF_INET_LOCAL_IN; /* Source NAT */
> -	else
> -		hooknum = NF_INET_LOCAL_OUT; /* Destination NAT */
> -
> -	switch (ctinfo) {
> -	case IP_CT_RELATED:
> -	case IP_CT_RELATED_REPLY:
> -		if (proto == htons(ETH_P_IP) &&
> -		    ip_hdr(skb)->protocol == IPPROTO_ICMP) {
> -			if (!nf_nat_icmp_reply_translation(skb, ct, ctinfo,
> -							   hooknum))
> -				err = NF_DROP;
> -			goto out;
> -		} else if (IS_ENABLED(CONFIG_IPV6) && proto == htons(ETH_P_IPV6)) {
> -			__be16 frag_off;
> -			u8 nexthdr = ipv6_hdr(skb)->nexthdr;
> -			int hdrlen = ipv6_skip_exthdr(skb,
> -						      sizeof(struct ipv6hdr),
> -						      &nexthdr, &frag_off);
> -
> -			if (hdrlen >= 0 && nexthdr == IPPROTO_ICMPV6) {
> -				if (!nf_nat_icmpv6_reply_translation(skb, ct,
> -								     ctinfo,
> -								     hooknum,
> -								     hdrlen))
> -					err = NF_DROP;
> -				goto out;
> -			}
> -		}
> -		/* Non-ICMP, fall thru to initialize if needed. */
> -		fallthrough;
> -	case IP_CT_NEW:
> -		/* Seen it before?  This can happen for loopback, retrans,
> -		 * or local packets.
> -		 */
> -		if (!nf_nat_initialized(ct, maniptype)) {
> -			/* Initialize according to the NAT action. */
> -			err = (range && range->flags & NF_NAT_RANGE_MAP_IPS)
> -				/* Action is set up to establish a new
> -				 * mapping.
> -				 */
> -				? nf_nat_setup_info(ct, range, maniptype)
> -				: nf_nat_alloc_null_binding(ct, hooknum);
> -			if (err != NF_ACCEPT)
> -				goto out;
> -		}
> -		break;
> -
> -	case IP_CT_ESTABLISHED:
> -	case IP_CT_ESTABLISHED_REPLY:
> -		break;
> -
> -	default:
> -		err = NF_DROP;
> -		goto out;
> -	}
> -
> -	err = nf_nat_packet(ct, ctinfo, hooknum, skb);
> -out:
> -	if (err == NF_ACCEPT) {
> -		if (maniptype == NF_NAT_MANIP_SRC)
> -			tc_skb_cb(skb)->post_ct_snat = 1;
> -		if (maniptype == NF_NAT_MANIP_DST)
> -			tc_skb_cb(skb)->post_ct_dnat = 1;
> -	}
> -	return err;
> -}
> -#endif /* CONFIG_NF_NAT */
> -
>  static void tcf_ct_act_set_mark(struct nf_conn *ct, u32 mark, u32 mask)
>  {
>  #if IS_ENABLED(CONFIG_NF_CONNTRACK_MARK)
> @@ -986,52 +902,22 @@ static int tcf_ct_act_nat(struct sk_buff *skb,
>  			  bool commit)
>  {
>  #if IS_ENABLED(CONFIG_NF_NAT)
> -	int err;
> -	enum nf_nat_manip_type maniptype;
> +	int err, action = 0;
>  
>  	if (!(ct_action & TCA_CT_ACT_NAT))
>  		return NF_ACCEPT;
> +	if (ct_action & TCA_CT_ACT_NAT_SRC)
> +		action |= (1 << NF_NAT_MANIP_SRC);
> +	if (ct_action & TCA_CT_ACT_NAT_DST)
> +		action |= (1 << NF_NAT_MANIP_DST);
>  
> -	/* Add NAT extension if not confirmed yet. */
> -	if (!nf_ct_is_confirmed(ct) && !nf_ct_nat_ext_add(ct))
> -		return NF_ACCEPT;   /* Can't NAT. */
> -
> -	if (ctinfo != IP_CT_NEW && (ct->status & IPS_NAT_MASK) &&
> -	    (ctinfo != IP_CT_RELATED || commit)) {
> -		/* NAT an established or related connection like before. */
> -		if (CTINFO2DIR(ctinfo) == IP_CT_DIR_REPLY)
> -			/* This is the REPLY direction for a connection
> -			 * for which NAT was applied in the forward
> -			 * direction.  Do the reverse NAT.
> -			 */
> -			maniptype = ct->status & IPS_SRC_NAT
> -				? NF_NAT_MANIP_DST : NF_NAT_MANIP_SRC;
> -		else
> -			maniptype = ct->status & IPS_SRC_NAT
> -				? NF_NAT_MANIP_SRC : NF_NAT_MANIP_DST;
> -	} else if (ct_action & TCA_CT_ACT_NAT_SRC) {
> -		maniptype = NF_NAT_MANIP_SRC;
> -	} else if (ct_action & TCA_CT_ACT_NAT_DST) {
> -		maniptype = NF_NAT_MANIP_DST;
> -	} else {
> -		return NF_ACCEPT;
> -	}
> +	err = nf_ct_nat(skb, ct, ctinfo, &action, range, commit);
> +
> +	if (action & (1 << NF_NAT_MANIP_SRC))
> +		tc_skb_cb(skb)->post_ct_snat = 1;
> +	if (action & (1 << NF_NAT_MANIP_DST))
> +		tc_skb_cb(skb)->post_ct_dnat = 1;
>  
> -	err = ct_nat_execute(skb, ct, ctinfo, range, maniptype);
> -	if (err == NF_ACCEPT && ct->status & IPS_DST_NAT) {
> -		if (ct->status & IPS_SRC_NAT) {
> -			if (maniptype == NF_NAT_MANIP_SRC)
> -				maniptype = NF_NAT_MANIP_DST;
> -			else
> -				maniptype = NF_NAT_MANIP_SRC;
> -
> -			err = ct_nat_execute(skb, ct, ctinfo, range,
> -					     maniptype);
> -		} else if (CTINFO2DIR(ctinfo) == IP_CT_DIR_ORIGINAL) {
> -			err = ct_nat_execute(skb, ct, ctinfo, NULL,
> -					     NF_NAT_MANIP_SRC);
> -		}
> -	}
>  	return err;
>  #else
>  	return NF_ACCEPT;
> -- 
> 2.31.1
> 

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [ovs-dev] [PATCHv2 net-next 5/5] net: move the nat function to nf_nat_ovs for ovs and tc
  2022-11-23 15:09   ` Marcelo Ricardo Leitner
@ 2022-11-23 15:13     ` Marcelo Ricardo Leitner
  2022-11-23 17:31       ` Xin Long
  0 siblings, 1 reply; 23+ messages in thread
From: Marcelo Ricardo Leitner @ 2022-11-23 15:13 UTC (permalink / raw)
  To: Xin Long
  Cc: dev, ovs-dev, Davide Caratti, Jiri Pirko, network dev,
	Paul Blakey, Florian Westphal, Jamal Hadi Salim, Ilya Maximets,
	Eric Dumazet, Cong Wang, kuba, Paolo Abeni, davem,
	Pablo Neira Ayuso

On Wed, Nov 23, 2022 at 12:09:55PM -0300, Marcelo Ricardo Leitner wrote:
> On Tue, Nov 22, 2022 at 12:32:21PM -0500, Xin Long wrote:
> > +int nf_ct_nat(struct sk_buff *skb, struct nf_conn *ct,
> > +	      enum ip_conntrack_info ctinfo, int *action,
> > +	      const struct nf_nat_range2 *range, bool commit)
> > +{
> > +	enum nf_nat_manip_type maniptype;
> > +	int err, ct_action = *action;
> > +
> > +	*action = 0;
> > +
> > +	/* Add NAT extension if not confirmed yet. */
> > +	if (!nf_ct_is_confirmed(ct) && !nf_ct_nat_ext_add(ct))
> > +		return NF_ACCEPT;   /* Can't NAT. */
> > +
> > +	if (ctinfo != IP_CT_NEW && (ct->status & IPS_NAT_MASK) &&
> > +	    (ctinfo != IP_CT_RELATED || commit)) {
> > +		/* NAT an established or related connection like before. */
> > +		if (CTINFO2DIR(ctinfo) == IP_CT_DIR_REPLY)
> > +			/* This is the REPLY direction for a connection
> > +			 * for which NAT was applied in the forward
> > +			 * direction.  Do the reverse NAT.
> > +			 */
> > +			maniptype = ct->status & IPS_SRC_NAT
> > +				? NF_NAT_MANIP_DST : NF_NAT_MANIP_SRC;
> > +		else
> > +			maniptype = ct->status & IPS_SRC_NAT
> > +				? NF_NAT_MANIP_SRC : NF_NAT_MANIP_DST;
> > +	} else if (ct_action & (1 << NF_NAT_MANIP_SRC)) {
> > +		maniptype = NF_NAT_MANIP_SRC;
> > +	} else if (ct_action & (1 << NF_NAT_MANIP_DST)) {
> > +		maniptype = NF_NAT_MANIP_DST;
> > +	} else {
> > +		return NF_ACCEPT;
> > +	}
> > +
> > +	err = nf_ct_nat_execute(skb, ct, ctinfo, action, range, maniptype);
> > +	if (err == NF_ACCEPT && ct->status & IPS_DST_NAT) {
> > +		if (ct->status & IPS_SRC_NAT) {
> > +			if (maniptype == NF_NAT_MANIP_SRC)
> > +				maniptype = NF_NAT_MANIP_DST;
> > +			else
> > +				maniptype = NF_NAT_MANIP_SRC;
> > +
> > +			err = nf_ct_nat_execute(skb, ct, ctinfo, action, range,
> > +						maniptype);
> > +		} else if (CTINFO2DIR(ctinfo) == IP_CT_DIR_ORIGINAL) {
> > +			err = nf_ct_nat_execute(skb, ct, ctinfo, action, NULL,
> > +						NF_NAT_MANIP_SRC);
> > +		}
> > +	}
> > +	return err;
> > +}
> > +EXPORT_SYMBOL_GPL(nf_ct_nat);
> > diff --git a/net/openvswitch/conntrack.c b/net/openvswitch/conntrack.c
> > index cc643a556ea1..d03c75165663 100644
> > --- a/net/openvswitch/conntrack.c
> > +++ b/net/openvswitch/conntrack.c
> > @@ -726,144 +726,27 @@ static void ovs_nat_update_key(struct sw_flow_key *key,
> >  	}
> >  }
> >  
> > -/* Modelled after nf_nat_ipv[46]_fn().
> > - * range is only used for new, uninitialized NAT state.
> > - * Returns either NF_ACCEPT or NF_DROP.
> > - */
> > -static int ovs_ct_nat_execute(struct sk_buff *skb, struct nf_conn *ct,
> > -			      enum ip_conntrack_info ctinfo,
> > -			      const struct nf_nat_range2 *range,
> > -			      enum nf_nat_manip_type maniptype, struct sw_flow_key *key)
> > -{
> > -	int hooknum, err = NF_ACCEPT;
> > -
> > -	/* See HOOK2MANIP(). */
> > -	if (maniptype == NF_NAT_MANIP_SRC)
> > -		hooknum = NF_INET_LOCAL_IN; /* Source NAT */
> > -	else
> > -		hooknum = NF_INET_LOCAL_OUT; /* Destination NAT */
> > -
> > -	switch (ctinfo) {
> > -	case IP_CT_RELATED:
> > -	case IP_CT_RELATED_REPLY:
> > -		if (IS_ENABLED(CONFIG_NF_NAT) &&
> > -		    skb->protocol == htons(ETH_P_IP) &&
> > -		    ip_hdr(skb)->protocol == IPPROTO_ICMP) {
> > -			if (!nf_nat_icmp_reply_translation(skb, ct, ctinfo,
> > -							   hooknum))
> > -				err = NF_DROP;
> > -			goto out;
> > -		} else if (IS_ENABLED(CONFIG_IPV6) &&
> > -			   skb->protocol == htons(ETH_P_IPV6)) {
> > -			__be16 frag_off;
> > -			u8 nexthdr = ipv6_hdr(skb)->nexthdr;
> > -			int hdrlen = ipv6_skip_exthdr(skb,
> > -						      sizeof(struct ipv6hdr),
> > -						      &nexthdr, &frag_off);
> > -
> > -			if (hdrlen >= 0 && nexthdr == IPPROTO_ICMPV6) {
> > -				if (!nf_nat_icmpv6_reply_translation(skb, ct,
> > -								     ctinfo,
> > -								     hooknum,
> > -								     hdrlen))
> > -					err = NF_DROP;
> > -				goto out;
> > -			}
> > -		}
> > -		/* Non-ICMP, fall thru to initialize if needed. */
> > -		fallthrough;
> > -	case IP_CT_NEW:
> > -		/* Seen it before?  This can happen for loopback, retrans,
> > -		 * or local packets.
> > -		 */
> > -		if (!nf_nat_initialized(ct, maniptype)) {
> > -			/* Initialize according to the NAT action. */
> > -			err = (range && range->flags & NF_NAT_RANGE_MAP_IPS)
> > -				/* Action is set up to establish a new
> > -				 * mapping.
> > -				 */
> > -				? nf_nat_setup_info(ct, range, maniptype)
> > -				: nf_nat_alloc_null_binding(ct, hooknum);
> > -			if (err != NF_ACCEPT)
> > -				goto out;
> > -		}
> > -		break;
> > -
> > -	case IP_CT_ESTABLISHED:
> > -	case IP_CT_ESTABLISHED_REPLY:
> > -		break;
> > -
> > -	default:
> > -		err = NF_DROP;
> > -		goto out;
> > -	}
> > -
> > -	err = nf_nat_packet(ct, ctinfo, hooknum, skb);
> > -out:
> > -	/* Update the flow key if NAT successful. */
> > -	if (err == NF_ACCEPT)
> > -		ovs_nat_update_key(key, skb, maniptype);
> > -
> > -	return err;
> > -}
> > -
> >  /* Returns NF_DROP if the packet should be dropped, NF_ACCEPT otherwise. */
> >  static int ovs_ct_nat(struct net *net, struct sw_flow_key *key,
> >  		      const struct ovs_conntrack_info *info,
> >  		      struct sk_buff *skb, struct nf_conn *ct,
> >  		      enum ip_conntrack_info ctinfo)
> >  {
> > -	enum nf_nat_manip_type maniptype;
> > -	int err;
> > +	int err, action = 0;
> >  
> >  	if (!(info->nat & OVS_CT_NAT))
> >  		return NF_ACCEPT;
> > +	if (info->nat & OVS_CT_SRC_NAT)
> > +		action |= (1 << NF_NAT_MANIP_SRC);
> > +	if (info->nat & OVS_CT_DST_NAT)
> > +		action |= (1 << NF_NAT_MANIP_DST);
> 
> I'm wondering why this dance at this level with supporting multiple
> MANIPs while actually only one can be used at a time.
> 
> act_ct will reject an action using both:
>         if ((p->ct_action & TCA_CT_ACT_NAT_SRC) &&
>             (p->ct_action & TCA_CT_ACT_NAT_DST)) {
>                 NL_SET_ERR_MSG_MOD(extack, "dnat and snat can't be enabled at the same time");
>                 return -EOPNOTSUPP;
> 
> I couldn't find this kind of check in ovs code right now (didn't look much, I
> confess :)), but even the code here was already doing:
> 
> -	} else if (info->nat & OVS_CT_SRC_NAT) {
> -		maniptype = NF_NAT_MANIP_SRC;
> -	} else if (info->nat & OVS_CT_DST_NAT) {
> -		maniptype = NF_NAT_MANIP_DST;
> 
> And in case of tuple conflict, maniptype will be forcibly updated in
> [*] below.

Ahh.. just found why.. in case of typle conflict, nf_ct_nat() now may
set both.

> 
> Anyhow, if really needed, it would be nice to use BIT(NF_NAT_MANIP_..)
> instead.

But still this.

> 
> >  
> > -	/* Add NAT extension if not confirmed yet. */
> > -	if (!nf_ct_is_confirmed(ct) && !nf_ct_nat_ext_add(ct))
> > -		return NF_ACCEPT;   /* Can't NAT. */
> > +	err = nf_ct_nat(skb, ct, ctinfo, &action, &info->range, info->commit);
> >  
> > -	/* Determine NAT type.
> > -	 * Check if the NAT type can be deduced from the tracked connection.
> > -	 * Make sure new expected connections (IP_CT_RELATED) are NATted only
> > -	 * when committing.
> > -	 */
> > -	if (ctinfo != IP_CT_NEW && ct->status & IPS_NAT_MASK &&
> > -	    (ctinfo != IP_CT_RELATED || info->commit)) {
> > -		/* NAT an established or related connection like before. */
> > -		if (CTINFO2DIR(ctinfo) == IP_CT_DIR_REPLY)
> > -			/* This is the REPLY direction for a connection
> > -			 * for which NAT was applied in the forward
> > -			 * direction.  Do the reverse NAT.
> > -			 */
> > -			maniptype = ct->status & IPS_SRC_NAT
> > -				? NF_NAT_MANIP_DST : NF_NAT_MANIP_SRC;
> > -		else
> > -			maniptype = ct->status & IPS_SRC_NAT
> > -				? NF_NAT_MANIP_SRC : NF_NAT_MANIP_DST;
> > -	} else if (info->nat & OVS_CT_SRC_NAT) {
> > -		maniptype = NF_NAT_MANIP_SRC;
> > -	} else if (info->nat & OVS_CT_DST_NAT) {
> > -		maniptype = NF_NAT_MANIP_DST;
> > -	} else {
> > -		return NF_ACCEPT; /* Connection is not NATed. */
> > -	}
> > -	err = ovs_ct_nat_execute(skb, ct, ctinfo, &info->range, maniptype, key);
> > -
> > -	if (err == NF_ACCEPT && ct->status & IPS_DST_NAT) {
> > -		if (ct->status & IPS_SRC_NAT) {
> > -			if (maniptype == NF_NAT_MANIP_SRC)
> > -				maniptype = NF_NAT_MANIP_DST;
> > -			else
> > -				maniptype = NF_NAT_MANIP_SRC;
> 
> [*]
> 
> > -
> > -			err = ovs_ct_nat_execute(skb, ct, ctinfo, &info->range,
> > -						 maniptype, key);
> > -		} else if (CTINFO2DIR(ctinfo) == IP_CT_DIR_ORIGINAL) {
> > -			err = ovs_ct_nat_execute(skb, ct, ctinfo, NULL,
> > -						 NF_NAT_MANIP_SRC, key);
> > -		}
> > -	}
> > +	if (action & (1 << NF_NAT_MANIP_SRC))
> > +		ovs_nat_update_key(key, skb, NF_NAT_MANIP_SRC);
> > +	if (action & (1 << NF_NAT_MANIP_DST))
> > +		ovs_nat_update_key(key, skb, NF_NAT_MANIP_DST);
> >  
> >  	return err;
> >  }
> > diff --git a/net/sched/act_ct.c b/net/sched/act_ct.c
> > index c7782c9a6ab6..0c410220239f 100644
> > --- a/net/sched/act_ct.c
> > +++ b/net/sched/act_ct.c
> > @@ -863,90 +863,6 @@ static void tcf_ct_params_free_rcu(struct rcu_head *head)
> >  	tcf_ct_params_free(params);
> >  }
> >  
> > -#if IS_ENABLED(CONFIG_NF_NAT)
> > -/* Modelled after nf_nat_ipv[46]_fn().
> > - * range is only used for new, uninitialized NAT state.
> > - * Returns either NF_ACCEPT or NF_DROP.
> > - */
> > -static int ct_nat_execute(struct sk_buff *skb, struct nf_conn *ct,
> > -			  enum ip_conntrack_info ctinfo,
> > -			  const struct nf_nat_range2 *range,
> > -			  enum nf_nat_manip_type maniptype)
> > -{
> > -	__be16 proto = skb_protocol(skb, true);
> > -	int hooknum, err = NF_ACCEPT;
> > -
> > -	/* See HOOK2MANIP(). */
> > -	if (maniptype == NF_NAT_MANIP_SRC)
> > -		hooknum = NF_INET_LOCAL_IN; /* Source NAT */
> > -	else
> > -		hooknum = NF_INET_LOCAL_OUT; /* Destination NAT */
> > -
> > -	switch (ctinfo) {
> > -	case IP_CT_RELATED:
> > -	case IP_CT_RELATED_REPLY:
> > -		if (proto == htons(ETH_P_IP) &&
> > -		    ip_hdr(skb)->protocol == IPPROTO_ICMP) {
> > -			if (!nf_nat_icmp_reply_translation(skb, ct, ctinfo,
> > -							   hooknum))
> > -				err = NF_DROP;
> > -			goto out;
> > -		} else if (IS_ENABLED(CONFIG_IPV6) && proto == htons(ETH_P_IPV6)) {
> > -			__be16 frag_off;
> > -			u8 nexthdr = ipv6_hdr(skb)->nexthdr;
> > -			int hdrlen = ipv6_skip_exthdr(skb,
> > -						      sizeof(struct ipv6hdr),
> > -						      &nexthdr, &frag_off);
> > -
> > -			if (hdrlen >= 0 && nexthdr == IPPROTO_ICMPV6) {
> > -				if (!nf_nat_icmpv6_reply_translation(skb, ct,
> > -								     ctinfo,
> > -								     hooknum,
> > -								     hdrlen))
> > -					err = NF_DROP;
> > -				goto out;
> > -			}
> > -		}
> > -		/* Non-ICMP, fall thru to initialize if needed. */
> > -		fallthrough;
> > -	case IP_CT_NEW:
> > -		/* Seen it before?  This can happen for loopback, retrans,
> > -		 * or local packets.
> > -		 */
> > -		if (!nf_nat_initialized(ct, maniptype)) {
> > -			/* Initialize according to the NAT action. */
> > -			err = (range && range->flags & NF_NAT_RANGE_MAP_IPS)
> > -				/* Action is set up to establish a new
> > -				 * mapping.
> > -				 */
> > -				? nf_nat_setup_info(ct, range, maniptype)
> > -				: nf_nat_alloc_null_binding(ct, hooknum);
> > -			if (err != NF_ACCEPT)
> > -				goto out;
> > -		}
> > -		break;
> > -
> > -	case IP_CT_ESTABLISHED:
> > -	case IP_CT_ESTABLISHED_REPLY:
> > -		break;
> > -
> > -	default:
> > -		err = NF_DROP;
> > -		goto out;
> > -	}
> > -
> > -	err = nf_nat_packet(ct, ctinfo, hooknum, skb);
> > -out:
> > -	if (err == NF_ACCEPT) {
> > -		if (maniptype == NF_NAT_MANIP_SRC)
> > -			tc_skb_cb(skb)->post_ct_snat = 1;
> > -		if (maniptype == NF_NAT_MANIP_DST)
> > -			tc_skb_cb(skb)->post_ct_dnat = 1;
> > -	}
> > -	return err;
> > -}
> > -#endif /* CONFIG_NF_NAT */
> > -
> >  static void tcf_ct_act_set_mark(struct nf_conn *ct, u32 mark, u32 mask)
> >  {
> >  #if IS_ENABLED(CONFIG_NF_CONNTRACK_MARK)
> > @@ -986,52 +902,22 @@ static int tcf_ct_act_nat(struct sk_buff *skb,
> >  			  bool commit)
> >  {
> >  #if IS_ENABLED(CONFIG_NF_NAT)
> > -	int err;
> > -	enum nf_nat_manip_type maniptype;
> > +	int err, action = 0;
> >  
> >  	if (!(ct_action & TCA_CT_ACT_NAT))
> >  		return NF_ACCEPT;
> > +	if (ct_action & TCA_CT_ACT_NAT_SRC)
> > +		action |= (1 << NF_NAT_MANIP_SRC);
> > +	if (ct_action & TCA_CT_ACT_NAT_DST)
> > +		action |= (1 << NF_NAT_MANIP_DST);
> >  
> > -	/* Add NAT extension if not confirmed yet. */
> > -	if (!nf_ct_is_confirmed(ct) && !nf_ct_nat_ext_add(ct))
> > -		return NF_ACCEPT;   /* Can't NAT. */
> > -
> > -	if (ctinfo != IP_CT_NEW && (ct->status & IPS_NAT_MASK) &&
> > -	    (ctinfo != IP_CT_RELATED || commit)) {
> > -		/* NAT an established or related connection like before. */
> > -		if (CTINFO2DIR(ctinfo) == IP_CT_DIR_REPLY)
> > -			/* This is the REPLY direction for a connection
> > -			 * for which NAT was applied in the forward
> > -			 * direction.  Do the reverse NAT.
> > -			 */
> > -			maniptype = ct->status & IPS_SRC_NAT
> > -				? NF_NAT_MANIP_DST : NF_NAT_MANIP_SRC;
> > -		else
> > -			maniptype = ct->status & IPS_SRC_NAT
> > -				? NF_NAT_MANIP_SRC : NF_NAT_MANIP_DST;
> > -	} else if (ct_action & TCA_CT_ACT_NAT_SRC) {
> > -		maniptype = NF_NAT_MANIP_SRC;
> > -	} else if (ct_action & TCA_CT_ACT_NAT_DST) {
> > -		maniptype = NF_NAT_MANIP_DST;
> > -	} else {
> > -		return NF_ACCEPT;
> > -	}
> > +	err = nf_ct_nat(skb, ct, ctinfo, &action, range, commit);
> > +
> > +	if (action & (1 << NF_NAT_MANIP_SRC))
> > +		tc_skb_cb(skb)->post_ct_snat = 1;
> > +	if (action & (1 << NF_NAT_MANIP_DST))
> > +		tc_skb_cb(skb)->post_ct_dnat = 1;
> >  
> > -	err = ct_nat_execute(skb, ct, ctinfo, range, maniptype);
> > -	if (err == NF_ACCEPT && ct->status & IPS_DST_NAT) {
> > -		if (ct->status & IPS_SRC_NAT) {
> > -			if (maniptype == NF_NAT_MANIP_SRC)
> > -				maniptype = NF_NAT_MANIP_DST;
> > -			else
> > -				maniptype = NF_NAT_MANIP_SRC;
> > -
> > -			err = ct_nat_execute(skb, ct, ctinfo, range,
> > -					     maniptype);
> > -		} else if (CTINFO2DIR(ctinfo) == IP_CT_DIR_ORIGINAL) {
> > -			err = ct_nat_execute(skb, ct, ctinfo, NULL,
> > -					     NF_NAT_MANIP_SRC);
> > -		}
> > -	}
> >  	return err;
> >  #else
> >  	return NF_ACCEPT;
> > -- 
> > 2.31.1
> > 
> _______________________________________________
> dev mailing list
> dev@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
> 

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [ovs-dev] [PATCHv2 net-next 5/5] net: move the nat function to nf_nat_ovs for ovs and tc
  2022-11-23 15:13     ` [ovs-dev] " Marcelo Ricardo Leitner
@ 2022-11-23 17:31       ` Xin Long
  2022-11-23 18:48         ` Marcelo Ricardo Leitner
  0 siblings, 1 reply; 23+ messages in thread
From: Xin Long @ 2022-11-23 17:31 UTC (permalink / raw)
  To: Marcelo Ricardo Leitner
  Cc: dev, ovs-dev, Davide Caratti, Jiri Pirko, network dev,
	Paul Blakey, Florian Westphal, Jamal Hadi Salim, Ilya Maximets,
	Eric Dumazet, Cong Wang, kuba, Paolo Abeni, davem,
	Pablo Neira Ayuso

On Wed, Nov 23, 2022 at 10:13 AM Marcelo Ricardo Leitner
<marcelo.leitner@gmail.com> wrote:
>
> On Wed, Nov 23, 2022 at 12:09:55PM -0300, Marcelo Ricardo Leitner wrote:
> > On Tue, Nov 22, 2022 at 12:32:21PM -0500, Xin Long wrote:
> > > +int nf_ct_nat(struct sk_buff *skb, struct nf_conn *ct,
> > > +         enum ip_conntrack_info ctinfo, int *action,
> > > +         const struct nf_nat_range2 *range, bool commit)
> > > +{
> > > +   enum nf_nat_manip_type maniptype;
> > > +   int err, ct_action = *action;
> > > +
> > > +   *action = 0;
> > > +
> > > +   /* Add NAT extension if not confirmed yet. */
> > > +   if (!nf_ct_is_confirmed(ct) && !nf_ct_nat_ext_add(ct))
> > > +           return NF_ACCEPT;   /* Can't NAT. */
> > > +
> > > +   if (ctinfo != IP_CT_NEW && (ct->status & IPS_NAT_MASK) &&
> > > +       (ctinfo != IP_CT_RELATED || commit)) {
> > > +           /* NAT an established or related connection like before. */
> > > +           if (CTINFO2DIR(ctinfo) == IP_CT_DIR_REPLY)
> > > +                   /* This is the REPLY direction for a connection
> > > +                    * for which NAT was applied in the forward
> > > +                    * direction.  Do the reverse NAT.
> > > +                    */
> > > +                   maniptype = ct->status & IPS_SRC_NAT
> > > +                           ? NF_NAT_MANIP_DST : NF_NAT_MANIP_SRC;
> > > +           else
> > > +                   maniptype = ct->status & IPS_SRC_NAT
> > > +                           ? NF_NAT_MANIP_SRC : NF_NAT_MANIP_DST;
> > > +   } else if (ct_action & (1 << NF_NAT_MANIP_SRC)) {
> > > +           maniptype = NF_NAT_MANIP_SRC;
> > > +   } else if (ct_action & (1 << NF_NAT_MANIP_DST)) {
> > > +           maniptype = NF_NAT_MANIP_DST;
> > > +   } else {
> > > +           return NF_ACCEPT;
> > > +   }
> > > +
> > > +   err = nf_ct_nat_execute(skb, ct, ctinfo, action, range, maniptype);
> > > +   if (err == NF_ACCEPT && ct->status & IPS_DST_NAT) {
> > > +           if (ct->status & IPS_SRC_NAT) {
> > > +                   if (maniptype == NF_NAT_MANIP_SRC)
> > > +                           maniptype = NF_NAT_MANIP_DST;
> > > +                   else
> > > +                           maniptype = NF_NAT_MANIP_SRC;
> > > +
> > > +                   err = nf_ct_nat_execute(skb, ct, ctinfo, action, range,
> > > +                                           maniptype);
> > > +           } else if (CTINFO2DIR(ctinfo) == IP_CT_DIR_ORIGINAL) {
> > > +                   err = nf_ct_nat_execute(skb, ct, ctinfo, action, NULL,
> > > +                                           NF_NAT_MANIP_SRC);
> > > +           }
> > > +   }
> > > +   return err;
> > > +}
> > > +EXPORT_SYMBOL_GPL(nf_ct_nat);
> > > diff --git a/net/openvswitch/conntrack.c b/net/openvswitch/conntrack.c
> > > index cc643a556ea1..d03c75165663 100644
> > > --- a/net/openvswitch/conntrack.c
> > > +++ b/net/openvswitch/conntrack.c
> > > @@ -726,144 +726,27 @@ static void ovs_nat_update_key(struct sw_flow_key *key,
> > >     }
> > >  }
> > >
> > > -/* Modelled after nf_nat_ipv[46]_fn().
> > > - * range is only used for new, uninitialized NAT state.
> > > - * Returns either NF_ACCEPT or NF_DROP.
> > > - */
> > > -static int ovs_ct_nat_execute(struct sk_buff *skb, struct nf_conn *ct,
> > > -                         enum ip_conntrack_info ctinfo,
> > > -                         const struct nf_nat_range2 *range,
> > > -                         enum nf_nat_manip_type maniptype, struct sw_flow_key *key)
> > > -{
> > > -   int hooknum, err = NF_ACCEPT;
> > > -
> > > -   /* See HOOK2MANIP(). */
> > > -   if (maniptype == NF_NAT_MANIP_SRC)
> > > -           hooknum = NF_INET_LOCAL_IN; /* Source NAT */
> > > -   else
> > > -           hooknum = NF_INET_LOCAL_OUT; /* Destination NAT */
> > > -
> > > -   switch (ctinfo) {
> > > -   case IP_CT_RELATED:
> > > -   case IP_CT_RELATED_REPLY:
> > > -           if (IS_ENABLED(CONFIG_NF_NAT) &&
> > > -               skb->protocol == htons(ETH_P_IP) &&
> > > -               ip_hdr(skb)->protocol == IPPROTO_ICMP) {
> > > -                   if (!nf_nat_icmp_reply_translation(skb, ct, ctinfo,
> > > -                                                      hooknum))
> > > -                           err = NF_DROP;
> > > -                   goto out;
> > > -           } else if (IS_ENABLED(CONFIG_IPV6) &&
> > > -                      skb->protocol == htons(ETH_P_IPV6)) {
> > > -                   __be16 frag_off;
> > > -                   u8 nexthdr = ipv6_hdr(skb)->nexthdr;
> > > -                   int hdrlen = ipv6_skip_exthdr(skb,
> > > -                                                 sizeof(struct ipv6hdr),
> > > -                                                 &nexthdr, &frag_off);
> > > -
> > > -                   if (hdrlen >= 0 && nexthdr == IPPROTO_ICMPV6) {
> > > -                           if (!nf_nat_icmpv6_reply_translation(skb, ct,
> > > -                                                                ctinfo,
> > > -                                                                hooknum,
> > > -                                                                hdrlen))
> > > -                                   err = NF_DROP;
> > > -                           goto out;
> > > -                   }
> > > -           }
> > > -           /* Non-ICMP, fall thru to initialize if needed. */
> > > -           fallthrough;
> > > -   case IP_CT_NEW:
> > > -           /* Seen it before?  This can happen for loopback, retrans,
> > > -            * or local packets.
> > > -            */
> > > -           if (!nf_nat_initialized(ct, maniptype)) {
> > > -                   /* Initialize according to the NAT action. */
> > > -                   err = (range && range->flags & NF_NAT_RANGE_MAP_IPS)
> > > -                           /* Action is set up to establish a new
> > > -                            * mapping.
> > > -                            */
> > > -                           ? nf_nat_setup_info(ct, range, maniptype)
> > > -                           : nf_nat_alloc_null_binding(ct, hooknum);
> > > -                   if (err != NF_ACCEPT)
> > > -                           goto out;
> > > -           }
> > > -           break;
> > > -
> > > -   case IP_CT_ESTABLISHED:
> > > -   case IP_CT_ESTABLISHED_REPLY:
> > > -           break;
> > > -
> > > -   default:
> > > -           err = NF_DROP;
> > > -           goto out;
> > > -   }
> > > -
> > > -   err = nf_nat_packet(ct, ctinfo, hooknum, skb);
> > > -out:
> > > -   /* Update the flow key if NAT successful. */
> > > -   if (err == NF_ACCEPT)
> > > -           ovs_nat_update_key(key, skb, maniptype);
> > > -
> > > -   return err;
> > > -}
> > > -
> > >  /* Returns NF_DROP if the packet should be dropped, NF_ACCEPT otherwise. */
> > >  static int ovs_ct_nat(struct net *net, struct sw_flow_key *key,
> > >                   const struct ovs_conntrack_info *info,
> > >                   struct sk_buff *skb, struct nf_conn *ct,
> > >                   enum ip_conntrack_info ctinfo)
> > >  {
> > > -   enum nf_nat_manip_type maniptype;
> > > -   int err;
> > > +   int err, action = 0;
> > >
> > >     if (!(info->nat & OVS_CT_NAT))
> > >             return NF_ACCEPT;
> > > +   if (info->nat & OVS_CT_SRC_NAT)
> > > +           action |= (1 << NF_NAT_MANIP_SRC);
> > > +   if (info->nat & OVS_CT_DST_NAT)
> > > +           action |= (1 << NF_NAT_MANIP_DST);
> >
> > I'm wondering why this dance at this level with supporting multiple
> > MANIPs while actually only one can be used at a time.
> >
> > act_ct will reject an action using both:
> >         if ((p->ct_action & TCA_CT_ACT_NAT_SRC) &&
> >             (p->ct_action & TCA_CT_ACT_NAT_DST)) {
> >                 NL_SET_ERR_MSG_MOD(extack, "dnat and snat can't be enabled at the same time");
> >                 return -EOPNOTSUPP;
> >
> > I couldn't find this kind of check in ovs code right now (didn't look much, I
> > confess :)), but even the code here was already doing:
> >
> > -     } else if (info->nat & OVS_CT_SRC_NAT) {
> > -             maniptype = NF_NAT_MANIP_SRC;
> > -     } else if (info->nat & OVS_CT_DST_NAT) {
> > -             maniptype = NF_NAT_MANIP_DST;
> >
> > And in case of tuple conflict, maniptype will be forcibly updated in
> > [*] below.
>
> Ahh.. just found why.. in case of typle conflict, nf_ct_nat() now may
> set both.
Right.
BTW. The tuple conflict processing actually has provided the
code to do full nat (snat and dnat at the same time), which I
think is more efficient than doing full nat in two zones. All
we have to do is adjust a few lines of code for that.

>
> >
> > Anyhow, if really needed, it would be nice to use BIT(NF_NAT_MANIP_..)
> > instead.
BIT() looks a good one, will use it.
Thanks.

>
> But still this.
>
> >
> > >
> > > -   /* Add NAT extension if not confirmed yet. */
> > > -   if (!nf_ct_is_confirmed(ct) && !nf_ct_nat_ext_add(ct))
> > > -           return NF_ACCEPT;   /* Can't NAT. */
> > > +   err = nf_ct_nat(skb, ct, ctinfo, &action, &info->range, info->commit);
> > >
> > > -   /* Determine NAT type.
> > > -    * Check if the NAT type can be deduced from the tracked connection.
> > > -    * Make sure new expected connections (IP_CT_RELATED) are NATted only
> > > -    * when committing.
> > > -    */
> > > -   if (ctinfo != IP_CT_NEW && ct->status & IPS_NAT_MASK &&
> > > -       (ctinfo != IP_CT_RELATED || info->commit)) {
> > > -           /* NAT an established or related connection like before. */
> > > -           if (CTINFO2DIR(ctinfo) == IP_CT_DIR_REPLY)
> > > -                   /* This is the REPLY direction for a connection
> > > -                    * for which NAT was applied in the forward
> > > -                    * direction.  Do the reverse NAT.
> > > -                    */
> > > -                   maniptype = ct->status & IPS_SRC_NAT
> > > -                           ? NF_NAT_MANIP_DST : NF_NAT_MANIP_SRC;
> > > -           else
> > > -                   maniptype = ct->status & IPS_SRC_NAT
> > > -                           ? NF_NAT_MANIP_SRC : NF_NAT_MANIP_DST;
> > > -   } else if (info->nat & OVS_CT_SRC_NAT) {
> > > -           maniptype = NF_NAT_MANIP_SRC;
> > > -   } else if (info->nat & OVS_CT_DST_NAT) {
> > > -           maniptype = NF_NAT_MANIP_DST;
> > > -   } else {
> > > -           return NF_ACCEPT; /* Connection is not NATed. */
> > > -   }
> > > -   err = ovs_ct_nat_execute(skb, ct, ctinfo, &info->range, maniptype, key);
> > > -
> > > -   if (err == NF_ACCEPT && ct->status & IPS_DST_NAT) {
> > > -           if (ct->status & IPS_SRC_NAT) {
> > > -                   if (maniptype == NF_NAT_MANIP_SRC)
> > > -                           maniptype = NF_NAT_MANIP_DST;
> > > -                   else
> > > -                           maniptype = NF_NAT_MANIP_SRC;
> >
> > [*]
> >
> > > -
> > > -                   err = ovs_ct_nat_execute(skb, ct, ctinfo, &info->range,
> > > -                                            maniptype, key);
> > > -           } else if (CTINFO2DIR(ctinfo) == IP_CT_DIR_ORIGINAL) {
> > > -                   err = ovs_ct_nat_execute(skb, ct, ctinfo, NULL,
> > > -                                            NF_NAT_MANIP_SRC, key);
> > > -           }
> > > -   }
> > > +   if (action & (1 << NF_NAT_MANIP_SRC))
> > > +           ovs_nat_update_key(key, skb, NF_NAT_MANIP_SRC);
> > > +   if (action & (1 << NF_NAT_MANIP_DST))
> > > +           ovs_nat_update_key(key, skb, NF_NAT_MANIP_DST);
> > >
> > >     return err;
> > >  }
> > > diff --git a/net/sched/act_ct.c b/net/sched/act_ct.c
> > > index c7782c9a6ab6..0c410220239f 100644
> > > --- a/net/sched/act_ct.c
> > > +++ b/net/sched/act_ct.c
> > > @@ -863,90 +863,6 @@ static void tcf_ct_params_free_rcu(struct rcu_head *head)
> > >     tcf_ct_params_free(params);
> > >  }
> > >
> > > -#if IS_ENABLED(CONFIG_NF_NAT)
> > > -/* Modelled after nf_nat_ipv[46]_fn().
> > > - * range is only used for new, uninitialized NAT state.
> > > - * Returns either NF_ACCEPT or NF_DROP.
> > > - */
> > > -static int ct_nat_execute(struct sk_buff *skb, struct nf_conn *ct,
> > > -                     enum ip_conntrack_info ctinfo,
> > > -                     const struct nf_nat_range2 *range,
> > > -                     enum nf_nat_manip_type maniptype)
> > > -{
> > > -   __be16 proto = skb_protocol(skb, true);
> > > -   int hooknum, err = NF_ACCEPT;
> > > -
> > > -   /* See HOOK2MANIP(). */
> > > -   if (maniptype == NF_NAT_MANIP_SRC)
> > > -           hooknum = NF_INET_LOCAL_IN; /* Source NAT */
> > > -   else
> > > -           hooknum = NF_INET_LOCAL_OUT; /* Destination NAT */
> > > -
> > > -   switch (ctinfo) {
> > > -   case IP_CT_RELATED:
> > > -   case IP_CT_RELATED_REPLY:
> > > -           if (proto == htons(ETH_P_IP) &&
> > > -               ip_hdr(skb)->protocol == IPPROTO_ICMP) {
> > > -                   if (!nf_nat_icmp_reply_translation(skb, ct, ctinfo,
> > > -                                                      hooknum))
> > > -                           err = NF_DROP;
> > > -                   goto out;
> > > -           } else if (IS_ENABLED(CONFIG_IPV6) && proto == htons(ETH_P_IPV6)) {
> > > -                   __be16 frag_off;
> > > -                   u8 nexthdr = ipv6_hdr(skb)->nexthdr;
> > > -                   int hdrlen = ipv6_skip_exthdr(skb,
> > > -                                                 sizeof(struct ipv6hdr),
> > > -                                                 &nexthdr, &frag_off);
> > > -
> > > -                   if (hdrlen >= 0 && nexthdr == IPPROTO_ICMPV6) {
> > > -                           if (!nf_nat_icmpv6_reply_translation(skb, ct,
> > > -                                                                ctinfo,
> > > -                                                                hooknum,
> > > -                                                                hdrlen))
> > > -                                   err = NF_DROP;
> > > -                           goto out;
> > > -                   }
> > > -           }
> > > -           /* Non-ICMP, fall thru to initialize if needed. */
> > > -           fallthrough;
> > > -   case IP_CT_NEW:
> > > -           /* Seen it before?  This can happen for loopback, retrans,
> > > -            * or local packets.
> > > -            */
> > > -           if (!nf_nat_initialized(ct, maniptype)) {
> > > -                   /* Initialize according to the NAT action. */
> > > -                   err = (range && range->flags & NF_NAT_RANGE_MAP_IPS)
> > > -                           /* Action is set up to establish a new
> > > -                            * mapping.
> > > -                            */
> > > -                           ? nf_nat_setup_info(ct, range, maniptype)
> > > -                           : nf_nat_alloc_null_binding(ct, hooknum);
> > > -                   if (err != NF_ACCEPT)
> > > -                           goto out;
> > > -           }
> > > -           break;
> > > -
> > > -   case IP_CT_ESTABLISHED:
> > > -   case IP_CT_ESTABLISHED_REPLY:
> > > -           break;
> > > -
> > > -   default:
> > > -           err = NF_DROP;
> > > -           goto out;
> > > -   }
> > > -
> > > -   err = nf_nat_packet(ct, ctinfo, hooknum, skb);
> > > -out:
> > > -   if (err == NF_ACCEPT) {
> > > -           if (maniptype == NF_NAT_MANIP_SRC)
> > > -                   tc_skb_cb(skb)->post_ct_snat = 1;
> > > -           if (maniptype == NF_NAT_MANIP_DST)
> > > -                   tc_skb_cb(skb)->post_ct_dnat = 1;
> > > -   }
> > > -   return err;
> > > -}
> > > -#endif /* CONFIG_NF_NAT */
> > > -
> > >  static void tcf_ct_act_set_mark(struct nf_conn *ct, u32 mark, u32 mask)
> > >  {
> > >  #if IS_ENABLED(CONFIG_NF_CONNTRACK_MARK)
> > > @@ -986,52 +902,22 @@ static int tcf_ct_act_nat(struct sk_buff *skb,
> > >                       bool commit)
> > >  {
> > >  #if IS_ENABLED(CONFIG_NF_NAT)
> > > -   int err;
> > > -   enum nf_nat_manip_type maniptype;
> > > +   int err, action = 0;
> > >
> > >     if (!(ct_action & TCA_CT_ACT_NAT))
> > >             return NF_ACCEPT;
> > > +   if (ct_action & TCA_CT_ACT_NAT_SRC)
> > > +           action |= (1 << NF_NAT_MANIP_SRC);
> > > +   if (ct_action & TCA_CT_ACT_NAT_DST)
> > > +           action |= (1 << NF_NAT_MANIP_DST);
> > >
> > > -   /* Add NAT extension if not confirmed yet. */
> > > -   if (!nf_ct_is_confirmed(ct) && !nf_ct_nat_ext_add(ct))
> > > -           return NF_ACCEPT;   /* Can't NAT. */
> > > -
> > > -   if (ctinfo != IP_CT_NEW && (ct->status & IPS_NAT_MASK) &&
> > > -       (ctinfo != IP_CT_RELATED || commit)) {
> > > -           /* NAT an established or related connection like before. */
> > > -           if (CTINFO2DIR(ctinfo) == IP_CT_DIR_REPLY)
> > > -                   /* This is the REPLY direction for a connection
> > > -                    * for which NAT was applied in the forward
> > > -                    * direction.  Do the reverse NAT.
> > > -                    */
> > > -                   maniptype = ct->status & IPS_SRC_NAT
> > > -                           ? NF_NAT_MANIP_DST : NF_NAT_MANIP_SRC;
> > > -           else
> > > -                   maniptype = ct->status & IPS_SRC_NAT
> > > -                           ? NF_NAT_MANIP_SRC : NF_NAT_MANIP_DST;
> > > -   } else if (ct_action & TCA_CT_ACT_NAT_SRC) {
> > > -           maniptype = NF_NAT_MANIP_SRC;
> > > -   } else if (ct_action & TCA_CT_ACT_NAT_DST) {
> > > -           maniptype = NF_NAT_MANIP_DST;
> > > -   } else {
> > > -           return NF_ACCEPT;
> > > -   }
> > > +   err = nf_ct_nat(skb, ct, ctinfo, &action, range, commit);
> > > +
> > > +   if (action & (1 << NF_NAT_MANIP_SRC))
> > > +           tc_skb_cb(skb)->post_ct_snat = 1;
> > > +   if (action & (1 << NF_NAT_MANIP_DST))
> > > +           tc_skb_cb(skb)->post_ct_dnat = 1;
> > >
> > > -   err = ct_nat_execute(skb, ct, ctinfo, range, maniptype);
> > > -   if (err == NF_ACCEPT && ct->status & IPS_DST_NAT) {
> > > -           if (ct->status & IPS_SRC_NAT) {
> > > -                   if (maniptype == NF_NAT_MANIP_SRC)
> > > -                           maniptype = NF_NAT_MANIP_DST;
> > > -                   else
> > > -                           maniptype = NF_NAT_MANIP_SRC;
> > > -
> > > -                   err = ct_nat_execute(skb, ct, ctinfo, range,
> > > -                                        maniptype);
> > > -           } else if (CTINFO2DIR(ctinfo) == IP_CT_DIR_ORIGINAL) {
> > > -                   err = ct_nat_execute(skb, ct, ctinfo, NULL,
> > > -                                        NF_NAT_MANIP_SRC);
> > > -           }
> > > -   }
> > >     return err;
> > >  #else
> > >     return NF_ACCEPT;
> > > --
> > > 2.31.1
> > >
> > _______________________________________________
> > dev mailing list
> > dev@openvswitch.org
> > https://mail.openvswitch.org/mailman/listinfo/ovs-dev
> >

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [ovs-dev] [PATCHv2 net-next 5/5] net: move the nat function to nf_nat_ovs for ovs and tc
  2022-11-23 17:31       ` Xin Long
@ 2022-11-23 18:48         ` Marcelo Ricardo Leitner
  2022-11-23 18:54           ` Xin Long
  0 siblings, 1 reply; 23+ messages in thread
From: Marcelo Ricardo Leitner @ 2022-11-23 18:48 UTC (permalink / raw)
  To: Xin Long
  Cc: dev, ovs-dev, Davide Caratti, Jiri Pirko, network dev,
	Paul Blakey, Florian Westphal, Jamal Hadi Salim, Ilya Maximets,
	Eric Dumazet, Cong Wang, kuba, Paolo Abeni, davem,
	Pablo Neira Ayuso

On Wed, Nov 23, 2022 at 12:31:38PM -0500, Xin Long wrote:
> On Wed, Nov 23, 2022 at 10:13 AM Marcelo Ricardo Leitner
> <marcelo.leitner@gmail.com> wrote:
> >
> > On Wed, Nov 23, 2022 at 12:09:55PM -0300, Marcelo Ricardo Leitner wrote:
> > > On Tue, Nov 22, 2022 at 12:32:21PM -0500, Xin Long wrote:
> > > > +int nf_ct_nat(struct sk_buff *skb, struct nf_conn *ct,
> > > > +         enum ip_conntrack_info ctinfo, int *action,
> > > > +         const struct nf_nat_range2 *range, bool commit)
> > > > +{
> > > > +   enum nf_nat_manip_type maniptype;
> > > > +   int err, ct_action = *action;
> > > > +
> > > > +   *action = 0;
> > > > +
> > > > +   /* Add NAT extension if not confirmed yet. */
> > > > +   if (!nf_ct_is_confirmed(ct) && !nf_ct_nat_ext_add(ct))
> > > > +           return NF_ACCEPT;   /* Can't NAT. */
> > > > +
> > > > +   if (ctinfo != IP_CT_NEW && (ct->status & IPS_NAT_MASK) &&
> > > > +       (ctinfo != IP_CT_RELATED || commit)) {
> > > > +           /* NAT an established or related connection like before. */
> > > > +           if (CTINFO2DIR(ctinfo) == IP_CT_DIR_REPLY)
> > > > +                   /* This is the REPLY direction for a connection
> > > > +                    * for which NAT was applied in the forward
> > > > +                    * direction.  Do the reverse NAT.
> > > > +                    */
> > > > +                   maniptype = ct->status & IPS_SRC_NAT
> > > > +                           ? NF_NAT_MANIP_DST : NF_NAT_MANIP_SRC;
> > > > +           else
> > > > +                   maniptype = ct->status & IPS_SRC_NAT
> > > > +                           ? NF_NAT_MANIP_SRC : NF_NAT_MANIP_DST;
> > > > +   } else if (ct_action & (1 << NF_NAT_MANIP_SRC)) {
> > > > +           maniptype = NF_NAT_MANIP_SRC;
> > > > +   } else if (ct_action & (1 << NF_NAT_MANIP_DST)) {
> > > > +           maniptype = NF_NAT_MANIP_DST;
> > > > +   } else {
> > > > +           return NF_ACCEPT;
> > > > +   }
> > > > +
> > > > +   err = nf_ct_nat_execute(skb, ct, ctinfo, action, range, maniptype);
> > > > +   if (err == NF_ACCEPT && ct->status & IPS_DST_NAT) {
> > > > +           if (ct->status & IPS_SRC_NAT) {
> > > > +                   if (maniptype == NF_NAT_MANIP_SRC)
> > > > +                           maniptype = NF_NAT_MANIP_DST;
> > > > +                   else
> > > > +                           maniptype = NF_NAT_MANIP_SRC;
> > > > +
> > > > +                   err = nf_ct_nat_execute(skb, ct, ctinfo, action, range,
> > > > +                                           maniptype);
> > > > +           } else if (CTINFO2DIR(ctinfo) == IP_CT_DIR_ORIGINAL) {
> > > > +                   err = nf_ct_nat_execute(skb, ct, ctinfo, action, NULL,
> > > > +                                           NF_NAT_MANIP_SRC);
> > > > +           }
> > > > +   }
> > > > +   return err;
> > > > +}
> > > > +EXPORT_SYMBOL_GPL(nf_ct_nat);
> > > > diff --git a/net/openvswitch/conntrack.c b/net/openvswitch/conntrack.c
> > > > index cc643a556ea1..d03c75165663 100644
> > > > --- a/net/openvswitch/conntrack.c
> > > > +++ b/net/openvswitch/conntrack.c
> > > > @@ -726,144 +726,27 @@ static void ovs_nat_update_key(struct sw_flow_key *key,
> > > >     }
> > > >  }
> > > >
> > > > -/* Modelled after nf_nat_ipv[46]_fn().
> > > > - * range is only used for new, uninitialized NAT state.
> > > > - * Returns either NF_ACCEPT or NF_DROP.
> > > > - */
> > > > -static int ovs_ct_nat_execute(struct sk_buff *skb, struct nf_conn *ct,
> > > > -                         enum ip_conntrack_info ctinfo,
> > > > -                         const struct nf_nat_range2 *range,
> > > > -                         enum nf_nat_manip_type maniptype, struct sw_flow_key *key)
> > > > -{
> > > > -   int hooknum, err = NF_ACCEPT;
> > > > -
> > > > -   /* See HOOK2MANIP(). */
> > > > -   if (maniptype == NF_NAT_MANIP_SRC)
> > > > -           hooknum = NF_INET_LOCAL_IN; /* Source NAT */
> > > > -   else
> > > > -           hooknum = NF_INET_LOCAL_OUT; /* Destination NAT */
> > > > -
> > > > -   switch (ctinfo) {
> > > > -   case IP_CT_RELATED:
> > > > -   case IP_CT_RELATED_REPLY:
> > > > -           if (IS_ENABLED(CONFIG_NF_NAT) &&
> > > > -               skb->protocol == htons(ETH_P_IP) &&
> > > > -               ip_hdr(skb)->protocol == IPPROTO_ICMP) {
> > > > -                   if (!nf_nat_icmp_reply_translation(skb, ct, ctinfo,
> > > > -                                                      hooknum))
> > > > -                           err = NF_DROP;
> > > > -                   goto out;
> > > > -           } else if (IS_ENABLED(CONFIG_IPV6) &&
> > > > -                      skb->protocol == htons(ETH_P_IPV6)) {
> > > > -                   __be16 frag_off;
> > > > -                   u8 nexthdr = ipv6_hdr(skb)->nexthdr;
> > > > -                   int hdrlen = ipv6_skip_exthdr(skb,
> > > > -                                                 sizeof(struct ipv6hdr),
> > > > -                                                 &nexthdr, &frag_off);
> > > > -
> > > > -                   if (hdrlen >= 0 && nexthdr == IPPROTO_ICMPV6) {
> > > > -                           if (!nf_nat_icmpv6_reply_translation(skb, ct,
> > > > -                                                                ctinfo,
> > > > -                                                                hooknum,
> > > > -                                                                hdrlen))
> > > > -                                   err = NF_DROP;
> > > > -                           goto out;
> > > > -                   }
> > > > -           }
> > > > -           /* Non-ICMP, fall thru to initialize if needed. */
> > > > -           fallthrough;
> > > > -   case IP_CT_NEW:
> > > > -           /* Seen it before?  This can happen for loopback, retrans,
> > > > -            * or local packets.
> > > > -            */
> > > > -           if (!nf_nat_initialized(ct, maniptype)) {
> > > > -                   /* Initialize according to the NAT action. */
> > > > -                   err = (range && range->flags & NF_NAT_RANGE_MAP_IPS)
> > > > -                           /* Action is set up to establish a new
> > > > -                            * mapping.
> > > > -                            */
> > > > -                           ? nf_nat_setup_info(ct, range, maniptype)
> > > > -                           : nf_nat_alloc_null_binding(ct, hooknum);
> > > > -                   if (err != NF_ACCEPT)
> > > > -                           goto out;
> > > > -           }
> > > > -           break;
> > > > -
> > > > -   case IP_CT_ESTABLISHED:
> > > > -   case IP_CT_ESTABLISHED_REPLY:
> > > > -           break;
> > > > -
> > > > -   default:
> > > > -           err = NF_DROP;
> > > > -           goto out;
> > > > -   }
> > > > -
> > > > -   err = nf_nat_packet(ct, ctinfo, hooknum, skb);
> > > > -out:
> > > > -   /* Update the flow key if NAT successful. */
> > > > -   if (err == NF_ACCEPT)
> > > > -           ovs_nat_update_key(key, skb, maniptype);
> > > > -
> > > > -   return err;
> > > > -}
> > > > -
> > > >  /* Returns NF_DROP if the packet should be dropped, NF_ACCEPT otherwise. */
> > > >  static int ovs_ct_nat(struct net *net, struct sw_flow_key *key,
> > > >                   const struct ovs_conntrack_info *info,
> > > >                   struct sk_buff *skb, struct nf_conn *ct,
> > > >                   enum ip_conntrack_info ctinfo)
> > > >  {
> > > > -   enum nf_nat_manip_type maniptype;
> > > > -   int err;
> > > > +   int err, action = 0;
> > > >
> > > >     if (!(info->nat & OVS_CT_NAT))
> > > >             return NF_ACCEPT;
> > > > +   if (info->nat & OVS_CT_SRC_NAT)
> > > > +           action |= (1 << NF_NAT_MANIP_SRC);
> > > > +   if (info->nat & OVS_CT_DST_NAT)
> > > > +           action |= (1 << NF_NAT_MANIP_DST);
> > >
> > > I'm wondering why this dance at this level with supporting multiple
> > > MANIPs while actually only one can be used at a time.
> > >
> > > act_ct will reject an action using both:
> > >         if ((p->ct_action & TCA_CT_ACT_NAT_SRC) &&
> > >             (p->ct_action & TCA_CT_ACT_NAT_DST)) {
> > >                 NL_SET_ERR_MSG_MOD(extack, "dnat and snat can't be enabled at the same time");
> > >                 return -EOPNOTSUPP;
> > >
> > > I couldn't find this kind of check in ovs code right now (didn't look much, I
> > > confess :)), but even the code here was already doing:
> > >
> > > -     } else if (info->nat & OVS_CT_SRC_NAT) {
> > > -             maniptype = NF_NAT_MANIP_SRC;
> > > -     } else if (info->nat & OVS_CT_DST_NAT) {
> > > -             maniptype = NF_NAT_MANIP_DST;
> > >
> > > And in case of tuple conflict, maniptype will be forcibly updated in
> > > [*] below.
> >
> > Ahh.. just found why.. in case of typle conflict, nf_ct_nat() now may
> > set both.
> Right.
> BTW. The tuple conflict processing actually has provided the
> code to do full nat (snat and dnat at the same time), which I
> think is more efficient than doing full nat in two zones. All
> we have to do is adjust a few lines of code for that.

In this part, yes. But it needs some surgery all around for full
support. The code in ovs kernel for using only one type starts here:

static int parse_nat(const struct nlattr *attr,
                     struct ovs_conntrack_info *info, bool log)
{
...
                switch (type) {
                case OVS_NAT_ATTR_SRC:
                case OVS_NAT_ATTR_DST:
                        if (info->nat) {
                                OVS_NLERR(log, "Only one type of NAT may be specified");
                                return -ERANGE;
                        }
...
}

So vswitchd also doesn't support it. And then tc, iproute and drivers
that offload it as well. Not sure if it has impacts on openflow too.


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCHv2 net-next 5/5] net: move the nat function to nf_nat_ovs for ovs and tc
  2022-11-22 17:32 ` [PATCHv2 net-next 5/5] net: move the nat function to nf_nat_ovs for ovs and tc Xin Long
  2022-11-23 15:09   ` Marcelo Ricardo Leitner
@ 2022-11-23 18:52   ` Marcelo Ricardo Leitner
  2022-12-01 16:26     ` Xin Long
  1 sibling, 1 reply; 23+ messages in thread
From: Marcelo Ricardo Leitner @ 2022-11-23 18:52 UTC (permalink / raw)
  To: Xin Long
  Cc: network dev, dev, ovs-dev, davem, kuba, Eric Dumazet,
	Paolo Abeni, Pravin B Shelar, Jamal Hadi Salim, Cong Wang,
	Jiri Pirko, Pablo Neira Ayuso, Florian Westphal, Davide Caratti,
	Oz Shlomo, Paul Blakey, Ilya Maximets, Eelco Chaudron,
	Aaron Conole

On Tue, Nov 22, 2022 at 12:32:21PM -0500, Xin Long wrote:
> --- a/net/netfilter/Makefile
> +++ b/net/netfilter/Makefile
> @@ -52,7 +52,7 @@ obj-$(CONFIG_NF_CONNTRACK_SANE) += nf_conntrack_sane.o
>  obj-$(CONFIG_NF_CONNTRACK_SIP) += nf_conntrack_sip.o
>  obj-$(CONFIG_NF_CONNTRACK_TFTP) += nf_conntrack_tftp.o
>  
> -nf_nat-y	:= nf_nat_core.o nf_nat_proto.o nf_nat_helper.o
> +nf_nat-y	:= nf_nat_core.o nf_nat_proto.o nf_nat_helper.o nf_nat_ovs.o

Considering that the code in nf_nat_ovs is only used if ovs or act_ct
are enabled, shouldn't it be using an invisible knob here that gets
automatically selected by them? Pablo?

I think this is my last comment on this series. The rest LGTM.

>  
>  obj-$(CONFIG_NF_LOG_SYSLOG) += nf_log_syslog.o
>  

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [ovs-dev] [PATCHv2 net-next 5/5] net: move the nat function to nf_nat_ovs for ovs and tc
  2022-11-23 18:48         ` Marcelo Ricardo Leitner
@ 2022-11-23 18:54           ` Xin Long
  2022-11-23 19:17             ` Marcelo Ricardo Leitner
  0 siblings, 1 reply; 23+ messages in thread
From: Xin Long @ 2022-11-23 18:54 UTC (permalink / raw)
  To: Marcelo Ricardo Leitner
  Cc: dev, ovs-dev, Davide Caratti, Jiri Pirko, network dev,
	Paul Blakey, Florian Westphal, Jamal Hadi Salim, Ilya Maximets,
	Eric Dumazet, Cong Wang, kuba, Paolo Abeni, davem,
	Pablo Neira Ayuso

On Wed, Nov 23, 2022 at 1:48 PM Marcelo Ricardo Leitner
<marcelo.leitner@gmail.com> wrote:
>
> On Wed, Nov 23, 2022 at 12:31:38PM -0500, Xin Long wrote:
> > On Wed, Nov 23, 2022 at 10:13 AM Marcelo Ricardo Leitner
> > <marcelo.leitner@gmail.com> wrote:
> > >
> > > On Wed, Nov 23, 2022 at 12:09:55PM -0300, Marcelo Ricardo Leitner wrote:
> > > > On Tue, Nov 22, 2022 at 12:32:21PM -0500, Xin Long wrote:
> > > > > +int nf_ct_nat(struct sk_buff *skb, struct nf_conn *ct,
> > > > > +         enum ip_conntrack_info ctinfo, int *action,
> > > > > +         const struct nf_nat_range2 *range, bool commit)
> > > > > +{
> > > > > +   enum nf_nat_manip_type maniptype;
> > > > > +   int err, ct_action = *action;
> > > > > +
> > > > > +   *action = 0;
> > > > > +
> > > > > +   /* Add NAT extension if not confirmed yet. */
> > > > > +   if (!nf_ct_is_confirmed(ct) && !nf_ct_nat_ext_add(ct))
> > > > > +           return NF_ACCEPT;   /* Can't NAT. */
> > > > > +
> > > > > +   if (ctinfo != IP_CT_NEW && (ct->status & IPS_NAT_MASK) &&
> > > > > +       (ctinfo != IP_CT_RELATED || commit)) {
> > > > > +           /* NAT an established or related connection like before. */
> > > > > +           if (CTINFO2DIR(ctinfo) == IP_CT_DIR_REPLY)
> > > > > +                   /* This is the REPLY direction for a connection
> > > > > +                    * for which NAT was applied in the forward
> > > > > +                    * direction.  Do the reverse NAT.
> > > > > +                    */
> > > > > +                   maniptype = ct->status & IPS_SRC_NAT
> > > > > +                           ? NF_NAT_MANIP_DST : NF_NAT_MANIP_SRC;
> > > > > +           else
> > > > > +                   maniptype = ct->status & IPS_SRC_NAT
> > > > > +                           ? NF_NAT_MANIP_SRC : NF_NAT_MANIP_DST;
> > > > > +   } else if (ct_action & (1 << NF_NAT_MANIP_SRC)) {
> > > > > +           maniptype = NF_NAT_MANIP_SRC;
> > > > > +   } else if (ct_action & (1 << NF_NAT_MANIP_DST)) {
> > > > > +           maniptype = NF_NAT_MANIP_DST;
> > > > > +   } else {
> > > > > +           return NF_ACCEPT;
> > > > > +   }
> > > > > +
> > > > > +   err = nf_ct_nat_execute(skb, ct, ctinfo, action, range, maniptype);
> > > > > +   if (err == NF_ACCEPT && ct->status & IPS_DST_NAT) {
> > > > > +           if (ct->status & IPS_SRC_NAT) {
> > > > > +                   if (maniptype == NF_NAT_MANIP_SRC)
> > > > > +                           maniptype = NF_NAT_MANIP_DST;
> > > > > +                   else
> > > > > +                           maniptype = NF_NAT_MANIP_SRC;
> > > > > +
> > > > > +                   err = nf_ct_nat_execute(skb, ct, ctinfo, action, range,
> > > > > +                                           maniptype);
> > > > > +           } else if (CTINFO2DIR(ctinfo) == IP_CT_DIR_ORIGINAL) {
> > > > > +                   err = nf_ct_nat_execute(skb, ct, ctinfo, action, NULL,
> > > > > +                                           NF_NAT_MANIP_SRC);
> > > > > +           }
> > > > > +   }
> > > > > +   return err;
> > > > > +}
> > > > > +EXPORT_SYMBOL_GPL(nf_ct_nat);
> > > > > diff --git a/net/openvswitch/conntrack.c b/net/openvswitch/conntrack.c
> > > > > index cc643a556ea1..d03c75165663 100644
> > > > > --- a/net/openvswitch/conntrack.c
> > > > > +++ b/net/openvswitch/conntrack.c
> > > > > @@ -726,144 +726,27 @@ static void ovs_nat_update_key(struct sw_flow_key *key,
> > > > >     }
> > > > >  }
> > > > >
> > > > > -/* Modelled after nf_nat_ipv[46]_fn().
> > > > > - * range is only used for new, uninitialized NAT state.
> > > > > - * Returns either NF_ACCEPT or NF_DROP.
> > > > > - */
> > > > > -static int ovs_ct_nat_execute(struct sk_buff *skb, struct nf_conn *ct,
> > > > > -                         enum ip_conntrack_info ctinfo,
> > > > > -                         const struct nf_nat_range2 *range,
> > > > > -                         enum nf_nat_manip_type maniptype, struct sw_flow_key *key)
> > > > > -{
> > > > > -   int hooknum, err = NF_ACCEPT;
> > > > > -
> > > > > -   /* See HOOK2MANIP(). */
> > > > > -   if (maniptype == NF_NAT_MANIP_SRC)
> > > > > -           hooknum = NF_INET_LOCAL_IN; /* Source NAT */
> > > > > -   else
> > > > > -           hooknum = NF_INET_LOCAL_OUT; /* Destination NAT */
> > > > > -
> > > > > -   switch (ctinfo) {
> > > > > -   case IP_CT_RELATED:
> > > > > -   case IP_CT_RELATED_REPLY:
> > > > > -           if (IS_ENABLED(CONFIG_NF_NAT) &&
> > > > > -               skb->protocol == htons(ETH_P_IP) &&
> > > > > -               ip_hdr(skb)->protocol == IPPROTO_ICMP) {
> > > > > -                   if (!nf_nat_icmp_reply_translation(skb, ct, ctinfo,
> > > > > -                                                      hooknum))
> > > > > -                           err = NF_DROP;
> > > > > -                   goto out;
> > > > > -           } else if (IS_ENABLED(CONFIG_IPV6) &&
> > > > > -                      skb->protocol == htons(ETH_P_IPV6)) {
> > > > > -                   __be16 frag_off;
> > > > > -                   u8 nexthdr = ipv6_hdr(skb)->nexthdr;
> > > > > -                   int hdrlen = ipv6_skip_exthdr(skb,
> > > > > -                                                 sizeof(struct ipv6hdr),
> > > > > -                                                 &nexthdr, &frag_off);
> > > > > -
> > > > > -                   if (hdrlen >= 0 && nexthdr == IPPROTO_ICMPV6) {
> > > > > -                           if (!nf_nat_icmpv6_reply_translation(skb, ct,
> > > > > -                                                                ctinfo,
> > > > > -                                                                hooknum,
> > > > > -                                                                hdrlen))
> > > > > -                                   err = NF_DROP;
> > > > > -                           goto out;
> > > > > -                   }
> > > > > -           }
> > > > > -           /* Non-ICMP, fall thru to initialize if needed. */
> > > > > -           fallthrough;
> > > > > -   case IP_CT_NEW:
> > > > > -           /* Seen it before?  This can happen for loopback, retrans,
> > > > > -            * or local packets.
> > > > > -            */
> > > > > -           if (!nf_nat_initialized(ct, maniptype)) {
> > > > > -                   /* Initialize according to the NAT action. */
> > > > > -                   err = (range && range->flags & NF_NAT_RANGE_MAP_IPS)
> > > > > -                           /* Action is set up to establish a new
> > > > > -                            * mapping.
> > > > > -                            */
> > > > > -                           ? nf_nat_setup_info(ct, range, maniptype)
> > > > > -                           : nf_nat_alloc_null_binding(ct, hooknum);
> > > > > -                   if (err != NF_ACCEPT)
> > > > > -                           goto out;
> > > > > -           }
> > > > > -           break;
> > > > > -
> > > > > -   case IP_CT_ESTABLISHED:
> > > > > -   case IP_CT_ESTABLISHED_REPLY:
> > > > > -           break;
> > > > > -
> > > > > -   default:
> > > > > -           err = NF_DROP;
> > > > > -           goto out;
> > > > > -   }
> > > > > -
> > > > > -   err = nf_nat_packet(ct, ctinfo, hooknum, skb);
> > > > > -out:
> > > > > -   /* Update the flow key if NAT successful. */
> > > > > -   if (err == NF_ACCEPT)
> > > > > -           ovs_nat_update_key(key, skb, maniptype);
> > > > > -
> > > > > -   return err;
> > > > > -}
> > > > > -
> > > > >  /* Returns NF_DROP if the packet should be dropped, NF_ACCEPT otherwise. */
> > > > >  static int ovs_ct_nat(struct net *net, struct sw_flow_key *key,
> > > > >                   const struct ovs_conntrack_info *info,
> > > > >                   struct sk_buff *skb, struct nf_conn *ct,
> > > > >                   enum ip_conntrack_info ctinfo)
> > > > >  {
> > > > > -   enum nf_nat_manip_type maniptype;
> > > > > -   int err;
> > > > > +   int err, action = 0;
> > > > >
> > > > >     if (!(info->nat & OVS_CT_NAT))
> > > > >             return NF_ACCEPT;
> > > > > +   if (info->nat & OVS_CT_SRC_NAT)
> > > > > +           action |= (1 << NF_NAT_MANIP_SRC);
> > > > > +   if (info->nat & OVS_CT_DST_NAT)
> > > > > +           action |= (1 << NF_NAT_MANIP_DST);
> > > >
> > > > I'm wondering why this dance at this level with supporting multiple
> > > > MANIPs while actually only one can be used at a time.
> > > >
> > > > act_ct will reject an action using both:
> > > >         if ((p->ct_action & TCA_CT_ACT_NAT_SRC) &&
> > > >             (p->ct_action & TCA_CT_ACT_NAT_DST)) {
> > > >                 NL_SET_ERR_MSG_MOD(extack, "dnat and snat can't be enabled at the same time");
> > > >                 return -EOPNOTSUPP;
> > > >
> > > > I couldn't find this kind of check in ovs code right now (didn't look much, I
> > > > confess :)), but even the code here was already doing:
> > > >
> > > > -     } else if (info->nat & OVS_CT_SRC_NAT) {
> > > > -             maniptype = NF_NAT_MANIP_SRC;
> > > > -     } else if (info->nat & OVS_CT_DST_NAT) {
> > > > -             maniptype = NF_NAT_MANIP_DST;
> > > >
> > > > And in case of tuple conflict, maniptype will be forcibly updated in
> > > > [*] below.
> > >
> > > Ahh.. just found why.. in case of typle conflict, nf_ct_nat() now may
> > > set both.
> > Right.
> > BTW. The tuple conflict processing actually has provided the
> > code to do full nat (snat and dnat at the same time), which I
> > think is more efficient than doing full nat in two zones. All
> > we have to do is adjust a few lines of code for that.
>
> In this part, yes. But it needs some surgery all around for full
> support. The code in ovs kernel for using only one type starts here:
>
> static int parse_nat(const struct nlattr *attr,
>                      struct ovs_conntrack_info *info, bool log)
> {
> ...
>                 switch (type) {
>                 case OVS_NAT_ATTR_SRC:
>                 case OVS_NAT_ATTR_DST:
>                         if (info->nat) {
>                                 OVS_NLERR(log, "Only one type of NAT may be specified");
>                                 return -ERANGE;
>                         }
> ...
> }
>
> So vswitchd also doesn't support it. And then tc, iproute and drivers
> that offload it as well. Not sure if it has impacts on openflow too.
>
not in one single action, but two actions:

"table=1, in_port=veth1,tcp,tcp_dst=2121,ct_state=+trk+new
actions=ct(nat(dst=7.7.16.3)),ct(commit, nat(src=7.7.16.1),
alg=ftp),veth2"

as long as it allows the 1st one doesn't commit, which is a simple
check in parse_nat().
I tested it, TC already supports it. I'm not sure about drivers, but I
think it should be, as with two actions.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [ovs-dev] [PATCHv2 net-next 5/5] net: move the nat function to nf_nat_ovs for ovs and tc
  2022-11-23 18:54           ` Xin Long
@ 2022-11-23 19:17             ` Marcelo Ricardo Leitner
  2022-11-23 19:55               ` Xin Long
  0 siblings, 1 reply; 23+ messages in thread
From: Marcelo Ricardo Leitner @ 2022-11-23 19:17 UTC (permalink / raw)
  To: Xin Long
  Cc: dev, ovs-dev, Davide Caratti, Jiri Pirko, network dev,
	Paul Blakey, Florian Westphal, Jamal Hadi Salim, Ilya Maximets,
	Eric Dumazet, Cong Wang, kuba, Paolo Abeni, davem,
	Pablo Neira Ayuso

On Wed, Nov 23, 2022 at 01:54:41PM -0500, Xin Long wrote:
> On Wed, Nov 23, 2022 at 1:48 PM Marcelo Ricardo Leitner
> <marcelo.leitner@gmail.com> wrote:
> >
> > On Wed, Nov 23, 2022 at 12:31:38PM -0500, Xin Long wrote:
> > > On Wed, Nov 23, 2022 at 10:13 AM Marcelo Ricardo Leitner
> > > <marcelo.leitner@gmail.com> wrote:
> > > >
> > > > On Wed, Nov 23, 2022 at 12:09:55PM -0300, Marcelo Ricardo Leitner wrote:
> > > > > On Tue, Nov 22, 2022 at 12:32:21PM -0500, Xin Long wrote:
> > > > > > +int nf_ct_nat(struct sk_buff *skb, struct nf_conn *ct,
> > > > > > +         enum ip_conntrack_info ctinfo, int *action,
> > > > > > +         const struct nf_nat_range2 *range, bool commit)
> > > > > > +{
> > > > > > +   enum nf_nat_manip_type maniptype;
> > > > > > +   int err, ct_action = *action;
> > > > > > +
> > > > > > +   *action = 0;
> > > > > > +
> > > > > > +   /* Add NAT extension if not confirmed yet. */
> > > > > > +   if (!nf_ct_is_confirmed(ct) && !nf_ct_nat_ext_add(ct))
> > > > > > +           return NF_ACCEPT;   /* Can't NAT. */
> > > > > > +
> > > > > > +   if (ctinfo != IP_CT_NEW && (ct->status & IPS_NAT_MASK) &&
> > > > > > +       (ctinfo != IP_CT_RELATED || commit)) {
> > > > > > +           /* NAT an established or related connection like before. */
> > > > > > +           if (CTINFO2DIR(ctinfo) == IP_CT_DIR_REPLY)
> > > > > > +                   /* This is the REPLY direction for a connection
> > > > > > +                    * for which NAT was applied in the forward
> > > > > > +                    * direction.  Do the reverse NAT.
> > > > > > +                    */
> > > > > > +                   maniptype = ct->status & IPS_SRC_NAT
> > > > > > +                           ? NF_NAT_MANIP_DST : NF_NAT_MANIP_SRC;
> > > > > > +           else
> > > > > > +                   maniptype = ct->status & IPS_SRC_NAT
> > > > > > +                           ? NF_NAT_MANIP_SRC : NF_NAT_MANIP_DST;
> > > > > > +   } else if (ct_action & (1 << NF_NAT_MANIP_SRC)) {
> > > > > > +           maniptype = NF_NAT_MANIP_SRC;
> > > > > > +   } else if (ct_action & (1 << NF_NAT_MANIP_DST)) {
> > > > > > +           maniptype = NF_NAT_MANIP_DST;
> > > > > > +   } else {
> > > > > > +           return NF_ACCEPT;
> > > > > > +   }
> > > > > > +
> > > > > > +   err = nf_ct_nat_execute(skb, ct, ctinfo, action, range, maniptype);
> > > > > > +   if (err == NF_ACCEPT && ct->status & IPS_DST_NAT) {
> > > > > > +           if (ct->status & IPS_SRC_NAT) {
> > > > > > +                   if (maniptype == NF_NAT_MANIP_SRC)
> > > > > > +                           maniptype = NF_NAT_MANIP_DST;
> > > > > > +                   else
> > > > > > +                           maniptype = NF_NAT_MANIP_SRC;
> > > > > > +
> > > > > > +                   err = nf_ct_nat_execute(skb, ct, ctinfo, action, range,
> > > > > > +                                           maniptype);
> > > > > > +           } else if (CTINFO2DIR(ctinfo) == IP_CT_DIR_ORIGINAL) {
> > > > > > +                   err = nf_ct_nat_execute(skb, ct, ctinfo, action, NULL,
> > > > > > +                                           NF_NAT_MANIP_SRC);
> > > > > > +           }
> > > > > > +   }
> > > > > > +   return err;
> > > > > > +}
> > > > > > +EXPORT_SYMBOL_GPL(nf_ct_nat);
> > > > > > diff --git a/net/openvswitch/conntrack.c b/net/openvswitch/conntrack.c
> > > > > > index cc643a556ea1..d03c75165663 100644
> > > > > > --- a/net/openvswitch/conntrack.c
> > > > > > +++ b/net/openvswitch/conntrack.c
> > > > > > @@ -726,144 +726,27 @@ static void ovs_nat_update_key(struct sw_flow_key *key,
> > > > > >     }
> > > > > >  }
> > > > > >
> > > > > > -/* Modelled after nf_nat_ipv[46]_fn().
> > > > > > - * range is only used for new, uninitialized NAT state.
> > > > > > - * Returns either NF_ACCEPT or NF_DROP.
> > > > > > - */
> > > > > > -static int ovs_ct_nat_execute(struct sk_buff *skb, struct nf_conn *ct,
> > > > > > -                         enum ip_conntrack_info ctinfo,
> > > > > > -                         const struct nf_nat_range2 *range,
> > > > > > -                         enum nf_nat_manip_type maniptype, struct sw_flow_key *key)
> > > > > > -{
> > > > > > -   int hooknum, err = NF_ACCEPT;
> > > > > > -
> > > > > > -   /* See HOOK2MANIP(). */
> > > > > > -   if (maniptype == NF_NAT_MANIP_SRC)
> > > > > > -           hooknum = NF_INET_LOCAL_IN; /* Source NAT */
> > > > > > -   else
> > > > > > -           hooknum = NF_INET_LOCAL_OUT; /* Destination NAT */
> > > > > > -
> > > > > > -   switch (ctinfo) {
> > > > > > -   case IP_CT_RELATED:
> > > > > > -   case IP_CT_RELATED_REPLY:
> > > > > > -           if (IS_ENABLED(CONFIG_NF_NAT) &&
> > > > > > -               skb->protocol == htons(ETH_P_IP) &&
> > > > > > -               ip_hdr(skb)->protocol == IPPROTO_ICMP) {
> > > > > > -                   if (!nf_nat_icmp_reply_translation(skb, ct, ctinfo,
> > > > > > -                                                      hooknum))
> > > > > > -                           err = NF_DROP;
> > > > > > -                   goto out;
> > > > > > -           } else if (IS_ENABLED(CONFIG_IPV6) &&
> > > > > > -                      skb->protocol == htons(ETH_P_IPV6)) {
> > > > > > -                   __be16 frag_off;
> > > > > > -                   u8 nexthdr = ipv6_hdr(skb)->nexthdr;
> > > > > > -                   int hdrlen = ipv6_skip_exthdr(skb,
> > > > > > -                                                 sizeof(struct ipv6hdr),
> > > > > > -                                                 &nexthdr, &frag_off);
> > > > > > -
> > > > > > -                   if (hdrlen >= 0 && nexthdr == IPPROTO_ICMPV6) {
> > > > > > -                           if (!nf_nat_icmpv6_reply_translation(skb, ct,
> > > > > > -                                                                ctinfo,
> > > > > > -                                                                hooknum,
> > > > > > -                                                                hdrlen))
> > > > > > -                                   err = NF_DROP;
> > > > > > -                           goto out;
> > > > > > -                   }
> > > > > > -           }
> > > > > > -           /* Non-ICMP, fall thru to initialize if needed. */
> > > > > > -           fallthrough;
> > > > > > -   case IP_CT_NEW:
> > > > > > -           /* Seen it before?  This can happen for loopback, retrans,
> > > > > > -            * or local packets.
> > > > > > -            */
> > > > > > -           if (!nf_nat_initialized(ct, maniptype)) {
> > > > > > -                   /* Initialize according to the NAT action. */
> > > > > > -                   err = (range && range->flags & NF_NAT_RANGE_MAP_IPS)
> > > > > > -                           /* Action is set up to establish a new
> > > > > > -                            * mapping.
> > > > > > -                            */
> > > > > > -                           ? nf_nat_setup_info(ct, range, maniptype)
> > > > > > -                           : nf_nat_alloc_null_binding(ct, hooknum);
> > > > > > -                   if (err != NF_ACCEPT)
> > > > > > -                           goto out;
> > > > > > -           }
> > > > > > -           break;
> > > > > > -
> > > > > > -   case IP_CT_ESTABLISHED:
> > > > > > -   case IP_CT_ESTABLISHED_REPLY:
> > > > > > -           break;
> > > > > > -
> > > > > > -   default:
> > > > > > -           err = NF_DROP;
> > > > > > -           goto out;
> > > > > > -   }
> > > > > > -
> > > > > > -   err = nf_nat_packet(ct, ctinfo, hooknum, skb);
> > > > > > -out:
> > > > > > -   /* Update the flow key if NAT successful. */
> > > > > > -   if (err == NF_ACCEPT)
> > > > > > -           ovs_nat_update_key(key, skb, maniptype);
> > > > > > -
> > > > > > -   return err;
> > > > > > -}
> > > > > > -
> > > > > >  /* Returns NF_DROP if the packet should be dropped, NF_ACCEPT otherwise. */
> > > > > >  static int ovs_ct_nat(struct net *net, struct sw_flow_key *key,
> > > > > >                   const struct ovs_conntrack_info *info,
> > > > > >                   struct sk_buff *skb, struct nf_conn *ct,
> > > > > >                   enum ip_conntrack_info ctinfo)
> > > > > >  {
> > > > > > -   enum nf_nat_manip_type maniptype;
> > > > > > -   int err;
> > > > > > +   int err, action = 0;
> > > > > >
> > > > > >     if (!(info->nat & OVS_CT_NAT))
> > > > > >             return NF_ACCEPT;
> > > > > > +   if (info->nat & OVS_CT_SRC_NAT)
> > > > > > +           action |= (1 << NF_NAT_MANIP_SRC);
> > > > > > +   if (info->nat & OVS_CT_DST_NAT)
> > > > > > +           action |= (1 << NF_NAT_MANIP_DST);
> > > > >
> > > > > I'm wondering why this dance at this level with supporting multiple
> > > > > MANIPs while actually only one can be used at a time.
> > > > >
> > > > > act_ct will reject an action using both:
> > > > >         if ((p->ct_action & TCA_CT_ACT_NAT_SRC) &&
> > > > >             (p->ct_action & TCA_CT_ACT_NAT_DST)) {
> > > > >                 NL_SET_ERR_MSG_MOD(extack, "dnat and snat can't be enabled at the same time");
> > > > >                 return -EOPNOTSUPP;
> > > > >
> > > > > I couldn't find this kind of check in ovs code right now (didn't look much, I
> > > > > confess :)), but even the code here was already doing:
> > > > >
> > > > > -     } else if (info->nat & OVS_CT_SRC_NAT) {
> > > > > -             maniptype = NF_NAT_MANIP_SRC;
> > > > > -     } else if (info->nat & OVS_CT_DST_NAT) {
> > > > > -             maniptype = NF_NAT_MANIP_DST;
> > > > >
> > > > > And in case of tuple conflict, maniptype will be forcibly updated in
> > > > > [*] below.
> > > >
> > > > Ahh.. just found why.. in case of typle conflict, nf_ct_nat() now may
> > > > set both.
> > > Right.
> > > BTW. The tuple conflict processing actually has provided the
> > > code to do full nat (snat and dnat at the same time), which I
> > > think is more efficient than doing full nat in two zones. All
> > > we have to do is adjust a few lines of code for that.
> >
> > In this part, yes. But it needs some surgery all around for full
> > support. The code in ovs kernel for using only one type starts here:
> >
> > static int parse_nat(const struct nlattr *attr,
> >                      struct ovs_conntrack_info *info, bool log)
> > {
> > ...
> >                 switch (type) {
> >                 case OVS_NAT_ATTR_SRC:
> >                 case OVS_NAT_ATTR_DST:
> >                         if (info->nat) {
> >                                 OVS_NLERR(log, "Only one type of NAT may be specified");
> >                                 return -ERANGE;
> >                         }
> > ...
> > }
> >
> > So vswitchd also doesn't support it. And then tc, iproute and drivers
> > that offload it as well. Not sure if it has impacts on openflow too.
> >
> not in one single action, but two actions:

Oh.

> 
> "table=1, in_port=veth1,tcp,tcp_dst=2121,ct_state=+trk+new
> actions=ct(nat(dst=7.7.16.3)),ct(commit, nat(src=7.7.16.1),
> alg=ftp),veth2"
> 
> as long as it allows the 1st one doesn't commit, which is a simple
> check in parse_nat().
> I tested it, TC already supports it. I'm not sure about drivers, but I

There's an outstanding issue with act_ct that it may reuse an old
CT cache. Fixing it could (I'm not sure) impact this use case:

https://bugzilla.redhat.com/show_bug.cgi?id=2099220
same issue in ovs was fixed in
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=2061ecfdf2350994e5b61c43e50e98a7a70e95ee

(please don't ask me who would NAT and then overwrite IP addresses and
then NAT it again :D)

> think it should be, as with two actions.

mlx5 at least currently doesn't support it. Good that tc already
supports it, then. Maybe drivers can get up to speed later on.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [ovs-dev] [PATCHv2 net-next 5/5] net: move the nat function to nf_nat_ovs for ovs and tc
  2022-11-23 19:17             ` Marcelo Ricardo Leitner
@ 2022-11-23 19:55               ` Xin Long
  2022-11-23 21:21                 ` Marcelo Ricardo Leitner
  0 siblings, 1 reply; 23+ messages in thread
From: Xin Long @ 2022-11-23 19:55 UTC (permalink / raw)
  To: Marcelo Ricardo Leitner
  Cc: dev, ovs-dev, Davide Caratti, Jiri Pirko, network dev,
	Paul Blakey, Florian Westphal, Jamal Hadi Salim, Ilya Maximets,
	Eric Dumazet, Cong Wang, kuba, Paolo Abeni, davem,
	Pablo Neira Ayuso

On Wed, Nov 23, 2022 at 2:17 PM Marcelo Ricardo Leitner
<marcelo.leitner@gmail.com> wrote:
>
> On Wed, Nov 23, 2022 at 01:54:41PM -0500, Xin Long wrote:
> > On Wed, Nov 23, 2022 at 1:48 PM Marcelo Ricardo Leitner
> > <marcelo.leitner@gmail.com> wrote:
> > >
> > > On Wed, Nov 23, 2022 at 12:31:38PM -0500, Xin Long wrote:
> > > > On Wed, Nov 23, 2022 at 10:13 AM Marcelo Ricardo Leitner
> > > > <marcelo.leitner@gmail.com> wrote:
> > > > >
> > > > > On Wed, Nov 23, 2022 at 12:09:55PM -0300, Marcelo Ricardo Leitner wrote:
> > > > > > On Tue, Nov 22, 2022 at 12:32:21PM -0500, Xin Long wrote:
> > > > > > > +int nf_ct_nat(struct sk_buff *skb, struct nf_conn *ct,
> > > > > > > +         enum ip_conntrack_info ctinfo, int *action,
> > > > > > > +         const struct nf_nat_range2 *range, bool commit)
> > > > > > > +{
> > > > > > > +   enum nf_nat_manip_type maniptype;
> > > > > > > +   int err, ct_action = *action;
> > > > > > > +
> > > > > > > +   *action = 0;
> > > > > > > +
> > > > > > > +   /* Add NAT extension if not confirmed yet. */
> > > > > > > +   if (!nf_ct_is_confirmed(ct) && !nf_ct_nat_ext_add(ct))
> > > > > > > +           return NF_ACCEPT;   /* Can't NAT. */
> > > > > > > +
> > > > > > > +   if (ctinfo != IP_CT_NEW && (ct->status & IPS_NAT_MASK) &&
> > > > > > > +       (ctinfo != IP_CT_RELATED || commit)) {
> > > > > > > +           /* NAT an established or related connection like before. */
> > > > > > > +           if (CTINFO2DIR(ctinfo) == IP_CT_DIR_REPLY)
> > > > > > > +                   /* This is the REPLY direction for a connection
> > > > > > > +                    * for which NAT was applied in the forward
> > > > > > > +                    * direction.  Do the reverse NAT.
> > > > > > > +                    */
> > > > > > > +                   maniptype = ct->status & IPS_SRC_NAT
> > > > > > > +                           ? NF_NAT_MANIP_DST : NF_NAT_MANIP_SRC;
> > > > > > > +           else
> > > > > > > +                   maniptype = ct->status & IPS_SRC_NAT
> > > > > > > +                           ? NF_NAT_MANIP_SRC : NF_NAT_MANIP_DST;
> > > > > > > +   } else if (ct_action & (1 << NF_NAT_MANIP_SRC)) {
> > > > > > > +           maniptype = NF_NAT_MANIP_SRC;
> > > > > > > +   } else if (ct_action & (1 << NF_NAT_MANIP_DST)) {
> > > > > > > +           maniptype = NF_NAT_MANIP_DST;
> > > > > > > +   } else {
> > > > > > > +           return NF_ACCEPT;
> > > > > > > +   }
> > > > > > > +
> > > > > > > +   err = nf_ct_nat_execute(skb, ct, ctinfo, action, range, maniptype);
> > > > > > > +   if (err == NF_ACCEPT && ct->status & IPS_DST_NAT) {
> > > > > > > +           if (ct->status & IPS_SRC_NAT) {
> > > > > > > +                   if (maniptype == NF_NAT_MANIP_SRC)
> > > > > > > +                           maniptype = NF_NAT_MANIP_DST;
> > > > > > > +                   else
> > > > > > > +                           maniptype = NF_NAT_MANIP_SRC;
> > > > > > > +
> > > > > > > +                   err = nf_ct_nat_execute(skb, ct, ctinfo, action, range,
> > > > > > > +                                           maniptype);
> > > > > > > +           } else if (CTINFO2DIR(ctinfo) == IP_CT_DIR_ORIGINAL) {
> > > > > > > +                   err = nf_ct_nat_execute(skb, ct, ctinfo, action, NULL,
> > > > > > > +                                           NF_NAT_MANIP_SRC);
> > > > > > > +           }
> > > > > > > +   }
> > > > > > > +   return err;
> > > > > > > +}
> > > > > > > +EXPORT_SYMBOL_GPL(nf_ct_nat);
> > > > > > > diff --git a/net/openvswitch/conntrack.c b/net/openvswitch/conntrack.c
> > > > > > > index cc643a556ea1..d03c75165663 100644
> > > > > > > --- a/net/openvswitch/conntrack.c
> > > > > > > +++ b/net/openvswitch/conntrack.c
> > > > > > > @@ -726,144 +726,27 @@ static void ovs_nat_update_key(struct sw_flow_key *key,
> > > > > > >     }
> > > > > > >  }
> > > > > > >
> > > > > > > -/* Modelled after nf_nat_ipv[46]_fn().
> > > > > > > - * range is only used for new, uninitialized NAT state.
> > > > > > > - * Returns either NF_ACCEPT or NF_DROP.
> > > > > > > - */
> > > > > > > -static int ovs_ct_nat_execute(struct sk_buff *skb, struct nf_conn *ct,
> > > > > > > -                         enum ip_conntrack_info ctinfo,
> > > > > > > -                         const struct nf_nat_range2 *range,
> > > > > > > -                         enum nf_nat_manip_type maniptype, struct sw_flow_key *key)
> > > > > > > -{
> > > > > > > -   int hooknum, err = NF_ACCEPT;
> > > > > > > -
> > > > > > > -   /* See HOOK2MANIP(). */
> > > > > > > -   if (maniptype == NF_NAT_MANIP_SRC)
> > > > > > > -           hooknum = NF_INET_LOCAL_IN; /* Source NAT */
> > > > > > > -   else
> > > > > > > -           hooknum = NF_INET_LOCAL_OUT; /* Destination NAT */
> > > > > > > -
> > > > > > > -   switch (ctinfo) {
> > > > > > > -   case IP_CT_RELATED:
> > > > > > > -   case IP_CT_RELATED_REPLY:
> > > > > > > -           if (IS_ENABLED(CONFIG_NF_NAT) &&
> > > > > > > -               skb->protocol == htons(ETH_P_IP) &&
> > > > > > > -               ip_hdr(skb)->protocol == IPPROTO_ICMP) {
> > > > > > > -                   if (!nf_nat_icmp_reply_translation(skb, ct, ctinfo,
> > > > > > > -                                                      hooknum))
> > > > > > > -                           err = NF_DROP;
> > > > > > > -                   goto out;
> > > > > > > -           } else if (IS_ENABLED(CONFIG_IPV6) &&
> > > > > > > -                      skb->protocol == htons(ETH_P_IPV6)) {
> > > > > > > -                   __be16 frag_off;
> > > > > > > -                   u8 nexthdr = ipv6_hdr(skb)->nexthdr;
> > > > > > > -                   int hdrlen = ipv6_skip_exthdr(skb,
> > > > > > > -                                                 sizeof(struct ipv6hdr),
> > > > > > > -                                                 &nexthdr, &frag_off);
> > > > > > > -
> > > > > > > -                   if (hdrlen >= 0 && nexthdr == IPPROTO_ICMPV6) {
> > > > > > > -                           if (!nf_nat_icmpv6_reply_translation(skb, ct,
> > > > > > > -                                                                ctinfo,
> > > > > > > -                                                                hooknum,
> > > > > > > -                                                                hdrlen))
> > > > > > > -                                   err = NF_DROP;
> > > > > > > -                           goto out;
> > > > > > > -                   }
> > > > > > > -           }
> > > > > > > -           /* Non-ICMP, fall thru to initialize if needed. */
> > > > > > > -           fallthrough;
> > > > > > > -   case IP_CT_NEW:
> > > > > > > -           /* Seen it before?  This can happen for loopback, retrans,
> > > > > > > -            * or local packets.
> > > > > > > -            */
> > > > > > > -           if (!nf_nat_initialized(ct, maniptype)) {
> > > > > > > -                   /* Initialize according to the NAT action. */
> > > > > > > -                   err = (range && range->flags & NF_NAT_RANGE_MAP_IPS)
> > > > > > > -                           /* Action is set up to establish a new
> > > > > > > -                            * mapping.
> > > > > > > -                            */
> > > > > > > -                           ? nf_nat_setup_info(ct, range, maniptype)
> > > > > > > -                           : nf_nat_alloc_null_binding(ct, hooknum);
> > > > > > > -                   if (err != NF_ACCEPT)
> > > > > > > -                           goto out;
> > > > > > > -           }
> > > > > > > -           break;
> > > > > > > -
> > > > > > > -   case IP_CT_ESTABLISHED:
> > > > > > > -   case IP_CT_ESTABLISHED_REPLY:
> > > > > > > -           break;
> > > > > > > -
> > > > > > > -   default:
> > > > > > > -           err = NF_DROP;
> > > > > > > -           goto out;
> > > > > > > -   }
> > > > > > > -
> > > > > > > -   err = nf_nat_packet(ct, ctinfo, hooknum, skb);
> > > > > > > -out:
> > > > > > > -   /* Update the flow key if NAT successful. */
> > > > > > > -   if (err == NF_ACCEPT)
> > > > > > > -           ovs_nat_update_key(key, skb, maniptype);
> > > > > > > -
> > > > > > > -   return err;
> > > > > > > -}
> > > > > > > -
> > > > > > >  /* Returns NF_DROP if the packet should be dropped, NF_ACCEPT otherwise. */
> > > > > > >  static int ovs_ct_nat(struct net *net, struct sw_flow_key *key,
> > > > > > >                   const struct ovs_conntrack_info *info,
> > > > > > >                   struct sk_buff *skb, struct nf_conn *ct,
> > > > > > >                   enum ip_conntrack_info ctinfo)
> > > > > > >  {
> > > > > > > -   enum nf_nat_manip_type maniptype;
> > > > > > > -   int err;
> > > > > > > +   int err, action = 0;
> > > > > > >
> > > > > > >     if (!(info->nat & OVS_CT_NAT))
> > > > > > >             return NF_ACCEPT;
> > > > > > > +   if (info->nat & OVS_CT_SRC_NAT)
> > > > > > > +           action |= (1 << NF_NAT_MANIP_SRC);
> > > > > > > +   if (info->nat & OVS_CT_DST_NAT)
> > > > > > > +           action |= (1 << NF_NAT_MANIP_DST);
> > > > > >
> > > > > > I'm wondering why this dance at this level with supporting multiple
> > > > > > MANIPs while actually only one can be used at a time.
> > > > > >
> > > > > > act_ct will reject an action using both:
> > > > > >         if ((p->ct_action & TCA_CT_ACT_NAT_SRC) &&
> > > > > >             (p->ct_action & TCA_CT_ACT_NAT_DST)) {
> > > > > >                 NL_SET_ERR_MSG_MOD(extack, "dnat and snat can't be enabled at the same time");
> > > > > >                 return -EOPNOTSUPP;
> > > > > >
> > > > > > I couldn't find this kind of check in ovs code right now (didn't look much, I
> > > > > > confess :)), but even the code here was already doing:
> > > > > >
> > > > > > -     } else if (info->nat & OVS_CT_SRC_NAT) {
> > > > > > -             maniptype = NF_NAT_MANIP_SRC;
> > > > > > -     } else if (info->nat & OVS_CT_DST_NAT) {
> > > > > > -             maniptype = NF_NAT_MANIP_DST;
> > > > > >
> > > > > > And in case of tuple conflict, maniptype will be forcibly updated in
> > > > > > [*] below.
> > > > >
> > > > > Ahh.. just found why.. in case of typle conflict, nf_ct_nat() now may
> > > > > set both.
> > > > Right.
> > > > BTW. The tuple conflict processing actually has provided the
> > > > code to do full nat (snat and dnat at the same time), which I
> > > > think is more efficient than doing full nat in two zones. All
> > > > we have to do is adjust a few lines of code for that.
> > >
> > > In this part, yes. But it needs some surgery all around for full
> > > support. The code in ovs kernel for using only one type starts here:
> > >
> > > static int parse_nat(const struct nlattr *attr,
> > >                      struct ovs_conntrack_info *info, bool log)
> > > {
> > > ...
> > >                 switch (type) {
> > >                 case OVS_NAT_ATTR_SRC:
> > >                 case OVS_NAT_ATTR_DST:
> > >                         if (info->nat) {
> > >                                 OVS_NLERR(log, "Only one type of NAT may be specified");
> > >                                 return -ERANGE;
> > >                         }
> > > ...
> > > }
> > >
> > > So vswitchd also doesn't support it. And then tc, iproute and drivers
> > > that offload it as well. Not sure if it has impacts on openflow too.
> > >
> > not in one single action, but two actions:
>
> Oh.
>
> >
> > "table=1, in_port=veth1,tcp,tcp_dst=2121,ct_state=+trk+new
> > actions=ct(nat(dst=7.7.16.3)),ct(commit, nat(src=7.7.16.1),
> > alg=ftp),veth2"
> >
> > as long as it allows the 1st one doesn't commit, which is a simple
> > check in parse_nat().
> > I tested it, TC already supports it. I'm not sure about drivers, but I
>
> There's an outstanding issue with act_ct that it may reuse an old
> CT cache. Fixing it could (I'm not sure) impact this use case:
>
> https://bugzilla.redhat.com/show_bug.cgi?id=2099220
> same issue in ovs was fixed in
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=2061ecfdf2350994e5b61c43e50e98a7a70e95ee
>
> (please don't ask me who would NAT and then overwrite IP addresses and
> then NAT it again :D)
I thought only traditional NAT would change IP, I'm too naive.

nftables names this as "stateless NAT."
With two CTs in the same zone for full nat is more close to the
netfilter's NAT processing (the same CT goes from prerouting to
postrouting).
Now I'm wondering how nftables handles the stateful NAT and stateless
NAT at the same time.

>
> > think it should be, as with two actions.
>
> mlx5 at least currently doesn't support it. Good that tc already
> supports it, then. Maybe drivers can get up to speed later on.
So because mlx5 currently only supports one ct action in one tc rule
or something?

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [ovs-dev] [PATCHv2 net-next 5/5] net: move the nat function to nf_nat_ovs for ovs and tc
  2022-11-23 19:55               ` Xin Long
@ 2022-11-23 21:21                 ` Marcelo Ricardo Leitner
  2022-11-23 21:34                   ` Xin Long
  2022-11-24 10:00                   ` Pablo Neira Ayuso
  0 siblings, 2 replies; 23+ messages in thread
From: Marcelo Ricardo Leitner @ 2022-11-23 21:21 UTC (permalink / raw)
  To: Xin Long
  Cc: dev, ovs-dev, Davide Caratti, Jiri Pirko, network dev,
	Paul Blakey, Florian Westphal, Jamal Hadi Salim, Ilya Maximets,
	Eric Dumazet, Cong Wang, kuba, Paolo Abeni, davem,
	Pablo Neira Ayuso

On Wed, Nov 23, 2022 at 02:55:05PM -0500, Xin Long wrote:
> On Wed, Nov 23, 2022 at 2:17 PM Marcelo Ricardo Leitner
> <marcelo.leitner@gmail.com> wrote:
> >
> > On Wed, Nov 23, 2022 at 01:54:41PM -0500, Xin Long wrote:
> > > On Wed, Nov 23, 2022 at 1:48 PM Marcelo Ricardo Leitner
> > > <marcelo.leitner@gmail.com> wrote:
> > > >
> > > > On Wed, Nov 23, 2022 at 12:31:38PM -0500, Xin Long wrote:
> > > > > On Wed, Nov 23, 2022 at 10:13 AM Marcelo Ricardo Leitner
> > > > > <marcelo.leitner@gmail.com> wrote:
> > > > > >
> > > > > > On Wed, Nov 23, 2022 at 12:09:55PM -0300, Marcelo Ricardo Leitner wrote:
> > > > > > > On Tue, Nov 22, 2022 at 12:32:21PM -0500, Xin Long wrote:
> > > > > > > > +int nf_ct_nat(struct sk_buff *skb, struct nf_conn *ct,
> > > > > > > > +         enum ip_conntrack_info ctinfo, int *action,
> > > > > > > > +         const struct nf_nat_range2 *range, bool commit)
> > > > > > > > +{
> > > > > > > > +   enum nf_nat_manip_type maniptype;
> > > > > > > > +   int err, ct_action = *action;
> > > > > > > > +
> > > > > > > > +   *action = 0;
> > > > > > > > +
> > > > > > > > +   /* Add NAT extension if not confirmed yet. */
> > > > > > > > +   if (!nf_ct_is_confirmed(ct) && !nf_ct_nat_ext_add(ct))
> > > > > > > > +           return NF_ACCEPT;   /* Can't NAT. */
> > > > > > > > +
> > > > > > > > +   if (ctinfo != IP_CT_NEW && (ct->status & IPS_NAT_MASK) &&
> > > > > > > > +       (ctinfo != IP_CT_RELATED || commit)) {
> > > > > > > > +           /* NAT an established or related connection like before. */
> > > > > > > > +           if (CTINFO2DIR(ctinfo) == IP_CT_DIR_REPLY)
> > > > > > > > +                   /* This is the REPLY direction for a connection
> > > > > > > > +                    * for which NAT was applied in the forward
> > > > > > > > +                    * direction.  Do the reverse NAT.
> > > > > > > > +                    */
> > > > > > > > +                   maniptype = ct->status & IPS_SRC_NAT
> > > > > > > > +                           ? NF_NAT_MANIP_DST : NF_NAT_MANIP_SRC;
> > > > > > > > +           else
> > > > > > > > +                   maniptype = ct->status & IPS_SRC_NAT
> > > > > > > > +                           ? NF_NAT_MANIP_SRC : NF_NAT_MANIP_DST;
> > > > > > > > +   } else if (ct_action & (1 << NF_NAT_MANIP_SRC)) {
> > > > > > > > +           maniptype = NF_NAT_MANIP_SRC;
> > > > > > > > +   } else if (ct_action & (1 << NF_NAT_MANIP_DST)) {
> > > > > > > > +           maniptype = NF_NAT_MANIP_DST;
> > > > > > > > +   } else {
> > > > > > > > +           return NF_ACCEPT;
> > > > > > > > +   }
> > > > > > > > +
> > > > > > > > +   err = nf_ct_nat_execute(skb, ct, ctinfo, action, range, maniptype);
> > > > > > > > +   if (err == NF_ACCEPT && ct->status & IPS_DST_NAT) {
> > > > > > > > +           if (ct->status & IPS_SRC_NAT) {
> > > > > > > > +                   if (maniptype == NF_NAT_MANIP_SRC)
> > > > > > > > +                           maniptype = NF_NAT_MANIP_DST;
> > > > > > > > +                   else
> > > > > > > > +                           maniptype = NF_NAT_MANIP_SRC;
> > > > > > > > +
> > > > > > > > +                   err = nf_ct_nat_execute(skb, ct, ctinfo, action, range,
> > > > > > > > +                                           maniptype);
> > > > > > > > +           } else if (CTINFO2DIR(ctinfo) == IP_CT_DIR_ORIGINAL) {
> > > > > > > > +                   err = nf_ct_nat_execute(skb, ct, ctinfo, action, NULL,
> > > > > > > > +                                           NF_NAT_MANIP_SRC);
> > > > > > > > +           }
> > > > > > > > +   }
> > > > > > > > +   return err;
> > > > > > > > +}
> > > > > > > > +EXPORT_SYMBOL_GPL(nf_ct_nat);
> > > > > > > > diff --git a/net/openvswitch/conntrack.c b/net/openvswitch/conntrack.c
> > > > > > > > index cc643a556ea1..d03c75165663 100644
> > > > > > > > --- a/net/openvswitch/conntrack.c
> > > > > > > > +++ b/net/openvswitch/conntrack.c
> > > > > > > > @@ -726,144 +726,27 @@ static void ovs_nat_update_key(struct sw_flow_key *key,
> > > > > > > >     }
> > > > > > > >  }
> > > > > > > >
> > > > > > > > -/* Modelled after nf_nat_ipv[46]_fn().
> > > > > > > > - * range is only used for new, uninitialized NAT state.
> > > > > > > > - * Returns either NF_ACCEPT or NF_DROP.
> > > > > > > > - */
> > > > > > > > -static int ovs_ct_nat_execute(struct sk_buff *skb, struct nf_conn *ct,
> > > > > > > > -                         enum ip_conntrack_info ctinfo,
> > > > > > > > -                         const struct nf_nat_range2 *range,
> > > > > > > > -                         enum nf_nat_manip_type maniptype, struct sw_flow_key *key)
> > > > > > > > -{
> > > > > > > > -   int hooknum, err = NF_ACCEPT;
> > > > > > > > -
> > > > > > > > -   /* See HOOK2MANIP(). */
> > > > > > > > -   if (maniptype == NF_NAT_MANIP_SRC)
> > > > > > > > -           hooknum = NF_INET_LOCAL_IN; /* Source NAT */
> > > > > > > > -   else
> > > > > > > > -           hooknum = NF_INET_LOCAL_OUT; /* Destination NAT */
> > > > > > > > -
> > > > > > > > -   switch (ctinfo) {
> > > > > > > > -   case IP_CT_RELATED:
> > > > > > > > -   case IP_CT_RELATED_REPLY:
> > > > > > > > -           if (IS_ENABLED(CONFIG_NF_NAT) &&
> > > > > > > > -               skb->protocol == htons(ETH_P_IP) &&
> > > > > > > > -               ip_hdr(skb)->protocol == IPPROTO_ICMP) {
> > > > > > > > -                   if (!nf_nat_icmp_reply_translation(skb, ct, ctinfo,
> > > > > > > > -                                                      hooknum))
> > > > > > > > -                           err = NF_DROP;
> > > > > > > > -                   goto out;
> > > > > > > > -           } else if (IS_ENABLED(CONFIG_IPV6) &&
> > > > > > > > -                      skb->protocol == htons(ETH_P_IPV6)) {
> > > > > > > > -                   __be16 frag_off;
> > > > > > > > -                   u8 nexthdr = ipv6_hdr(skb)->nexthdr;
> > > > > > > > -                   int hdrlen = ipv6_skip_exthdr(skb,
> > > > > > > > -                                                 sizeof(struct ipv6hdr),
> > > > > > > > -                                                 &nexthdr, &frag_off);
> > > > > > > > -
> > > > > > > > -                   if (hdrlen >= 0 && nexthdr == IPPROTO_ICMPV6) {
> > > > > > > > -                           if (!nf_nat_icmpv6_reply_translation(skb, ct,
> > > > > > > > -                                                                ctinfo,
> > > > > > > > -                                                                hooknum,
> > > > > > > > -                                                                hdrlen))
> > > > > > > > -                                   err = NF_DROP;
> > > > > > > > -                           goto out;
> > > > > > > > -                   }
> > > > > > > > -           }
> > > > > > > > -           /* Non-ICMP, fall thru to initialize if needed. */
> > > > > > > > -           fallthrough;
> > > > > > > > -   case IP_CT_NEW:
> > > > > > > > -           /* Seen it before?  This can happen for loopback, retrans,
> > > > > > > > -            * or local packets.
> > > > > > > > -            */
> > > > > > > > -           if (!nf_nat_initialized(ct, maniptype)) {
> > > > > > > > -                   /* Initialize according to the NAT action. */
> > > > > > > > -                   err = (range && range->flags & NF_NAT_RANGE_MAP_IPS)
> > > > > > > > -                           /* Action is set up to establish a new
> > > > > > > > -                            * mapping.
> > > > > > > > -                            */
> > > > > > > > -                           ? nf_nat_setup_info(ct, range, maniptype)
> > > > > > > > -                           : nf_nat_alloc_null_binding(ct, hooknum);
> > > > > > > > -                   if (err != NF_ACCEPT)
> > > > > > > > -                           goto out;
> > > > > > > > -           }
> > > > > > > > -           break;
> > > > > > > > -
> > > > > > > > -   case IP_CT_ESTABLISHED:
> > > > > > > > -   case IP_CT_ESTABLISHED_REPLY:
> > > > > > > > -           break;
> > > > > > > > -
> > > > > > > > -   default:
> > > > > > > > -           err = NF_DROP;
> > > > > > > > -           goto out;
> > > > > > > > -   }
> > > > > > > > -
> > > > > > > > -   err = nf_nat_packet(ct, ctinfo, hooknum, skb);
> > > > > > > > -out:
> > > > > > > > -   /* Update the flow key if NAT successful. */
> > > > > > > > -   if (err == NF_ACCEPT)
> > > > > > > > -           ovs_nat_update_key(key, skb, maniptype);
> > > > > > > > -
> > > > > > > > -   return err;
> > > > > > > > -}
> > > > > > > > -
> > > > > > > >  /* Returns NF_DROP if the packet should be dropped, NF_ACCEPT otherwise. */
> > > > > > > >  static int ovs_ct_nat(struct net *net, struct sw_flow_key *key,
> > > > > > > >                   const struct ovs_conntrack_info *info,
> > > > > > > >                   struct sk_buff *skb, struct nf_conn *ct,
> > > > > > > >                   enum ip_conntrack_info ctinfo)
> > > > > > > >  {
> > > > > > > > -   enum nf_nat_manip_type maniptype;
> > > > > > > > -   int err;
> > > > > > > > +   int err, action = 0;
> > > > > > > >
> > > > > > > >     if (!(info->nat & OVS_CT_NAT))
> > > > > > > >             return NF_ACCEPT;
> > > > > > > > +   if (info->nat & OVS_CT_SRC_NAT)
> > > > > > > > +           action |= (1 << NF_NAT_MANIP_SRC);
> > > > > > > > +   if (info->nat & OVS_CT_DST_NAT)
> > > > > > > > +           action |= (1 << NF_NAT_MANIP_DST);
> > > > > > >
> > > > > > > I'm wondering why this dance at this level with supporting multiple
> > > > > > > MANIPs while actually only one can be used at a time.
> > > > > > >
> > > > > > > act_ct will reject an action using both:
> > > > > > >         if ((p->ct_action & TCA_CT_ACT_NAT_SRC) &&
> > > > > > >             (p->ct_action & TCA_CT_ACT_NAT_DST)) {
> > > > > > >                 NL_SET_ERR_MSG_MOD(extack, "dnat and snat can't be enabled at the same time");
> > > > > > >                 return -EOPNOTSUPP;
> > > > > > >
> > > > > > > I couldn't find this kind of check in ovs code right now (didn't look much, I
> > > > > > > confess :)), but even the code here was already doing:
> > > > > > >
> > > > > > > -     } else if (info->nat & OVS_CT_SRC_NAT) {
> > > > > > > -             maniptype = NF_NAT_MANIP_SRC;
> > > > > > > -     } else if (info->nat & OVS_CT_DST_NAT) {
> > > > > > > -             maniptype = NF_NAT_MANIP_DST;
> > > > > > >
> > > > > > > And in case of tuple conflict, maniptype will be forcibly updated in
> > > > > > > [*] below.
> > > > > >
> > > > > > Ahh.. just found why.. in case of typle conflict, nf_ct_nat() now may
> > > > > > set both.
> > > > > Right.
> > > > > BTW. The tuple conflict processing actually has provided the
> > > > > code to do full nat (snat and dnat at the same time), which I
> > > > > think is more efficient than doing full nat in two zones. All
> > > > > we have to do is adjust a few lines of code for that.
> > > >
> > > > In this part, yes. But it needs some surgery all around for full
> > > > support. The code in ovs kernel for using only one type starts here:
> > > >
> > > > static int parse_nat(const struct nlattr *attr,
> > > >                      struct ovs_conntrack_info *info, bool log)
> > > > {
> > > > ...
> > > >                 switch (type) {
> > > >                 case OVS_NAT_ATTR_SRC:
> > > >                 case OVS_NAT_ATTR_DST:
> > > >                         if (info->nat) {
> > > >                                 OVS_NLERR(log, "Only one type of NAT may be specified");
> > > >                                 return -ERANGE;
> > > >                         }
> > > > ...
> > > > }
> > > >
> > > > So vswitchd also doesn't support it. And then tc, iproute and drivers
> > > > that offload it as well. Not sure if it has impacts on openflow too.
> > > >
> > > not in one single action, but two actions:
> >
> > Oh.
> >
> > >
> > > "table=1, in_port=veth1,tcp,tcp_dst=2121,ct_state=+trk+new
> > > actions=ct(nat(dst=7.7.16.3)),ct(commit, nat(src=7.7.16.1),
> > > alg=ftp),veth2"
> > >
> > > as long as it allows the 1st one doesn't commit, which is a simple
> > > check in parse_nat().
> > > I tested it, TC already supports it. I'm not sure about drivers, but I
> >
> > There's an outstanding issue with act_ct that it may reuse an old
> > CT cache. Fixing it could (I'm not sure) impact this use case:
> >
> > https://bugzilla.redhat.com/show_bug.cgi?id=2099220
> > same issue in ovs was fixed in
> > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=2061ecfdf2350994e5b61c43e50e98a7a70e95ee
> >
> > (please don't ask me who would NAT and then overwrite IP addresses and
> > then NAT it again :D)
> I thought only traditional NAT would change IP, I'm too naive.
> 
> nftables names this as "stateless NAT."
> With two CTs in the same zone for full nat is more close to the
> netfilter's NAT processing (the same CT goes from prerouting to
> postrouting).
> Now I'm wondering how nftables handles the stateful NAT and stateless
> NAT at the same time.

Me too.

> 
> >
> > > think it should be, as with two actions.
> >
> > mlx5 at least currently doesn't support it. Good that tc already
> > supports it, then. Maybe drivers can get up to speed later on.
> So because mlx5 currently only supports one ct action in one tc rule
> or something?

Not sure what you meant here?

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [ovs-dev] [PATCHv2 net-next 5/5] net: move the nat function to nf_nat_ovs for ovs and tc
  2022-11-23 21:21                 ` Marcelo Ricardo Leitner
@ 2022-11-23 21:34                   ` Xin Long
  2022-12-01 21:37                     ` Marcelo Ricardo Leitner
  2022-11-24 10:00                   ` Pablo Neira Ayuso
  1 sibling, 1 reply; 23+ messages in thread
From: Xin Long @ 2022-11-23 21:34 UTC (permalink / raw)
  To: Marcelo Ricardo Leitner
  Cc: dev, ovs-dev, Davide Caratti, Jiri Pirko, network dev,
	Paul Blakey, Florian Westphal, Jamal Hadi Salim, Ilya Maximets,
	Eric Dumazet, Cong Wang, kuba, Paolo Abeni, davem,
	Pablo Neira Ayuso

On Wed, Nov 23, 2022 at 4:21 PM Marcelo Ricardo Leitner
<marcelo.leitner@gmail.com> wrote:
>
> On Wed, Nov 23, 2022 at 02:55:05PM -0500, Xin Long wrote:
> > On Wed, Nov 23, 2022 at 2:17 PM Marcelo Ricardo Leitner
> > <marcelo.leitner@gmail.com> wrote:
> > >
> > > On Wed, Nov 23, 2022 at 01:54:41PM -0500, Xin Long wrote:
> > > > On Wed, Nov 23, 2022 at 1:48 PM Marcelo Ricardo Leitner
> > > > <marcelo.leitner@gmail.com> wrote:
> > > > >
> > > > > On Wed, Nov 23, 2022 at 12:31:38PM -0500, Xin Long wrote:
> > > > > > On Wed, Nov 23, 2022 at 10:13 AM Marcelo Ricardo Leitner
> > > > > > <marcelo.leitner@gmail.com> wrote:
> > > > > > >
> > > > > > > On Wed, Nov 23, 2022 at 12:09:55PM -0300, Marcelo Ricardo Leitner wrote:
> > > > > > > > On Tue, Nov 22, 2022 at 12:32:21PM -0500, Xin Long wrote:
> > > > > > > > > +int nf_ct_nat(struct sk_buff *skb, struct nf_conn *ct,
> > > > > > > > > +         enum ip_conntrack_info ctinfo, int *action,
> > > > > > > > > +         const struct nf_nat_range2 *range, bool commit)
> > > > > > > > > +{
> > > > > > > > > +   enum nf_nat_manip_type maniptype;
> > > > > > > > > +   int err, ct_action = *action;
> > > > > > > > > +
> > > > > > > > > +   *action = 0;
> > > > > > > > > +
> > > > > > > > > +   /* Add NAT extension if not confirmed yet. */
> > > > > > > > > +   if (!nf_ct_is_confirmed(ct) && !nf_ct_nat_ext_add(ct))
> > > > > > > > > +           return NF_ACCEPT;   /* Can't NAT. */
> > > > > > > > > +
> > > > > > > > > +   if (ctinfo != IP_CT_NEW && (ct->status & IPS_NAT_MASK) &&
> > > > > > > > > +       (ctinfo != IP_CT_RELATED || commit)) {
> > > > > > > > > +           /* NAT an established or related connection like before. */
> > > > > > > > > +           if (CTINFO2DIR(ctinfo) == IP_CT_DIR_REPLY)
> > > > > > > > > +                   /* This is the REPLY direction for a connection
> > > > > > > > > +                    * for which NAT was applied in the forward
> > > > > > > > > +                    * direction.  Do the reverse NAT.
> > > > > > > > > +                    */
> > > > > > > > > +                   maniptype = ct->status & IPS_SRC_NAT
> > > > > > > > > +                           ? NF_NAT_MANIP_DST : NF_NAT_MANIP_SRC;
> > > > > > > > > +           else
> > > > > > > > > +                   maniptype = ct->status & IPS_SRC_NAT
> > > > > > > > > +                           ? NF_NAT_MANIP_SRC : NF_NAT_MANIP_DST;
> > > > > > > > > +   } else if (ct_action & (1 << NF_NAT_MANIP_SRC)) {
> > > > > > > > > +           maniptype = NF_NAT_MANIP_SRC;
> > > > > > > > > +   } else if (ct_action & (1 << NF_NAT_MANIP_DST)) {
> > > > > > > > > +           maniptype = NF_NAT_MANIP_DST;
> > > > > > > > > +   } else {
> > > > > > > > > +           return NF_ACCEPT;
> > > > > > > > > +   }
> > > > > > > > > +
> > > > > > > > > +   err = nf_ct_nat_execute(skb, ct, ctinfo, action, range, maniptype);
> > > > > > > > > +   if (err == NF_ACCEPT && ct->status & IPS_DST_NAT) {
> > > > > > > > > +           if (ct->status & IPS_SRC_NAT) {
> > > > > > > > > +                   if (maniptype == NF_NAT_MANIP_SRC)
> > > > > > > > > +                           maniptype = NF_NAT_MANIP_DST;
> > > > > > > > > +                   else
> > > > > > > > > +                           maniptype = NF_NAT_MANIP_SRC;
> > > > > > > > > +
> > > > > > > > > +                   err = nf_ct_nat_execute(skb, ct, ctinfo, action, range,
> > > > > > > > > +                                           maniptype);
> > > > > > > > > +           } else if (CTINFO2DIR(ctinfo) == IP_CT_DIR_ORIGINAL) {
> > > > > > > > > +                   err = nf_ct_nat_execute(skb, ct, ctinfo, action, NULL,
> > > > > > > > > +                                           NF_NAT_MANIP_SRC);
> > > > > > > > > +           }
> > > > > > > > > +   }
> > > > > > > > > +   return err;
> > > > > > > > > +}
> > > > > > > > > +EXPORT_SYMBOL_GPL(nf_ct_nat);
> > > > > > > > > diff --git a/net/openvswitch/conntrack.c b/net/openvswitch/conntrack.c
> > > > > > > > > index cc643a556ea1..d03c75165663 100644
> > > > > > > > > --- a/net/openvswitch/conntrack.c
> > > > > > > > > +++ b/net/openvswitch/conntrack.c
> > > > > > > > > @@ -726,144 +726,27 @@ static void ovs_nat_update_key(struct sw_flow_key *key,
> > > > > > > > >     }
> > > > > > > > >  }
> > > > > > > > >
> > > > > > > > > -/* Modelled after nf_nat_ipv[46]_fn().
> > > > > > > > > - * range is only used for new, uninitialized NAT state.
> > > > > > > > > - * Returns either NF_ACCEPT or NF_DROP.
> > > > > > > > > - */
> > > > > > > > > -static int ovs_ct_nat_execute(struct sk_buff *skb, struct nf_conn *ct,
> > > > > > > > > -                         enum ip_conntrack_info ctinfo,
> > > > > > > > > -                         const struct nf_nat_range2 *range,
> > > > > > > > > -                         enum nf_nat_manip_type maniptype, struct sw_flow_key *key)
> > > > > > > > > -{
> > > > > > > > > -   int hooknum, err = NF_ACCEPT;
> > > > > > > > > -
> > > > > > > > > -   /* See HOOK2MANIP(). */
> > > > > > > > > -   if (maniptype == NF_NAT_MANIP_SRC)
> > > > > > > > > -           hooknum = NF_INET_LOCAL_IN; /* Source NAT */
> > > > > > > > > -   else
> > > > > > > > > -           hooknum = NF_INET_LOCAL_OUT; /* Destination NAT */
> > > > > > > > > -
> > > > > > > > > -   switch (ctinfo) {
> > > > > > > > > -   case IP_CT_RELATED:
> > > > > > > > > -   case IP_CT_RELATED_REPLY:
> > > > > > > > > -           if (IS_ENABLED(CONFIG_NF_NAT) &&
> > > > > > > > > -               skb->protocol == htons(ETH_P_IP) &&
> > > > > > > > > -               ip_hdr(skb)->protocol == IPPROTO_ICMP) {
> > > > > > > > > -                   if (!nf_nat_icmp_reply_translation(skb, ct, ctinfo,
> > > > > > > > > -                                                      hooknum))
> > > > > > > > > -                           err = NF_DROP;
> > > > > > > > > -                   goto out;
> > > > > > > > > -           } else if (IS_ENABLED(CONFIG_IPV6) &&
> > > > > > > > > -                      skb->protocol == htons(ETH_P_IPV6)) {
> > > > > > > > > -                   __be16 frag_off;
> > > > > > > > > -                   u8 nexthdr = ipv6_hdr(skb)->nexthdr;
> > > > > > > > > -                   int hdrlen = ipv6_skip_exthdr(skb,
> > > > > > > > > -                                                 sizeof(struct ipv6hdr),
> > > > > > > > > -                                                 &nexthdr, &frag_off);
> > > > > > > > > -
> > > > > > > > > -                   if (hdrlen >= 0 && nexthdr == IPPROTO_ICMPV6) {
> > > > > > > > > -                           if (!nf_nat_icmpv6_reply_translation(skb, ct,
> > > > > > > > > -                                                                ctinfo,
> > > > > > > > > -                                                                hooknum,
> > > > > > > > > -                                                                hdrlen))
> > > > > > > > > -                                   err = NF_DROP;
> > > > > > > > > -                           goto out;
> > > > > > > > > -                   }
> > > > > > > > > -           }
> > > > > > > > > -           /* Non-ICMP, fall thru to initialize if needed. */
> > > > > > > > > -           fallthrough;
> > > > > > > > > -   case IP_CT_NEW:
> > > > > > > > > -           /* Seen it before?  This can happen for loopback, retrans,
> > > > > > > > > -            * or local packets.
> > > > > > > > > -            */
> > > > > > > > > -           if (!nf_nat_initialized(ct, maniptype)) {
> > > > > > > > > -                   /* Initialize according to the NAT action. */
> > > > > > > > > -                   err = (range && range->flags & NF_NAT_RANGE_MAP_IPS)
> > > > > > > > > -                           /* Action is set up to establish a new
> > > > > > > > > -                            * mapping.
> > > > > > > > > -                            */
> > > > > > > > > -                           ? nf_nat_setup_info(ct, range, maniptype)
> > > > > > > > > -                           : nf_nat_alloc_null_binding(ct, hooknum);
> > > > > > > > > -                   if (err != NF_ACCEPT)
> > > > > > > > > -                           goto out;
> > > > > > > > > -           }
> > > > > > > > > -           break;
> > > > > > > > > -
> > > > > > > > > -   case IP_CT_ESTABLISHED:
> > > > > > > > > -   case IP_CT_ESTABLISHED_REPLY:
> > > > > > > > > -           break;
> > > > > > > > > -
> > > > > > > > > -   default:
> > > > > > > > > -           err = NF_DROP;
> > > > > > > > > -           goto out;
> > > > > > > > > -   }
> > > > > > > > > -
> > > > > > > > > -   err = nf_nat_packet(ct, ctinfo, hooknum, skb);
> > > > > > > > > -out:
> > > > > > > > > -   /* Update the flow key if NAT successful. */
> > > > > > > > > -   if (err == NF_ACCEPT)
> > > > > > > > > -           ovs_nat_update_key(key, skb, maniptype);
> > > > > > > > > -
> > > > > > > > > -   return err;
> > > > > > > > > -}
> > > > > > > > > -
> > > > > > > > >  /* Returns NF_DROP if the packet should be dropped, NF_ACCEPT otherwise. */
> > > > > > > > >  static int ovs_ct_nat(struct net *net, struct sw_flow_key *key,
> > > > > > > > >                   const struct ovs_conntrack_info *info,
> > > > > > > > >                   struct sk_buff *skb, struct nf_conn *ct,
> > > > > > > > >                   enum ip_conntrack_info ctinfo)
> > > > > > > > >  {
> > > > > > > > > -   enum nf_nat_manip_type maniptype;
> > > > > > > > > -   int err;
> > > > > > > > > +   int err, action = 0;
> > > > > > > > >
> > > > > > > > >     if (!(info->nat & OVS_CT_NAT))
> > > > > > > > >             return NF_ACCEPT;
> > > > > > > > > +   if (info->nat & OVS_CT_SRC_NAT)
> > > > > > > > > +           action |= (1 << NF_NAT_MANIP_SRC);
> > > > > > > > > +   if (info->nat & OVS_CT_DST_NAT)
> > > > > > > > > +           action |= (1 << NF_NAT_MANIP_DST);
> > > > > > > >
> > > > > > > > I'm wondering why this dance at this level with supporting multiple
> > > > > > > > MANIPs while actually only one can be used at a time.
> > > > > > > >
> > > > > > > > act_ct will reject an action using both:
> > > > > > > >         if ((p->ct_action & TCA_CT_ACT_NAT_SRC) &&
> > > > > > > >             (p->ct_action & TCA_CT_ACT_NAT_DST)) {
> > > > > > > >                 NL_SET_ERR_MSG_MOD(extack, "dnat and snat can't be enabled at the same time");
> > > > > > > >                 return -EOPNOTSUPP;
> > > > > > > >
> > > > > > > > I couldn't find this kind of check in ovs code right now (didn't look much, I
> > > > > > > > confess :)), but even the code here was already doing:
> > > > > > > >
> > > > > > > > -     } else if (info->nat & OVS_CT_SRC_NAT) {
> > > > > > > > -             maniptype = NF_NAT_MANIP_SRC;
> > > > > > > > -     } else if (info->nat & OVS_CT_DST_NAT) {
> > > > > > > > -             maniptype = NF_NAT_MANIP_DST;
> > > > > > > >
> > > > > > > > And in case of tuple conflict, maniptype will be forcibly updated in
> > > > > > > > [*] below.
> > > > > > >
> > > > > > > Ahh.. just found why.. in case of typle conflict, nf_ct_nat() now may
> > > > > > > set both.
> > > > > > Right.
> > > > > > BTW. The tuple conflict processing actually has provided the
> > > > > > code to do full nat (snat and dnat at the same time), which I
> > > > > > think is more efficient than doing full nat in two zones. All
> > > > > > we have to do is adjust a few lines of code for that.
> > > > >
> > > > > In this part, yes. But it needs some surgery all around for full
> > > > > support. The code in ovs kernel for using only one type starts here:
> > > > >
> > > > > static int parse_nat(const struct nlattr *attr,
> > > > >                      struct ovs_conntrack_info *info, bool log)
> > > > > {
> > > > > ...
> > > > >                 switch (type) {
> > > > >                 case OVS_NAT_ATTR_SRC:
> > > > >                 case OVS_NAT_ATTR_DST:
> > > > >                         if (info->nat) {
> > > > >                                 OVS_NLERR(log, "Only one type of NAT may be specified");
> > > > >                                 return -ERANGE;
> > > > >                         }
> > > > > ...
> > > > > }
> > > > >
> > > > > So vswitchd also doesn't support it. And then tc, iproute and drivers
> > > > > that offload it as well. Not sure if it has impacts on openflow too.
> > > > >
> > > > not in one single action, but two actions:
> > >
> > > Oh.
> > >
> > > >
> > > > "table=1, in_port=veth1,tcp,tcp_dst=2121,ct_state=+trk+new
> > > > actions=ct(nat(dst=7.7.16.3)),ct(commit, nat(src=7.7.16.1),
> > > > alg=ftp),veth2"
> > > >
> > > > as long as it allows the 1st one doesn't commit, which is a simple
> > > > check in parse_nat().
> > > > I tested it, TC already supports it. I'm not sure about drivers, but I
> > >
> > > There's an outstanding issue with act_ct that it may reuse an old
> > > CT cache. Fixing it could (I'm not sure) impact this use case:
> > >
> > > https://bugzilla.redhat.com/show_bug.cgi?id=2099220
> > > same issue in ovs was fixed in
> > > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=2061ecfdf2350994e5b61c43e50e98a7a70e95ee
> > >
> > > (please don't ask me who would NAT and then overwrite IP addresses and
> > > then NAT it again :D)
> > I thought only traditional NAT would change IP, I'm too naive.
> >
> > nftables names this as "stateless NAT."
> > With two CTs in the same zone for full nat is more close to the
> > netfilter's NAT processing (the same CT goes from prerouting to
> > postrouting).
> > Now I'm wondering how nftables handles the stateful NAT and stateless
> > NAT at the same time.
>
> Me too.
>
> >
> > >
> > > > think it should be, as with two actions.
> > >
> > > mlx5 at least currently doesn't support it. Good that tc already
> > > supports it, then. Maybe drivers can get up to speed later on.
> > So because mlx5 currently only supports one ct action in one tc rule
> > or something?
>
> Not sure what you meant here?
Sorry. that might be a silly question, I don't know much about TC
HW offload, and just curious why such a rule can not be offloaded
in mlx5 driver:

"actions=ct(nat(dst=7.7.16.3)),ct(commit, nat(src=7.7.16.1)),veth2"

it doesn't support any tc NAT or just does not support tc full NAT HW offload?

Thanks.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [ovs-dev] [PATCHv2 net-next 5/5] net: move the nat function to nf_nat_ovs for ovs and tc
  2022-11-23 21:21                 ` Marcelo Ricardo Leitner
  2022-11-23 21:34                   ` Xin Long
@ 2022-11-24 10:00                   ` Pablo Neira Ayuso
  1 sibling, 0 replies; 23+ messages in thread
From: Pablo Neira Ayuso @ 2022-11-24 10:00 UTC (permalink / raw)
  To: Marcelo Ricardo Leitner
  Cc: Xin Long, dev, ovs-dev, Davide Caratti, Jiri Pirko, network dev,
	Paul Blakey, Florian Westphal, Jamal Hadi Salim, Ilya Maximets,
	Eric Dumazet, Cong Wang, kuba, Paolo Abeni, davem

On Wed, Nov 23, 2022 at 06:21:15PM -0300, Marcelo Ricardo Leitner wrote:
> On Wed, Nov 23, 2022 at 02:55:05PM -0500, Xin Long wrote:
> > On Wed, Nov 23, 2022 at 2:17 PM Marcelo Ricardo Leitner
> > <marcelo.leitner@gmail.com> wrote:
[...]
> > > > "table=1, in_port=veth1,tcp,tcp_dst=2121,ct_state=+trk+new
> > > > actions=ct(nat(dst=7.7.16.3)),ct(commit, nat(src=7.7.16.1),
> > > > alg=ftp),veth2"
> > > >
> > > > as long as it allows the 1st one doesn't commit, which is a simple
> > > > check in parse_nat().
> > > > I tested it, TC already supports it. I'm not sure about drivers, but I
> > >
> > > There's an outstanding issue with act_ct that it may reuse an old
> > > CT cache. Fixing it could (I'm not sure) impact this use case:
> > >
> > > https://bugzilla.redhat.com/show_bug.cgi?id=2099220
> > > same issue in ovs was fixed in
> > > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=2061ecfdf2350994e5b61c43e50e98a7a70e95ee
> > >
> > > (please don't ask me who would NAT and then overwrite IP addresses and
> > > then NAT it again :D)
> > I thought only traditional NAT would change IP, I'm too naive.
> > 
> > nftables names this as "stateless NAT."
> > With two CTs in the same zone for full nat is more close to the
> > netfilter's NAT processing (the same CT goes from prerouting to
> > postrouting).
> > Now I'm wondering how nftables handles the stateful NAT and stateless
> > NAT at the same time.
> 
> Me too.

There is a 'notrack' action to skip connection tracking for the flows
where the user needs stateless NAT.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCHv2 net-next 5/5] net: move the nat function to nf_nat_ovs for ovs and tc
  2022-11-23 18:52   ` Marcelo Ricardo Leitner
@ 2022-12-01 16:26     ` Xin Long
  0 siblings, 0 replies; 23+ messages in thread
From: Xin Long @ 2022-12-01 16:26 UTC (permalink / raw)
  To: Marcelo Ricardo Leitner
  Cc: network dev, dev, ovs-dev, davem, kuba, Eric Dumazet,
	Paolo Abeni, Pravin B Shelar, Jamal Hadi Salim, Cong Wang,
	Jiri Pirko, Pablo Neira Ayuso, Florian Westphal, Davide Caratti,
	Oz Shlomo, Paul Blakey, Ilya Maximets, Eelco Chaudron,
	Aaron Conole, netfilter-devel

On Wed, Nov 23, 2022 at 1:52 PM Marcelo Ricardo Leitner
<marcelo.leitner@gmail.com> wrote:
>
> On Tue, Nov 22, 2022 at 12:32:21PM -0500, Xin Long wrote:
> > --- a/net/netfilter/Makefile
> > +++ b/net/netfilter/Makefile
> > @@ -52,7 +52,7 @@ obj-$(CONFIG_NF_CONNTRACK_SANE) += nf_conntrack_sane.o
> >  obj-$(CONFIG_NF_CONNTRACK_SIP) += nf_conntrack_sip.o
> >  obj-$(CONFIG_NF_CONNTRACK_TFTP) += nf_conntrack_tftp.o
> >
> > -nf_nat-y     := nf_nat_core.o nf_nat_proto.o nf_nat_helper.o
> > +nf_nat-y     := nf_nat_core.o nf_nat_proto.o nf_nat_helper.o nf_nat_ovs.o
>
> Considering that the code in nf_nat_ovs is only used if ovs or act_ct
> are enabled, shouldn't it be using an invisible knob here that gets
> automatically selected by them? Pablo?
It's good to not build it if no place is using it.
Making nf_nat_ovs a module might be too much.
If it's okay to Netfilter devel, I will change to use "ifdef" in Makefile:

+ifdef CONFIG_OPENVSWITCH
+nf_nat-y += nf_nat_ovs.o
+else ifdef CONFIG_NET_ACT_CT
+nf_nat-y += nf_nat_ovs.o
+endif
+

Thanks.

>
> I think this is my last comment on this series. The rest LGTM.
>
> >
> >  obj-$(CONFIG_NF_LOG_SYSLOG) += nf_log_syslog.o
> >

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCHv2 net-next 3/5] net: sched: return NF_ACCEPT when fails to add nat ext in tcf_ct_act_nat
  2022-11-23 14:23   ` Marcelo Ricardo Leitner
@ 2022-12-01 16:53     ` Xin Long
  0 siblings, 0 replies; 23+ messages in thread
From: Xin Long @ 2022-12-01 16:53 UTC (permalink / raw)
  To: Marcelo Ricardo Leitner, Jarno Rajahalme
  Cc: network dev, dev, ovs-dev, davem, kuba, Eric Dumazet,
	Paolo Abeni, Pravin B Shelar, Jamal Hadi Salim, Cong Wang,
	Jiri Pirko, Pablo Neira Ayuso, Florian Westphal, Davide Caratti,
	Oz Shlomo, Paul Blakey, Ilya Maximets, Eelco Chaudron,
	Aaron Conole

On Wed, Nov 23, 2022 at 9:24 AM Marcelo Ricardo Leitner
<marcelo.leitner@gmail.com> wrote:
>
> On Tue, Nov 22, 2022 at 12:32:19PM -0500, Xin Long wrote:
> > This patch changes to return NF_ACCEPT when fails to add nat
> > ext before doing NAT in tcf_ct_act_nat(), to keep consistent
> > with OVS' processing in ovs_ct_nat().
> >
> > Reviewed-by: Saeed Mahameed <saeed@kernel.org>
> > Signed-off-by: Xin Long <lucien.xin@gmail.com>
> > ---
> >  net/sched/act_ct.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/net/sched/act_ct.c b/net/sched/act_ct.c
> > index da0b7f665277..8869b3ef6642 100644
> > --- a/net/sched/act_ct.c
> > +++ b/net/sched/act_ct.c
> > @@ -994,7 +994,7 @@ static int tcf_ct_act_nat(struct sk_buff *skb,
> >
> >       /* Add NAT extension if not confirmed yet. */
> >       if (!nf_ct_is_confirmed(ct) && !nf_ct_nat_ext_add(ct))
> > -             return NF_DROP;   /* Can't NAT. */
> > +             return NF_ACCEPT;   /* Can't NAT. */
>
> I'm wondering if the fix should actually be in OVS, to make it drop
> the packet? Aaron, Eelco?
>
> If the user asked for NAT, and it can't NAT, it doesn't seem right to
> forward the packet while not performing the asked action.
>
> If we follow the code here, it may even commit the entry without the
> NAT extension, rendering the connection useless/broken per the first
> if condition above. It just won't try again.
nf_ct_nat_ext_add() returning NULL is caused by krealloc() failure
like an -ENOMEM error, and a similar thing could happen in
nfct_seqadj_ext_add() called by ovs_ct_nat() -> nf_nat_setup_info()
when doing NAT where it returns DROP. So I think it's right that
we should fix this in openvswitch instead of TC.

Anyway, in ovs_ct_nat():

        if (!nf_ct_is_confirmed(ct) && !nf_ct_nat_ext_add(ct))
                return NF_ACCEPT;   /* Can't NAT. */

git blame shows this was added at the beginning by:

    05752523e565 ("openvswitch: Interface with NAT.")

So add Jarno Rajahalme to Cc: list.

Thanks.

>
> >
> >       if (ctinfo != IP_CT_NEW && (ct->status & IPS_NAT_MASK) &&
> >           (ctinfo != IP_CT_RELATED || commit)) {
> > --
> > 2.31.1
> >

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [ovs-dev] [PATCHv2 net-next 5/5] net: move the nat function to nf_nat_ovs for ovs and tc
  2022-11-23 21:34                   ` Xin Long
@ 2022-12-01 21:37                     ` Marcelo Ricardo Leitner
  0 siblings, 0 replies; 23+ messages in thread
From: Marcelo Ricardo Leitner @ 2022-12-01 21:37 UTC (permalink / raw)
  To: Xin Long
  Cc: dev, ovs-dev, Davide Caratti, Jiri Pirko, network dev,
	Paul Blakey, Florian Westphal, Jamal Hadi Salim, Ilya Maximets,
	Eric Dumazet, Cong Wang, kuba, Paolo Abeni, davem,
	Pablo Neira Ayuso

On Wed, Nov 23, 2022 at 04:34:45PM -0500, Xin Long wrote:
> On Wed, Nov 23, 2022 at 4:21 PM Marcelo Ricardo Leitner
> <marcelo.leitner@gmail.com> wrote:
> >
> > On Wed, Nov 23, 2022 at 02:55:05PM -0500, Xin Long wrote:
> > > On Wed, Nov 23, 2022 at 2:17 PM Marcelo Ricardo Leitner
> > > <marcelo.leitner@gmail.com> wrote:
> > > >
> > > > On Wed, Nov 23, 2022 at 01:54:41PM -0500, Xin Long wrote:
> > > > > On Wed, Nov 23, 2022 at 1:48 PM Marcelo Ricardo Leitner
> > > > > <marcelo.leitner@gmail.com> wrote:
> > > > > >
> > > > > > On Wed, Nov 23, 2022 at 12:31:38PM -0500, Xin Long wrote:
> > > > > > > On Wed, Nov 23, 2022 at 10:13 AM Marcelo Ricardo Leitner
> > > > > > > <marcelo.leitner@gmail.com> wrote:
> > > > > > > >
> > > > > > > > On Wed, Nov 23, 2022 at 12:09:55PM -0300, Marcelo Ricardo Leitner wrote:
> > > > > > > > > On Tue, Nov 22, 2022 at 12:32:21PM -0500, Xin Long wrote:
> > > > > > > > > > +int nf_ct_nat(struct sk_buff *skb, struct nf_conn *ct,
> > > > > > > > > > +         enum ip_conntrack_info ctinfo, int *action,
> > > > > > > > > > +         const struct nf_nat_range2 *range, bool commit)
> > > > > > > > > > +{
> > > > > > > > > > +   enum nf_nat_manip_type maniptype;
> > > > > > > > > > +   int err, ct_action = *action;
> > > > > > > > > > +
> > > > > > > > > > +   *action = 0;
> > > > > > > > > > +
> > > > > > > > > > +   /* Add NAT extension if not confirmed yet. */
> > > > > > > > > > +   if (!nf_ct_is_confirmed(ct) && !nf_ct_nat_ext_add(ct))
> > > > > > > > > > +           return NF_ACCEPT;   /* Can't NAT. */
> > > > > > > > > > +
> > > > > > > > > > +   if (ctinfo != IP_CT_NEW && (ct->status & IPS_NAT_MASK) &&
> > > > > > > > > > +       (ctinfo != IP_CT_RELATED || commit)) {
> > > > > > > > > > +           /* NAT an established or related connection like before. */
> > > > > > > > > > +           if (CTINFO2DIR(ctinfo) == IP_CT_DIR_REPLY)
> > > > > > > > > > +                   /* This is the REPLY direction for a connection
> > > > > > > > > > +                    * for which NAT was applied in the forward
> > > > > > > > > > +                    * direction.  Do the reverse NAT.
> > > > > > > > > > +                    */
> > > > > > > > > > +                   maniptype = ct->status & IPS_SRC_NAT
> > > > > > > > > > +                           ? NF_NAT_MANIP_DST : NF_NAT_MANIP_SRC;
> > > > > > > > > > +           else
> > > > > > > > > > +                   maniptype = ct->status & IPS_SRC_NAT
> > > > > > > > > > +                           ? NF_NAT_MANIP_SRC : NF_NAT_MANIP_DST;
> > > > > > > > > > +   } else if (ct_action & (1 << NF_NAT_MANIP_SRC)) {
> > > > > > > > > > +           maniptype = NF_NAT_MANIP_SRC;
> > > > > > > > > > +   } else if (ct_action & (1 << NF_NAT_MANIP_DST)) {
> > > > > > > > > > +           maniptype = NF_NAT_MANIP_DST;
> > > > > > > > > > +   } else {
> > > > > > > > > > +           return NF_ACCEPT;
> > > > > > > > > > +   }
> > > > > > > > > > +
> > > > > > > > > > +   err = nf_ct_nat_execute(skb, ct, ctinfo, action, range, maniptype);
> > > > > > > > > > +   if (err == NF_ACCEPT && ct->status & IPS_DST_NAT) {
> > > > > > > > > > +           if (ct->status & IPS_SRC_NAT) {
> > > > > > > > > > +                   if (maniptype == NF_NAT_MANIP_SRC)
> > > > > > > > > > +                           maniptype = NF_NAT_MANIP_DST;
> > > > > > > > > > +                   else
> > > > > > > > > > +                           maniptype = NF_NAT_MANIP_SRC;
> > > > > > > > > > +
> > > > > > > > > > +                   err = nf_ct_nat_execute(skb, ct, ctinfo, action, range,
> > > > > > > > > > +                                           maniptype);
> > > > > > > > > > +           } else if (CTINFO2DIR(ctinfo) == IP_CT_DIR_ORIGINAL) {
> > > > > > > > > > +                   err = nf_ct_nat_execute(skb, ct, ctinfo, action, NULL,
> > > > > > > > > > +                                           NF_NAT_MANIP_SRC);
> > > > > > > > > > +           }
> > > > > > > > > > +   }
> > > > > > > > > > +   return err;
> > > > > > > > > > +}
> > > > > > > > > > +EXPORT_SYMBOL_GPL(nf_ct_nat);
> > > > > > > > > > diff --git a/net/openvswitch/conntrack.c b/net/openvswitch/conntrack.c
> > > > > > > > > > index cc643a556ea1..d03c75165663 100644
> > > > > > > > > > --- a/net/openvswitch/conntrack.c
> > > > > > > > > > +++ b/net/openvswitch/conntrack.c
> > > > > > > > > > @@ -726,144 +726,27 @@ static void ovs_nat_update_key(struct sw_flow_key *key,
> > > > > > > > > >     }
> > > > > > > > > >  }
> > > > > > > > > >
> > > > > > > > > > -/* Modelled after nf_nat_ipv[46]_fn().
> > > > > > > > > > - * range is only used for new, uninitialized NAT state.
> > > > > > > > > > - * Returns either NF_ACCEPT or NF_DROP.
> > > > > > > > > > - */
> > > > > > > > > > -static int ovs_ct_nat_execute(struct sk_buff *skb, struct nf_conn *ct,
> > > > > > > > > > -                         enum ip_conntrack_info ctinfo,
> > > > > > > > > > -                         const struct nf_nat_range2 *range,
> > > > > > > > > > -                         enum nf_nat_manip_type maniptype, struct sw_flow_key *key)
> > > > > > > > > > -{
> > > > > > > > > > -   int hooknum, err = NF_ACCEPT;
> > > > > > > > > > -
> > > > > > > > > > -   /* See HOOK2MANIP(). */
> > > > > > > > > > -   if (maniptype == NF_NAT_MANIP_SRC)
> > > > > > > > > > -           hooknum = NF_INET_LOCAL_IN; /* Source NAT */
> > > > > > > > > > -   else
> > > > > > > > > > -           hooknum = NF_INET_LOCAL_OUT; /* Destination NAT */
> > > > > > > > > > -
> > > > > > > > > > -   switch (ctinfo) {
> > > > > > > > > > -   case IP_CT_RELATED:
> > > > > > > > > > -   case IP_CT_RELATED_REPLY:
> > > > > > > > > > -           if (IS_ENABLED(CONFIG_NF_NAT) &&
> > > > > > > > > > -               skb->protocol == htons(ETH_P_IP) &&
> > > > > > > > > > -               ip_hdr(skb)->protocol == IPPROTO_ICMP) {
> > > > > > > > > > -                   if (!nf_nat_icmp_reply_translation(skb, ct, ctinfo,
> > > > > > > > > > -                                                      hooknum))
> > > > > > > > > > -                           err = NF_DROP;
> > > > > > > > > > -                   goto out;
> > > > > > > > > > -           } else if (IS_ENABLED(CONFIG_IPV6) &&
> > > > > > > > > > -                      skb->protocol == htons(ETH_P_IPV6)) {
> > > > > > > > > > -                   __be16 frag_off;
> > > > > > > > > > -                   u8 nexthdr = ipv6_hdr(skb)->nexthdr;
> > > > > > > > > > -                   int hdrlen = ipv6_skip_exthdr(skb,
> > > > > > > > > > -                                                 sizeof(struct ipv6hdr),
> > > > > > > > > > -                                                 &nexthdr, &frag_off);
> > > > > > > > > > -
> > > > > > > > > > -                   if (hdrlen >= 0 && nexthdr == IPPROTO_ICMPV6) {
> > > > > > > > > > -                           if (!nf_nat_icmpv6_reply_translation(skb, ct,
> > > > > > > > > > -                                                                ctinfo,
> > > > > > > > > > -                                                                hooknum,
> > > > > > > > > > -                                                                hdrlen))
> > > > > > > > > > -                                   err = NF_DROP;
> > > > > > > > > > -                           goto out;
> > > > > > > > > > -                   }
> > > > > > > > > > -           }
> > > > > > > > > > -           /* Non-ICMP, fall thru to initialize if needed. */
> > > > > > > > > > -           fallthrough;
> > > > > > > > > > -   case IP_CT_NEW:
> > > > > > > > > > -           /* Seen it before?  This can happen for loopback, retrans,
> > > > > > > > > > -            * or local packets.
> > > > > > > > > > -            */
> > > > > > > > > > -           if (!nf_nat_initialized(ct, maniptype)) {
> > > > > > > > > > -                   /* Initialize according to the NAT action. */
> > > > > > > > > > -                   err = (range && range->flags & NF_NAT_RANGE_MAP_IPS)
> > > > > > > > > > -                           /* Action is set up to establish a new
> > > > > > > > > > -                            * mapping.
> > > > > > > > > > -                            */
> > > > > > > > > > -                           ? nf_nat_setup_info(ct, range, maniptype)
> > > > > > > > > > -                           : nf_nat_alloc_null_binding(ct, hooknum);
> > > > > > > > > > -                   if (err != NF_ACCEPT)
> > > > > > > > > > -                           goto out;
> > > > > > > > > > -           }
> > > > > > > > > > -           break;
> > > > > > > > > > -
> > > > > > > > > > -   case IP_CT_ESTABLISHED:
> > > > > > > > > > -   case IP_CT_ESTABLISHED_REPLY:
> > > > > > > > > > -           break;
> > > > > > > > > > -
> > > > > > > > > > -   default:
> > > > > > > > > > -           err = NF_DROP;
> > > > > > > > > > -           goto out;
> > > > > > > > > > -   }
> > > > > > > > > > -
> > > > > > > > > > -   err = nf_nat_packet(ct, ctinfo, hooknum, skb);
> > > > > > > > > > -out:
> > > > > > > > > > -   /* Update the flow key if NAT successful. */
> > > > > > > > > > -   if (err == NF_ACCEPT)
> > > > > > > > > > -           ovs_nat_update_key(key, skb, maniptype);
> > > > > > > > > > -
> > > > > > > > > > -   return err;
> > > > > > > > > > -}
> > > > > > > > > > -
> > > > > > > > > >  /* Returns NF_DROP if the packet should be dropped, NF_ACCEPT otherwise. */
> > > > > > > > > >  static int ovs_ct_nat(struct net *net, struct sw_flow_key *key,
> > > > > > > > > >                   const struct ovs_conntrack_info *info,
> > > > > > > > > >                   struct sk_buff *skb, struct nf_conn *ct,
> > > > > > > > > >                   enum ip_conntrack_info ctinfo)
> > > > > > > > > >  {
> > > > > > > > > > -   enum nf_nat_manip_type maniptype;
> > > > > > > > > > -   int err;
> > > > > > > > > > +   int err, action = 0;
> > > > > > > > > >
> > > > > > > > > >     if (!(info->nat & OVS_CT_NAT))
> > > > > > > > > >             return NF_ACCEPT;
> > > > > > > > > > +   if (info->nat & OVS_CT_SRC_NAT)
> > > > > > > > > > +           action |= (1 << NF_NAT_MANIP_SRC);
> > > > > > > > > > +   if (info->nat & OVS_CT_DST_NAT)
> > > > > > > > > > +           action |= (1 << NF_NAT_MANIP_DST);
> > > > > > > > >
> > > > > > > > > I'm wondering why this dance at this level with supporting multiple
> > > > > > > > > MANIPs while actually only one can be used at a time.
> > > > > > > > >
> > > > > > > > > act_ct will reject an action using both:
> > > > > > > > >         if ((p->ct_action & TCA_CT_ACT_NAT_SRC) &&
> > > > > > > > >             (p->ct_action & TCA_CT_ACT_NAT_DST)) {
> > > > > > > > >                 NL_SET_ERR_MSG_MOD(extack, "dnat and snat can't be enabled at the same time");
> > > > > > > > >                 return -EOPNOTSUPP;
> > > > > > > > >
> > > > > > > > > I couldn't find this kind of check in ovs code right now (didn't look much, I
> > > > > > > > > confess :)), but even the code here was already doing:
> > > > > > > > >
> > > > > > > > > -     } else if (info->nat & OVS_CT_SRC_NAT) {
> > > > > > > > > -             maniptype = NF_NAT_MANIP_SRC;
> > > > > > > > > -     } else if (info->nat & OVS_CT_DST_NAT) {
> > > > > > > > > -             maniptype = NF_NAT_MANIP_DST;
> > > > > > > > >
> > > > > > > > > And in case of tuple conflict, maniptype will be forcibly updated in
> > > > > > > > > [*] below.
> > > > > > > >
> > > > > > > > Ahh.. just found why.. in case of typle conflict, nf_ct_nat() now may
> > > > > > > > set both.
> > > > > > > Right.
> > > > > > > BTW. The tuple conflict processing actually has provided the
> > > > > > > code to do full nat (snat and dnat at the same time), which I
> > > > > > > think is more efficient than doing full nat in two zones. All
> > > > > > > we have to do is adjust a few lines of code for that.
> > > > > >
> > > > > > In this part, yes. But it needs some surgery all around for full
> > > > > > support. The code in ovs kernel for using only one type starts here:
> > > > > >
> > > > > > static int parse_nat(const struct nlattr *attr,
> > > > > >                      struct ovs_conntrack_info *info, bool log)
> > > > > > {
> > > > > > ...
> > > > > >                 switch (type) {
> > > > > >                 case OVS_NAT_ATTR_SRC:
> > > > > >                 case OVS_NAT_ATTR_DST:
> > > > > >                         if (info->nat) {
> > > > > >                                 OVS_NLERR(log, "Only one type of NAT may be specified");
> > > > > >                                 return -ERANGE;
> > > > > >                         }
> > > > > > ...
> > > > > > }
> > > > > >
> > > > > > So vswitchd also doesn't support it. And then tc, iproute and drivers
> > > > > > that offload it as well. Not sure if it has impacts on openflow too.
> > > > > >
> > > > > not in one single action, but two actions:
> > > >
> > > > Oh.
> > > >
> > > > >
> > > > > "table=1, in_port=veth1,tcp,tcp_dst=2121,ct_state=+trk+new
> > > > > actions=ct(nat(dst=7.7.16.3)),ct(commit, nat(src=7.7.16.1),
> > > > > alg=ftp),veth2"
> > > > >
> > > > > as long as it allows the 1st one doesn't commit, which is a simple
> > > > > check in parse_nat().
> > > > > I tested it, TC already supports it. I'm not sure about drivers, but I
> > > >
> > > > There's an outstanding issue with act_ct that it may reuse an old
> > > > CT cache. Fixing it could (I'm not sure) impact this use case:
> > > >
> > > > https://bugzilla.redhat.com/show_bug.cgi?id=2099220
> > > > same issue in ovs was fixed in
> > > > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=2061ecfdf2350994e5b61c43e50e98a7a70e95ee
> > > >
> > > > (please don't ask me who would NAT and then overwrite IP addresses and
> > > > then NAT it again :D)
> > > I thought only traditional NAT would change IP, I'm too naive.
> > >
> > > nftables names this as "stateless NAT."
> > > With two CTs in the same zone for full nat is more close to the
> > > netfilter's NAT processing (the same CT goes from prerouting to
> > > postrouting).
> > > Now I'm wondering how nftables handles the stateful NAT and stateless
> > > NAT at the same time.
> >
> > Me too.
> >
> > >
> > > >
> > > > > think it should be, as with two actions.
> > > >
> > > > mlx5 at least currently doesn't support it. Good that tc already
> > > > supports it, then. Maybe drivers can get up to speed later on.
> > > So because mlx5 currently only supports one ct action in one tc rule
> > > or something?
> >
> > Not sure what you meant here?
> Sorry. that might be a silly question, I don't know much about TC
> HW offload, and just curious why such a rule can not be offloaded
> in mlx5 driver:
> 
> "actions=ct(nat(dst=7.7.16.3)),ct(commit, nat(src=7.7.16.1)),veth2"
> 
> it doesn't support any tc NAT or just does not support tc full NAT HW offload?

AFAIK it just doesn't support 2 ct() actions in the same filter. It
can do full NAT HW offload, as long as it's going through 2 filters.
I don't know where the limitation comes from. I was hoping that Nvidia
could comment on it, with at least a "yep, that's not supported" or
otherwise. Maybe this is old stuff already.


^ permalink raw reply	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2022-12-01 21:37 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-11-22 17:32 [PATCHv2 net-next 0/5] net: eliminate the duplicate code in the ct nat functions of ovs and tc Xin Long
2022-11-22 17:32 ` [PATCHv2 net-next 1/5] openvswitch: delete the unncessary skb_pull_rcsum call in ovs_ct_nat_execute Xin Long
2022-11-22 17:32 ` [PATCHv2 net-next 2/5] openvswitch: return NF_ACCEPT when OVS_CT_NAT is net set in info nat Xin Long
2022-11-23 14:24   ` Marcelo Ricardo Leitner
2022-11-22 17:32 ` [PATCHv2 net-next 3/5] net: sched: return NF_ACCEPT when fails to add nat ext in tcf_ct_act_nat Xin Long
2022-11-23 14:23   ` Marcelo Ricardo Leitner
2022-12-01 16:53     ` Xin Long
2022-11-22 17:32 ` [PATCHv2 net-next 4/5] net: sched: update the nat flag for icmp error packets in ct_nat_execute Xin Long
2022-11-22 17:32 ` [PATCHv2 net-next 5/5] net: move the nat function to nf_nat_ovs for ovs and tc Xin Long
2022-11-23 15:09   ` Marcelo Ricardo Leitner
2022-11-23 15:13     ` [ovs-dev] " Marcelo Ricardo Leitner
2022-11-23 17:31       ` Xin Long
2022-11-23 18:48         ` Marcelo Ricardo Leitner
2022-11-23 18:54           ` Xin Long
2022-11-23 19:17             ` Marcelo Ricardo Leitner
2022-11-23 19:55               ` Xin Long
2022-11-23 21:21                 ` Marcelo Ricardo Leitner
2022-11-23 21:34                   ` Xin Long
2022-12-01 21:37                     ` Marcelo Ricardo Leitner
2022-11-24 10:00                   ` Pablo Neira Ayuso
2022-11-23 18:52   ` Marcelo Ricardo Leitner
2022-12-01 16:26     ` Xin Long
2022-11-23 12:39 ` [PATCHv2 net-next 0/5] net: eliminate the duplicate code in the ct nat functions of " Marcelo Ricardo Leitner

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.