All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH nf] netfilter: conntrack: remove offload_pickup sysctl again
@ 2021-08-04  9:25 Florian Westphal
  2021-08-04 11:00 ` Oz Shlomo
  0 siblings, 1 reply; 4+ messages in thread
From: Florian Westphal @ 2021-08-04  9:25 UTC (permalink / raw)
  To: netfilter-devel
  Cc: Florian Westphal, Oz Shlomo, Paul Blakey, Marcelo Ricardo Leitner

These two sysctls were added because the hardcoded defaults (2 minutes,
tcp, 30 seconds, udp) turned out to be too low for some setups.

They appeared in 5.14-rc1 so it should be fine to remove it again.

Marcelo convinced me that there should be no difference between a flow
that was offloaded vs. a flow that was not wrt. timeout handling.
Thus the default is changed to those for TCP established and UDP stream,
5 days and 120 seconds, respectively.

Marcelo also suggested to account for the timeout value used for the
offloading, this avoids increase beyond the value in the conntrack-sysctl
and will also instantly expire the conntrack entry with altered sysctls.

Example:
   nf_conntrack_udp_timeout_stream=60
   nf_flowtable_udp_timeout=60

This will remove offloaded udp flows after one minute, rather than two.

When flow transitions back from offload to software, also clear the
ASSURED bit -- this allows conntrack to early-expire the entry in case
the table is full.

The existing TCP code sets the ASSURED bit again when a valid ACK is seen
while conntrack is in ESTABLISHED state.  UDP re-sets on next packet.

Cc: Oz Shlomo <ozsh@nvidia.com>
Cc: Paul Blakey <paulb@nvidia.com>
Suggested-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
---
 Pablo, please wait a bit before applying so that Oz, Paul and Marcelo
 have time to ack/nack this.

 .../networking/nf_conntrack-sysctl.rst          | 10 ----------
 include/net/netns/conntrack.h                   |  2 --
 net/netfilter/nf_conntrack_proto_tcp.c          |  1 -
 net/netfilter/nf_conntrack_proto_udp.c          |  1 -
 net/netfilter/nf_conntrack_standalone.c         | 16 ----------------
 net/netfilter/nf_flow_table_core.c              | 17 ++++++++++++++---
 6 files changed, 14 insertions(+), 33 deletions(-)

diff --git a/Documentation/networking/nf_conntrack-sysctl.rst b/Documentation/networking/nf_conntrack-sysctl.rst
index d31ed6c1cb0d..024d784157c8 100644
--- a/Documentation/networking/nf_conntrack-sysctl.rst
+++ b/Documentation/networking/nf_conntrack-sysctl.rst
@@ -191,19 +191,9 @@ nf_flowtable_tcp_timeout - INTEGER (seconds)
         TCP connections may be offloaded from nf conntrack to nf flow table.
         Once aged, the connection is returned to nf conntrack with tcp pickup timeout.
 
-nf_flowtable_tcp_pickup - INTEGER (seconds)
-        default 120
-
-        TCP connection timeout after being aged from nf flow table offload.
-
 nf_flowtable_udp_timeout - INTEGER (seconds)
         default 30
 
         Control offload timeout for udp connections.
         UDP connections may be offloaded from nf conntrack to nf flow table.
         Once aged, the connection is returned to nf conntrack with udp pickup timeout.
-
-nf_flowtable_udp_pickup - INTEGER (seconds)
-        default 30
-
-        UDP connection timeout after being aged from nf flow table offload.
diff --git a/include/net/netns/conntrack.h b/include/net/netns/conntrack.h
index 37e5300c7e5a..fefd38db95b3 100644
--- a/include/net/netns/conntrack.h
+++ b/include/net/netns/conntrack.h
@@ -30,7 +30,6 @@ struct nf_tcp_net {
 	u8 tcp_ignore_invalid_rst;
 #if IS_ENABLED(CONFIG_NF_FLOW_TABLE)
 	unsigned int offload_timeout;
-	unsigned int offload_pickup;
 #endif
 };
 
@@ -44,7 +43,6 @@ struct nf_udp_net {
 	unsigned int timeouts[UDP_CT_MAX];
 #if IS_ENABLED(CONFIG_NF_FLOW_TABLE)
 	unsigned int offload_timeout;
-	unsigned int offload_pickup;
 #endif
 };
 
diff --git a/net/netfilter/nf_conntrack_proto_tcp.c b/net/netfilter/nf_conntrack_proto_tcp.c
index 3259416f2ea4..af5115e127cf 100644
--- a/net/netfilter/nf_conntrack_proto_tcp.c
+++ b/net/netfilter/nf_conntrack_proto_tcp.c
@@ -1478,7 +1478,6 @@ void nf_conntrack_tcp_init_net(struct net *net)
 
 #if IS_ENABLED(CONFIG_NF_FLOW_TABLE)
 	tn->offload_timeout = 30 * HZ;
-	tn->offload_pickup = 120 * HZ;
 #endif
 }
 
diff --git a/net/netfilter/nf_conntrack_proto_udp.c b/net/netfilter/nf_conntrack_proto_udp.c
index 698fee49e732..f8e3c0d2602f 100644
--- a/net/netfilter/nf_conntrack_proto_udp.c
+++ b/net/netfilter/nf_conntrack_proto_udp.c
@@ -271,7 +271,6 @@ void nf_conntrack_udp_init_net(struct net *net)
 
 #if IS_ENABLED(CONFIG_NF_FLOW_TABLE)
 	un->offload_timeout = 30 * HZ;
-	un->offload_pickup = 30 * HZ;
 #endif
 }
 
diff --git a/net/netfilter/nf_conntrack_standalone.c b/net/netfilter/nf_conntrack_standalone.c
index 214d9f9e499b..e84b499b7bfa 100644
--- a/net/netfilter/nf_conntrack_standalone.c
+++ b/net/netfilter/nf_conntrack_standalone.c
@@ -575,7 +575,6 @@ enum nf_ct_sysctl_index {
 	NF_SYSCTL_CT_PROTO_TIMEOUT_TCP_UNACK,
 #if IS_ENABLED(CONFIG_NF_FLOW_TABLE)
 	NF_SYSCTL_CT_PROTO_TIMEOUT_TCP_OFFLOAD,
-	NF_SYSCTL_CT_PROTO_TIMEOUT_TCP_OFFLOAD_PICKUP,
 #endif
 	NF_SYSCTL_CT_PROTO_TCP_LOOSE,
 	NF_SYSCTL_CT_PROTO_TCP_LIBERAL,
@@ -585,7 +584,6 @@ enum nf_ct_sysctl_index {
 	NF_SYSCTL_CT_PROTO_TIMEOUT_UDP_STREAM,
 #if IS_ENABLED(CONFIG_NF_FLOW_TABLE)
 	NF_SYSCTL_CT_PROTO_TIMEOUT_UDP_OFFLOAD,
-	NF_SYSCTL_CT_PROTO_TIMEOUT_UDP_OFFLOAD_PICKUP,
 #endif
 	NF_SYSCTL_CT_PROTO_TIMEOUT_ICMP,
 	NF_SYSCTL_CT_PROTO_TIMEOUT_ICMPV6,
@@ -776,12 +774,6 @@ static struct ctl_table nf_ct_sysctl_table[] = {
 		.mode		= 0644,
 		.proc_handler	= proc_dointvec_jiffies,
 	},
-	[NF_SYSCTL_CT_PROTO_TIMEOUT_TCP_OFFLOAD_PICKUP] = {
-		.procname	= "nf_flowtable_tcp_pickup",
-		.maxlen		= sizeof(unsigned int),
-		.mode		= 0644,
-		.proc_handler	= proc_dointvec_jiffies,
-	},
 #endif
 	[NF_SYSCTL_CT_PROTO_TCP_LOOSE] = {
 		.procname	= "nf_conntrack_tcp_loose",
@@ -832,12 +824,6 @@ static struct ctl_table nf_ct_sysctl_table[] = {
 		.mode		= 0644,
 		.proc_handler	= proc_dointvec_jiffies,
 	},
-	[NF_SYSCTL_CT_PROTO_TIMEOUT_UDP_OFFLOAD_PICKUP] = {
-		.procname	= "nf_flowtable_udp_pickup",
-		.maxlen		= sizeof(unsigned int),
-		.mode		= 0644,
-		.proc_handler	= proc_dointvec_jiffies,
-	},
 #endif
 	[NF_SYSCTL_CT_PROTO_TIMEOUT_ICMP] = {
 		.procname	= "nf_conntrack_icmp_timeout",
@@ -1018,7 +1004,6 @@ static void nf_conntrack_standalone_init_tcp_sysctl(struct net *net,
 
 #if IS_ENABLED(CONFIG_NF_FLOW_TABLE)
 	table[NF_SYSCTL_CT_PROTO_TIMEOUT_TCP_OFFLOAD].data = &tn->offload_timeout;
-	table[NF_SYSCTL_CT_PROTO_TIMEOUT_TCP_OFFLOAD_PICKUP].data = &tn->offload_pickup;
 #endif
 
 }
@@ -1111,7 +1096,6 @@ static int nf_conntrack_standalone_init_sysctl(struct net *net)
 	table[NF_SYSCTL_CT_PROTO_TIMEOUT_UDP_STREAM].data = &un->timeouts[UDP_CT_REPLIED];
 #if IS_ENABLED(CONFIG_NF_FLOW_TABLE)
 	table[NF_SYSCTL_CT_PROTO_TIMEOUT_UDP_OFFLOAD].data = &un->offload_timeout;
-	table[NF_SYSCTL_CT_PROTO_TIMEOUT_UDP_OFFLOAD_PICKUP].data = &un->offload_pickup;
 #endif
 
 	nf_conntrack_standalone_init_tcp_sysctl(net, table);
diff --git a/net/netfilter/nf_flow_table_core.c b/net/netfilter/nf_flow_table_core.c
index ec3dd1c9c8f4..8b23c19e1833 100644
--- a/net/netfilter/nf_flow_table_core.c
+++ b/net/netfilter/nf_flow_table_core.c
@@ -182,20 +182,31 @@ static void flow_offload_fixup_ct_timeout(struct nf_conn *ct)
 {
 	struct net *net = nf_ct_net(ct);
 	int l4num = nf_ct_protonum(ct);
-	unsigned int timeout;
+	s32 timeout;
+
+	/* Clear assured to allow early_drop for this entry.
+	 *
+	 * The flag is set again if the connection sees a reply packet.
+	 */
+	clear_bit(IPS_ASSURED_BIT, &ct->status);
 
 	if (l4num == IPPROTO_TCP) {
 		struct nf_tcp_net *tn = nf_tcp_pernet(net);
 
-		timeout = tn->offload_pickup;
+		timeout = tn->timeouts[TCP_CONNTRACK_ESTABLISHED];
+		timeout -= tn->offload_timeout;
 	} else if (l4num == IPPROTO_UDP) {
 		struct nf_udp_net *tn = nf_udp_pernet(net);
 
-		timeout = tn->offload_pickup;
+		timeout = tn->timeouts[UDP_CT_REPLIED];
+		timeout -= tn->offload_timeout;
 	} else {
 		return;
 	}
 
+	if (timeout < 0)
+		timeout = 0;
+
 	if (nf_flow_timeout_delta(ct->timeout) > (__s32)timeout)
 		ct->timeout = nfct_time_stamp + timeout;
 }
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH nf] netfilter: conntrack: remove offload_pickup sysctl again
  2021-08-04  9:25 [PATCH nf] netfilter: conntrack: remove offload_pickup sysctl again Florian Westphal
@ 2021-08-04 11:00 ` Oz Shlomo
  2021-08-04 11:19   ` Florian Westphal
  0 siblings, 1 reply; 4+ messages in thread
From: Oz Shlomo @ 2021-08-04 11:00 UTC (permalink / raw)
  To: Florian Westphal, netfilter-devel; +Cc: Paul Blakey, Marcelo Ricardo Leitner

Hi Florian/Marcelo,

On 8/4/2021 12:25 PM, Florian Westphal wrote:
> These two sysctls were added because the hardcoded defaults (2 minutes,
> tcp, 30 seconds, udp) turned out to be too low for some setups.
> 
> They appeared in 5.14-rc1 so it should be fine to remove it again.
> 
> Marcelo convinced me that there should be no difference between a flow
> that was offloaded vs. a flow that was not wrt. timeout handling.
> Thus the default is changed to those for TCP established and UDP stream,
> 5 days and 120 seconds, respectively.
> 
> Marcelo also suggested to account for the timeout value used for the
> offloading, this avoids increase beyond the value in the conntrack-sysctl
> and will also instantly expire the conntrack entry with altered sysctls.
> 
> Example:
>     nf_conntrack_udp_timeout_stream=60
>     nf_flowtable_udp_timeout=60
> 
> This will remove offloaded udp flows after one minute, rather than two.

This part of the patch looks good to me.

> 
> When flow transitions back from offload to software, also clear the
> ASSURED bit -- this allows conntrack to early-expire the entry in case
> the table is full.

Doesn't this introduce a discrpency between offloaded and non-offload connections?
IIUC, offloaded connections might timeout earlier after they are picked up by the software when the 
conntrack table is full.
However, if the same tcp connection was not offloaded it would timeout after 5 days.

> 
> The existing TCP code sets the ASSURED bit again when a valid ACK is seen
> while conntrack is in ESTABLISHED state.  UDP re-sets on next packet.
> 
> Cc: Oz Shlomo <ozsh@nvidia.com>
> Cc: Paul Blakey <paulb@nvidia.com>
> Suggested-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
> Signed-off-by: Florian Westphal <fw@strlen.de>
> ---
>   Pablo, please wait a bit before applying so that Oz, Paul and Marcelo
>   have time to ack/nack this.
> 
>   .../networking/nf_conntrack-sysctl.rst          | 10 ----------
>   include/net/netns/conntrack.h                   |  2 --
>   net/netfilter/nf_conntrack_proto_tcp.c          |  1 -
>   net/netfilter/nf_conntrack_proto_udp.c          |  1 -
>   net/netfilter/nf_conntrack_standalone.c         | 16 ----------------
>   net/netfilter/nf_flow_table_core.c              | 17 ++++++++++++++---
>   6 files changed, 14 insertions(+), 33 deletions(-)
> 
> diff --git a/Documentation/networking/nf_conntrack-sysctl.rst b/Documentation/networking/nf_conntrack-sysctl.rst
> index d31ed6c1cb0d..024d784157c8 100644
> --- a/Documentation/networking/nf_conntrack-sysctl.rst
> +++ b/Documentation/networking/nf_conntrack-sysctl.rst
> @@ -191,19 +191,9 @@ nf_flowtable_tcp_timeout - INTEGER (seconds)
>           TCP connections may be offloaded from nf conntrack to nf flow table.
>           Once aged, the connection is returned to nf conntrack with tcp pickup timeout.
>   
> -nf_flowtable_tcp_pickup - INTEGER (seconds)
> -        default 120
> -
> -        TCP connection timeout after being aged from nf flow table offload.
> -
>   nf_flowtable_udp_timeout - INTEGER (seconds)
>           default 30
>   
>           Control offload timeout for udp connections.
>           UDP connections may be offloaded from nf conntrack to nf flow table.
>           Once aged, the connection is returned to nf conntrack with udp pickup timeout.
> -
> -nf_flowtable_udp_pickup - INTEGER (seconds)
> -        default 30
> -
> -        UDP connection timeout after being aged from nf flow table offload.
> diff --git a/include/net/netns/conntrack.h b/include/net/netns/conntrack.h
> index 37e5300c7e5a..fefd38db95b3 100644
> --- a/include/net/netns/conntrack.h
> +++ b/include/net/netns/conntrack.h
> @@ -30,7 +30,6 @@ struct nf_tcp_net {
>   	u8 tcp_ignore_invalid_rst;
>   #if IS_ENABLED(CONFIG_NF_FLOW_TABLE)
>   	unsigned int offload_timeout;
> -	unsigned int offload_pickup;
>   #endif
>   };
>   
> @@ -44,7 +43,6 @@ struct nf_udp_net {
>   	unsigned int timeouts[UDP_CT_MAX];
>   #if IS_ENABLED(CONFIG_NF_FLOW_TABLE)
>   	unsigned int offload_timeout;
> -	unsigned int offload_pickup;
>   #endif
>   };
>   
> diff --git a/net/netfilter/nf_conntrack_proto_tcp.c b/net/netfilter/nf_conntrack_proto_tcp.c
> index 3259416f2ea4..af5115e127cf 100644
> --- a/net/netfilter/nf_conntrack_proto_tcp.c
> +++ b/net/netfilter/nf_conntrack_proto_tcp.c
> @@ -1478,7 +1478,6 @@ void nf_conntrack_tcp_init_net(struct net *net)
>   
>   #if IS_ENABLED(CONFIG_NF_FLOW_TABLE)
>   	tn->offload_timeout = 30 * HZ;
> -	tn->offload_pickup = 120 * HZ;
>   #endif
>   }
>   
> diff --git a/net/netfilter/nf_conntrack_proto_udp.c b/net/netfilter/nf_conntrack_proto_udp.c
> index 698fee49e732..f8e3c0d2602f 100644
> --- a/net/netfilter/nf_conntrack_proto_udp.c
> +++ b/net/netfilter/nf_conntrack_proto_udp.c
> @@ -271,7 +271,6 @@ void nf_conntrack_udp_init_net(struct net *net)
>   
>   #if IS_ENABLED(CONFIG_NF_FLOW_TABLE)
>   	un->offload_timeout = 30 * HZ;
> -	un->offload_pickup = 30 * HZ;
>   #endif
>   }
>   
> diff --git a/net/netfilter/nf_conntrack_standalone.c b/net/netfilter/nf_conntrack_standalone.c
> index 214d9f9e499b..e84b499b7bfa 100644
> --- a/net/netfilter/nf_conntrack_standalone.c
> +++ b/net/netfilter/nf_conntrack_standalone.c
> @@ -575,7 +575,6 @@ enum nf_ct_sysctl_index {
>   	NF_SYSCTL_CT_PROTO_TIMEOUT_TCP_UNACK,
>   #if IS_ENABLED(CONFIG_NF_FLOW_TABLE)
>   	NF_SYSCTL_CT_PROTO_TIMEOUT_TCP_OFFLOAD,
> -	NF_SYSCTL_CT_PROTO_TIMEOUT_TCP_OFFLOAD_PICKUP,
>   #endif
>   	NF_SYSCTL_CT_PROTO_TCP_LOOSE,
>   	NF_SYSCTL_CT_PROTO_TCP_LIBERAL,
> @@ -585,7 +584,6 @@ enum nf_ct_sysctl_index {
>   	NF_SYSCTL_CT_PROTO_TIMEOUT_UDP_STREAM,
>   #if IS_ENABLED(CONFIG_NF_FLOW_TABLE)
>   	NF_SYSCTL_CT_PROTO_TIMEOUT_UDP_OFFLOAD,
> -	NF_SYSCTL_CT_PROTO_TIMEOUT_UDP_OFFLOAD_PICKUP,
>   #endif
>   	NF_SYSCTL_CT_PROTO_TIMEOUT_ICMP,
>   	NF_SYSCTL_CT_PROTO_TIMEOUT_ICMPV6,
> @@ -776,12 +774,6 @@ static struct ctl_table nf_ct_sysctl_table[] = {
>   		.mode		= 0644,
>   		.proc_handler	= proc_dointvec_jiffies,
>   	},
> -	[NF_SYSCTL_CT_PROTO_TIMEOUT_TCP_OFFLOAD_PICKUP] = {
> -		.procname	= "nf_flowtable_tcp_pickup",
> -		.maxlen		= sizeof(unsigned int),
> -		.mode		= 0644,
> -		.proc_handler	= proc_dointvec_jiffies,
> -	},
>   #endif
>   	[NF_SYSCTL_CT_PROTO_TCP_LOOSE] = {
>   		.procname	= "nf_conntrack_tcp_loose",
> @@ -832,12 +824,6 @@ static struct ctl_table nf_ct_sysctl_table[] = {
>   		.mode		= 0644,
>   		.proc_handler	= proc_dointvec_jiffies,
>   	},
> -	[NF_SYSCTL_CT_PROTO_TIMEOUT_UDP_OFFLOAD_PICKUP] = {
> -		.procname	= "nf_flowtable_udp_pickup",
> -		.maxlen		= sizeof(unsigned int),
> -		.mode		= 0644,
> -		.proc_handler	= proc_dointvec_jiffies,
> -	},
>   #endif
>   	[NF_SYSCTL_CT_PROTO_TIMEOUT_ICMP] = {
>   		.procname	= "nf_conntrack_icmp_timeout",
> @@ -1018,7 +1004,6 @@ static void nf_conntrack_standalone_init_tcp_sysctl(struct net *net,
>   
>   #if IS_ENABLED(CONFIG_NF_FLOW_TABLE)
>   	table[NF_SYSCTL_CT_PROTO_TIMEOUT_TCP_OFFLOAD].data = &tn->offload_timeout;
> -	table[NF_SYSCTL_CT_PROTO_TIMEOUT_TCP_OFFLOAD_PICKUP].data = &tn->offload_pickup;
>   #endif
>   
>   }
> @@ -1111,7 +1096,6 @@ static int nf_conntrack_standalone_init_sysctl(struct net *net)
>   	table[NF_SYSCTL_CT_PROTO_TIMEOUT_UDP_STREAM].data = &un->timeouts[UDP_CT_REPLIED];
>   #if IS_ENABLED(CONFIG_NF_FLOW_TABLE)
>   	table[NF_SYSCTL_CT_PROTO_TIMEOUT_UDP_OFFLOAD].data = &un->offload_timeout;
> -	table[NF_SYSCTL_CT_PROTO_TIMEOUT_UDP_OFFLOAD_PICKUP].data = &un->offload_pickup;
>   #endif
>   
>   	nf_conntrack_standalone_init_tcp_sysctl(net, table);
> diff --git a/net/netfilter/nf_flow_table_core.c b/net/netfilter/nf_flow_table_core.c
> index ec3dd1c9c8f4..8b23c19e1833 100644
> --- a/net/netfilter/nf_flow_table_core.c
> +++ b/net/netfilter/nf_flow_table_core.c
> @@ -182,20 +182,31 @@ static void flow_offload_fixup_ct_timeout(struct nf_conn *ct)
>   {
>   	struct net *net = nf_ct_net(ct);
>   	int l4num = nf_ct_protonum(ct);
> -	unsigned int timeout;
> +	s32 timeout;
> +
> +	/* Clear assured to allow early_drop for this entry.
> +	 *
> +	 * The flag is set again if the connection sees a reply packet.
> +	 */
> +	clear_bit(IPS_ASSURED_BIT, &ct->status);
>   
>   	if (l4num == IPPROTO_TCP) {
>   		struct nf_tcp_net *tn = nf_tcp_pernet(net);
>   
> -		timeout = tn->offload_pickup;
> +		timeout = tn->timeouts[TCP_CONNTRACK_ESTABLISHED];
> +		timeout -= tn->offload_timeout;
>   	} else if (l4num == IPPROTO_UDP) {
>   		struct nf_udp_net *tn = nf_udp_pernet(net);
>   
> -		timeout = tn->offload_pickup;
> +		timeout = tn->timeouts[UDP_CT_REPLIED];
> +		timeout -= tn->offload_timeout;
>   	} else {
>   		return;
>   	}
>   
> +	if (timeout < 0)
> +		timeout = 0;
> +
>   	if (nf_flow_timeout_delta(ct->timeout) > (__s32)timeout)
>   		ct->timeout = nfct_time_stamp + timeout;
>   }
> 

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH nf] netfilter: conntrack: remove offload_pickup sysctl again
  2021-08-04 11:00 ` Oz Shlomo
@ 2021-08-04 11:19   ` Florian Westphal
  2021-08-04 11:54     ` Oz Shlomo
  0 siblings, 1 reply; 4+ messages in thread
From: Florian Westphal @ 2021-08-04 11:19 UTC (permalink / raw)
  To: Oz Shlomo
  Cc: Florian Westphal, netfilter-devel, Paul Blakey, Marcelo Ricardo Leitner

Oz Shlomo <ozsh@nvidia.com> wrote:
> > When flow transitions back from offload to software, also clear the
> > ASSURED bit -- this allows conntrack to early-expire the entry in case
> > the table is full.
> 
> Doesn't this introduce a discrpency between offloaded and non-offload connections?
> IIUC, offloaded connections might timeout earlier after they are picked up
> by the software when the conntrack table is full.

Yes, if no packet was seen after the flow got moved back to software and
a new connection request is made while table is full.

> However, if the same tcp connection was not offloaded it would timeout after 5 days.

Yes.  The problem is that AFAIU HW may move flow back to SW path after
it saw e.g. FIN bit, or after one side went silent (i.e., unacked data).

And and in that case, SW path has a lot smaller timeout than the 5day
established value.

AFAICS there is no way to detect this on generic side and it might even
be different depending on hw/driver?

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH nf] netfilter: conntrack: remove offload_pickup sysctl again
  2021-08-04 11:19   ` Florian Westphal
@ 2021-08-04 11:54     ` Oz Shlomo
  0 siblings, 0 replies; 4+ messages in thread
From: Oz Shlomo @ 2021-08-04 11:54 UTC (permalink / raw)
  To: Florian Westphal; +Cc: netfilter-devel, Paul Blakey, Marcelo Ricardo Leitner

Hi Florian,

On 8/4/2021 2:19 PM, Florian Westphal wrote:
> Oz Shlomo <ozsh@nvidia.com> wrote:
>>> When flow transitions back from offload to software, also clear the
>>> ASSURED bit -- this allows conntrack to early-expire the entry in case
>>> the table is full.
>>
>> Doesn't this introduce a discrpency between offloaded and non-offload connections?
>> IIUC, offloaded connections might timeout earlier after they are picked up
>> by the software when the conntrack table is full.
> 
> Yes, if no packet was seen after the flow got moved back to software and
> a new connection request is made while table is full.
> 

Then perhaps it is better not to clear the ASSURED bit.
What do you think?

>> However, if the same tcp connection was not offloaded it would timeout after 5 days.
> 
> Yes.  The problem is that AFAIU HW may move flow back to SW path after
> it saw e.g. FIN bit, or after one side went silent (i.e., unacked data).
> 
> And and in that case, SW path has a lot smaller timeout than the 5day
> established value.
> 
> AFAICS there is no way to detect this on generic side and it might even
> be different depending on hw/driver?
> 

Actually, the hardware sends all packets with a set FIN flags to sw.
When act_ct processes a FIN packet it sets the teardown flag for the offloaded connection and 
continues to process the packet through nf conntrack.
Therefore, the connection timeout interval will be updated by nf conntrack.

Connections that are aged in hardware are expected to be in the established state.
Therefore, the pickup time should align with the nf conntrack settings.





^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2021-08-04 11:54 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-08-04  9:25 [PATCH nf] netfilter: conntrack: remove offload_pickup sysctl again Florian Westphal
2021-08-04 11:00 ` Oz Shlomo
2021-08-04 11:19   ` Florian Westphal
2021-08-04 11:54     ` Oz Shlomo

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.