[PATCH net] netfilter: flowtable: Add pending bit for offload work

netfilter-devel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [PATCH net] netfilter: flowtable: Add pending bit for offload work
@ 2020-05-06 11:24 Paul Blakey
  2020-05-10 22:14 ` Pablo Neira Ayuso
  2020-05-11 14:37 ` Pablo Neira Ayuso
  0 siblings, 2 replies; 7+ messages in thread
From: Paul Blakey @ 2020-05-06 11:24 UTC (permalink / raw)
  To: Paul Blakey, Oz Shlomo, Pablo Neira Ayuso, Roi Dayan, netdev,
	Saeed Mahameed
  Cc: netfilter-devel

Gc step can queue offloaded flow del work or stats work.
Those work items can race each other and a flow could be freed
before the stats work is executed and querying it.
To avoid that, add a pending bit that if a work exists for a flow
don't queue another work for it.
This will also avoid adding multiple stats works in case stats work
didn't complete but gc step started again.

Signed-off-by: Paul Blakey <paulb@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
---
 include/net/netfilter/nf_flow_table.h | 1 +
 net/netfilter/nf_flow_table_offload.c | 8 +++++++-
 2 files changed, 8 insertions(+), 1 deletion(-)

diff --git a/include/net/netfilter/nf_flow_table.h b/include/net/netfilter/nf_flow_table.h
index 6bf6965..c54a7f7 100644
--- a/include/net/netfilter/nf_flow_table.h
+++ b/include/net/netfilter/nf_flow_table.h
@@ -127,6 +127,7 @@ enum nf_flow_flags {
 	NF_FLOW_HW_DYING,
 	NF_FLOW_HW_DEAD,
 	NF_FLOW_HW_REFRESH,
+	NF_FLOW_HW_PENDING,
 };
 
 enum flow_offload_type {
diff --git a/net/netfilter/nf_flow_table_offload.c b/net/netfilter/nf_flow_table_offload.c
index b9d5ecc..731d738 100644
--- a/net/netfilter/nf_flow_table_offload.c
+++ b/net/netfilter/nf_flow_table_offload.c
@@ -817,6 +817,7 @@ static void flow_offload_work_handler(struct work_struct *work)
 			WARN_ON_ONCE(1);
 	}
 
+	clear_bit(NF_FLOW_HW_PENDING, &offload->flow->flags);
 	kfree(offload);
 }
 
@@ -831,9 +832,14 @@ static void flow_offload_queue_work(struct flow_offload_work *offload)
 {
 	struct flow_offload_work *offload;
 
+	if (test_and_set_bit(NF_FLOW_HW_PENDING, &flow->flags))
+		return NULL;
+
 	offload = kmalloc(sizeof(struct flow_offload_work), GFP_ATOMIC);
-	if (!offload)
+	if (!offload) {
+		clear_bit(NF_FLOW_HW_PENDING, &flow->flags);
 		return NULL;
+	}
 
 	offload->cmd = cmd;
 	offload->flow = flow;
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH net] netfilter: flowtable: Add pending bit for offload work
  2020-05-06 11:24 [PATCH net] netfilter: flowtable: Add pending bit for offload work Paul Blakey
@ 2020-05-10 22:14 ` Pablo Neira Ayuso
  2020-05-11  8:32   ` Paul Blakey
  2020-05-11 14:37 ` Pablo Neira Ayuso
  1 sibling, 1 reply; 7+ messages in thread
From: Pablo Neira Ayuso @ 2020-05-10 22:14 UTC (permalink / raw)
  To: Paul Blakey; +Cc: Oz Shlomo, Roi Dayan, netdev, Saeed Mahameed, netfilter-devel

Hi,

On Wed, May 06, 2020 at 02:24:39PM +0300, Paul Blakey wrote:
> Gc step can queue offloaded flow del work or stats work.
> Those work items can race each other and a flow could be freed
> before the stats work is executed and querying it.
> To avoid that, add a pending bit that if a work exists for a flow
> don't queue another work for it.
> This will also avoid adding multiple stats works in case stats work
> didn't complete but gc step started again.

This is happening since the mutex has been removed, right?

Another question below.

> Signed-off-by: Paul Blakey <paulb@mellanox.com>
> Reviewed-by: Roi Dayan <roid@mellanox.com>
> ---
>  include/net/netfilter/nf_flow_table.h | 1 +
>  net/netfilter/nf_flow_table_offload.c | 8 +++++++-
>  2 files changed, 8 insertions(+), 1 deletion(-)
> 
> diff --git a/include/net/netfilter/nf_flow_table.h b/include/net/netfilter/nf_flow_table.h
> index 6bf6965..c54a7f7 100644
> --- a/include/net/netfilter/nf_flow_table.h
> +++ b/include/net/netfilter/nf_flow_table.h
> @@ -127,6 +127,7 @@ enum nf_flow_flags {
>  	NF_FLOW_HW_DYING,
>  	NF_FLOW_HW_DEAD,
>  	NF_FLOW_HW_REFRESH,
> +	NF_FLOW_HW_PENDING,
>  };
>  
>  enum flow_offload_type {
> diff --git a/net/netfilter/nf_flow_table_offload.c b/net/netfilter/nf_flow_table_offload.c
> index b9d5ecc..731d738 100644
> --- a/net/netfilter/nf_flow_table_offload.c
> +++ b/net/netfilter/nf_flow_table_offload.c
> @@ -817,6 +817,7 @@ static void flow_offload_work_handler(struct work_struct *work)
>  			WARN_ON_ONCE(1);
>  	}
>  
> +	clear_bit(NF_FLOW_HW_PENDING, &offload->flow->flags);
>  	kfree(offload);
>  }
>  
> @@ -831,9 +832,14 @@ static void flow_offload_queue_work(struct flow_offload_work *offload)
>  {
>  	struct flow_offload_work *offload;
>  
> +	if (test_and_set_bit(NF_FLOW_HW_PENDING, &flow->flags))
> +		return NULL;

In case of stats, it's fine to lose work.

But how does this work for the deletion case? Does this falls back to
the timeout deletion?

Thanks.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH net] netfilter: flowtable: Add pending bit for offload work
  2020-05-10 22:14 ` Pablo Neira Ayuso
@ 2020-05-11  8:32   ` Paul Blakey
  2020-05-11 11:59     ` Pablo Neira Ayuso
  0 siblings, 1 reply; 7+ messages in thread
From: Paul Blakey @ 2020-05-11  8:32 UTC (permalink / raw)
  To: Pablo Neira Ayuso
  Cc: Oz Shlomo, Roi Dayan, netdev, Saeed Mahameed, netfilter-devel



On 5/11/2020 1:14 AM, Pablo Neira Ayuso wrote:
> Hi,
>
> On Wed, May 06, 2020 at 02:24:39PM +0300, Paul Blakey wrote:
>> Gc step can queue offloaded flow del work or stats work.
>> Those work items can race each other and a flow could be freed
>> before the stats work is executed and querying it.
>> To avoid that, add a pending bit that if a work exists for a flow
>> don't queue another work for it.
>> This will also avoid adding multiple stats works in case stats work
>> didn't complete but gc step started again.
> This is happening since the mutex has been removed, right?
>
> Another question below.
it's from the using a new workqueue and one work per item, allowing parallelization
between a flow work items.


>
>> Signed-off-by: Paul Blakey <paulb@mellanox.com>
>> Reviewed-by: Roi Dayan <roid@mellanox.com>
>> ---
>>  include/net/netfilter/nf_flow_table.h | 1 +
>>  net/netfilter/nf_flow_table_offload.c | 8 +++++++-
>>  2 files changed, 8 insertions(+), 1 deletion(-)
>>
>> diff --git a/include/net/netfilter/nf_flow_table.h b/include/net/netfilter/nf_flow_table.h
>> index 6bf6965..c54a7f7 100644
>> --- a/include/net/netfilter/nf_flow_table.h
>> +++ b/include/net/netfilter/nf_flow_table.h
>> @@ -127,6 +127,7 @@ enum nf_flow_flags {
>>  	NF_FLOW_HW_DYING,
>>  	NF_FLOW_HW_DEAD,
>>  	NF_FLOW_HW_REFRESH,
>> +	NF_FLOW_HW_PENDING,
>>  };
>>  
>>  enum flow_offload_type {
>> diff --git a/net/netfilter/nf_flow_table_offload.c b/net/netfilter/nf_flow_table_offload.c
>> index b9d5ecc..731d738 100644
>> --- a/net/netfilter/nf_flow_table_offload.c
>> +++ b/net/netfilter/nf_flow_table_offload.c
>> @@ -817,6 +817,7 @@ static void flow_offload_work_handler(struct work_struct *work)
>>  			WARN_ON_ONCE(1);
>>  	}
>>  
>> +	clear_bit(NF_FLOW_HW_PENDING, &offload->flow->flags);
>>  	kfree(offload);
>>  }
>>  
>> @@ -831,9 +832,14 @@ static void flow_offload_queue_work(struct flow_offload_work *offload)
>>  {
>>  	struct flow_offload_work *offload;
>>  
>> +	if (test_and_set_bit(NF_FLOW_HW_PENDING, &flow->flags))
>> +		return NULL;
> In case of stats, it's fine to lose work.
>
> But how does this work for the deletion case? Does this falls back to
> the timeout deletion?

We get to nf_flow_table_offload_del (delete) in these cases:

>-------if (nf_flow_has_expired(flow) || nf_ct_is_dying(flow->ct) ||
>-------    test_bit(NF_FLOW_TEARDOWN, &flow->flags) {
>------->-------   ....
>------->-------    nf_flow_offload_del(flow_table, flow);

Which are all persistent once set but the nf_flow_has_expired(flow). So we will
try the delete
again and again till pending flag is unset or the flow is 'saved' by the already
queued stats updating the timeout.
A pending stats update can't save the flow once it's marked for teardown or
(flow->ct is dying), only delay it.

We didn't mention flush, like in table free. I guess we need to flush the
hardware workqueue
of any pending stats work, then queue the deletion, and flush again:

Adding nf_flow_table_offload_flush(flow_table), after
cancel_delayed_work_sync(&flow_table->gc_work);



>
> Thanks.


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH net] netfilter: flowtable: Add pending bit for offload work
  2020-05-11  8:32   ` Paul Blakey
@ 2020-05-11 11:59     ` Pablo Neira Ayuso
  2020-05-11 13:57       ` Roi Dayan
  2020-05-11 13:58       ` Roi Dayan
  0 siblings, 2 replies; 7+ messages in thread
From: Pablo Neira Ayuso @ 2020-05-11 11:59 UTC (permalink / raw)
  To: Paul Blakey; +Cc: Oz Shlomo, Roi Dayan, netdev, Saeed Mahameed, netfilter-devel

On Mon, May 11, 2020 at 11:32:36AM +0300, Paul Blakey wrote:
> On 5/11/2020 1:14 AM, Pablo Neira Ayuso wrote:
[...]
> >> @@ -831,9 +832,14 @@ static void flow_offload_queue_work(struct flow_offload_work *offload)
> >>  {
> >>  	struct flow_offload_work *offload;
> >>  
> >> +	if (test_and_set_bit(NF_FLOW_HW_PENDING, &flow->flags))
> >> +		return NULL;
> > In case of stats, it's fine to lose work.
> >
> > But how does this work for the deletion case? Does this falls back to
> > the timeout deletion?
> 
> We get to nf_flow_table_offload_del (delete) in these cases:
> 
> >-------if (nf_flow_has_expired(flow) || nf_ct_is_dying(flow->ct) ||
> >-------    test_bit(NF_FLOW_TEARDOWN, &flow->flags) {
> >------->-------   ....
> >------->-------    nf_flow_offload_del(flow_table, flow);
> 
> Which are all persistent once set but the nf_flow_has_expired(flow). So we will
> try the delete
> again and again till pending flag is unset or the flow is 'saved' by the already
> queued stats updating the timeout.
> A pending stats update can't save the flow once it's marked for teardown or
> (flow->ct is dying), only delay it.

Thanks for explaining.

> We didn't mention flush, like in table free. I guess we need to flush the
> hardware workqueue
> of any pending stats work, then queue the deletion, and flush again:
> Adding nf_flow_table_offload_flush(flow_table), after
> cancel_delayed_work_sync(&flow_table->gc_work);

The "flush" makes sure that stats work runs before the deletion, to
ensure no races happen for in-transit work objects, right?

We might use alloc_ordered_workqueue() and let the workqueue handle
this problem?

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH net] netfilter: flowtable: Add pending bit for offload work
  2020-05-11 11:59     ` Pablo Neira Ayuso
@ 2020-05-11 13:57       ` Roi Dayan
  2020-05-11 13:58       ` Roi Dayan
  1 sibling, 0 replies; 7+ messages in thread
From: Roi Dayan @ 2020-05-11 13:57 UTC (permalink / raw)
  To: Pablo Neira Ayuso, Paul Blakey
  Cc: Oz Shlomo, netdev, Saeed Mahameed, netfilter-devel



On 2020-05-11 2:59 PM, Pablo Neira Ayuso wrote:
> On Mon, May 11, 2020 at 11:32:36AM +0300, Paul Blakey wrote:
>> On 5/11/2020 1:14 AM, Pablo Neira Ayuso wrote:
> [...]
>>>> @@ -831,9 +832,14 @@ static void flow_offload_queue_work(struct flow_offload_work *offload)
>>>>  {
>>>>  	struct flow_offload_work *offload;
>>>>  
>>>> +	if (test_and_set_bit(NF_FLOW_HW_PENDING, &flow->flags))
>>>> +		return NULL;
>>> In case of stats, it's fine to lose work.
>>>
>>> But how does this work for the deletion case? Does this falls back to
>>> the timeout deletion?
>>
>> We get to nf_flow_table_offload_del (delete) in these cases:
>>
>>> -------if (nf_flow_has_expired(flow) || nf_ct_is_dying(flow->ct) ||
>>> -------    test_bit(NF_FLOW_TEARDOWN, &flow->flags) {
>>> ------->-------   ....
>>> ------->-------    nf_flow_offload_del(flow_table, flow);
>>
>> Which are all persistent once set but the nf_flow_has_expired(flow). So we will
>> try the delete
>> again and again till pending flag is unset or the flow is 'saved' by the already
>> queued stats updating the timeout.
>> A pending stats update can't save the flow once it's marked for teardown or
>> (flow->ct is dying), only delay it.
> 
> Thanks for explaining.
> 
>> We didn't mention flush, like in table free. I guess we need to flush the
>> hardware workqueue
>> of any pending stats work, then queue the deletion, and flush again:
>> Adding nf_flow_table_offload_flush(flow_table), after
>> cancel_delayed_work_sync(&flow_table->gc_work);
> 
> The "flush" makes sure that stats work runs before the deletion, to
> ensure no races happen for in-transit work objects, right?
> 
> We might use alloc_ordered_workqueue() and let the workqueue handle
> this problem?
> 

ordered workqueue executed one work at a time.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH net] netfilter: flowtable: Add pending bit for offload work
  2020-05-11 11:59     ` Pablo Neira Ayuso
  2020-05-11 13:57       ` Roi Dayan
@ 2020-05-11 13:58       ` Roi Dayan
  1 sibling, 0 replies; 7+ messages in thread
From: Roi Dayan @ 2020-05-11 13:58 UTC (permalink / raw)
  To: Pablo Neira Ayuso, Paul Blakey
  Cc: Oz Shlomo, netdev, Saeed Mahameed, netfilter-devel



On 2020-05-11 2:59 PM, Pablo Neira Ayuso wrote:
> On Mon, May 11, 2020 at 11:32:36AM +0300, Paul Blakey wrote:
>> On 5/11/2020 1:14 AM, Pablo Neira Ayuso wrote:
> [...]
>>>> @@ -831,9 +832,14 @@ static void flow_offload_queue_work(struct flow_offload_work *offload)
>>>>  {
>>>>  	struct flow_offload_work *offload;
>>>>  
>>>> +	if (test_and_set_bit(NF_FLOW_HW_PENDING, &flow->flags))
>>>> +		return NULL;
>>> In case of stats, it's fine to lose work.
>>>
>>> But how does this work for the deletion case? Does this falls back to
>>> the timeout deletion?
>>
>> We get to nf_flow_table_offload_del (delete) in these cases:
>>
>>> -------if (nf_flow_has_expired(flow) || nf_ct_is_dying(flow->ct) ||
>>> -------    test_bit(NF_FLOW_TEARDOWN, &flow->flags) {
>>> ------->-------   ....
>>> ------->-------    nf_flow_offload_del(flow_table, flow);
>>
>> Which are all persistent once set but the nf_flow_has_expired(flow). So we will
>> try the delete
>> again and again till pending flag is unset or the flow is 'saved' by the already
>> queued stats updating the timeout.
>> A pending stats update can't save the flow once it's marked for teardown or
>> (flow->ct is dying), only delay it.
> 
> Thanks for explaining.
> 
>> We didn't mention flush, like in table free. I guess we need to flush the
>> hardware workqueue
>> of any pending stats work, then queue the deletion, and flush again:
>> Adding nf_flow_table_offload_flush(flow_table), after
>> cancel_delayed_work_sync(&flow_table->gc_work);
> 
> The "flush" makes sure that stats work runs before the deletion, to
> ensure no races happen for in-transit work objects, right?
> 
> We might use alloc_ordered_workqueue() and let the workqueue handle
> this problem?
> 

ordered workqueue executes one work at a time.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH net] netfilter: flowtable: Add pending bit for offload work
  2020-05-06 11:24 [PATCH net] netfilter: flowtable: Add pending bit for offload work Paul Blakey
  2020-05-10 22:14 ` Pablo Neira Ayuso
@ 2020-05-11 14:37 ` Pablo Neira Ayuso
  1 sibling, 0 replies; 7+ messages in thread
From: Pablo Neira Ayuso @ 2020-05-11 14:37 UTC (permalink / raw)
  To: Paul Blakey; +Cc: Oz Shlomo, Roi Dayan, netdev, Saeed Mahameed, netfilter-devel

On Wed, May 06, 2020 at 02:24:39PM +0300, Paul Blakey wrote:
> Gc step can queue offloaded flow del work or stats work.
> Those work items can race each other and a flow could be freed
> before the stats work is executed and querying it.
> To avoid that, add a pending bit that if a work exists for a flow
> don't queue another work for it.
> This will also avoid adding multiple stats works in case stats work
> didn't complete but gc step started again.

Applied to nf, thanks.

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2020-05-11 14:38 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-05-06 11:24 [PATCH net] netfilter: flowtable: Add pending bit for offload work Paul Blakey
2020-05-10 22:14 ` Pablo Neira Ayuso
2020-05-11  8:32   ` Paul Blakey
2020-05-11 11:59     ` Pablo Neira Ayuso
2020-05-11 13:57       ` Roi Dayan
2020-05-11 13:58       ` Roi Dayan
2020-05-11 14:37 ` Pablo Neira Ayuso

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).