All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v3 net 1/1] net sched actions: fix GETing actions
@ 2016-09-12 23:07 Jamal Hadi Salim
  2016-09-13 16:20 ` Cong Wang
                   ` (2 more replies)
  0 siblings, 3 replies; 7+ messages in thread
From: Jamal Hadi Salim @ 2016-09-12 23:07 UTC (permalink / raw)
  To: davem; +Cc: netdev, xiyou.wangcong, Jamal Hadi Salim

From: Jamal Hadi Salim <jhs@mojatatu.com>

With the batch changes that translated transient actions into
a temporary list lost in the translation was the fact that
tcf_action_destroy() will eventually delete the action from
the permanent location if the refcount is zero.

Example of what broke:
...add a gact action to drop
sudo $TC actions add action drop index 10
...now retrieve it, looks good
sudo $TC actions get action gact index 10
...retrieve it again and find it is gone!
sudo $TC actions get action gact index 10

Fixes:
commit 22dc13c837c3 ("net_sched: convert tcf_exts from list to pointer array"),
commit 824a7e8863b3 ("net_sched: remove an unnecessary list_del()")
commit f07fed82ad79 ("net_sched: remove the leftover cleanup_a()")

Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
---
 net/sched/act_api.c | 19 +++++++++++++++++++
 1 file changed, 19 insertions(+)

diff --git a/net/sched/act_api.c b/net/sched/act_api.c
index d09d068..50720b1 100644
--- a/net/sched/act_api.c
+++ b/net/sched/act_api.c
@@ -592,6 +592,16 @@ err_out:
 	return ERR_PTR(err);
 }
 
+static void cleanup_a(struct list_head *actions, int ovr)
+{
+	struct tc_action *a;
+
+	list_for_each_entry(a, actions, list) {
+		if (ovr)
+			a->tcfa_refcnt -= 1;
+	}
+}
+
 int tcf_action_init(struct net *net, struct nlattr *nla,
 				  struct nlattr *est, char *name, int ovr,
 				  int bind, struct list_head *actions)
@@ -612,8 +622,15 @@ int tcf_action_init(struct net *net, struct nlattr *nla,
 			goto err;
 		}
 		act->order = i;
+		if (ovr)
+			act->tcfa_refcnt += 1;
 		list_add_tail(&act->list, actions);
 	}
+
+	/* Remove the temp refcnt which was necessary to protect against
+	 * destroying an existing action which was being replaced
+	 */
+	cleanup_a(actions, ovr);
 	return 0;
 
 err:
@@ -883,6 +900,8 @@ tca_action_gd(struct net *net, struct nlattr *nla, struct nlmsghdr *n,
 			goto err;
 		}
 		act->order = i;
+		if (event == RTM_GETACTION)
+			act->tcfa_refcnt += 1;
 		list_add_tail(&act->list, &actions);
 	}
 
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH v3 net 1/1] net sched actions: fix GETing actions
  2016-09-12 23:07 [PATCH v3 net 1/1] net sched actions: fix GETing actions Jamal Hadi Salim
@ 2016-09-13 16:20 ` Cong Wang
  2016-09-13 19:47   ` Jamal Hadi Salim
  2016-09-15 12:15 ` Sergei Shtylyov
  2016-09-15 23:33 ` David Miller
  2 siblings, 1 reply; 7+ messages in thread
From: Cong Wang @ 2016-09-13 16:20 UTC (permalink / raw)
  To: Jamal Hadi Salim; +Cc: David Miller, Linux Kernel Network Developers

On Mon, Sep 12, 2016 at 4:07 PM, Jamal Hadi Salim <jhs@mojatatu.com> wrote:
> From: Jamal Hadi Salim <jhs@mojatatu.com>
>
> With the batch changes that translated transient actions into
> a temporary list lost in the translation was the fact that
> tcf_action_destroy() will eventually delete the action from
> the permanent location if the refcount is zero.
>
> Example of what broke:
> ...add a gact action to drop
> sudo $TC actions add action drop index 10
> ...now retrieve it, looks good
> sudo $TC actions get action gact index 10
> ...retrieve it again and find it is gone!
> sudo $TC actions get action gact index 10
>
> Fixes:
> commit 22dc13c837c3 ("net_sched: convert tcf_exts from list to pointer array"),
> commit 824a7e8863b3 ("net_sched: remove an unnecessary list_del()")
> commit f07fed82ad79 ("net_sched: remove the leftover cleanup_a()")
>
> Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
> ---
>  net/sched/act_api.c | 19 +++++++++++++++++++
>  1 file changed, 19 insertions(+)
>
> diff --git a/net/sched/act_api.c b/net/sched/act_api.c
> index d09d068..50720b1 100644
> --- a/net/sched/act_api.c
> +++ b/net/sched/act_api.c
> @@ -592,6 +592,16 @@ err_out:
>         return ERR_PTR(err);
>  }
>
> +static void cleanup_a(struct list_head *actions, int ovr)
> +{
> +       struct tc_action *a;
> +
> +       list_for_each_entry(a, actions, list) {
> +               if (ovr)
> +                       a->tcfa_refcnt -= 1;
> +       }
> +}
> +
>  int tcf_action_init(struct net *net, struct nlattr *nla,
>                                   struct nlattr *est, char *name, int ovr,
>                                   int bind, struct list_head *actions)
> @@ -612,8 +622,15 @@ int tcf_action_init(struct net *net, struct nlattr *nla,
>                         goto err;
>                 }
>                 act->order = i;
> +               if (ovr)
> +                       act->tcfa_refcnt += 1;
>                 list_add_tail(&act->list, actions);
>         }
> +
> +       /* Remove the temp refcnt which was necessary to protect against
> +        * destroying an existing action which was being replaced
> +        */
> +       cleanup_a(actions, ovr);
>         return 0;

I am still trying to understand this piece, so here you hold the refcnt
for the same action used by the later iteration? Otherwise there is
almost none user inbetween hold and release...

The comment you add is not clear to me, we use RTNL/RCU to
sync destroy and replace, so how could that happen?

Thanks.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v3 net 1/1] net sched actions: fix GETing actions
  2016-09-13 16:20 ` Cong Wang
@ 2016-09-13 19:47   ` Jamal Hadi Salim
  2016-09-14 11:33     ` Jamal Hadi Salim
  0 siblings, 1 reply; 7+ messages in thread
From: Jamal Hadi Salim @ 2016-09-13 19:47 UTC (permalink / raw)
  To: Cong Wang; +Cc: David Miller, Linux Kernel Network Developers

On 16-09-13 12:20 PM, Cong Wang wrote:
> On Mon, Sep 12, 2016 at 4:07 PM, Jamal Hadi Salim <jhs@mojatatu.com> wrote:
>> From: Jamal Hadi Salim <jhs@mojatatu.com>
>>
>> With the batch changes that translated transient actions into
>> a temporary list lost in the translation was the fact that
>> tcf_action_destroy() will eventually delete the action from
>> the permanent location if the refcount is zero.
>>
>> Example of what broke:
>> ...add a gact action to drop
>> sudo $TC actions add action drop index 10
>> ...now retrieve it, looks good
>> sudo $TC actions get action gact index 10
>> ...retrieve it again and find it is gone!
>> sudo $TC actions get action gact index 10
>>
>> Fixes:
>> commit 22dc13c837c3 ("net_sched: convert tcf_exts from list to pointer array"),
>> commit 824a7e8863b3 ("net_sched: remove an unnecessary list_del()")
>> commit f07fed82ad79 ("net_sched: remove the leftover cleanup_a()")
>>
>> Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
>> ---
>>  net/sched/act_api.c | 19 +++++++++++++++++++
>>  1 file changed, 19 insertions(+)
>>
>> diff --git a/net/sched/act_api.c b/net/sched/act_api.c
>> index d09d068..50720b1 100644
>> --- a/net/sched/act_api.c
>> +++ b/net/sched/act_api.c
>> @@ -592,6 +592,16 @@ err_out:
>>         return ERR_PTR(err);
>>  }
>>
>> +static void cleanup_a(struct list_head *actions, int ovr)
>> +{
>> +       struct tc_action *a;
>> +
>> +       list_for_each_entry(a, actions, list) {
>> +               if (ovr)
>> +                       a->tcfa_refcnt -= 1;
>> +       }
>> +}
>> +
>>  int tcf_action_init(struct net *net, struct nlattr *nla,
>>                                   struct nlattr *est, char *name, int ovr,
>>                                   int bind, struct list_head *actions)
>> @@ -612,8 +622,15 @@ int tcf_action_init(struct net *net, struct nlattr *nla,
>>                         goto err;
>>                 }
>>                 act->order = i;
>> +               if (ovr)
>> +                       act->tcfa_refcnt += 1;
>>                 list_add_tail(&act->list, actions);
>>         }
>> +
>> +       /* Remove the temp refcnt which was necessary to protect against
>> +        * destroying an existing action which was being replaced
>> +        */
>> +       cleanup_a(actions, ovr);
>>         return 0;
>
> I am still trying to understand this piece, so here you hold the refcnt
> for the same action used by the later iteration? Otherwise there is
> almost none user inbetween hold and release...
>
> The comment you add is not clear to me, we use RTNL/RCU to
> sync destroy and replace, so how could that happen?
>

I was worried about the destroy() hitting an error in that function.
If an action already existed and all we asked for was to
replace some attribute it would be deleted. It was the way the code was
before your changes so i just restored it to its original form.

cheers,
jamal

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v3 net 1/1] net sched actions: fix GETing actions
  2016-09-13 19:47   ` Jamal Hadi Salim
@ 2016-09-14 11:33     ` Jamal Hadi Salim
  2016-09-14 16:30       ` Cong Wang
  0 siblings, 1 reply; 7+ messages in thread
From: Jamal Hadi Salim @ 2016-09-14 11:33 UTC (permalink / raw)
  To: Cong Wang; +Cc: David Miller, Linux Kernel Network Developers

On 16-09-13 03:47 PM, Jamal Hadi Salim wrote:
> On 16-09-13 12:20 PM, Cong Wang wrote:
>> On Mon, Sep 12, 2016 at 4:07 PM, Jamal Hadi Salim <jhs@mojatatu.com>
>> wrote:

[..]
>> I am still trying to understand this piece, so here you hold the refcnt
>> for the same action used by the later iteration? Otherwise there is
>> almost none user inbetween hold and release...
>>
>> The comment you add is not clear to me, we use RTNL/RCU to
>> sync destroy and replace, so how could that happen?
>>
>
> I was worried about the destroy() hitting an error in that function.
> If an action already existed and all we asked for was to
> replace some attribute it would be deleted. It was the way the code was
> before your changes so i just restored it to its original form.
>

And I have verified this is needed. I went and made gact return
a failure if you replace something. I added a gact action; then
when i replaced it failed. And when it failed it replace the existing
action.
I then tried another experiment and batch replaced several actions
including the one i know would fail. I placed the failing action in
the middle and hallelujah, all the actions before the middle one got
deleted.

So please ACK this so we can move forward.

cheers,
jamal

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v3 net 1/1] net sched actions: fix GETing actions
  2016-09-14 11:33     ` Jamal Hadi Salim
@ 2016-09-14 16:30       ` Cong Wang
  0 siblings, 0 replies; 7+ messages in thread
From: Cong Wang @ 2016-09-14 16:30 UTC (permalink / raw)
  To: Jamal Hadi Salim; +Cc: David Miller, Linux Kernel Network Developers

On Wed, Sep 14, 2016 at 4:33 AM, Jamal Hadi Salim <jhs@mojatatu.com> wrote:
> On 16-09-13 03:47 PM, Jamal Hadi Salim wrote:
>>
>> On 16-09-13 12:20 PM, Cong Wang wrote:
>>>
>>> On Mon, Sep 12, 2016 at 4:07 PM, Jamal Hadi Salim <jhs@mojatatu.com>
>>> wrote:
>
>
> [..]
>>>
>>> I am still trying to understand this piece, so here you hold the refcnt
>>> for the same action used by the later iteration? Otherwise there is
>>> almost none user inbetween hold and release...
>>>
>>> The comment you add is not clear to me, we use RTNL/RCU to
>>> sync destroy and replace, so how could that happen?
>>>
>>
>> I was worried about the destroy() hitting an error in that function.
>> If an action already existed and all we asked for was to
>> replace some attribute it would be deleted. It was the way the code was
>> before your changes so i just restored it to its original form.
>>
>
> And I have verified this is needed. I went and made gact return
> a failure if you replace something. I added a gact action; then
> when i replaced it failed. And when it failed it replace the existing
> action.

Oh, I see, the comments actually mislead me. ;) You are trying to
fix the rollback for failure path here... Makes sense to me now.

Thanks for verifying it.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v3 net 1/1] net sched actions: fix GETing actions
  2016-09-12 23:07 [PATCH v3 net 1/1] net sched actions: fix GETing actions Jamal Hadi Salim
  2016-09-13 16:20 ` Cong Wang
@ 2016-09-15 12:15 ` Sergei Shtylyov
  2016-09-15 23:33 ` David Miller
  2 siblings, 0 replies; 7+ messages in thread
From: Sergei Shtylyov @ 2016-09-15 12:15 UTC (permalink / raw)
  To: Jamal Hadi Salim, davem; +Cc: netdev, xiyou.wangcong

On 9/13/2016 2:07 AM, Jamal Hadi Salim wrote:

> From: Jamal Hadi Salim <jhs@mojatatu.com>
>
> With the batch changes that translated transient actions into
> a temporary list lost in the translation was the fact that
> tcf_action_destroy() will eventually delete the action from
> the permanent location if the refcount is zero.
>
> Example of what broke:
> ...add a gact action to drop
> sudo $TC actions add action drop index 10
> ...now retrieve it, looks good
> sudo $TC actions get action gact index 10
> ...retrieve it again and find it is gone!
> sudo $TC actions get action gact index 10
>
> Fixes:
> commit 22dc13c837c3 ("net_sched: convert tcf_exts from list to pointer array"),
> commit 824a7e8863b3 ("net_sched: remove an unnecessary list_del()")
> commit f07fed82ad79 ("net_sched: remove the leftover cleanup_a()")
>
> Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
> ---
>  net/sched/act_api.c | 19 +++++++++++++++++++
>  1 file changed, 19 insertions(+)
>
> diff --git a/net/sched/act_api.c b/net/sched/act_api.c
> index d09d068..50720b1 100644
> --- a/net/sched/act_api.c
> +++ b/net/sched/act_api.c
> @@ -592,6 +592,16 @@ err_out:
>  	return ERR_PTR(err);
>  }
>
> +static void cleanup_a(struct list_head *actions, int ovr)
> +{
> +	struct tc_action *a;
> +
> +	list_for_each_entry(a, actions, list) {
> +		if (ovr)
> +			a->tcfa_refcnt -= 1;

			a->tcfa_refcnt--;

[...]
> @@ -612,8 +622,15 @@ int tcf_action_init(struct net *net, struct nlattr *nla,
>  			goto err;
>  		}
>  		act->order = i;
> +		if (ovr)
> +			act->tcfa_refcnt += 1;

			act->tcfa_refcnt++;

[...]
> @@ -883,6 +900,8 @@ tca_action_gd(struct net *net, struct nlattr *nla, struct nlmsghdr *n,
>  			goto err;
>  		}
>  		act->order = i;
> +		if (event == RTM_GETACTION)
> +			act->tcfa_refcnt += 1;

		act->tcfa_refcnt++;

[...]

MBR, Sergei

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v3 net 1/1] net sched actions: fix GETing actions
  2016-09-12 23:07 [PATCH v3 net 1/1] net sched actions: fix GETing actions Jamal Hadi Salim
  2016-09-13 16:20 ` Cong Wang
  2016-09-15 12:15 ` Sergei Shtylyov
@ 2016-09-15 23:33 ` David Miller
  2 siblings, 0 replies; 7+ messages in thread
From: David Miller @ 2016-09-15 23:33 UTC (permalink / raw)
  To: jhs; +Cc: netdev, xiyou.wangcong

From: Jamal Hadi Salim <jhs@mojatatu.com>
Date: Mon, 12 Sep 2016 19:07:38 -0400

> From: Jamal Hadi Salim <jhs@mojatatu.com>
> 
> With the batch changes that translated transient actions into
> a temporary list lost in the translation was the fact that
> tcf_action_destroy() will eventually delete the action from
> the permanent location if the refcount is zero.
> 
> Example of what broke:
> ...add a gact action to drop
> sudo $TC actions add action drop index 10
> ...now retrieve it, looks good
> sudo $TC actions get action gact index 10
> ...retrieve it again and find it is gone!
> sudo $TC actions get action gact index 10
> 
> Fixes:
> commit 22dc13c837c3 ("net_sched: convert tcf_exts from list to pointer array"),
> commit 824a7e8863b3 ("net_sched: remove an unnecessary list_del()")
> commit f07fed82ad79 ("net_sched: remove the leftover cleanup_a()")
> 
> Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>

Please incorporate Sergei's feedback and resubmit, thanks Jamal.

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2016-09-15 23:33 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-09-12 23:07 [PATCH v3 net 1/1] net sched actions: fix GETing actions Jamal Hadi Salim
2016-09-13 16:20 ` Cong Wang
2016-09-13 19:47   ` Jamal Hadi Salim
2016-09-14 11:33     ` Jamal Hadi Salim
2016-09-14 16:30       ` Cong Wang
2016-09-15 12:15 ` Sergei Shtylyov
2016-09-15 23:33 ` David Miller

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.