linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] net: mellanox: mlx5: fix error return code of mlx5e_stats_flower()
@ 2021-03-06 13:47 Jia-Ju Bai
  2021-03-09  8:20 ` Roi Dayan
  0 siblings, 1 reply; 7+ messages in thread
From: Jia-Ju Bai @ 2021-03-06 13:47 UTC (permalink / raw)
  To: saeedm, leon, davem, kuba; +Cc: netdev, linux-rdma, linux-kernel, Jia-Ju Bai

When mlx5e_tc_get_counter() returns NULL to counter or
mlx5_devcom_get_peer_data() returns NULL to peer_esw, no error return 
code of mlx5e_stats_flower() is assigned.
To fix this bug, err is assigned with -EINVAL in these cases.

Reported-by: TOTE Robot <oslab@tsinghua.edu.cn>
Signed-off-by: Jia-Ju Bai <baijiaju1990@gmail.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/en_tc.c | 12 +++++++++---
 1 file changed, 9 insertions(+), 3 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c b/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
index 0da69b98f38f..1f2c9da7bd35 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
@@ -4380,8 +4380,10 @@ int mlx5e_stats_flower(struct net_device *dev, struct mlx5e_priv *priv,
 
 	if (mlx5e_is_offloaded_flow(flow) || flow_flag_test(flow, CT)) {
 		counter = mlx5e_tc_get_counter(flow);
-		if (!counter)
+		if (!counter) {
+			err = -EINVAL;
 			goto errout;
+		}
 
 		mlx5_fc_query_cached(counter, &bytes, &packets, &lastuse);
 	}
@@ -4390,8 +4392,10 @@ int mlx5e_stats_flower(struct net_device *dev, struct mlx5e_priv *priv,
 	 * un-offloaded while the other rule is offloaded.
 	 */
 	peer_esw = mlx5_devcom_get_peer_data(devcom, MLX5_DEVCOM_ESW_OFFLOADS);
-	if (!peer_esw)
+	if (!peer_esw) {
+		err = -EINVAL;
 		goto out;
+	}
 
 	if (flow_flag_test(flow, DUP) &&
 	    flow_flag_test(flow->peer_flow, OFFLOADED)) {
@@ -4400,8 +4404,10 @@ int mlx5e_stats_flower(struct net_device *dev, struct mlx5e_priv *priv,
 		u64 lastuse2;
 
 		counter = mlx5e_tc_get_counter(flow->peer_flow);
-		if (!counter)
+		if (!counter) {
+			err = -EINVAL;
 			goto no_peer_counter;
+		}
 		mlx5_fc_query_cached(counter, &bytes2, &packets2, &lastuse2);
 
 		bytes += bytes2;
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH] net: mellanox: mlx5: fix error return code of mlx5e_stats_flower()
  2021-03-06 13:47 [PATCH] net: mellanox: mlx5: fix error return code of mlx5e_stats_flower() Jia-Ju Bai
@ 2021-03-09  8:20 ` Roi Dayan
  2021-03-09  8:24   ` Roi Dayan
  0 siblings, 1 reply; 7+ messages in thread
From: Roi Dayan @ 2021-03-09  8:20 UTC (permalink / raw)
  To: Jia-Ju Bai, saeedm, leon, davem, kuba; +Cc: netdev, linux-rdma, linux-kernel



On 2021-03-06 3:47 PM, Jia-Ju Bai wrote:
> When mlx5e_tc_get_counter() returns NULL to counter or
> mlx5_devcom_get_peer_data() returns NULL to peer_esw, no error return
> code of mlx5e_stats_flower() is assigned.
> To fix this bug, err is assigned with -EINVAL in these cases.
> 
> Reported-by: TOTE Robot <oslab@tsinghua.edu.cn>
> Signed-off-by: Jia-Ju Bai <baijiaju1990@gmail.com>
> ---
>   drivers/net/ethernet/mellanox/mlx5/core/en_tc.c | 12 +++++++++---
>   1 file changed, 9 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c b/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
> index 0da69b98f38f..1f2c9da7bd35 100644
> --- a/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
> +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
> @@ -4380,8 +4380,10 @@ int mlx5e_stats_flower(struct net_device *dev, struct mlx5e_priv *priv,
>   
>   	if (mlx5e_is_offloaded_flow(flow) || flow_flag_test(flow, CT)) {
>   		counter = mlx5e_tc_get_counter(flow);
> -		if (!counter)
> +		if (!counter) {
> +			err = -EINVAL;
>   			goto errout;
> +		}
>   
>   		mlx5_fc_query_cached(counter, &bytes, &packets, &lastuse);
>   	}
> @@ -4390,8 +4392,10 @@ int mlx5e_stats_flower(struct net_device *dev, struct mlx5e_priv *priv,
>   	 * un-offloaded while the other rule is offloaded.
>   	 */
>   	peer_esw = mlx5_devcom_get_peer_data(devcom, MLX5_DEVCOM_ESW_OFFLOADS);
> -	if (!peer_esw)
> +	if (!peer_esw) {
> +		err = -EINVAL;

note here it's not an error. it could be there is no peer esw
so just continue with the stats update.

>   		goto out;
> +	}
>   
>   	if (flow_flag_test(flow, DUP) &&
>   	    flow_flag_test(flow->peer_flow, OFFLOADED)) {
> @@ -4400,8 +4404,10 @@ int mlx5e_stats_flower(struct net_device *dev, struct mlx5e_priv *priv,
>   		u64 lastuse2;
>   
>   		counter = mlx5e_tc_get_counter(flow->peer_flow);
> -		if (!counter)
> +		if (!counter) {
> +			err = -EINVAL;
>   			goto no_peer_counter;
> +		}
>   		mlx5_fc_query_cached(counter, &bytes2, &packets2, &lastuse2);
>   
>   		bytes += bytes2;
> 



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] net: mellanox: mlx5: fix error return code of mlx5e_stats_flower()
  2021-03-09  8:20 ` Roi Dayan
@ 2021-03-09  8:24   ` Roi Dayan
  2021-03-09  8:32     ` Jia-Ju Bai
  0 siblings, 1 reply; 7+ messages in thread
From: Roi Dayan @ 2021-03-09  8:24 UTC (permalink / raw)
  To: Jia-Ju Bai, saeedm, leon, davem, kuba; +Cc: netdev, linux-rdma, linux-kernel



On 2021-03-09 10:20 AM, Roi Dayan wrote:
> 
> 
> On 2021-03-06 3:47 PM, Jia-Ju Bai wrote:
>> When mlx5e_tc_get_counter() returns NULL to counter or
>> mlx5_devcom_get_peer_data() returns NULL to peer_esw, no error return
>> code of mlx5e_stats_flower() is assigned.
>> To fix this bug, err is assigned with -EINVAL in these cases.
>>
>> Reported-by: TOTE Robot <oslab@tsinghua.edu.cn>
>> Signed-off-by: Jia-Ju Bai <baijiaju1990@gmail.com>
>> ---
>>   drivers/net/ethernet/mellanox/mlx5/core/en_tc.c | 12 +++++++++---
>>   1 file changed, 9 insertions(+), 3 deletions(-)
>>
>> diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c 
>> b/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
>> index 0da69b98f38f..1f2c9da7bd35 100644
>> --- a/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
>> +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
>> @@ -4380,8 +4380,10 @@ int mlx5e_stats_flower(struct net_device *dev, 
>> struct mlx5e_priv *priv,
>>       if (mlx5e_is_offloaded_flow(flow) || flow_flag_test(flow, CT)) {
>>           counter = mlx5e_tc_get_counter(flow);
>> -        if (!counter)
>> +        if (!counter) {
>> +            err = -EINVAL;
>>               goto errout;
>> +        }
>>           mlx5_fc_query_cached(counter, &bytes, &packets, &lastuse);
>>       }
>> @@ -4390,8 +4392,10 @@ int mlx5e_stats_flower(struct net_device *dev, 
>> struct mlx5e_priv *priv,
>>        * un-offloaded while the other rule is offloaded.
>>        */
>>       peer_esw = mlx5_devcom_get_peer_data(devcom, 
>> MLX5_DEVCOM_ESW_OFFLOADS);
>> -    if (!peer_esw)
>> +    if (!peer_esw) {
>> +        err = -EINVAL;
> 
> note here it's not an error. it could be there is no peer esw
> so just continue with the stats update.
> 
>>           goto out;
>> +    }
>>       if (flow_flag_test(flow, DUP) &&
>>           flow_flag_test(flow->peer_flow, OFFLOADED)) {
>> @@ -4400,8 +4404,10 @@ int mlx5e_stats_flower(struct net_device *dev, 
>> struct mlx5e_priv *priv,
>>           u64 lastuse2;
>>           counter = mlx5e_tc_get_counter(flow->peer_flow);
>> -        if (!counter)
>> +        if (!counter) {
>> +            err = -EINVAL;

this change is problematic. the current goto is to do stats update with
the first counter stats we got but if you now want to return an error
then you probably should not do any update at all.

>>               goto no_peer_counter;
>> +        }
>>           mlx5_fc_query_cached(counter, &bytes2, &packets2, &lastuse2);
>>           bytes += bytes2;
>>
> 
> 

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] net: mellanox: mlx5: fix error return code of mlx5e_stats_flower()
  2021-03-09  8:24   ` Roi Dayan
@ 2021-03-09  8:32     ` Jia-Ju Bai
  2021-03-09  9:44       ` Roi Dayan
  0 siblings, 1 reply; 7+ messages in thread
From: Jia-Ju Bai @ 2021-03-09  8:32 UTC (permalink / raw)
  To: Roi Dayan, saeedm, leon, davem, kuba; +Cc: netdev, linux-rdma, linux-kernel



On 2021/3/9 16:24, Roi Dayan wrote:
>
>
> On 2021-03-09 10:20 AM, Roi Dayan wrote:
>>
>>
>> On 2021-03-06 3:47 PM, Jia-Ju Bai wrote:
>>> When mlx5e_tc_get_counter() returns NULL to counter or
>>> mlx5_devcom_get_peer_data() returns NULL to peer_esw, no error return
>>> code of mlx5e_stats_flower() is assigned.
>>> To fix this bug, err is assigned with -EINVAL in these cases.
>>>
>>> Reported-by: TOTE Robot <oslab@tsinghua.edu.cn>
>>> Signed-off-by: Jia-Ju Bai <baijiaju1990@gmail.com>
>>> ---
>>>   drivers/net/ethernet/mellanox/mlx5/core/en_tc.c | 12 +++++++++---
>>>   1 file changed, 9 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c 
>>> b/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
>>> index 0da69b98f38f..1f2c9da7bd35 100644
>>> --- a/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
>>> +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
>>> @@ -4380,8 +4380,10 @@ int mlx5e_stats_flower(struct net_device 
>>> *dev, struct mlx5e_priv *priv,
>>>       if (mlx5e_is_offloaded_flow(flow) || flow_flag_test(flow, CT)) {
>>>           counter = mlx5e_tc_get_counter(flow);
>>> -        if (!counter)
>>> +        if (!counter) {
>>> +            err = -EINVAL;
>>>               goto errout;
>>> +        }
>>>           mlx5_fc_query_cached(counter, &bytes, &packets, &lastuse);
>>>       }
>>> @@ -4390,8 +4392,10 @@ int mlx5e_stats_flower(struct net_device 
>>> *dev, struct mlx5e_priv *priv,
>>>        * un-offloaded while the other rule is offloaded.
>>>        */
>>>       peer_esw = mlx5_devcom_get_peer_data(devcom, 
>>> MLX5_DEVCOM_ESW_OFFLOADS);
>>> -    if (!peer_esw)
>>> +    if (!peer_esw) {
>>> +        err = -EINVAL;
>>
>> note here it's not an error. it could be there is no peer esw
>> so just continue with the stats update.
>>
>>>           goto out;
>>> +    }
>>>       if (flow_flag_test(flow, DUP) &&
>>>           flow_flag_test(flow->peer_flow, OFFLOADED)) {
>>> @@ -4400,8 +4404,10 @@ int mlx5e_stats_flower(struct net_device 
>>> *dev, struct mlx5e_priv *priv,
>>>           u64 lastuse2;
>>>           counter = mlx5e_tc_get_counter(flow->peer_flow);
>>> -        if (!counter)
>>> +        if (!counter) {
>>> +            err = -EINVAL;
>
> this change is problematic. the current goto is to do stats update with
> the first counter stats we got but if you now want to return an error
> then you probably should not do any update at all.

Thanks for your reply :)
I am not sure whether an error code should be returned here?
If so, flow_stats_update(...) should not be called here?


Best wishes,
Jia-Ju Bai

>
>>>               goto no_peer_counter;
>>> +        }
>>>           mlx5_fc_query_cached(counter, &bytes2, &packets2, &lastuse2);
>>>           bytes += bytes2;
>>>
>>
>>


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] net: mellanox: mlx5: fix error return code of mlx5e_stats_flower()
  2021-03-09  8:32     ` Jia-Ju Bai
@ 2021-03-09  9:44       ` Roi Dayan
  2021-03-11 22:47         ` Saeed Mahameed
  0 siblings, 1 reply; 7+ messages in thread
From: Roi Dayan @ 2021-03-09  9:44 UTC (permalink / raw)
  To: Jia-Ju Bai, saeedm, leon, davem, kuba; +Cc: netdev, linux-rdma, linux-kernel



On 2021-03-09 10:32 AM, Jia-Ju Bai wrote:
> 
> 
> On 2021/3/9 16:24, Roi Dayan wrote:
>>
>>
>> On 2021-03-09 10:20 AM, Roi Dayan wrote:
>>>
>>>
>>> On 2021-03-06 3:47 PM, Jia-Ju Bai wrote:
>>>> When mlx5e_tc_get_counter() returns NULL to counter or
>>>> mlx5_devcom_get_peer_data() returns NULL to peer_esw, no error return
>>>> code of mlx5e_stats_flower() is assigned.
>>>> To fix this bug, err is assigned with -EINVAL in these cases.
>>>>
>>>> Reported-by: TOTE Robot <oslab@tsinghua.edu.cn>
>>>> Signed-off-by: Jia-Ju Bai <baijiaju1990@gmail.com>
>>>> ---
>>>>   drivers/net/ethernet/mellanox/mlx5/core/en_tc.c | 12 +++++++++---
>>>>   1 file changed, 9 insertions(+), 3 deletions(-)
>>>>
>>>> diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c 
>>>> b/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
>>>> index 0da69b98f38f..1f2c9da7bd35 100644
>>>> --- a/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
>>>> +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
>>>> @@ -4380,8 +4380,10 @@ int mlx5e_stats_flower(struct net_device 
>>>> *dev, struct mlx5e_priv *priv,
>>>>       if (mlx5e_is_offloaded_flow(flow) || flow_flag_test(flow, CT)) {
>>>>           counter = mlx5e_tc_get_counter(flow);
>>>> -        if (!counter)
>>>> +        if (!counter) {
>>>> +            err = -EINVAL;
>>>>               goto errout;
>>>> +        }
>>>>           mlx5_fc_query_cached(counter, &bytes, &packets, &lastuse);
>>>>       }
>>>> @@ -4390,8 +4392,10 @@ int mlx5e_stats_flower(struct net_device 
>>>> *dev, struct mlx5e_priv *priv,
>>>>        * un-offloaded while the other rule is offloaded.
>>>>        */
>>>>       peer_esw = mlx5_devcom_get_peer_data(devcom, 
>>>> MLX5_DEVCOM_ESW_OFFLOADS);
>>>> -    if (!peer_esw)
>>>> +    if (!peer_esw) {
>>>> +        err = -EINVAL;
>>>
>>> note here it's not an error. it could be there is no peer esw
>>> so just continue with the stats update.
>>>
>>>>           goto out;
>>>> +    }
>>>>       if (flow_flag_test(flow, DUP) &&
>>>>           flow_flag_test(flow->peer_flow, OFFLOADED)) {
>>>> @@ -4400,8 +4404,10 @@ int mlx5e_stats_flower(struct net_device 
>>>> *dev, struct mlx5e_priv *priv,
>>>>           u64 lastuse2;
>>>>           counter = mlx5e_tc_get_counter(flow->peer_flow);
>>>> -        if (!counter)
>>>> +        if (!counter) {
>>>> +            err = -EINVAL;
>>
>> this change is problematic. the current goto is to do stats update with
>> the first counter stats we got but if you now want to return an error
>> then you probably should not do any update at all.
> 
> Thanks for your reply :)
> I am not sure whether an error code should be returned here?
> If so, flow_stats_update(...) should not be called here?
> 
> 
> Best wishes,
> Jia-Ju Bai
> 

basically flow and peer_flow should be valid and protected from changes,
and counter should be valid.
it looks like the check here is more of a sanity check if something goes
wrong but shouldn't. you can just let it be, update the stats from the
first queried counter.


>>
>>>>               goto no_peer_counter;
>>>> +        }
>>>>           mlx5_fc_query_cached(counter, &bytes2, &packets2, &lastuse2);
>>>>           bytes += bytes2;
>>>>
>>>
>>>
> 

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] net: mellanox: mlx5: fix error return code of mlx5e_stats_flower()
  2021-03-09  9:44       ` Roi Dayan
@ 2021-03-11 22:47         ` Saeed Mahameed
  2021-03-14  9:50           ` Roi Dayan
  0 siblings, 1 reply; 7+ messages in thread
From: Saeed Mahameed @ 2021-03-11 22:47 UTC (permalink / raw)
  To: Roi Dayan, Jia-Ju Bai, leon, davem, kuba; +Cc: netdev, linux-rdma, linux-kernel

On Tue, 2021-03-09 at 11:44 +0200, Roi Dayan wrote:
> 
> 
> On 2021-03-09 10:32 AM, Jia-Ju Bai wrote:
> > 
> > 
> > On 2021/3/9 16:24, Roi Dayan wrote:
> > > 
> > > 
> > > On 2021-03-09 10:20 AM, Roi Dayan wrote:
> > > > 
> > > > 
> > > > On 2021-03-06 3:47 PM, Jia-Ju Bai wrote:
> > > > > When mlx5e_tc_get_counter() returns NULL to counter or
> > > > > mlx5_devcom_get_peer_data() returns NULL to peer_esw, no
> > > > > error return
> > > > > code of mlx5e_stats_flower() is assigned.
> > > > > To fix this bug, err is assigned with -EINVAL in these cases.
> > > > > 
> > > > > Reported-by: TOTE Robot <oslab@tsinghua.edu.cn>

Hey Jia-Ju, What are the conditions for this robot to raise a flag?
sometimes it is totally normal to abort a function and return 0.. i am
just curious to know ? 


> > > > > Signed-off-by: Jia-Ju Bai <baijiaju1990@gmail.com>
> > > > > ---
> > > > >   drivers/net/ethernet/mellanox/mlx5/core/en_tc.c | 12
> > > > > +++++++++---
> > > > >   1 file changed, 9 insertions(+), 3 deletions(-)
> > > > > 
> > > > > diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c 
> > > > > b/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
> > > > > index 0da69b98f38f..1f2c9da7bd35 100644
> > > > > --- a/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
> > > > > +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
> > > > > @@ -4380,8 +4380,10 @@ int mlx5e_stats_flower(struct
> > > > > net_device 
> > > > > *dev, struct mlx5e_priv *priv,
> > > > >       if (mlx5e_is_offloaded_flow(flow) ||
> > > > > flow_flag_test(flow, CT)) {
> > > > >           counter = mlx5e_tc_get_counter(flow);
> > > > > -        if (!counter)
> > > > > +        if (!counter) {
> > > > > +            err = -EINVAL;
> > > > >               goto errout;
> > > > > +        }
> > > > >           mlx5_fc_query_cached(counter, &bytes, &packets,
> > > > > &lastuse);
> > > > >       }
> > > > > @@ -4390,8 +4392,10 @@ int mlx5e_stats_flower(struct
> > > > > net_device 
> > > > > *dev, struct mlx5e_priv *priv,
> > > > >        * un-offloaded while the other rule is offloaded.
> > > > >        */
> > > > >       peer_esw = mlx5_devcom_get_peer_data(devcom, 
> > > > > MLX5_DEVCOM_ESW_OFFLOADS);
> > > > > -    if (!peer_esw)
> > > > > +    if (!peer_esw) {
> > > > > +        err = -EINVAL;
> > > > 

This is not an error flow, i am curious what are the thresholds of this
robot ?

> > > > note here it's not an error. it could be there is no peer esw
> > > > so just continue with the stats update.
> > > > 
> > > > >           goto out;
> > > > > +    }
> > > > >       if (flow_flag_test(flow, DUP) &&
> > > > >           flow_flag_test(flow->peer_flow, OFFLOADED)) {
> > > > > @@ -4400,8 +4404,10 @@ int mlx5e_stats_flower(struct
> > > > > net_device 
> > > > > *dev, struct mlx5e_priv *priv,
> > > > >           u64 lastuse2;
> > > > >           counter = mlx5e_tc_get_counter(flow->peer_flow);
> > > > > -        if (!counter)
> > > > > +        if (!counter) {
> > > > > +            err = -EINVAL;
> > > 
> > > this change is problematic. the current goto is to do stats
> > > update with
> > > the first counter stats we got but if you now want to return an
> > > error
> > > then you probably should not do any update at all.
> > 
> > Thanks for your reply :)
> > I am not sure whether an error code should be returned here?
> > If so, flow_stats_update(...) should not be called here?
> > 
> > 
> > Best wishes,
> > Jia-Ju Bai
> > 
> 
> basically flow and peer_flow should be valid and protected from
> changes,
> and counter should be valid.
> it looks like the check here is more of a sanity check if something
> goes
> wrong but shouldn't. you can just let it be, update the stats from
> the
> first queried counter.
> 

Roi, let's consider returning an error code here, we shouldn't be
silently returning if we are not expecting these errors, 

why would mlx5e_stats_flower() be called if stats are not offloaded ?

Thanks,
Saeed.



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] net: mellanox: mlx5: fix error return code of mlx5e_stats_flower()
  2021-03-11 22:47         ` Saeed Mahameed
@ 2021-03-14  9:50           ` Roi Dayan
  0 siblings, 0 replies; 7+ messages in thread
From: Roi Dayan @ 2021-03-14  9:50 UTC (permalink / raw)
  To: Saeed Mahameed, Jia-Ju Bai, leon, davem, kuba
  Cc: netdev, linux-rdma, linux-kernel



On 2021-03-12 12:47 AM, Saeed Mahameed wrote:
> On Tue, 2021-03-09 at 11:44 +0200, Roi Dayan wrote:
>>
>>
>> On 2021-03-09 10:32 AM, Jia-Ju Bai wrote:
>>>
>>>
>>> On 2021/3/9 16:24, Roi Dayan wrote:
>>>>
>>>>
>>>> On 2021-03-09 10:20 AM, Roi Dayan wrote:
>>>>>
>>>>>
>>>>> On 2021-03-06 3:47 PM, Jia-Ju Bai wrote:
>>>>>> When mlx5e_tc_get_counter() returns NULL to counter or
>>>>>> mlx5_devcom_get_peer_data() returns NULL to peer_esw, no
>>>>>> error return
>>>>>> code of mlx5e_stats_flower() is assigned.
>>>>>> To fix this bug, err is assigned with -EINVAL in these cases.
>>>>>>
>>>>>> Reported-by: TOTE Robot <oslab@tsinghua.edu.cn>
> 
> Hey Jia-Ju, What are the conditions for this robot to raise a flag?
> sometimes it is totally normal to abort a function and return 0.. i am
> just curious to know ?
> 
> 
>>>>>> Signed-off-by: Jia-Ju Bai <baijiaju1990@gmail.com>
>>>>>> ---
>>>>>>    drivers/net/ethernet/mellanox/mlx5/core/en_tc.c | 12
>>>>>> +++++++++---
>>>>>>    1 file changed, 9 insertions(+), 3 deletions(-)
>>>>>>
>>>>>> diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
>>>>>> b/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
>>>>>> index 0da69b98f38f..1f2c9da7bd35 100644
>>>>>> --- a/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
>>>>>> +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
>>>>>> @@ -4380,8 +4380,10 @@ int mlx5e_stats_flower(struct
>>>>>> net_device
>>>>>> *dev, struct mlx5e_priv *priv,
>>>>>>        if (mlx5e_is_offloaded_flow(flow) ||
>>>>>> flow_flag_test(flow, CT)) {
>>>>>>            counter = mlx5e_tc_get_counter(flow);
>>>>>> -        if (!counter)
>>>>>> +        if (!counter) {
>>>>>> +            err = -EINVAL;
>>>>>>                goto errout;
>>>>>> +        }
>>>>>>            mlx5_fc_query_cached(counter, &bytes, &packets,
>>>>>> &lastuse);
>>>>>>        }
>>>>>> @@ -4390,8 +4392,10 @@ int mlx5e_stats_flower(struct
>>>>>> net_device
>>>>>> *dev, struct mlx5e_priv *priv,
>>>>>>         * un-offloaded while the other rule is offloaded.
>>>>>>         */
>>>>>>        peer_esw = mlx5_devcom_get_peer_data(devcom,
>>>>>> MLX5_DEVCOM_ESW_OFFLOADS);
>>>>>> -    if (!peer_esw)
>>>>>> +    if (!peer_esw) {
>>>>>> +        err = -EINVAL;
>>>>>
> 
> This is not an error flow, i am curious what are the thresholds of this
> robot ?
> 
>>>>> note here it's not an error. it could be there is no peer esw
>>>>> so just continue with the stats update.
>>>>>
>>>>>>            goto out;
>>>>>> +    }
>>>>>>        if (flow_flag_test(flow, DUP) &&
>>>>>>            flow_flag_test(flow->peer_flow, OFFLOADED)) {
>>>>>> @@ -4400,8 +4404,10 @@ int mlx5e_stats_flower(struct
>>>>>> net_device
>>>>>> *dev, struct mlx5e_priv *priv,
>>>>>>            u64 lastuse2;
>>>>>>            counter = mlx5e_tc_get_counter(flow->peer_flow);
>>>>>> -        if (!counter)
>>>>>> +        if (!counter) {
>>>>>> +            err = -EINVAL;
>>>>
>>>> this change is problematic. the current goto is to do stats
>>>> update with
>>>> the first counter stats we got but if you now want to return an
>>>> error
>>>> then you probably should not do any update at all.
>>>
>>> Thanks for your reply :)
>>> I am not sure whether an error code should be returned here?
>>> If so, flow_stats_update(...) should not be called here?
>>>
>>>
>>> Best wishes,
>>> Jia-Ju Bai
>>>
>>
>> basically flow and peer_flow should be valid and protected from
>> changes,
>> and counter should be valid.
>> it looks like the check here is more of a sanity check if something
>> goes
>> wrong but shouldn't. you can just let it be, update the stats from
>> the
>> first queried counter.
>>
> 
> Roi, let's consider returning an error code here, we shouldn't be
> silently returning if we are not expecting these errors,
> 
> why would mlx5e_stats_flower() be called if stats are not offloaded ?
> 
> Thanks,
> Saeed.
> 
> 

yes we can return an error if peer counter missing.
I just pointed out we should probably not call flow_stats_update() if
we do return an error.
the other option, as today, is updating the stats with first counter
stats and because of that we didn't return an error.

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2021-03-14  9:51 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-03-06 13:47 [PATCH] net: mellanox: mlx5: fix error return code of mlx5e_stats_flower() Jia-Ju Bai
2021-03-09  8:20 ` Roi Dayan
2021-03-09  8:24   ` Roi Dayan
2021-03-09  8:32     ` Jia-Ju Bai
2021-03-09  9:44       ` Roi Dayan
2021-03-11 22:47         ` Saeed Mahameed
2021-03-14  9:50           ` Roi Dayan

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).