All of lore.kernel.org
 help / color / mirror / Atom feed
* [Patch net-next] net_sched: fix an extack message in tcf_block_find()
@ 2018-09-27 20:42 Cong Wang
  2018-09-27 20:42 ` [Patch net-next] net_sched: fix a crash in tc_new_tfilter() Cong Wang
                   ` (3 more replies)
  0 siblings, 4 replies; 11+ messages in thread
From: Cong Wang @ 2018-09-27 20:42 UTC (permalink / raw)
  To: netdev; +Cc: jiri, jhs, vladbu, Cong Wang

It is clearly a copy-n-paste.

Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
---
 net/sched/cls_api.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/sched/cls_api.c b/net/sched/cls_api.c
index 3de47e99b788..8dd7f8af6d54 100644
--- a/net/sched/cls_api.c
+++ b/net/sched/cls_api.c
@@ -655,7 +655,7 @@ static struct tcf_block *tcf_block_find(struct net *net, struct Qdisc **q,
 
 		*q = qdisc_refcount_inc_nz(*q);
 		if (!*q) {
-			NL_SET_ERR_MSG(extack, "Parent Qdisc doesn't exists");
+			NL_SET_ERR_MSG(extack, "Can't increase Qdisc refcount");
 			err = -EINVAL;
 			goto errout_rcu;
 		}
-- 
2.14.4

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [Patch net-next] net_sched: fix a crash in tc_new_tfilter()
  2018-09-27 20:42 [Patch net-next] net_sched: fix an extack message in tcf_block_find() Cong Wang
@ 2018-09-27 20:42 ` Cong Wang
  2018-09-27 21:16 ` [Patch net-next] net_sched: fix an extack message in tcf_block_find() Eric Dumazet
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 11+ messages in thread
From: Cong Wang @ 2018-09-27 20:42 UTC (permalink / raw)
  To: netdev; +Cc: jiri, jhs, vladbu, Cong Wang

When tcf_block_find() fails, it already rollbacks the qdisc refcnt,
so its caller doesn't need to clean up this again. Avoid calling
qdisc_put() again by resetting qdisc to NULL for callers.

Reported-by: syzbot+37b8770e6d5a8220a039@syzkaller.appspotmail.com
Fixes: e368fdb61d8e ("net: sched: use Qdisc rcu API instead of relying on rtnl lock")
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
---
 net/sched/cls_api.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/net/sched/cls_api.c b/net/sched/cls_api.c
index 8dd7f8af6d54..a4167ec0a220 100644
--- a/net/sched/cls_api.c
+++ b/net/sched/cls_api.c
@@ -717,8 +717,10 @@ static struct tcf_block *tcf_block_find(struct net *net, struct Qdisc **q,
 errout_rcu:
 	rcu_read_unlock();
 errout_qdisc:
-	if (*q)
+	if (*q) {
 		qdisc_put(*q);
+		*q = NULL;
+	}
 	return ERR_PTR(err);
 }
 
-- 
2.14.4

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [Patch net-next] net_sched: fix an extack message in tcf_block_find()
  2018-09-27 20:42 [Patch net-next] net_sched: fix an extack message in tcf_block_find() Cong Wang
  2018-09-27 20:42 ` [Patch net-next] net_sched: fix a crash in tc_new_tfilter() Cong Wang
@ 2018-09-27 21:16 ` Eric Dumazet
  2018-09-27 21:36   ` Cong Wang
  2018-09-28 11:36 ` Vlad Buslov
  2018-09-28 17:04 ` Cong Wang
  3 siblings, 1 reply; 11+ messages in thread
From: Eric Dumazet @ 2018-09-27 21:16 UTC (permalink / raw)
  To: Cong Wang, netdev; +Cc: jiri, jhs, vladbu



On 09/27/2018 01:42 PM, Cong Wang wrote:
> It is clearly a copy-n-paste.
> 
> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
> ---
>  net/sched/cls_api.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/net/sched/cls_api.c b/net/sched/cls_api.c
> index 3de47e99b788..8dd7f8af6d54 100644
> --- a/net/sched/cls_api.c
> +++ b/net/sched/cls_api.c
> @@ -655,7 +655,7 @@ static struct tcf_block *tcf_block_find(struct net *net, struct Qdisc **q,
>  
>  		*q = qdisc_refcount_inc_nz(*q);
>  		if (!*q) {
> -			NL_SET_ERR_MSG(extack, "Parent Qdisc doesn't exists");
> +			NL_SET_ERR_MSG(extack, "Can't increase Qdisc refcount");


I am not sure it was a copy-n-paste.

Qdisc refcount business is kernel internal.
If we can not increase the refcount, this is precisely because this qdisc is about
to be destroyed. Nothing fundamentally different than having this thread delayed a bit
and qdisc_lookup_rcu() returning NULL in the first place.

This also means that using RCU for control path is problematic, as surely the caller
of this interface would prefer something that succeeds, even if this means
waiting a bit in the kernel.

Or are we willing to change ip command and make it restart failed syscalls ?

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [Patch net-next] net_sched: fix an extack message in tcf_block_find()
  2018-09-27 21:16 ` [Patch net-next] net_sched: fix an extack message in tcf_block_find() Eric Dumazet
@ 2018-09-27 21:36   ` Cong Wang
  2018-09-27 22:18     ` Eric Dumazet
  2018-09-28  1:51     ` [Patch net-next] net_sched: fix an extack message " David Ahern
  0 siblings, 2 replies; 11+ messages in thread
From: Cong Wang @ 2018-09-27 21:36 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Linux Kernel Network Developers, Jiri Pirko, Jamal Hadi Salim,
	Vlad Buslov

On Thu, Sep 27, 2018 at 2:16 PM Eric Dumazet <eric.dumazet@gmail.com> wrote:
>
>
>
> On 09/27/2018 01:42 PM, Cong Wang wrote:
> > It is clearly a copy-n-paste.
> >
> > Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
> > ---
> >  net/sched/cls_api.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/net/sched/cls_api.c b/net/sched/cls_api.c
> > index 3de47e99b788..8dd7f8af6d54 100644
> > --- a/net/sched/cls_api.c
> > +++ b/net/sched/cls_api.c
> > @@ -655,7 +655,7 @@ static struct tcf_block *tcf_block_find(struct net *net, struct Qdisc **q,
> >
> >               *q = qdisc_refcount_inc_nz(*q);
> >               if (!*q) {
> > -                     NL_SET_ERR_MSG(extack, "Parent Qdisc doesn't exists");
> > +                     NL_SET_ERR_MSG(extack, "Can't increase Qdisc refcount");
>
>
> I am not sure it was a copy-n-paste.


Make sure you knew there is an exactly same extack message
(with a same English grammar error).


>
> Qdisc refcount business is kernel internal.

Yeah, but the extack message is already there, this patch doesn't add
any new extack. Or you are suggesting we should remove it?



> If we can not increase the refcount, this is precisely because this qdisc is about
> to be destroyed. Nothing fundamentally different than having this thread delayed a bit
> and qdisc_lookup_rcu() returning NULL in the first place.


qdisc_lookup_rcu() is not always called, it could be dev->qdisc.
I am pretty sure parent exists in dev->qdisc.


>
> This also means that using RCU for control path is problematic, as surely the caller
> of this interface would prefer something that succeeds, even if this means
> waiting a bit in the kernel.

I fail to validate this statement, Why it prefers success when refcnt reaches
0?


>
> Or are we willing to change ip command and make it restart failed syscalls ?
>

I don't understand what you mean by changing ip command, you must
mean tc command, but still, I have no idea about how restarting failed
syscall could be related to my patch and why we need to restart anything
here. If the refcnt goes to 0, it will never come back, retrying won't help
anything.

BTW:

If you have any other question beyond my patch's scope, isn't it better
that we start a new thread for discussion?

In case you still misunderstand, my patch never intends to address any
other problem rather than correcting an inaccurate extack message.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [Patch net-next] net_sched: fix an extack message in tcf_block_find()
  2018-09-27 21:36   ` Cong Wang
@ 2018-09-27 22:18     ` Eric Dumazet
  2018-09-28 17:19       ` rcu_read_lock() " Cong Wang
  2018-09-28  1:51     ` [Patch net-next] net_sched: fix an extack message " David Ahern
  1 sibling, 1 reply; 11+ messages in thread
From: Eric Dumazet @ 2018-09-27 22:18 UTC (permalink / raw)
  To: Cong Wang, Eric Dumazet
  Cc: Linux Kernel Network Developers, Jiri Pirko, Jamal Hadi Salim,
	Vlad Buslov



On 09/27/2018 02:36 PM, Cong Wang wrote:

> I don't understand what you mean by changing ip command, you must
> mean tc command, but still, I have no idea about how restarting failed
> syscall could be related to my patch and why we need to restart anything
> here. If the refcnt goes to 0, it will never come back, retrying won't help
> anything.
>

Yep, tc command it is.

I was not especially commenting your patch (replacing an english message by another does
not seem very big deal), but the fact that the code right there seems to be prepared
for parallel changes.

But using RCU lookups in control path will lead to occasional failures
that most user space tools would not expect.

Lets assume two tasks are launching "tc qdisc replace dev eth0 root XXX" in whatever order/parallelism.

Both should succeed, after/before major RTNL->other_locking_mechanism

Control paths are usually using a mutex or a spinlock so that they never hit a 0-refcount at all.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [Patch net-next] net_sched: fix an extack message in tcf_block_find()
  2018-09-27 21:36   ` Cong Wang
  2018-09-27 22:18     ` Eric Dumazet
@ 2018-09-28  1:51     ` David Ahern
  1 sibling, 0 replies; 11+ messages in thread
From: David Ahern @ 2018-09-28  1:51 UTC (permalink / raw)
  To: Cong Wang, Eric Dumazet
  Cc: Linux Kernel Network Developers, Jiri Pirko, Jamal Hadi Salim,
	Vlad Buslov

On 9/27/18 3:36 PM, Cong Wang wrote:
> On Thu, Sep 27, 2018 at 2:16 PM Eric Dumazet <eric.dumazet@gmail.com> wrote:
>>
>>
>>
>> On 09/27/2018 01:42 PM, Cong Wang wrote:
>>> It is clearly a copy-n-paste.
>>>
>>> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
>>> ---
>>>  net/sched/cls_api.c | 2 +-
>>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>>
>>> diff --git a/net/sched/cls_api.c b/net/sched/cls_api.c
>>> index 3de47e99b788..8dd7f8af6d54 100644
>>> --- a/net/sched/cls_api.c
>>> +++ b/net/sched/cls_api.c
>>> @@ -655,7 +655,7 @@ static struct tcf_block *tcf_block_find(struct net *net, struct Qdisc **q,
>>>
>>>               *q = qdisc_refcount_inc_nz(*q);
>>>               if (!*q) {
>>> -                     NL_SET_ERR_MSG(extack, "Parent Qdisc doesn't exists");
>>> +                     NL_SET_ERR_MSG(extack, "Can't increase Qdisc refcount");
>>
>>
>> I am not sure it was a copy-n-paste.
> 
> 
> Make sure you knew there is an exactly same extack message
> (with a same English grammar error).
> 
> 
>>
>> Qdisc refcount business is kernel internal.
> 
> Yeah, but the extack message is already there, this patch doesn't add
> any new extack. Or you are suggesting we should remove it?

IMO the message grammar should be fixed, but the content is correct --
ie, parent qdisc does not exist.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [Patch net-next] net_sched: fix an extack message in tcf_block_find()
  2018-09-27 20:42 [Patch net-next] net_sched: fix an extack message in tcf_block_find() Cong Wang
  2018-09-27 20:42 ` [Patch net-next] net_sched: fix a crash in tc_new_tfilter() Cong Wang
  2018-09-27 21:16 ` [Patch net-next] net_sched: fix an extack message in tcf_block_find() Eric Dumazet
@ 2018-09-28 11:36 ` Vlad Buslov
  2018-09-28 17:03   ` Cong Wang
  2018-09-28 17:04 ` Cong Wang
  3 siblings, 1 reply; 11+ messages in thread
From: Vlad Buslov @ 2018-09-28 11:36 UTC (permalink / raw)
  To: Cong Wang; +Cc: netdev, jiri, jhs

On Thu 27 Sep 2018 at 20:42, Cong Wang <xiyou.wangcong@gmail.com> wrote:
> It is clearly a copy-n-paste.
>
> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
> ---
>  net/sched/cls_api.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/net/sched/cls_api.c b/net/sched/cls_api.c
> index 3de47e99b788..8dd7f8af6d54 100644
> --- a/net/sched/cls_api.c
> +++ b/net/sched/cls_api.c
> @@ -655,7 +655,7 @@ static struct tcf_block *tcf_block_find(struct net *net, struct Qdisc **q,
>  
>  		*q = qdisc_refcount_inc_nz(*q);
>  		if (!*q) {
> -			NL_SET_ERR_MSG(extack, "Parent Qdisc doesn't exists");
> +			NL_SET_ERR_MSG(extack, "Can't increase Qdisc refcount");
>  			err = -EINVAL;
>  			goto errout_rcu;
>  		}

Is there a benefit in exposing this info to user?
For all intents and purposes Qdisc is gone at that point.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [Patch net-next] net_sched: fix an extack message in tcf_block_find()
  2018-09-28 11:36 ` Vlad Buslov
@ 2018-09-28 17:03   ` Cong Wang
  2018-09-30 14:12     ` Vlad Buslov
  0 siblings, 1 reply; 11+ messages in thread
From: Cong Wang @ 2018-09-28 17:03 UTC (permalink / raw)
  To: Vlad Buslov; +Cc: Linux Kernel Network Developers, Jiri Pirko, Jamal Hadi Salim

On Fri, Sep 28, 2018 at 4:36 AM Vlad Buslov <vladbu@mellanox.com> wrote:
>
> On Thu 27 Sep 2018 at 20:42, Cong Wang <xiyou.wangcong@gmail.com> wrote:
> > It is clearly a copy-n-paste.
> >
> > Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
> > ---
> >  net/sched/cls_api.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/net/sched/cls_api.c b/net/sched/cls_api.c
> > index 3de47e99b788..8dd7f8af6d54 100644
> > --- a/net/sched/cls_api.c
> > +++ b/net/sched/cls_api.c
> > @@ -655,7 +655,7 @@ static struct tcf_block *tcf_block_find(struct net *net, struct Qdisc **q,
> >
> >               *q = qdisc_refcount_inc_nz(*q);
> >               if (!*q) {
> > -                     NL_SET_ERR_MSG(extack, "Parent Qdisc doesn't exists");
> > +                     NL_SET_ERR_MSG(extack, "Can't increase Qdisc refcount");
> >                       err = -EINVAL;
> >                       goto errout_rcu;
> >               }
>
> Is there a benefit in exposing this info to user?

Depends on what user you mean here. For kernel developers, yes,
this is useful. For others, no.


> For all intents and purposes Qdisc is gone at that point.

I don't want to be a language lawyer, but there is a difference between
"it doesn't exist" and "it exists but being removed". The errno EINVAL
betrays what you said too, it must be ENOENT to mach "Qdisc is gone".

I don't want to waste my time on this any more. Let's just drop it.

I really don't care, do you?

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [Patch net-next] net_sched: fix an extack message in tcf_block_find()
  2018-09-27 20:42 [Patch net-next] net_sched: fix an extack message in tcf_block_find() Cong Wang
                   ` (2 preceding siblings ...)
  2018-09-28 11:36 ` Vlad Buslov
@ 2018-09-28 17:04 ` Cong Wang
  3 siblings, 0 replies; 11+ messages in thread
From: Cong Wang @ 2018-09-28 17:04 UTC (permalink / raw)
  To: Linux Kernel Network Developers; +Cc: Jiri Pirko, Jamal Hadi Salim, Vlad Buslov

On Thu, Sep 27, 2018 at 1:42 PM Cong Wang <xiyou.wangcong@gmail.com> wrote:
>
> It is clearly a copy-n-paste.
>
> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>

I regret again for wasting my time, so:

NAcked-by: Cong Wang <xiyou.wangcong@gmail.com>

I really don't care, do you?

^ permalink raw reply	[flat|nested] 11+ messages in thread

* rcu_read_lock() in tcf_block_find()
  2018-09-27 22:18     ` Eric Dumazet
@ 2018-09-28 17:19       ` Cong Wang
  0 siblings, 0 replies; 11+ messages in thread
From: Cong Wang @ 2018-09-28 17:19 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Linux Kernel Network Developers, Jiri Pirko, Jamal Hadi Salim,
	Vlad Buslov

(Changing $subject as the discussion is going to a completely different topic)

On Thu, Sep 27, 2018 at 3:19 PM Eric Dumazet <eric.dumazet@gmail.com> wrote:
>
>
>
> On 09/27/2018 02:36 PM, Cong Wang wrote:
>
> > I don't understand what you mean by changing ip command, you must
> > mean tc command, but still, I have no idea about how restarting failed
> > syscall could be related to my patch and why we need to restart anything
> > here. If the refcnt goes to 0, it will never come back, retrying won't help
> > anything.
> >
>
> Yep, tc command it is.
>
> I was not especially commenting your patch (replacing an english message by another does
> not seem very big deal), but the fact that the code right there seems to be prepared
> for parallel changes.
>
> But using RCU lookups in control path will lead to occasional failures
> that most user space tools would not expect.
>

I already discussed this with Vlad in the beginning of his RTNL
removal patches, we both agreed some lock is still needed, it is not
completely lockless. Take a look at tc action code now, two spinlocks
are still needed even after we will remove the RTNL there.


> Lets assume two tasks are launching "tc qdisc replace dev eth0 root XXX" in whatever order/parallelism.
>
> Both should succeed, after/before major RTNL->other_locking_mechanism


Yes, it is never going to be completely lockless.


>
> Control paths are usually using a mutex or a spinlock so that they never hit a 0-refcount at all.


For dev->qdisc, sure, we should already hold a refcnt, it can't be gone.

For qdisc_lookup_rcu(), it could be that refcnt goes to 0 before we
remove it from hashtable, right? call_rcu() is only called after
refcnt==0, so rcu read lock can't help here.

Thanks.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [Patch net-next] net_sched: fix an extack message in tcf_block_find()
  2018-09-28 17:03   ` Cong Wang
@ 2018-09-30 14:12     ` Vlad Buslov
  0 siblings, 0 replies; 11+ messages in thread
From: Vlad Buslov @ 2018-09-30 14:12 UTC (permalink / raw)
  To: Cong Wang; +Cc: Linux Kernel Network Developers, Jiri Pirko, Jamal Hadi Salim


On Fri 28 Sep 2018 at 17:03, Cong Wang <xiyou.wangcong@gmail.com> wrote:
> On Fri, Sep 28, 2018 at 4:36 AM Vlad Buslov <vladbu@mellanox.com> wrote:
>>
>> On Thu 27 Sep 2018 at 20:42, Cong Wang <xiyou.wangcong@gmail.com> wrote:
>> > It is clearly a copy-n-paste.
>> >
>> > Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
>> > ---
>> >  net/sched/cls_api.c | 2 +-
>> >  1 file changed, 1 insertion(+), 1 deletion(-)
>> >
>> > diff --git a/net/sched/cls_api.c b/net/sched/cls_api.c
>> > index 3de47e99b788..8dd7f8af6d54 100644
>> > --- a/net/sched/cls_api.c
>> > +++ b/net/sched/cls_api.c
>> > @@ -655,7 +655,7 @@ static struct tcf_block *tcf_block_find(struct net *net, struct Qdisc **q,
>> >
>> >               *q = qdisc_refcount_inc_nz(*q);
>> >               if (!*q) {
>> > -                     NL_SET_ERR_MSG(extack, "Parent Qdisc doesn't exists");
>> > +                     NL_SET_ERR_MSG(extack, "Can't increase Qdisc refcount");
>> >                       err = -EINVAL;
>> >                       goto errout_rcu;
>> >               }
>>
>> Is there a benefit in exposing this info to user?
>
> Depends on what user you mean here. For kernel developers, yes,
> this is useful. For others, no.
>
>
>> For all intents and purposes Qdisc is gone at that point.
>
> I don't want to be a language lawyer, but there is a difference between
> "it doesn't exist" and "it exists but being removed". The errno EINVAL
> betrays what you said too, it must be ENOENT to mach "Qdisc is gone".
>
> I don't want to waste my time on this any more. Let's just drop it.
>
> I really don't care, do you?

I'm asked the question in order to improve error messages in my future
patches, not because I care about this particular string.

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2018-09-30 20:48 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-09-27 20:42 [Patch net-next] net_sched: fix an extack message in tcf_block_find() Cong Wang
2018-09-27 20:42 ` [Patch net-next] net_sched: fix a crash in tc_new_tfilter() Cong Wang
2018-09-27 21:16 ` [Patch net-next] net_sched: fix an extack message in tcf_block_find() Eric Dumazet
2018-09-27 21:36   ` Cong Wang
2018-09-27 22:18     ` Eric Dumazet
2018-09-28 17:19       ` rcu_read_lock() " Cong Wang
2018-09-28  1:51     ` [Patch net-next] net_sched: fix an extack message " David Ahern
2018-09-28 11:36 ` Vlad Buslov
2018-09-28 17:03   ` Cong Wang
2018-09-30 14:12     ` Vlad Buslov
2018-09-28 17:04 ` Cong Wang

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.