All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH net] net_sched: prio: properly report out of memory errors
@ 2016-06-12 23:21 Eric Dumazet
  2016-06-13  1:57 ` David Miller
  2016-06-13  3:45 ` Cong Wang
  0 siblings, 2 replies; 12+ messages in thread
From: Eric Dumazet @ 2016-06-12 23:21 UTC (permalink / raw)
  To: David Miller; +Cc: netdev

From: Eric Dumazet <edumazet@google.com>

At Qdisc creation or change time, prio_tune() creates missing
pfifo qdiscs but does not return an error code if one
qdisc could not be allocated.

Leaving a qdisc in non operational state without telling user
anything about this problem is not good.

Also, testing if we replace something different than noop_qdisc
a second time makes no sense so I removed useless code.

Signed-off-by: Eric Dumazet <edumazet@google.com>
---
 net/sched/sch_prio.c |   32 ++++++++++++--------------------
 1 file changed, 12 insertions(+), 20 deletions(-)

diff --git a/net/sched/sch_prio.c b/net/sched/sch_prio.c
index 4b0a82191bc4..071718bccdab 100644
--- a/net/sched/sch_prio.c
+++ b/net/sched/sch_prio.c
@@ -202,26 +202,18 @@ static int prio_tune(struct Qdisc *sch, struct nlattr *opt)
 	sch_tree_unlock(sch);
 
 	for (i = 0; i < q->bands; i++) {
-		if (q->queues[i] == &noop_qdisc) {
-			struct Qdisc *child, *old;
-
-			child = qdisc_create_dflt(sch->dev_queue,
-						  &pfifo_qdisc_ops,
-						  TC_H_MAKE(sch->handle, i + 1));
-			if (child) {
-				sch_tree_lock(sch);
-				old = q->queues[i];
-				q->queues[i] = child;
-
-				if (old != &noop_qdisc) {
-					qdisc_tree_reduce_backlog(old,
-								  old->q.qlen,
-								  old->qstats.backlog);
-					qdisc_destroy(old);
-				}
-				sch_tree_unlock(sch);
-			}
-		}
+		struct Qdisc *child;
+
+		if (q->queues[i] != &noop_qdisc)
+			continue;
+
+		child = qdisc_create_dflt(sch->dev_queue, &pfifo_qdisc_ops,
+					  TC_H_MAKE(sch->handle, i + 1));
+		if (!child)
+			return -ENOMEM;
+		sch_tree_lock(sch);
+		q->queues[i] = child;
+		sch_tree_unlock(sch);
 	}
 	return 0;
 }

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH net] net_sched: prio: properly report out of memory errors
  2016-06-12 23:21 [PATCH net] net_sched: prio: properly report out of memory errors Eric Dumazet
@ 2016-06-13  1:57 ` David Miller
  2016-06-13  3:45 ` Cong Wang
  1 sibling, 0 replies; 12+ messages in thread
From: David Miller @ 2016-06-13  1:57 UTC (permalink / raw)
  To: eric.dumazet; +Cc: netdev

From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Sun, 12 Jun 2016 16:21:47 -0700

> From: Eric Dumazet <edumazet@google.com>
> 
> At Qdisc creation or change time, prio_tune() creates missing
> pfifo qdiscs but does not return an error code if one
> qdisc could not be allocated.
> 
> Leaving a qdisc in non operational state without telling user
> anything about this problem is not good.
> 
> Also, testing if we replace something different than noop_qdisc
> a second time makes no sense so I removed useless code.
> 
> Signed-off-by: Eric Dumazet <edumazet@google.com>

Applied.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH net] net_sched: prio: properly report out of memory errors
  2016-06-12 23:21 [PATCH net] net_sched: prio: properly report out of memory errors Eric Dumazet
  2016-06-13  1:57 ` David Miller
@ 2016-06-13  3:45 ` Cong Wang
  2016-06-13  4:29   ` Eric Dumazet
  1 sibling, 1 reply; 12+ messages in thread
From: Cong Wang @ 2016-06-13  3:45 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: David Miller, netdev

On Sun, Jun 12, 2016 at 4:21 PM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> +               struct Qdisc *child;
> +
> +               if (q->queues[i] != &noop_qdisc)
> +                       continue;
> +
> +               child = qdisc_create_dflt(sch->dev_queue, &pfifo_qdisc_ops,
> +                                         TC_H_MAKE(sch->handle, i + 1));
> +               if (!child)
> +                       return -ENOMEM;

Since this is inside a loop, shouldn't we kfree the previous child
creations when we fail?



> +               sch_tree_lock(sch);
> +               q->queues[i] = child;
> +               sch_tree_unlock(sch);
>         }
>         return 0;
>  }
>
>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH net] net_sched: prio: properly report out of memory errors
  2016-06-13  3:45 ` Cong Wang
@ 2016-06-13  4:29   ` Eric Dumazet
  2016-06-13  5:03     ` [PATCH net] net_sched: prio: rollback allocations if prio_init() fails Eric Dumazet
  0 siblings, 1 reply; 12+ messages in thread
From: Eric Dumazet @ 2016-06-13  4:29 UTC (permalink / raw)
  To: Cong Wang; +Cc: David Miller, netdev

On Sun, 2016-06-12 at 20:45 -0700, Cong Wang wrote:
> On Sun, Jun 12, 2016 at 4:21 PM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> > +               struct Qdisc *child;
> > +
> > +               if (q->queues[i] != &noop_qdisc)
> > +                       continue;
> > +
> > +               child = qdisc_create_dflt(sch->dev_queue, &pfifo_qdisc_ops,
> > +                                         TC_H_MAKE(sch->handle, i + 1));
> > +               if (!child)
> > +                       return -ENOMEM;
> 
> Since this is inside a loop, shouldn't we kfree the previous child
> creations when we fail?

You're right.

prio_init() needs to do the cleanup, as prio_destroy() wont be called
from qdisc_create()

I am testing a fix with fault injection.

Thanks.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH net] net_sched: prio: rollback allocations if prio_init() fails
  2016-06-13  4:29   ` Eric Dumazet
@ 2016-06-13  5:03     ` Eric Dumazet
  2016-06-13 16:28       ` Cong Wang
  0 siblings, 1 reply; 12+ messages in thread
From: Eric Dumazet @ 2016-06-13  5:03 UTC (permalink / raw)
  To: Cong Wang; +Cc: David Miller, netdev

From: Eric Dumazet <edumazet@google.com>

Now prio_init() can return -ENOMEM, it also has to make sure
any allocated qdisc are freed, since the caller (qdisc_create()) wont
call ->destroy() handler for us.

Fixes: cbdf45116478 ("net_sched: prio: properly report out of memory errors")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Cong Wang <xiyou.wangcong@gmail.com>
---
 net/sched/sch_prio.c |   14 ++++++--------
 1 file changed, 6 insertions(+), 8 deletions(-)

diff --git a/net/sched/sch_prio.c b/net/sched/sch_prio.c
index 071718bccdab..9b703f2c921b 100644
--- a/net/sched/sch_prio.c
+++ b/net/sched/sch_prio.c
@@ -221,20 +221,18 @@ static int prio_tune(struct Qdisc *sch, struct nlattr *opt)
 static int prio_init(struct Qdisc *sch, struct nlattr *opt)
 {
 	struct prio_sched_data *q = qdisc_priv(sch);
-	int i;
+	int err, i;
 
 	for (i = 0; i < TCQ_PRIO_BANDS; i++)
 		q->queues[i] = &noop_qdisc;
 
-	if (opt == NULL) {
+	if (!opt)
 		return -EINVAL;
-	} else {
-		int err;
 
-		if ((err = prio_tune(sch, opt)) != 0)
-			return err;
-	}
-	return 0;
+	err = prio_tune(sch, opt);
+	if (err)
+		prio_destroy(sch);
+	return err;
 }
 
 static int prio_dump(struct Qdisc *sch, struct sk_buff *skb)

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH net] net_sched: prio: rollback allocations if prio_init() fails
  2016-06-13  5:03     ` [PATCH net] net_sched: prio: rollback allocations if prio_init() fails Eric Dumazet
@ 2016-06-13 16:28       ` Cong Wang
  2016-06-13 17:13         ` Eric Dumazet
  0 siblings, 1 reply; 12+ messages in thread
From: Cong Wang @ 2016-06-13 16:28 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: David Miller, netdev

On Sun, Jun 12, 2016 at 10:03 PM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> From: Eric Dumazet <edumazet@google.com>
>
> Now prio_init() can return -ENOMEM, it also has to make sure
> any allocated qdisc are freed, since the caller (qdisc_create()) wont
> call ->destroy() handler for us.


But prio_tune() is called by ->change() too, so just call prio_destroy()
inside prio_tune() ?

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH net] net_sched: prio: rollback allocations if prio_init() fails
  2016-06-13 16:28       ` Cong Wang
@ 2016-06-13 17:13         ` Eric Dumazet
  2016-06-13 17:27           ` Eric Dumazet
  0 siblings, 1 reply; 12+ messages in thread
From: Eric Dumazet @ 2016-06-13 17:13 UTC (permalink / raw)
  To: Cong Wang; +Cc: David Miller, netdev

On Mon, 2016-06-13 at 09:28 -0700, Cong Wang wrote:
> On Sun, Jun 12, 2016 at 10:03 PM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> > From: Eric Dumazet <edumazet@google.com>
> >
> > Now prio_init() can return -ENOMEM, it also has to make sure
> > any allocated qdisc are freed, since the caller (qdisc_create()) wont
> > call ->destroy() handler for us.
> 
> 
> But prio_tune() is called by ->change() too, so just call prio_destroy()
> inside prio_tune() ?

Because the following should work :

tc qdisc replace dev eth0 root prio bands 3

(later...)

tc qdisc change dev eth0 root prio bands 5 // memory allocation failure

We must not destroy the qdisc on a change operation if it fails.

Only the init should clean the qdisc state/memory/children since qdisc
wont be created for real.

Presumably we should commit changes on qdisc only if the whole
->change() succeeded, but I guess lot of qdisc are buggy in this
respect.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH net] net_sched: prio: rollback allocations if prio_init() fails
  2016-06-13 17:13         ` Eric Dumazet
@ 2016-06-13 17:27           ` Eric Dumazet
  2016-06-13 17:31             ` Cong Wang
  2016-06-13 18:33             ` [PATCH v2 net] net_sched: prio: insure proper transactional behavior Eric Dumazet
  0 siblings, 2 replies; 12+ messages in thread
From: Eric Dumazet @ 2016-06-13 17:27 UTC (permalink / raw)
  To: Cong Wang; +Cc: David Miller, netdev

On Mon, 2016-06-13 at 10:13 -0700, Eric Dumazet wrote:

> Presumably we should commit changes on qdisc only if the whole
> ->change() succeeded, but I guess lot of qdisc are buggy in this
> respect.

I'll post a v2 fixing this issue as well.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH net] net_sched: prio: rollback allocations if prio_init() fails
  2016-06-13 17:27           ` Eric Dumazet
@ 2016-06-13 17:31             ` Cong Wang
  2016-06-13 18:33             ` [PATCH v2 net] net_sched: prio: insure proper transactional behavior Eric Dumazet
  1 sibling, 0 replies; 12+ messages in thread
From: Cong Wang @ 2016-06-13 17:31 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: David Miller, netdev

On Mon, Jun 13, 2016 at 10:27 AM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> On Mon, 2016-06-13 at 10:13 -0700, Eric Dumazet wrote:
>
>> Presumably we should commit changes on qdisc only if the whole
>> ->change() succeeded, but I guess lot of qdisc are buggy in this
>> respect.

I suppose so, it should work like a transaction.

>
> I'll post a v2 fixing this issue as well.
>
>

Thanks!

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH v2 net] net_sched: prio: insure proper transactional behavior
  2016-06-13 17:27           ` Eric Dumazet
  2016-06-13 17:31             ` Cong Wang
@ 2016-06-13 18:33             ` Eric Dumazet
  2016-06-13 21:21               ` Cong Wang
  2016-06-15 19:30               ` David Miller
  1 sibling, 2 replies; 12+ messages in thread
From: Eric Dumazet @ 2016-06-13 18:33 UTC (permalink / raw)
  To: Cong Wang; +Cc: David Miller, netdev

From: Eric Dumazet <edumazet@google.com>

Now prio_init() can return -ENOMEM, it also has to make sure
any allocated qdiscs are freed, since the caller (qdisc_create()) wont
call ->destroy() handler for us.

More generally, we want a transactional behavior for "tc qdisc
change ...", so prio_tune() should not make modifications if
any error is returned.

It means that we must validate parameters and allocate missing qdisc(s)
before taking root qdisc lock exactly once, to not leave the prio qdisc
in an intermediate state.

Fixes: cbdf45116478 ("net_sched: prio: properly report out of memory errors")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Cong Wang <xiyou.wangcong@gmail.com>
---
 net/sched/sch_prio.c |   57 ++++++++++++++++++++-------------------------------
 1 file changed, 23 insertions(+), 34 deletions(-)

diff --git a/net/sched/sch_prio.c b/net/sched/sch_prio.c
index 071718bccdab..a356450b747b 100644
--- a/net/sched/sch_prio.c
+++ b/net/sched/sch_prio.c
@@ -172,8 +172,9 @@ prio_destroy(struct Qdisc *sch)
 static int prio_tune(struct Qdisc *sch, struct nlattr *opt)
 {
 	struct prio_sched_data *q = qdisc_priv(sch);
+	struct Qdisc *queues[TCQ_PRIO_BANDS];
+	int oldbands = q->bands, i;
 	struct tc_prio_qopt *qopt;
-	int i;
 
 	if (nla_len(opt) < sizeof(*qopt))
 		return -EINVAL;
@@ -187,54 +188,42 @@ static int prio_tune(struct Qdisc *sch, struct nlattr *opt)
 			return -EINVAL;
 	}
 
+	/* Before commit, make sure we can allocate all new qdiscs */
+	for (i = oldbands; i < qopt->bands; i++) {
+		queues[i] = qdisc_create_dflt(sch->dev_queue, &pfifo_qdisc_ops,
+					      TC_H_MAKE(sch->handle, i + 1));
+		if (!queues[i]) {
+			while (i > oldbands)
+				qdisc_destroy(queues[--i]);
+			return -ENOMEM;
+		}
+	}
+
 	sch_tree_lock(sch);
 	q->bands = qopt->bands;
 	memcpy(q->prio2band, qopt->priomap, TC_PRIO_MAX+1);
 
-	for (i = q->bands; i < TCQ_PRIO_BANDS; i++) {
+	for (i = q->bands; i < oldbands; i++) {
 		struct Qdisc *child = q->queues[i];
-		q->queues[i] = &noop_qdisc;
-		if (child != &noop_qdisc) {
-			qdisc_tree_reduce_backlog(child, child->q.qlen, child->qstats.backlog);
-			qdisc_destroy(child);
-		}
-	}
-	sch_tree_unlock(sch);
 
-	for (i = 0; i < q->bands; i++) {
-		struct Qdisc *child;
+		qdisc_tree_reduce_backlog(child, child->q.qlen,
+					  child->qstats.backlog);
+		qdisc_destroy(child);
+	}
 
-		if (q->queues[i] != &noop_qdisc)
-			continue;
+	for (i = oldbands; i < q->bands; i++)
+		q->queues[i] = queues[i];
 
-		child = qdisc_create_dflt(sch->dev_queue, &pfifo_qdisc_ops,
-					  TC_H_MAKE(sch->handle, i + 1));
-		if (!child)
-			return -ENOMEM;
-		sch_tree_lock(sch);
-		q->queues[i] = child;
-		sch_tree_unlock(sch);
-	}
+	sch_tree_unlock(sch);
 	return 0;
 }
 
 static int prio_init(struct Qdisc *sch, struct nlattr *opt)
 {
-	struct prio_sched_data *q = qdisc_priv(sch);
-	int i;
-
-	for (i = 0; i < TCQ_PRIO_BANDS; i++)
-		q->queues[i] = &noop_qdisc;
-
-	if (opt == NULL) {
+	if (!opt)
 		return -EINVAL;
-	} else {
-		int err;
 
-		if ((err = prio_tune(sch, opt)) != 0)
-			return err;
-	}
-	return 0;
+	return prio_tune(sch, opt);
 }
 
 static int prio_dump(struct Qdisc *sch, struct sk_buff *skb)

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH v2 net] net_sched: prio: insure proper transactional behavior
  2016-06-13 18:33             ` [PATCH v2 net] net_sched: prio: insure proper transactional behavior Eric Dumazet
@ 2016-06-13 21:21               ` Cong Wang
  2016-06-15 19:30               ` David Miller
  1 sibling, 0 replies; 12+ messages in thread
From: Cong Wang @ 2016-06-13 21:21 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: David Miller, netdev

On Mon, Jun 13, 2016 at 11:33 AM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> From: Eric Dumazet <edumazet@google.com>
>
> Now prio_init() can return -ENOMEM, it also has to make sure
> any allocated qdiscs are freed, since the caller (qdisc_create()) wont
> call ->destroy() handler for us.
>
> More generally, we want a transactional behavior for "tc qdisc
> change ...", so prio_tune() should not make modifications if
> any error is returned.
>
> It means that we must validate parameters and allocate missing qdisc(s)
> before taking root qdisc lock exactly once, to not leave the prio qdisc
> in an intermediate state.
>
> Fixes: cbdf45116478 ("net_sched: prio: properly report out of memory errors")
> Signed-off-by: Eric Dumazet <edumazet@google.com>
> Reported-by: Cong Wang <xiyou.wangcong@gmail.com>

Looks good to me,

Acked-by: Cong Wang <xiyou.wangcong@gmail.com>

Thanks!

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2 net] net_sched: prio: insure proper transactional behavior
  2016-06-13 18:33             ` [PATCH v2 net] net_sched: prio: insure proper transactional behavior Eric Dumazet
  2016-06-13 21:21               ` Cong Wang
@ 2016-06-15 19:30               ` David Miller
  1 sibling, 0 replies; 12+ messages in thread
From: David Miller @ 2016-06-15 19:30 UTC (permalink / raw)
  To: eric.dumazet; +Cc: xiyou.wangcong, netdev

From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Mon, 13 Jun 2016 11:33:32 -0700

> From: Eric Dumazet <edumazet@google.com>
> 
> Now prio_init() can return -ENOMEM, it also has to make sure
> any allocated qdiscs are freed, since the caller (qdisc_create()) wont
> call ->destroy() handler for us.
> 
> More generally, we want a transactional behavior for "tc qdisc
> change ...", so prio_tune() should not make modifications if
> any error is returned.
> 
> It means that we must validate parameters and allocate missing qdisc(s)
> before taking root qdisc lock exactly once, to not leave the prio qdisc
> in an intermediate state.
> 
> Fixes: cbdf45116478 ("net_sched: prio: properly report out of memory errors")
> Signed-off-by: Eric Dumazet <edumazet@google.com>
> Reported-by: Cong Wang <xiyou.wangcong@gmail.com>

Applied.

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2016-06-15 19:30 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-06-12 23:21 [PATCH net] net_sched: prio: properly report out of memory errors Eric Dumazet
2016-06-13  1:57 ` David Miller
2016-06-13  3:45 ` Cong Wang
2016-06-13  4:29   ` Eric Dumazet
2016-06-13  5:03     ` [PATCH net] net_sched: prio: rollback allocations if prio_init() fails Eric Dumazet
2016-06-13 16:28       ` Cong Wang
2016-06-13 17:13         ` Eric Dumazet
2016-06-13 17:27           ` Eric Dumazet
2016-06-13 17:31             ` Cong Wang
2016-06-13 18:33             ` [PATCH v2 net] net_sched: prio: insure proper transactional behavior Eric Dumazet
2016-06-13 21:21               ` Cong Wang
2016-06-15 19:30               ` David Miller

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.