All of lore.kernel.org
 help / color / mirror / Atom feed
* size overflow in function qdisc_tree_decrease_qlen net/sched/sch_api.c
@ 2015-12-01  1:00 Daniele Fucini
  2015-12-01  4:50 ` Cong Wang
  0 siblings, 1 reply; 19+ messages in thread
From: Daniele Fucini @ 2015-12-01  1:00 UTC (permalink / raw)
  To: netdev; +Cc: jhs, davem, spender, pageexec, re.emese

[-- Attachment #1: Type: text/plain, Size: 618 bytes --]

Hello,

I'm using a Grsecurity patched kernel (version 4.2.6-201511282239) and
I'm getting system freezes due to PaX detecting a size overflow in
function qdisc_tree_decrease_qlen net/sched/sch_api.c:769 whenever I use
Transmission BitTorrent client.

On the Grsecurity forum I was told it's probably an unintended integer
underflow that I should report upstream.

Here's the relevant log:
https://gist.github.com/cf54ccbb12ea65e146d4

And here's the thread on Grsecurity forum:
https://forums.grsecurity.net/viewtopic.php?f=3&t=4327

Let me know if there's any other information you need to help debug
this.

Daniele

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 473 bytes --]

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: size overflow in function qdisc_tree_decrease_qlen net/sched/sch_api.c
  2015-12-01  1:00 size overflow in function qdisc_tree_decrease_qlen net/sched/sch_api.c Daniele Fucini
@ 2015-12-01  4:50 ` Cong Wang
  2015-12-01 11:19   ` Daniele Fucini
  0 siblings, 1 reply; 19+ messages in thread
From: Cong Wang @ 2015-12-01  4:50 UTC (permalink / raw)
  To: Daniele Fucini
  Cc: netdev, Jamal Hadi Salim, David Miller, spender, pageexec, re.emese

On Mon, Nov 30, 2015 at 5:00 PM, Daniele Fucini <dfucini@gmail.com> wrote:
> Hello,
>
> I'm using a Grsecurity patched kernel (version 4.2.6-201511282239) and
> I'm getting system freezes due to PaX detecting a size overflow in
> function qdisc_tree_decrease_qlen net/sched/sch_api.c:769 whenever I use
> Transmission BitTorrent client.
>
> On the Grsecurity forum I was told it's probably an unintended integer
> underflow that I should report upstream.
>
> Here's the relevant log:
> https://gist.github.com/cf54ccbb12ea65e146d4
>

Looks like we miss some sch->q.qlen accounting somewhere...

What is your qdisc setup? Is your fq_codel the default one or you installed it
or some other qdisc somewhere (`tc qdisc show` could tell)?

I will take a deeper look tomorrow, or maybe Jamal could find something
before I wake up. ;)

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: size overflow in function qdisc_tree_decrease_qlen net/sched/sch_api.c
  2015-12-01  4:50 ` Cong Wang
@ 2015-12-01 11:19   ` Daniele Fucini
  2015-12-01 14:06     ` Eric Dumazet
  0 siblings, 1 reply; 19+ messages in thread
From: Daniele Fucini @ 2015-12-01 11:19 UTC (permalink / raw)
  To: Cong Wang
  Cc: netdev, Jamal Hadi Salim, David Miller, spender, pageexec, re.emese

[-- Attachment #1: Type: text/plain, Size: 1076 bytes --]

Thanks for the reply. Here's the output of `tc qdisc show`:
https://gist.github.com/1847102c8fe08f63e9e7

Daniele


On Mon, Nov 30, 2015 at 08:50:29PM -0800, Cong Wang wrote:
> On Mon, Nov 30, 2015 at 5:00 PM, Daniele Fucini <dfucini@gmail.com> wrote:
> > Hello,
> >
> > I'm using a Grsecurity patched kernel (version 4.2.6-201511282239) and
> > I'm getting system freezes due to PaX detecting a size overflow in
> > function qdisc_tree_decrease_qlen net/sched/sch_api.c:769 whenever I use
> > Transmission BitTorrent client.
> >
> > On the Grsecurity forum I was told it's probably an unintended integer
> > underflow that I should report upstream.
> >
> > Here's the relevant log:
> > https://gist.github.com/cf54ccbb12ea65e146d4
> >
> 
> Looks like we miss some sch->q.qlen accounting somewhere...
> 
> What is your qdisc setup? Is your fq_codel the default one or you installed it
> or some other qdisc somewhere (`tc qdisc show` could tell)?
> 
> I will take a deeper look tomorrow, or maybe Jamal could find something
> before I wake up. ;)

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 473 bytes --]

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: size overflow in function qdisc_tree_decrease_qlen net/sched/sch_api.c
  2015-12-01 11:19   ` Daniele Fucini
@ 2015-12-01 14:06     ` Eric Dumazet
  2015-12-01 14:10       ` Eric Dumazet
  2015-12-01 14:15       ` Eric Dumazet
  0 siblings, 2 replies; 19+ messages in thread
From: Eric Dumazet @ 2015-12-01 14:06 UTC (permalink / raw)
  To: Daniele Fucini
  Cc: Cong Wang, netdev, Jamal Hadi Salim, David Miller, spender,
	pageexec, re.emese

On Tue, 2015-12-01 at 12:19 +0100, Daniele Fucini wrote:
> Thanks for the reply. Here's the output of `tc qdisc show`:
> https://gist.github.com/1847102c8fe08f63e9e7
> 
> Daniele
> 
> 
> On Mon, Nov 30, 2015 at 08:50:29PM -0800, Cong Wang wrote:
> > On Mon, Nov 30, 2015 at 5:00 PM, Daniele Fucini <dfucini@gmail.com> wrote:
> > > Hello,
> > >
> > > I'm using a Grsecurity patched kernel (version 4.2.6-201511282239) and
> > > I'm getting system freezes due to PaX detecting a size overflow in
> > > function qdisc_tree_decrease_qlen net/sched/sch_api.c:769 whenever I use
> > > Transmission BitTorrent client.
> > >
> > > On the Grsecurity forum I was told it's probably an unintended integer
> > > underflow that I should report upstream.
> > >
> > > Here's the relevant log:
> > > https://gist.github.com/cf54ccbb12ea65e146d4
> > >
> > 
> > Looks like we miss some sch->q.qlen accounting somewhere...
> > 
> > What is your qdisc setup? Is your fq_codel the default one or you installed it
> > or some other qdisc somewhere (`tc qdisc show` could tell)?
> > 
> > I will take a deeper look tomorrow, or maybe Jamal could find something
> > before I wake up. ;)

Hmm... I do not think we ever took care of MQ in
qdisc_tree_decrease_qlen()

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: size overflow in function qdisc_tree_decrease_qlen net/sched/sch_api.c
  2015-12-01 14:06     ` Eric Dumazet
@ 2015-12-01 14:10       ` Eric Dumazet
  2015-12-01 16:13         ` PaX Team
  2015-12-01 14:15       ` Eric Dumazet
  1 sibling, 1 reply; 19+ messages in thread
From: Eric Dumazet @ 2015-12-01 14:10 UTC (permalink / raw)
  To: Daniele Fucini
  Cc: Cong Wang, netdev, Jamal Hadi Salim, David Miller, spender,
	pageexec, re.emese

On Tue, 2015-12-01 at 06:06 -0800, Eric Dumazet wrote:
> On Tue, 2015-12-01 at 12:19 +0100, Daniele Fucini wrote:
> > Thanks for the reply. Here's the output of `tc qdisc show`:
> > https://gist.github.com/1847102c8fe08f63e9e7

> Hmm... I do not think we ever took care of MQ in
> qdisc_tree_decrease_qlen()

This looks like a false positive, because MQ recomputes backlog/qlen at
the time (stats) dumps are requested.

I would say there is no bug.

static int mq_dump(struct Qdisc *sch, struct sk_buff *skb)
{
        struct net_device *dev = qdisc_dev(sch);
        struct Qdisc *qdisc;
        unsigned int ntx;

        sch->q.qlen = 0;
        memset(&sch->bstats, 0, sizeof(sch->bstats));
        memset(&sch->qstats, 0, sizeof(sch->qstats));

        for (ntx = 0; ntx < dev->num_tx_queues; ntx++) {
                qdisc = netdev_get_tx_queue(dev, ntx)->qdisc_sleeping;
                spin_lock_bh(qdisc_lock(qdisc));
                sch->q.qlen             += qdisc->q.qlen;
                sch->bstats.bytes       += qdisc->bstats.bytes;
                sch->bstats.packets     += qdisc->bstats.packets;
                sch->qstats.backlog     += qdisc->qstats.backlog;
                sch->qstats.drops       += qdisc->qstats.drops;
                sch->qstats.requeues    += qdisc->qstats.requeues;
                sch->qstats.overlimits  += qdisc->qstats.overlimits;
                spin_unlock_bh(qdisc_lock(qdisc));
        }
        return 0;
}

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: size overflow in function qdisc_tree_decrease_qlen net/sched/sch_api.c
  2015-12-01 14:06     ` Eric Dumazet
  2015-12-01 14:10       ` Eric Dumazet
@ 2015-12-01 14:15       ` Eric Dumazet
  2015-12-01 19:13         ` Daniele Fucini
  1 sibling, 1 reply; 19+ messages in thread
From: Eric Dumazet @ 2015-12-01 14:15 UTC (permalink / raw)
  To: Daniele Fucini
  Cc: Cong Wang, netdev, Jamal Hadi Salim, David Miller, spender,
	pageexec, re.emese

On Tue, 2015-12-01 at 06:06 -0800, Eric Dumazet wrote:
> On Tue, 2015-12-01 at 12:19 +0100, Daniele Fucini wrote:
> > Thanks for the reply. Here's the output of `tc qdisc show`:
> > https://gist.github.com/1847102c8fe08f63e9e7
> > 
> > Daniele
> > 
> > 
> > On Mon, Nov 30, 2015 at 08:50:29PM -0800, Cong Wang wrote:
> > > On Mon, Nov 30, 2015 at 5:00 PM, Daniele Fucini <dfucini@gmail.com> wrote:
> > > > Hello,
> > > >
> > > > I'm using a Grsecurity patched kernel (version 4.2.6-201511282239) and
> > > > I'm getting system freezes due to PaX detecting a size overflow in
> > > > function qdisc_tree_decrease_qlen net/sched/sch_api.c:769 whenever I use
> > > > Transmission BitTorrent client.
> > > >
> > > > On the Grsecurity forum I was told it's probably an unintended integer
> > > > underflow that I should report upstream.
> > > >
> > > > Here's the relevant log:
> > > > https://gist.github.com/cf54ccbb12ea65e146d4
> > > >
> > > 
> > > Looks like we miss some sch->q.qlen accounting somewhere...
> > > 
> > > What is your qdisc setup? Is your fq_codel the default one or you installed it
> > > or some other qdisc somewhere (`tc qdisc show` could tell)?
> > > 
> > > I will take a deeper look tomorrow, or maybe Jamal could find something
> > > before I wake up. ;)
> 
> Hmm... I do not think we ever took care of MQ in
> qdisc_tree_decrease_qlen()

Please try following fix :

diff --git a/net/sched/sch_api.c b/net/sched/sch_api.c
index f43c8f33f09e..72f2c1dfdcde 100644
--- a/net/sched/sch_api.c
+++ b/net/sched/sch_api.c
@@ -759,6 +759,8 @@ void qdisc_tree_decrease_qlen(struct Qdisc *sch, unsigned int n)
 			WARN_ON(parentid != TC_H_ROOT);
 			return;
 		}
+		if (sch->flags & TCQ_F_MQROOT)
+			return;
 		cops = sch->ops->cl_ops;
 		if (cops->qlen_notify) {
 			cl = cops->get(sch, parentid);

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: size overflow in function qdisc_tree_decrease_qlen net/sched/sch_api.c
  2015-12-01 14:10       ` Eric Dumazet
@ 2015-12-01 16:13         ` PaX Team
  2015-12-01 16:34           ` Eric Dumazet
  0 siblings, 1 reply; 19+ messages in thread
From: PaX Team @ 2015-12-01 16:13 UTC (permalink / raw)
  To: Daniele Fucini, Eric Dumazet
  Cc: Cong Wang, netdev, Jamal Hadi Salim, David Miller, spender, re.emese

On 1 Dec 2015 at 6:10, Eric Dumazet wrote:

> On Tue, 2015-12-01 at 06:06 -0800, Eric Dumazet wrote:
> > On Tue, 2015-12-01 at 12:19 +0100, Daniele Fucini wrote:
> > > Thanks for the reply. Here's the output of `tc qdisc show`:
> > > https://gist.github.com/1847102c8fe08f63e9e7
> 
> > Hmm... I do not think we ever took care of MQ in
> > qdisc_tree_decrease_qlen()
> 
> This looks like a false positive, because MQ recomputes backlog/qlen at
> the time (stats) dumps are requested.
> 
> I would say there is no bug.

is it correct for sk_buff_head.qlen to underflow in general
or just in this particular sched code? (we can exclude overflow
checking for either case but obviously would like to retain as
much coverage as possible)

thanks,
 PaX Team

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: size overflow in function qdisc_tree_decrease_qlen net/sched/sch_api.c
  2015-12-01 16:13         ` PaX Team
@ 2015-12-01 16:34           ` Eric Dumazet
  2015-12-01 18:43             ` Cong Wang
  0 siblings, 1 reply; 19+ messages in thread
From: Eric Dumazet @ 2015-12-01 16:34 UTC (permalink / raw)
  To: pageexec
  Cc: Daniele Fucini, Cong Wang, netdev, Jamal Hadi Salim,
	David Miller, spender, re.emese

On Tue, 2015-12-01 at 17:13 +0100, PaX Team wrote:
> On 1 Dec 2015 at 6:10, Eric Dumazet wrote:
> 
> > On Tue, 2015-12-01 at 06:06 -0800, Eric Dumazet wrote:
> > > On Tue, 2015-12-01 at 12:19 +0100, Daniele Fucini wrote:
> > > > Thanks for the reply. Here's the output of `tc qdisc show`:
> > > > https://gist.github.com/1847102c8fe08f63e9e7
> > 
> > > Hmm... I do not think we ever took care of MQ in
> > > qdisc_tree_decrease_qlen()
> > 
> > This looks like a false positive, because MQ recomputes backlog/qlen at
> > the time (stats) dumps are requested.
> > 
> > I would say there is no bug.
> 
> is it correct for sk_buff_head.qlen to underflow in general
> or just in this particular sched code? (we can exclude overflow
> checking for either case but obviously would like to retain as
> much coverage as possible)

We do not particularly care on the qlen value for MQ, although we simply
should avoid doing anything.

Otherwise we might report slightly wrong qlen values while doing "tc -s
qdisc show" if the qdisc_tree_decrease_qlen() happens right after we
computed the folded values in mq_dump()

Please try the patch I sent. I am pretty sure it should be the right
fix.

I will submit it formally once you tell me it is fixing the issue you
reported.


diff --git a/net/sched/sch_api.c b/net/sched/sch_api.c
index f43c8f33f09e..72f2c1dfdcde 100644
--- a/net/sched/sch_api.c
+++ b/net/sched/sch_api.c
@@ -759,6 +759,8 @@ void qdisc_tree_decrease_qlen(struct Qdisc *sch, unsigned int n)
 			WARN_ON(parentid != TC_H_ROOT);
 			return;
 		}
+		if (sch->flags & TCQ_F_MQROOT)
+			return;
 		cops = sch->ops->cl_ops;
 		if (cops->qlen_notify) {
 			cl = cops->get(sch, parentid);

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: size overflow in function qdisc_tree_decrease_qlen net/sched/sch_api.c
  2015-12-01 16:34           ` Eric Dumazet
@ 2015-12-01 18:43             ` Cong Wang
  2015-12-01 19:09               ` Eric Dumazet
  0 siblings, 1 reply; 19+ messages in thread
From: Cong Wang @ 2015-12-01 18:43 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: pageexec, Daniele Fucini, netdev, Jamal Hadi Salim, David Miller,
	spender, re.emese

On Tue, Dec 1, 2015 at 8:34 AM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> On Tue, 2015-12-01 at 17:13 +0100, PaX Team wrote:
>> On 1 Dec 2015 at 6:10, Eric Dumazet wrote:
>>
>> > On Tue, 2015-12-01 at 06:06 -0800, Eric Dumazet wrote:
>> > > On Tue, 2015-12-01 at 12:19 +0100, Daniele Fucini wrote:
>> > > > Thanks for the reply. Here's the output of `tc qdisc show`:
>> > > > https://gist.github.com/1847102c8fe08f63e9e7
>> >
>> > > Hmm... I do not think we ever took care of MQ in
>> > > qdisc_tree_decrease_qlen()
>> >
>> > This looks like a false positive, because MQ recomputes backlog/qlen at
>> > the time (stats) dumps are requested.
>> >
>> > I would say there is no bug.
>>
>> is it correct for sk_buff_head.qlen to underflow in general
>> or just in this particular sched code? (we can exclude overflow
>> checking for either case but obviously would like to retain as
>> much coverage as possible)
>
> We do not particularly care on the qlen value for MQ, although we simply
> should avoid doing anything.
>
> Otherwise we might report slightly wrong qlen values while doing "tc -s
> qdisc show" if the qdisc_tree_decrease_qlen() happens right after we
> computed the folded values in mq_dump()
>
> Please try the patch I sent. I am pretty sure it should be the right
> fix.
>

This smells hacky... Another way to fix this is to hold the qdisc tree
lock in mq_dump(), since it is not a hot path (comparing with
enqueue/dequeue)?

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: size overflow in function qdisc_tree_decrease_qlen net/sched/sch_api.c
  2015-12-01 18:43             ` Cong Wang
@ 2015-12-01 19:09               ` Eric Dumazet
  2015-12-01 19:17                 ` Cong Wang
  0 siblings, 1 reply; 19+ messages in thread
From: Eric Dumazet @ 2015-12-01 19:09 UTC (permalink / raw)
  To: Cong Wang
  Cc: pageexec, Daniele Fucini, netdev, Jamal Hadi Salim, David Miller,
	spender, re.emese

On Tue, 2015-12-01 at 10:43 -0800, Cong Wang wrote:

> This smells hacky... Another way to fix this is to hold the qdisc tree
> lock in mq_dump(), since it is not a hot path (comparing with
> enqueue/dequeue)?

Really ? Which qdisc tree lock will protect you exactly ???

Whole point of MQ is that each TX queue has its own lock.

So multiple cpus can call qdisc_tree_decrease_qlen() at the same time,
holding their own lock.

Clearly modifying mq 'data' is wrong.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: size overflow in function qdisc_tree_decrease_qlen net/sched/sch_api.c
  2015-12-01 14:15       ` Eric Dumazet
@ 2015-12-01 19:13         ` Daniele Fucini
  0 siblings, 0 replies; 19+ messages in thread
From: Daniele Fucini @ 2015-12-01 19:13 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Cong Wang, netdev, Jamal Hadi Salim, David Miller, spender,
	pageexec, re.emese

[-- Attachment #1: Type: text/plain, Size: 821 bytes --]

I can't be 100% sure but after running transmission for over an hour
without any problems I'd say this fix worked. I'll keep it running for a
while so if it suddently freezes again I'll report it, but I hope it
won't be the case.

Daniele

On Tue, Dec 01, 2015 at 06:15:37AM -0800, Eric Dumazet wrote:
> 
> Please try following fix :
> 
> diff --git a/net/sched/sch_api.c b/net/sched/sch_api.c
> index f43c8f33f09e..72f2c1dfdcde 100644
> --- a/net/sched/sch_api.c
> +++ b/net/sched/sch_api.c
> @@ -759,6 +759,8 @@ void qdisc_tree_decrease_qlen(struct Qdisc *sch, unsigned int n)
>  			WARN_ON(parentid != TC_H_ROOT);
>  			return;
>  		}
> +		if (sch->flags & TCQ_F_MQROOT)
> +			return;
>  		cops = sch->ops->cl_ops;
>  		if (cops->qlen_notify) {
>  			cl = cops->get(sch, parentid);
> 
> 
> 

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 473 bytes --]

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: size overflow in function qdisc_tree_decrease_qlen net/sched/sch_api.c
  2015-12-01 19:09               ` Eric Dumazet
@ 2015-12-01 19:17                 ` Cong Wang
  2015-12-01 20:06                   ` Eric Dumazet
  0 siblings, 1 reply; 19+ messages in thread
From: Cong Wang @ 2015-12-01 19:17 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: pageexec, Daniele Fucini, netdev, Jamal Hadi Salim, David Miller,
	spender, re.emese

On Tue, Dec 1, 2015 at 11:09 AM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> On Tue, 2015-12-01 at 10:43 -0800, Cong Wang wrote:
>
>> This smells hacky... Another way to fix this is to hold the qdisc tree
>> lock in mq_dump(), since it is not a hot path (comparing with
>> enqueue/dequeue)?
>
> Really ? Which qdisc tree lock will protect you exactly ???
>
> Whole point of MQ is that each TX queue has its own lock.
>
> So multiple cpus can call qdisc_tree_decrease_qlen() at the same time,
> holding their own lock.
>
> Clearly modifying mq 'data' is wrong.

Ah, yeah, but mq _seems_ also the only one who modifies sch->q.qlen
in ->dump(), which is the root cause of this bug. I am wondering if it should
just compute the qlen and return it without modifying sch->q.qlen.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: size overflow in function qdisc_tree_decrease_qlen net/sched/sch_api.c
  2015-12-01 19:17                 ` Cong Wang
@ 2015-12-01 20:06                   ` Eric Dumazet
  2015-12-01 22:33                     ` Eric Dumazet
  2015-12-01 22:40                     ` size overflow in function qdisc_tree_decrease_qlen net/sched/sch_api.c Cong Wang
  0 siblings, 2 replies; 19+ messages in thread
From: Eric Dumazet @ 2015-12-01 20:06 UTC (permalink / raw)
  To: Cong Wang
  Cc: pageexec, Daniele Fucini, netdev, Jamal Hadi Salim, David Miller,
	spender, re.emese

On Tue, 2015-12-01 at 11:17 -0800, Cong Wang wrote:
> On Tue, Dec 1, 2015 at 11:09 AM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> > On Tue, 2015-12-01 at 10:43 -0800, Cong Wang wrote:
> >
> >> This smells hacky... Another way to fix this is to hold the qdisc tree
> >> lock in mq_dump(), since it is not a hot path (comparing with
> >> enqueue/dequeue)?
> >
> > Really ? Which qdisc tree lock will protect you exactly ???
> >
> > Whole point of MQ is that each TX queue has its own lock.
> >
> > So multiple cpus can call qdisc_tree_decrease_qlen() at the same time,
> > holding their own lock.
> >
> > Clearly modifying mq 'data' is wrong.
> 
> Ah, yeah, but mq _seems_ also the only one who modifies sch->q.qlen
> in ->dump(), which is the root cause of this bug. I am wondering if it should
> just compute the qlen and return it without modifying sch->q.qlen.

Sure, but then we still would get PAX underflows warnings ...

Also need to take care of sch->qstats.drops += count;

Also that would require a change of ->dump() api, since tc_fill_qdisc()
does :

if (q->ops->dump && q->ops->dump(q, skb) < 0)
    goto nla_put_failure;
qlen = q->q.qlen;

Not sure it is worth the pain, changing signature of all ->dump()
handlers...


What about adding TCQ_F_NOPARENT and then :

Note : Seems to be more invasive patch for net tree (need to properly
set TCQ_F_NOPARENT)

diff --git a/net/sched/sch_api.c b/net/sched/sch_api.c
index f43c8f33f09e..20c462b48404 100644
--- a/net/sched/sch_api.c
+++ b/net/sched/sch_api.c
@@ -753,6 +753,9 @@ void qdisc_tree_decrease_qlen(struct Qdisc *sch, unsigned int n)
        while ((parentid = sch->parent)) {
                if (TC_H_MAJ(parentid) == TC_H_MAJ(TC_H_INGRESS))
                        return;
+               /* This qdisc has no 'parent', we are at the top of the tree */
+               if (sch->flags & TCQ_F_NOPARENT)
+                       return;
 
                sch = qdisc_lookup(qdisc_dev(sch), TC_H_MAJ(parentid));
                if (sch == NULL) {
 

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: size overflow in function qdisc_tree_decrease_qlen net/sched/sch_api.c
  2015-12-01 20:06                   ` Eric Dumazet
@ 2015-12-01 22:33                     ` Eric Dumazet
  2015-12-01 22:47                       ` Cong Wang
  2015-12-01 22:40                     ` size overflow in function qdisc_tree_decrease_qlen net/sched/sch_api.c Cong Wang
  1 sibling, 1 reply; 19+ messages in thread
From: Eric Dumazet @ 2015-12-01 22:33 UTC (permalink / raw)
  To: Cong Wang
  Cc: pageexec, Daniele Fucini, netdev, Jamal Hadi Salim, David Miller,
	spender, re.emese

On Tue, 2015-12-01 at 12:06 -0800, Eric Dumazet wrote:
> On Tue, 2015-12-01 at 11:17 -0800, Cong Wang wrote:
> > On Tue, Dec 1, 2015 at 11:09 AM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> > > On Tue, 2015-12-01 at 10:43 -0800, Cong Wang wrote:
> > >
> > >> This smells hacky... Another way to fix this is to hold the qdisc tree
> > >> lock in mq_dump(), since it is not a hot path (comparing with
> > >> enqueue/dequeue)?
> > >
> > > Really ? Which qdisc tree lock will protect you exactly ???
> > >
> > > Whole point of MQ is that each TX queue has its own lock.
> > >
> > > So multiple cpus can call qdisc_tree_decrease_qlen() at the same time,
> > > holding their own lock.
> > >
> > > Clearly modifying mq 'data' is wrong.
> > 
> > Ah, yeah, but mq _seems_ also the only one who modifies sch->q.qlen
> > in ->dump(), which is the root cause of this bug. I am wondering if it should
> > just compute the qlen and return it without modifying sch->q.qlen.
> 
> Sure, but then we still would get PAX underflows warnings ...
> 
> Also need to take care of sch->qstats.drops += count;
> 
> Also that would require a change of ->dump() api, since tc_fill_qdisc()
> does :
> 
> if (q->ops->dump && q->ops->dump(q, skb) < 0)
>     goto nla_put_failure;
> qlen = q->q.qlen;
> 
> Not sure it is worth the pain, changing signature of all ->dump()
> handlers...
> 
> 
> What about adding TCQ_F_NOPARENT and then :
> 
> Note : Seems to be more invasive patch for net tree (need to properly
> set TCQ_F_NOPARENT)


Hmm... it looks like we have a much more serious bug :

qdisc_lookup() calls qdisc_match_from_root(dev->qdisc, handle) without
proper lock being held, so we might actually crash the host,
if qdisc_tree_decrease_qlen() happens at the time qdiscs are changed. 

qdisc_tree_decrease_qlen() needs serious care :(

Damned.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: size overflow in function qdisc_tree_decrease_qlen net/sched/sch_api.c
  2015-12-01 20:06                   ` Eric Dumazet
  2015-12-01 22:33                     ` Eric Dumazet
@ 2015-12-01 22:40                     ` Cong Wang
  1 sibling, 0 replies; 19+ messages in thread
From: Cong Wang @ 2015-12-01 22:40 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: pageexec, Daniele Fucini, netdev, Jamal Hadi Salim, David Miller,
	spender, re.emese

On Tue, Dec 1, 2015 at 12:06 PM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> On Tue, 2015-12-01 at 11:17 -0800, Cong Wang wrote:
>> On Tue, Dec 1, 2015 at 11:09 AM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
>> > On Tue, 2015-12-01 at 10:43 -0800, Cong Wang wrote:
>> >
>> >> This smells hacky... Another way to fix this is to hold the qdisc tree
>> >> lock in mq_dump(), since it is not a hot path (comparing with
>> >> enqueue/dequeue)?
>> >
>> > Really ? Which qdisc tree lock will protect you exactly ???
>> >
>> > Whole point of MQ is that each TX queue has its own lock.
>> >
>> > So multiple cpus can call qdisc_tree_decrease_qlen() at the same time,
>> > holding their own lock.
>> >
>> > Clearly modifying mq 'data' is wrong.
>>
>> Ah, yeah, but mq _seems_ also the only one who modifies sch->q.qlen
>> in ->dump(), which is the root cause of this bug. I am wondering if it should
>> just compute the qlen and return it without modifying sch->q.qlen.
>
> Sure, but then we still would get PAX underflows warnings ...
>
> Also need to take care of sch->qstats.drops += count;
>
> Also that would require a change of ->dump() api, since tc_fill_qdisc()
> does :
>
> if (q->ops->dump && q->ops->dump(q, skb) < 0)
>     goto nla_put_failure;
> qlen = q->q.qlen;
>
> Not sure it is worth the pain, changing signature of all ->dump()
> handlers...

Yeah, I am fully aware of that, your patch is a quick fix, I was trying
to see if there is any long-term fix for this.

>
>
> What about adding TCQ_F_NOPARENT and then :

This seems equivalent to your fix since TCQ_F_MQROOT implies
no parent:

        if (sch->parent != TC_H_ROOT)
                return -EOPNOTSUPP;

Again, your patch is fine, just want to check if there is any better fix.

Thanks.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: size overflow in function qdisc_tree_decrease_qlen net/sched/sch_api.c
  2015-12-01 22:33                     ` Eric Dumazet
@ 2015-12-01 22:47                       ` Cong Wang
  2015-12-01 23:10                         ` Eric Dumazet
  0 siblings, 1 reply; 19+ messages in thread
From: Cong Wang @ 2015-12-01 22:47 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: pageexec, Daniele Fucini, netdev, Jamal Hadi Salim, David Miller,
	spender, re.emese

On Tue, Dec 1, 2015 at 2:33 PM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> Hmm... it looks like we have a much more serious bug :
>
> qdisc_lookup() calls qdisc_match_from_root(dev->qdisc, handle) without
> proper lock being held, so we might actually crash the host,
> if qdisc_tree_decrease_qlen() happens at the time qdiscs are changed.
>
> qdisc_tree_decrease_qlen() needs serious care :(

Convert qdisc list to RCU protected?

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: size overflow in function qdisc_tree_decrease_qlen net/sched/sch_api.c
  2015-12-01 22:47                       ` Cong Wang
@ 2015-12-01 23:10                         ` Eric Dumazet
  2015-12-02  4:08                           ` [PATCH net] net_sched: fix qdisc_tree_decrease_qlen() races Eric Dumazet
  0 siblings, 1 reply; 19+ messages in thread
From: Eric Dumazet @ 2015-12-01 23:10 UTC (permalink / raw)
  To: Cong Wang
  Cc: pageexec, Daniele Fucini, netdev, Jamal Hadi Salim, David Miller,
	spender, re.emese

On Tue, 2015-12-01 at 14:47 -0800, Cong Wang wrote:
> On Tue, Dec 1, 2015 at 2:33 PM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> > Hmm... it looks like we have a much more serious bug :
> >
> > qdisc_lookup() calls qdisc_match_from_root(dev->qdisc, handle) without
> > proper lock being held, so we might actually crash the host,
> > if qdisc_tree_decrease_qlen() happens at the time qdiscs are changed.
> >
> > qdisc_tree_decrease_qlen() needs serious care :(
> 
> Convert qdisc list to RCU protected?

Yes, or/and add a per txqueue list, to shorten lookup times !

If we have a per txqueue list, we do not need RCU as we already own the
qdisc lock.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH net] net_sched: fix qdisc_tree_decrease_qlen() races
  2015-12-01 23:10                         ` Eric Dumazet
@ 2015-12-02  4:08                           ` Eric Dumazet
  2015-12-03 19:59                             ` David Miller
  0 siblings, 1 reply; 19+ messages in thread
From: Eric Dumazet @ 2015-12-02  4:08 UTC (permalink / raw)
  To: Cong Wang
  Cc: pageexec, Daniele Fucini, netdev, Jamal Hadi Salim, David Miller,
	spender, re.emese

From: Eric Dumazet <edumazet@google.com>

qdisc_tree_decrease_qlen() suffers from two problems on multiqueue
devices.

One problem is that it updates sch->q.qlen and sch->qstats.drops
on the mq/mqprio root qdisc, while it should not : Daniele
reported underflows errors : 
[  681.774821] PAX: sch->q.qlen: 0 n: 1
[  681.774825] PAX: size overflow detected in function qdisc_tree_decrease_qlen net/sched/sch_api.c:769 cicus.693_49 min, count: 72, decl: qlen; num: 0; context: sk_buff_head;
[  681.774954] CPU: 2 PID: 19 Comm: ksoftirqd/2 Tainted: G           O    4.2.6.201511282239-1-grsec #1
[  681.774955] Hardware name: ASUSTeK COMPUTER INC. X302LJ/X302LJ, BIOS X302LJ.202 03/05/2015
[  681.774956]  ffffffffa9a04863 0000000000000000 0000000000000000 ffffffffa990ff7c
[  681.774959]  ffffc90000d3bc38 ffffffffa95d2810 0000000000000007 ffffffffa991002b
[  681.774960]  ffffc90000d3bc68 ffffffffa91a44f4 0000000000000001 0000000000000001
[  681.774962] Call Trace:
[  681.774967]  [<ffffffffa95d2810>] dump_stack+0x4c/0x7f
[  681.774970]  [<ffffffffa91a44f4>] report_size_overflow+0x34/0x50
[  681.774972]  [<ffffffffa94d17e2>] qdisc_tree_decrease_qlen+0x152/0x160
[  681.774976]  [<ffffffffc02694b1>] fq_codel_dequeue+0x7b1/0x820 [sch_fq_codel]
[  681.774978]  [<ffffffffc02680a0>] ? qdisc_peek_dequeued+0xa0/0xa0 [sch_fq_codel]
[  681.774980]  [<ffffffffa94cd92d>] __qdisc_run+0x4d/0x1d0
[  681.774983]  [<ffffffffa949b2b2>] net_tx_action+0xc2/0x160
[  681.774985]  [<ffffffffa90664c1>] __do_softirq+0xf1/0x200
[  681.774987]  [<ffffffffa90665ee>] run_ksoftirqd+0x1e/0x30
[  681.774989]  [<ffffffffa90896b0>] smpboot_thread_fn+0x150/0x260
[  681.774991]  [<ffffffffa9089560>] ? sort_range+0x40/0x40
[  681.774992]  [<ffffffffa9085fe4>] kthread+0xe4/0x100
[  681.774994]  [<ffffffffa9085f00>] ? kthread_worker_fn+0x170/0x170
[  681.774995]  [<ffffffffa95d8d1e>] ret_from_fork+0x3e/0x70

mq/mqprio have their own ways to report qlen/drops by folding stats on
all their queues, with appropriate locking.

A second problem is that qdisc_tree_decrease_qlen() calls qdisc_lookup()
without proper locking : concurrent qdisc updates could corrupt the list
that qdisc_match_from_root() parses to find a qdisc given its handle.

Fix first problem adding a TCQ_F_NOPARENT qdisc flag that
qdisc_tree_decrease_qlen() can use to abort its tree traversal,
as soon as it meets a mq/mqprio qdisc children.

Second problem can be fixed by RCU protection.
Qdisc are already freed after RCU grace period, so qdisc_list_add() and
qdisc_list_del() simply have to use appropriate rcu list variants.

A future patch will add a per struct netdev_queue list anchor, so that
qdisc_tree_decrease_qlen() can have more efficient lookups.

Reported-by: Daniele Fucini <dfucini@gmail.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Cong Wang <cwang@twopensource.com>
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
---
 include/net/sch_generic.h |    3 +++
 net/sched/sch_api.c       |   27 ++++++++++++++++++---------
 net/sched/sch_generic.c   |    2 +-
 net/sched/sch_mq.c        |    4 ++--
 net/sched/sch_mqprio.c    |    4 ++--
 5 files changed, 26 insertions(+), 14 deletions(-)

diff --git a/include/net/sch_generic.h b/include/net/sch_generic.h
index 4c79ce8c1f92..b2a8e6338576 100644
--- a/include/net/sch_generic.h
+++ b/include/net/sch_generic.h
@@ -61,6 +61,9 @@ struct Qdisc {
 				      */
 #define TCQ_F_WARN_NONWC	(1 << 16)
 #define TCQ_F_CPUSTATS		0x20 /* run using percpu statistics */
+#define TCQ_F_NOPARENT		0x40 /* root of its hierarchy :
+				      * qdisc_tree_decrease_qlen() should stop.
+				      */
 	u32			limit;
 	const struct Qdisc_ops	*ops;
 	struct qdisc_size_table	__rcu *stab;
diff --git a/net/sched/sch_api.c b/net/sched/sch_api.c
index f43c8f33f09e..7ec667dd4ce1 100644
--- a/net/sched/sch_api.c
+++ b/net/sched/sch_api.c
@@ -253,7 +253,8 @@ int qdisc_set_default(const char *name)
 }
 
 /* We know handle. Find qdisc among all qdisc's attached to device
-   (root qdisc, all its children, children of children etc.)
+ * (root qdisc, all its children, children of children etc.)
+ * Note: caller either uses rtnl or rcu_read_lock()
  */
 
 static struct Qdisc *qdisc_match_from_root(struct Qdisc *root, u32 handle)
@@ -264,7 +265,7 @@ static struct Qdisc *qdisc_match_from_root(struct Qdisc *root, u32 handle)
 	    root->handle == handle)
 		return root;
 
-	list_for_each_entry(q, &root->list, list) {
+	list_for_each_entry_rcu(q, &root->list, list) {
 		if (q->handle == handle)
 			return q;
 	}
@@ -277,15 +278,18 @@ void qdisc_list_add(struct Qdisc *q)
 		struct Qdisc *root = qdisc_dev(q)->qdisc;
 
 		WARN_ON_ONCE(root == &noop_qdisc);
-		list_add_tail(&q->list, &root->list);
+		ASSERT_RTNL();
+		list_add_tail_rcu(&q->list, &root->list);
 	}
 }
 EXPORT_SYMBOL(qdisc_list_add);
 
 void qdisc_list_del(struct Qdisc *q)
 {
-	if ((q->parent != TC_H_ROOT) && !(q->flags & TCQ_F_INGRESS))
-		list_del(&q->list);
+	if ((q->parent != TC_H_ROOT) && !(q->flags & TCQ_F_INGRESS)) {
+		ASSERT_RTNL();
+		list_del_rcu(&q->list);
+	}
 }
 EXPORT_SYMBOL(qdisc_list_del);
 
@@ -750,14 +754,18 @@ void qdisc_tree_decrease_qlen(struct Qdisc *sch, unsigned int n)
 	if (n == 0)
 		return;
 	drops = max_t(int, n, 0);
+	rcu_read_lock();
 	while ((parentid = sch->parent)) {
 		if (TC_H_MAJ(parentid) == TC_H_MAJ(TC_H_INGRESS))
-			return;
+			break;
 
+		if (sch->flags & TCQ_F_NOPARENT)
+			break;
+		/* TODO: perform the search on a per txq basis */
 		sch = qdisc_lookup(qdisc_dev(sch), TC_H_MAJ(parentid));
 		if (sch == NULL) {
-			WARN_ON(parentid != TC_H_ROOT);
-			return;
+			WARN_ON_ONCE(parentid != TC_H_ROOT);
+			break;
 		}
 		cops = sch->ops->cl_ops;
 		if (cops->qlen_notify) {
@@ -768,6 +776,7 @@ void qdisc_tree_decrease_qlen(struct Qdisc *sch, unsigned int n)
 		sch->q.qlen -= n;
 		__qdisc_qstats_drop(sch, drops);
 	}
+	rcu_read_unlock();
 }
 EXPORT_SYMBOL(qdisc_tree_decrease_qlen);
 
@@ -941,7 +950,7 @@ qdisc_create(struct net_device *dev, struct netdev_queue *dev_queue,
 		}
 		lockdep_set_class(qdisc_lock(sch), &qdisc_tx_lock);
 		if (!netif_is_multiqueue(dev))
-			sch->flags |= TCQ_F_ONETXQUEUE;
+			sch->flags |= TCQ_F_ONETXQUEUE | TCQ_F_NOPARENT;
 	}
 
 	sch->handle = handle;
diff --git a/net/sched/sch_generic.c b/net/sched/sch_generic.c
index cb5d4ad32946..e82a1ad80aa5 100644
--- a/net/sched/sch_generic.c
+++ b/net/sched/sch_generic.c
@@ -737,7 +737,7 @@ static void attach_one_default_qdisc(struct net_device *dev,
 		return;
 	}
 	if (!netif_is_multiqueue(dev))
-		qdisc->flags |= TCQ_F_ONETXQUEUE;
+		qdisc->flags |= TCQ_F_ONETXQUEUE | TCQ_F_NOPARENT;
 	dev_queue->qdisc_sleeping = qdisc;
 }
 
diff --git a/net/sched/sch_mq.c b/net/sched/sch_mq.c
index f3cbaecd283a..3e82f047caaf 100644
--- a/net/sched/sch_mq.c
+++ b/net/sched/sch_mq.c
@@ -63,7 +63,7 @@ static int mq_init(struct Qdisc *sch, struct nlattr *opt)
 		if (qdisc == NULL)
 			goto err;
 		priv->qdiscs[ntx] = qdisc;
-		qdisc->flags |= TCQ_F_ONETXQUEUE;
+		qdisc->flags |= TCQ_F_ONETXQUEUE | TCQ_F_NOPARENT;
 	}
 
 	sch->flags |= TCQ_F_MQROOT;
@@ -156,7 +156,7 @@ static int mq_graft(struct Qdisc *sch, unsigned long cl, struct Qdisc *new,
 
 	*old = dev_graft_qdisc(dev_queue, new);
 	if (new)
-		new->flags |= TCQ_F_ONETXQUEUE;
+		new->flags |= TCQ_F_ONETXQUEUE | TCQ_F_NOPARENT;
 	if (dev->flags & IFF_UP)
 		dev_activate(dev);
 	return 0;
diff --git a/net/sched/sch_mqprio.c b/net/sched/sch_mqprio.c
index 3811a745452c..ad70ecf57ce7 100644
--- a/net/sched/sch_mqprio.c
+++ b/net/sched/sch_mqprio.c
@@ -132,7 +132,7 @@ static int mqprio_init(struct Qdisc *sch, struct nlattr *opt)
 			goto err;
 		}
 		priv->qdiscs[i] = qdisc;
-		qdisc->flags |= TCQ_F_ONETXQUEUE;
+		qdisc->flags |= TCQ_F_ONETXQUEUE | TCQ_F_NOPARENT;
 	}
 
 	/* If the mqprio options indicate that hardware should own
@@ -209,7 +209,7 @@ static int mqprio_graft(struct Qdisc *sch, unsigned long cl, struct Qdisc *new,
 	*old = dev_graft_qdisc(dev_queue, new);
 
 	if (new)
-		new->flags |= TCQ_F_ONETXQUEUE;
+		new->flags |= TCQ_F_ONETXQUEUE | TCQ_F_NOPARENT;
 
 	if (dev->flags & IFF_UP)
 		dev_activate(dev);

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCH net] net_sched: fix qdisc_tree_decrease_qlen() races
  2015-12-02  4:08                           ` [PATCH net] net_sched: fix qdisc_tree_decrease_qlen() races Eric Dumazet
@ 2015-12-03 19:59                             ` David Miller
  0 siblings, 0 replies; 19+ messages in thread
From: David Miller @ 2015-12-03 19:59 UTC (permalink / raw)
  To: eric.dumazet; +Cc: cwang, pageexec, dfucini, netdev, jhs, spender, re.emese

From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Tue, 01 Dec 2015 20:08:51 -0800

> From: Eric Dumazet <edumazet@google.com>
> 
> qdisc_tree_decrease_qlen() suffers from two problems on multiqueue
> devices.
> 
> One problem is that it updates sch->q.qlen and sch->qstats.drops
> on the mq/mqprio root qdisc, while it should not : Daniele
> reported underflows errors : 
 ...
> mq/mqprio have their own ways to report qlen/drops by folding stats on
> all their queues, with appropriate locking.
> 
> A second problem is that qdisc_tree_decrease_qlen() calls qdisc_lookup()
> without proper locking : concurrent qdisc updates could corrupt the list
> that qdisc_match_from_root() parses to find a qdisc given its handle.
> 
> Fix first problem adding a TCQ_F_NOPARENT qdisc flag that
> qdisc_tree_decrease_qlen() can use to abort its tree traversal,
> as soon as it meets a mq/mqprio qdisc children.
> 
> Second problem can be fixed by RCU protection.
> Qdisc are already freed after RCU grace period, so qdisc_list_add() and
> qdisc_list_del() simply have to use appropriate rcu list variants.
> 
> A future patch will add a per struct netdev_queue list anchor, so that
> qdisc_tree_decrease_qlen() can have more efficient lookups.
> 
> Reported-by: Daniele Fucini <dfucini@gmail.com>
> Signed-off-by: Eric Dumazet <edumazet@google.com>

Applied and queued up for -stable, thanks Eric!

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2015-12-03 19:59 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-12-01  1:00 size overflow in function qdisc_tree_decrease_qlen net/sched/sch_api.c Daniele Fucini
2015-12-01  4:50 ` Cong Wang
2015-12-01 11:19   ` Daniele Fucini
2015-12-01 14:06     ` Eric Dumazet
2015-12-01 14:10       ` Eric Dumazet
2015-12-01 16:13         ` PaX Team
2015-12-01 16:34           ` Eric Dumazet
2015-12-01 18:43             ` Cong Wang
2015-12-01 19:09               ` Eric Dumazet
2015-12-01 19:17                 ` Cong Wang
2015-12-01 20:06                   ` Eric Dumazet
2015-12-01 22:33                     ` Eric Dumazet
2015-12-01 22:47                       ` Cong Wang
2015-12-01 23:10                         ` Eric Dumazet
2015-12-02  4:08                           ` [PATCH net] net_sched: fix qdisc_tree_decrease_qlen() races Eric Dumazet
2015-12-03 19:59                             ` David Miller
2015-12-01 22:40                     ` size overflow in function qdisc_tree_decrease_qlen net/sched/sch_api.c Cong Wang
2015-12-01 14:15       ` Eric Dumazet
2015-12-01 19:13         ` Daniele Fucini

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.