tc related lockdep warning.

All of lore.kernel.org
 help / color / mirror / Atom feed

* tc related lockdep warning.
@ 2006-09-24 21:29 Dave Jones
  2006-09-25 12:43 ` Jarek Poplawski
  0 siblings, 1 reply; 21+ messages in thread
From: Dave Jones @ 2006-09-24 21:29 UTC (permalink / raw)
  To: netdev


=================================
[ INFO: inconsistent lock state ]
---------------------------------
inconsistent {softirq-on-R} -> {in-softirq-W} usage.
swapper/0 [HC0[0]:SC1[2]:HE1:SE0] takes:
 (police_lock){-+--}, at: [<f8d304fd>] tcf_police_destroy+0x24/0x8f [act_police]
{softirq-on-R} state was registered at:
  [<c043bdd6>] lock_acquire+0x4b/0x6d
  [<c061495a>] _read_lock+0x19/0x28
  [<f8d3026a>] tcf_act_police_locate+0x26a/0x363 [act_police]
  [<c05cacc3>] tcf_action_init_1+0x113/0x1a7
  [<c05c97c9>] tcf_exts_validate+0x3c/0x85
  [<f8d4337c>] u32_set_parms+0x26/0x131 [cls_u32]
  [<f8d43dc7>] u32_change+0x2fc/0x371 [cls_u32]
  [<c05c9f44>] tc_ctl_tfilter+0x417/0x487
  [<c05c0d67>] rtnetlink_rcv_msg+0x1b3/0x1d6
  [<c05ccef3>] netlink_run_queue+0x69/0xfe
  [<c05c0b6a>] rtnetlink_rcv+0x29/0x42
  [<c05cd380>] netlink_data_ready+0x12/0x50
  [<c05cc3e8>] netlink_sendskb+0x1f/0x37
  [<c05cccc1>] netlink_unicast+0x1a1/0x1bb
  [<c05cd361>] netlink_sendmsg+0x275/0x282
  [<c05aff4a>] sock_sendmsg+0xe8/0x103
  [<c05b074d>] sys_sendmsg+0x14d/0x1a8
  [<c05b1937>] sys_socketcall+0x16b/0x186
  [<c0403fb7>] syscall_call+0x7/0xb
irq event stamp: 278833666
hardirqs last  enabled at (278833666): [<c04290b9>] tasklet_action+0x30/0xca
hardirqs last disabled at (278833665): [<c0429095>] tasklet_action+0xc/0xca
softirqs last  enabled at (278833650): [<c0429083>] __do_softirq+0xec/0xf2
softirqs last disabled at (278833659): [<c0406683>] do_softirq+0x5a/0xbe

other info that might help us debug this:
1 lock held by swapper/0:
 #0:  (qdisc_tree_lock){-+-.}, at: [<c05c7737>] __qdisc_destroy+0x20/0x85

stack backtrace:
 [<c04051ed>] show_trace_log_lvl+0x58/0x16a
 [<c04057fa>] show_trace+0xd/0x10
 [<c0405913>] dump_stack+0x19/0x1b
 [<c043a20b>] print_usage_bug+0x1cf/0x1dc
 [<c043a5e4>] mark_lock+0x124/0x353
 [<c043b2a0>] __lock_acquire+0x3d7/0x99c
 [<c043bdd6>] lock_acquire+0x4b/0x6d
 [<c06148a5>] _write_lock_bh+0x1e/0x2d
 [<f8d304fd>] tcf_police_destroy+0x24/0x8f [act_police]
 [<f8d30590>] tcf_act_police_cleanup+0x28/0x33 [act_police]
 [<c05ca1a1>] tcf_action_destroy+0x20/0x84
 [<c05c9784>] tcf_exts_destroy+0x16/0x1f
 [<f8d43114>] u32_destroy_key+0x30/0x50 [cls_u32]
 [<f8d4314f>] u32_clear_hnode+0x1b/0x2e [cls_u32]
 [<f8d4319a>] u32_destroy_hnode+0x38/0x81 [cls_u32]
 [<f8d4322d>] u32_destroy+0x4a/0xc9 [cls_u32]
 [<f8d340f9>] ingress_destroy+0x1a/0x5c [sch_ingress]
 [<c05c774d>] __qdisc_destroy+0x36/0x85
 [<c043438f>] __rcu_process_callbacks+0xfe/0x169
 [<c04346f0>] rcu_process_callbacks+0x23/0x45
 [<c04290ee>] tasklet_action+0x65/0xca
 [<c042900f>] __do_softirq+0x78/0xf2
 [<c0406683>] do_softirq+0x5a/0xbe
 [<c0428eb8>] irq_exit+0x3d/0x3f
 [<c04179df>] smp_apic_timer_interrupt+0x73/0x78
 [<c0404b12>] apic_timer_interrupt+0x2a/0x30
DWARF2 unwinder stuck at apic_timer_interrupt+0x2a/0x30
Leftover inexact backtrace:

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: tc related lockdep warning.
  2006-09-24 21:29 tc related lockdep warning Dave Jones
@ 2006-09-25 12:43 ` Jarek Poplawski
  2006-09-25 12:47   ` jamal
  0 siblings, 1 reply; 21+ messages in thread
From: Jarek Poplawski @ 2006-09-25 12:43 UTC (permalink / raw)
  To: netdev; +Cc: Dave Jones, hadi, davem

On 24-09-2006 23:29, Dave Jones wrote:
> =================================
> [ INFO: inconsistent lock state ]
> ---------------------------------
> inconsistent {softirq-on-R} -> {in-softirq-W} usage.
> swapper/0 [HC0[0]:SC1[2]:HE1:SE0] takes:
>  (police_lock){-+--}, at: [<f8d304fd>] tcf_police_destroy+0x24/0x8f [act_police]
> {softirq-on-R} state was registered at:
>   [<c043bdd6>] lock_acquire+0x4b/0x6d
>   [<c061495a>] _read_lock+0x19/0x28
>   [<f8d3026a>] tcf_act_police_locate+0x26a/0x363 [act_police]
...
>  [<c06148a5>] _write_lock_bh+0x1e/0x2d
>  [<f8d304fd>] tcf_police_destroy+0x24/0x8f [act_police]
...
>  [<c042900f>] __do_softirq+0x78/0xf2
>  [<c0406683>] do_softirq+0x5a/0xbe
>  [<c0428eb8>] irq_exit+0x3d/0x3f
>  [<c04179df>] smp_apic_timer_interrupt+0x73/0x78
>  [<c0404b12>] apic_timer_interrupt+0x2a/0x30
> DWARF2 unwinder stuck at apic_timer_interrupt+0x2a/0x30
> Leftover inexact backtrace:

It's probably 2.6.18 and should change a little now (git4) but
IMHO main problem stays: it looks tcf_act_police_locate in
act_police.c was preempted in read_lock (tcf_police_lookup)
- now the same is possible in tcf_hash_lookup. So maybe
read_lock_bh will help?

Jarek P. 

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: tc related lockdep warning.
  2006-09-25 12:43 ` Jarek Poplawski
@ 2006-09-25 12:47   ` jamal
  2006-09-25 13:05     ` Jarek Poplawski
  2006-09-25 13:29     ` Patrick McHardy
  0 siblings, 2 replies; 21+ messages in thread
From: jamal @ 2006-09-25 12:47 UTC (permalink / raw)
  To: Jarek Poplawski; +Cc: netdev, Dave Jones, davem

On Mon, 2006-25-09 at 14:43 +0200, Jarek Poplawski wrote:

> It's probably 2.6.18 and should change a little now (git4) but
> IMHO main problem stays: it looks tcf_act_police_locate in
> act_police.c was preempted in read_lock (tcf_police_lookup)
> - now the same is possible in tcf_hash_lookup. So maybe
> read_lock_bh will help?
> 

Yes, that looks plausible. Can you try making those changes and see if
the warning is gone?

cheers,
jamal


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: tc related lockdep warning.
  2006-09-25 12:47   ` jamal
@ 2006-09-25 13:05     ` Jarek Poplawski
  2006-09-25 13:29     ` Patrick McHardy
  1 sibling, 0 replies; 21+ messages in thread
From: Jarek Poplawski @ 2006-09-25 13:05 UTC (permalink / raw)
  To: hadi; +Cc: netdev, Dave Jones, davem

On 25-09-2006 14:47, jamal wrote:
> On Mon, 2006-25-09 at 14:43 +0200, Jarek Poplawski wrote:
> 
>> It's probably 2.6.18 and should change a little now (git4) but
>> IMHO main problem stays: it looks tcf_act_police_locate in
>> act_police.c was preempted in read_lock (tcf_police_lookup)
>> - now the same is possible in tcf_hash_lookup. So maybe
>> read_lock_bh will help?
>>
> 
> Yes, that looks plausible. Can you try making those changes and see if
> the warning is gone?

I hope you are asking Dave Jones... 

Jarek P. 

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: tc related lockdep warning.
  2006-09-25 12:47   ` jamal
  2006-09-25 13:05     ` Jarek Poplawski
@ 2006-09-25 13:29     ` Patrick McHardy
  2006-09-26 16:15       ` Patrick McHardy
  1 sibling, 1 reply; 21+ messages in thread
From: Patrick McHardy @ 2006-09-25 13:29 UTC (permalink / raw)
  To: hadi; +Cc: Jarek Poplawski, netdev, Dave Jones, davem

jamal wrote:
> On Mon, 2006-25-09 at 14:43 +0200, Jarek Poplawski wrote:
> 
> 
>>It's probably 2.6.18 and should change a little now (git4) but
>>IMHO main problem stays: it looks tcf_act_police_locate in
>>act_police.c was preempted in read_lock (tcf_police_lookup)
>>- now the same is possible in tcf_hash_lookup. So maybe
>>read_lock_bh will help?
>>
> 
> 
> Yes, that looks plausible. Can you try making those changes and see if
> the warning is gone?

I think this points to a bigger brokeness caused by the move of
dev->qdisc to RCU. It means destruction of filters and actions doesn't
necessarily happens in user-context and thus not protected by the rtnl
anymore. On first sight all actions are affected by this, as well as
the u32 classifier (race between u32_destroy and u32_init while changing
u32_list), but I bet there are more. I'm busy right now, but I'll look
into this later.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: tc related lockdep warning.
  2006-09-25 13:29     ` Patrick McHardy
@ 2006-09-26 16:15       ` Patrick McHardy
  2006-09-26 21:20         ` Dave Jones
  0 siblings, 1 reply; 21+ messages in thread
From: Patrick McHardy @ 2006-09-26 16:15 UTC (permalink / raw)
  To: Patrick McHardy; +Cc: hadi, Jarek Poplawski, netdev, Dave Jones, davem

[-- Attachment #1: Type: text/plain, Size: 1168 bytes --]

Patrick McHardy wrote:
> jamal wrote:
> 
>>Yes, that looks plausible. Can you try making those changes and see if
>>the warning is gone?
> 
>
> I think this points to a bigger brokeness caused by the move of
> dev->qdisc to RCU. It means destruction of filters and actions doesn't
> necessarily happens in user-context and thus not protected by the rtnl
> anymore.

I looked into this and we indeed still have lots of problems from that
broken RCU patch. Basically all locking (qdiscs, classifiers, actions,
estimators) assumes that updates are only done in process context and
thus read_lock doesn't need bottem half protection. Quite a few things
also assume that updates only happen under the RTNL and don't need
any further protection if not used during packet processing.

Instead of "fixing" all this I suggest something like this (untested)
patch instead. Since only the dev->qdisc pointer is protected by RCU,
but enqueue and the qdisc tree are still protected by dev->qdisc_lock,
we can perform destruction of the tree immediately and only do the
final free in the rcu callback, as long as we make sure not to enqueue
anything to a half-way destroyed qdisc.


[-- Attachment #2: x --]
[-- Type: text/plain, Size: 3479 bytes --]

diff --git a/include/net/sch_generic.h b/include/net/sch_generic.h
index b0e9108..278d5e6 100644
--- a/include/net/sch_generic.h
+++ b/include/net/sch_generic.h
@@ -31,6 +31,7 @@ struct Qdisc
 #define TCQ_F_BUILTIN	1
 #define TCQ_F_THROTTLED	2
 #define TCQ_F_INGRESS	4
+#define TCQ_F_DYING	8
 	int			padded;
 	struct Qdisc_ops	*ops;
 	u32			handle;
diff --git a/net/core/dev.c b/net/core/dev.c
index 14de297..c9c65c2 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -1480,14 +1480,15 @@ #endif
 	if (q->enqueue) {
 		/* Grab device queue */
 		spin_lock(&dev->queue_lock);
+		if (likely(!(q->flags & TCQ_F_DYING))) {
+			rc = q->enqueue(skb, q);
+			qdisc_run(dev);
+			spin_unlock(&dev->queue_lock);
 
-		rc = q->enqueue(skb, q);
-
-		qdisc_run(dev);
-
+			rc = rc == NET_XMIT_BYPASS ? NET_XMIT_SUCCESS : rc;
+			goto out;
+		}
 		spin_unlock(&dev->queue_lock);
-		rc = rc == NET_XMIT_BYPASS ? NET_XMIT_SUCCESS : rc;
-		goto out;
 	}
 
 	/* The device has no queue. Common case for software devices:
diff --git a/net/sched/sch_generic.c b/net/sched/sch_generic.c
index 6f91518..93d52e0 100644
--- a/net/sched/sch_generic.c
+++ b/net/sched/sch_generic.c
@@ -47,9 +47,8 @@ #include <net/pkt_sched.h>
      spinlock dev->queue_lock.
    - tree walking is protected by read_lock_bh(qdisc_tree_lock)
      and this lock is used only in process context.
-   - updates to tree are made under rtnl semaphore or
-     from softirq context (__qdisc_destroy rcu-callback)
-     hence this lock needs local bh disabling.
+   - updates to tree are made only under rtnl semaphore,
+     hence this lock may be made without local bh disabling.
 
    qdisc_tree_lock must be grabbed BEFORE dev->queue_lock!
  */
@@ -483,20 +482,6 @@ void qdisc_reset(struct Qdisc *qdisc)
 static void __qdisc_destroy(struct rcu_head *head)
 {
 	struct Qdisc *qdisc = container_of(head, struct Qdisc, q_rcu);
-	struct Qdisc_ops  *ops = qdisc->ops;
-
-#ifdef CONFIG_NET_ESTIMATOR
-	gen_kill_estimator(&qdisc->bstats, &qdisc->rate_est);
-#endif
-	write_lock(&qdisc_tree_lock);
-	if (ops->reset)
-		ops->reset(qdisc);
-	if (ops->destroy)
-		ops->destroy(qdisc);
-	write_unlock(&qdisc_tree_lock);
-	module_put(ops->owner);
-
-	dev_put(qdisc->dev);
 	kfree((char *) qdisc - qdisc->padded);
 }
 
@@ -504,32 +489,24 @@ #endif
 
 void qdisc_destroy(struct Qdisc *qdisc)
 {
-	struct list_head cql = LIST_HEAD_INIT(cql);
-	struct Qdisc *cq, *q, *n;
+	struct Qdisc_ops  *ops = qdisc->ops;
 
 	if (qdisc->flags & TCQ_F_BUILTIN ||
-		!atomic_dec_and_test(&qdisc->refcnt))
+	    !atomic_dec_and_test(&qdisc->refcnt))
 		return;
+	qdisc->flags |= TCQ_F_DYING;
 
-	if (!list_empty(&qdisc->list)) {
-		if (qdisc->ops->cl_ops == NULL)
-			list_del(&qdisc->list);
-		else
-			list_move(&qdisc->list, &cql);
-	}
-
-	/* unlink inner qdiscs from dev->qdisc_list immediately */
-	list_for_each_entry(cq, &cql, list)
-		list_for_each_entry_safe(q, n, &qdisc->dev->qdisc_list, list)
-			if (TC_H_MAJ(q->parent) == TC_H_MAJ(cq->handle)) {
-				if (q->ops->cl_ops == NULL)
-					list_del_init(&q->list);
-				else
-					list_move_tail(&q->list, &cql);
-			}
-	list_for_each_entry_safe(cq, n, &cql, list)
-		list_del_init(&cq->list);
+	list_del(&qdisc->list);
+#ifdef CONFIG_NET_ESTIMATOR
+	gen_kill_estimator(&qdisc->bstats, &qdisc->rate_est);
+#endif
+	if (ops->reset)
+		ops->reset(qdisc);
+	if (ops->destroy)
+		ops->destroy(qdisc);
 
+	module_put(ops->owner);
+	dev_put(qdisc->dev);
 	call_rcu(&qdisc->q_rcu, __qdisc_destroy);
 }
 

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: tc related lockdep warning.
  2006-09-26 16:15       ` Patrick McHardy
@ 2006-09-26 21:20         ` Dave Jones
  2006-09-27  8:54           ` Jarek Poplawski
                             ` (2 more replies)
  0 siblings, 3 replies; 21+ messages in thread
From: Dave Jones @ 2006-09-26 21:20 UTC (permalink / raw)
  To: Patrick McHardy; +Cc: hadi, Jarek Poplawski, netdev, davem

On Tue, Sep 26, 2006 at 06:15:21PM +0200, Patrick McHardy wrote:
 > Patrick McHardy wrote:
 > > jamal wrote:
 > > 
 > >>Yes, that looks plausible. Can you try making those changes and see if
 > >>the warning is gone?
 > > 
 > >
 > > I think this points to a bigger brokeness caused by the move of
 > > dev->qdisc to RCU. It means destruction of filters and actions doesn't
 > > necessarily happens in user-context and thus not protected by the rtnl
 > > anymore.
 > 
 > I looked into this and we indeed still have lots of problems from that
 > broken RCU patch. Basically all locking (qdiscs, classifiers, actions,
 > estimators) assumes that updates are only done in process context and
 > thus read_lock doesn't need bottem half protection. Quite a few things
 > also assume that updates only happen under the RTNL and don't need
 > any further protection if not used during packet processing.
 > 
 > Instead of "fixing" all this I suggest something like this (untested)
 > patch instead. Since only the dev->qdisc pointer is protected by RCU,
 > but enqueue and the qdisc tree are still protected by dev->qdisc_lock,
 > we can perform destruction of the tree immediately and only do the
 > final free in the rcu callback, as long as we make sure not to enqueue
 > anything to a half-way destroyed qdisc.

With this patch, I get no lockdep warnings, but the machine locks up completely.
I hooked up a serial console, and found this..


u32 classifier
    Performance counters on
    input device check on 
    Actions configured 
BUG: warning at net/sched/sch_htb.c:395/htb_safe_rb_erase()

Call Trace:
 [<ffffffff8026f79b>] show_trace+0xae/0x336
 [<ffffffff8026fa38>] dump_stack+0x15/0x17
 [<ffffffff8860a171>] :sch_htb:htb_safe_rb_erase+0x3b/0x55
 [<ffffffff8860a4d5>] :sch_htb:htb_deactivate_prios+0x173/0x1cd
 [<ffffffff8860b437>] :sch_htb:htb_dequeue+0x4d0/0x856
 [<ffffffff8042dc0d>] __qdisc_run+0x3f/0x1ca
 [<ffffffff802329a6>] dev_queue_xmit+0x137/0x268
 [<ffffffff8025b4a2>] neigh_resolve_output+0x249/0x27e
 [<ffffffff802353fd>] ip_output+0x210/0x25a
 [<ffffffff8043ce28>] ip_push_pending_frames+0x37c/0x45b
 [<ffffffff8044ffd7>] icmp_push_reply+0x13b/0x148
 [<ffffffff80450900>] icmp_send+0x366/0x3d3
 [<ffffffff802568a9>] udp_rcv+0x53d/0x556
 [<ffffffff80237e73>] ip_local_deliver+0x1a3/0x26b
 [<ffffffff80238ec8>] ip_rcv+0x4b9/0x501
 [<ffffffff802218bb>] netif_receive_skb+0x33d/0x3c9
 [<ffffffff881f6348>] :e1000:e1000_clean_rx_irq+0x450/0x4fe
 [<ffffffff881f47eb>] :e1000:e1000_clean+0x88/0x17d
 [<ffffffff8020cab3>] net_rx_action+0xac/0x1d1
 [<ffffffff80212725>] __do_softirq+0x68/0xf5
 [<ffffffff80262638>] call_softirq+0x1c/0x28
DWARF2 unwinder stuck at call_softirq+0x1c/0x28
Leftover inexact backtrace:
 <IRQ> [<ffffffff80270aaa>] do_softirq+0x39/0x9f
 [<ffffffff80296102>] irq_exit+0x57/0x59
 [<ffffffff80270c0d>] do_IRQ+0xfd/0x107
 [<ffffffff8025b51d>] mwait_idle+0x0/0x54
 [<ffffffff802618c6>] ret_from_intr+0x0/0xf
 <EOI> [<ffffffff80265e66>] __sched_text_start+0xaa6/0xadd
 [<ffffffff8025b55c>] mwait_idle+0x3f/0x54
 [<ffffffff8025b526>] mwait_idle+0x9/0x54
 [<ffffffff8024c81c>] cpu_idle+0xa2/0xc5
 [<ffffffff8026e519>] rest_init+0x2b/0x2d
 [<ffffffff80a7f811>] start_kernel+0x24a/0x24c
 [<ffffffff80a7f28b>] _sinittext+0x28b/0x292

BUG: warning at net/sched/sch_htb.c:395/htb_safe_rb_erase()

Call Trace:
 [<ffffffff8026f79b>] show_trace+0xae/0x336
 [<ffffffff8026fa38>] dump_stack+0x15/0x17
 [<ffffffff8860a171>] :sch_htb:htb_safe_rb_erase+0x3b/0x55
 [<ffffffff8860a4d5>] :sch_htb:htb_deactivate_prios+0x173/0x1cd
 [<ffffffff8860b437>] :sch_htb:htb_dequeue+0x4d0/0x856
 [<ffffffff8042dc0d>] __qdisc_run+0x3f/0x1ca
 [<ffffffff802329a6>] dev_queue_xmit+0x137/0x268
 [<ffffffff8025b4a2>] neigh_resolve_output+0x249/0x27e
 [<ffffffff802353fd>] ip_output+0x210/0x25a
 [<ffffffff8043ce28>] ip_push_pending_frames+0x37c/0x45b
 [<ffffffff8044ffd7>] icmp_push_reply+0x13b/0x148
 [<ffffffff80450900>] icmp_send+0x366/0x3d3
 [<ffffffff802568a9>] udp_rcv+0x53d/0x556
 [<ffffffff80237e73>] ip_local_deliver+0x1a3/0x26b
 [<ffffffff80238ec8>] ip_rcv+0x4b9/0x501
 [<ffffffff802218bb>] netif_receive_skb+0x33d/0x3c9
 [<ffffffff881f6348>] :e1000:e1000_clean_rx_irq+0x450/0x4fe
 [<ffffffff881f47eb>] :e1000:e1000_clean+0x88/0x17d
 [<ffffffff8020cab3>] net_rx_action+0xac/0x1d1
 [<ffffffff80212725>] __do_softirq+0x68/0xf5
 [<ffffffff80262638>] call_softirq+0x1c/0x28
DWARF2 unwinder stuck at call_softirq+0x1c/0x28
Leftover inexact backtrace:
 <IRQ> [<ffffffff80270aaa>] do_softirq+0x39/0x9f
 [<ffffffff80296102>] irq_exit+0x57/0x59
 [<ffffffff80270c0d>] do_IRQ+0xfd/0x107
 [<ffffffff8025b51d>] mwait_idle+0x0/0x54
 [<ffffffff802618c6>] ret_from_intr+0x0/0xf
 <EOI> [<ffffffff80265e66>] __sched_text_start+0xaa6/0xadd
 [<ffffffff8025b55c>] mwait_idle+0x3f/0x54
 [<ffffffff8025b526>] mwait_idle+0x9/0x54
 [<ffffffff8024c81c>] cpu_idle+0xa2/0xc5
 [<ffffffff8026e519>] rest_init+0x2b/0x2d
 [<ffffffff80a7f811>] start_kernel+0x24a/0x24c
 [<ffffffff80a7f28b>] _sinittext+0x28b/0x292

BUG: soft lockup detected on CPU#0!

Call Trace:
 [<ffffffff8026f79b>] show_trace+0xae/0x336
 [<ffffffff8026fa38>] dump_stack+0x15/0x17
 [<ffffffff802bfea7>] softlockup_tick+0xd5/0xea
 [<ffffffff80250dba>] run_local_timers+0x13/0x15
 [<ffffffff8029a0f9>] update_process_times+0x4c/0x79
 [<ffffffff8027a6ca>] smp_local_timer_interrupt+0x2b/0x50
 [<ffffffff8027aea7>] smp_apic_timer_interrupt+0x58/0x62
 [<ffffffff80261faf>] apic_timer_interrupt+0x6b/0x70
DWARF2 unwinder stuck at apic_timer_interrupt+0x6b/0x70
Leftover inexact backtrace:
 <IRQ> [<ffffffff80232976>] dev_queue_xmit+0x107/0x268
 [<ffffffff80232976>] dev_queue_xmit+0x107/0x268
 [<ffffffff802277bc>] __rb_rotate_left+0x5f/0x60
 [<ffffffff802135fb>] rb_insert_color+0xb9/0xe3
 [<ffffffff8860aa1a>] :sch_htb:htb_add_to_id_tree+0x64/0x66
 [<ffffffff8860aaf2>] :sch_htb:htb_activate_prios+0xd6/0xe7
 [<ffffffff8860c835>] :sch_htb:htb_enqueue+0x139/0x194
 [<ffffffff80232988>] dev_queue_xmit+0x119/0x268
 [<ffffffff8025b4a2>] neigh_resolve_output+0x249/0x27e
 [<ffffffff802353fd>] ip_output+0x210/0x25a
 [<ffffffff8043ce28>] ip_push_pending_frames+0x37c/0x45b
 [<ffffffff8044ffd7>] icmp_push_reply+0x13b/0x148
 [<ffffffff80450900>] icmp_send+0x366/0x3d3
 [<ffffffff802568a9>] udp_rcv+0x53d/0x556
 [<ffffffff80237e73>] ip_local_deliver+0x1a3/0x26b
 [<ffffffff80238ec8>] ip_rcv+0x4b9/0x501
 [<ffffffff802218bb>] netif_receive_skb+0x33d/0x3c9
 [<ffffffff881f6348>] :e1000:e1000_clean_rx_irq+0x450/0x4fe
 [<ffffffff881f47eb>] :e1000:e1000_clean+0x88/0x17d
 [<ffffffff8020cab3>] net_rx_action+0xac/0x1d1
 [<ffffffff80212711>] __do_softirq+0x54/0xf5
 [<ffffffff80212725>] __do_softirq+0x68/0xf5
 [<ffffffff80262638>] call_softirq+0x1c/0x28
 [<ffffffff80270aaa>] do_softirq+0x39/0x9f
 [<ffffffff80296102>] irq_exit+0x57/0x59
 [<ffffffff80270c0d>] do_IRQ+0xfd/0x107
 [<ffffffff8025b51d>] mwait_idle+0x0/0x54
 [<ffffffff802618c6>] ret_from_intr+0x0/0xf
 <EOI> [<ffffffff80265e66>] __sched_text_start+0xaa6/0xadd
 [<ffffffff8025b55c>] mwait_idle+0x3f/0x54
 [<ffffffff8025b526>] mwait_idle+0x9/0x54
 [<ffffffff8024c81c>] cpu_idle+0xa2/0xc5
 [<ffffffff8026e519>] rest_init+0x2b/0x2d
 [<ffffffff80a7f811>] start_kernel+0x24a/0x24c
 [<ffffffff80a7f28b>] _sinittext+0x28b/0x292

BUG: soft lockup detected on CPU#3!

Call Trace:
 [<ffffffff8026f79b>] show_trace+0xae/0x336
 [<ffffffff8026fa38>] dump_stack+0x15/0x17
 [<ffffffff802bfea7>] softlockup_tick+0xd5/0xea
 [<ffffffff80250dba>] run_local_timers+0x13/0x15
 [<ffffffff8029a0f9>] update_process_times+0x4c/0x79
 [<ffffffff8027a6ca>] smp_local_timer_interrupt+0x2b/0x50
 [<ffffffff8027aea7>] smp_apic_timer_interrupt+0x58/0x62
 [<ffffffff80261faf>] apic_timer_interrupt+0x6b/0x70
DWARF2 unwinder stuck at apic_timer_interrupt+0x6b/0x70
Leftover inexact backtrace:
 <IRQ> [<ffffffff8860c645>] :sch_htb:htb_rate_timer+0x21/0xd8
 [<ffffffff8860c645>] :sch_htb:htb_rate_timer+0x21/0xd8
 [<ffffffff8020cc4c>] __delay+0xa/0x15
 [<ffffffff802079c2>] _raw_spin_lock+0x8c/0x104
 [<ffffffff8860c645>] :sch_htb:htb_rate_timer+0x21/0xd8
 [<ffffffff802687b9>] _spin_lock_bh+0x32/0x36
 [<ffffffff8860c645>] :sch_htb:htb_rate_timer+0x21/0xd8
 [<ffffffff802998c6>] run_timer_softirq+0x14c/0x1d5
 [<ffffffff80212725>] __do_softirq+0x68/0xf5
 [<ffffffff80262638>] call_softirq+0x1c/0x28
 [<ffffffff80270aaa>] do_softirq+0x39/0x9f
 [<ffffffff80296102>] irq_exit+0x57/0x59
 [<ffffffff8027aeac>] smp_apic_timer_interrupt+0x5d/0x62
 [<ffffffff8025b51d>] mwait_idle+0x0/0x54
 [<ffffffff80261faf>] apic_timer_interrupt+0x6b/0x70
 <EOI> [<ffffffff80265e66>] __sched_text_start+0xaa6/0xadd
 [<ffffffff8025b55c>] mwait_idle+0x3f/0x54
 [<ffffffff8025b526>] mwait_idle+0x9/0x54
 [<ffffffff8024c81c>] cpu_idle+0xa2/0xc5
 [<ffffffff8027a4bd>] start_secondary+0x468/0x477

BUG: soft lockup detected on CPU#0!

Call Trace:
 [<ffffffff8026f79b>] show_trace+0xae/0x336
 [<ffffffff8026fa38>] dump_stack+0x15/0x17
 [<ffffffff802bfea7>] softlockup_tick+0xd5/0xea
 [<ffffffff80250dba>] run_local_timers+0x13/0x15
 [<ffffffff8029a0f9>] update_process_times+0x4c/0x79
 [<ffffffff8027a6ca>] smp_local_timer_interrupt+0x2b/0x50
 [<ffffffff8027aea7>] smp_apic_timer_interrupt+0x58/0x62
 [<ffffffff80261faf>] apic_timer_interrupt+0x6b/0x70
DWARF2 unwinder stuck at apic_timer_interrupt+0x6b/0x70
Leftover inexact backtrace:
 <IRQ> [<ffffffff80232976>] dev_queue_xmit+0x107/0x268
 [<ffffffff80232976>] dev_queue_xmit+0x107/0x268
 [<ffffffff8022779a>] __rb_rotate_left+0x3d/0x60
 [<ffffffff802135fb>] rb_insert_color+0xb9/0xe3
 [<ffffffff8860aa1a>] :sch_htb:htb_add_to_id_tree+0x64/0x66
 [<ffffffff8860aaf2>] :sch_htb:htb_activate_prios+0xd6/0xe7
 [<ffffffff8860c835>] :sch_htb:htb_enqueue+0x139/0x194
 [<ffffffff80232988>] dev_queue_xmit+0x119/0x268
 [<ffffffff8025b4a2>] neigh_resolve_output+0x249/0x27e
 [<ffffffff802353fd>] ip_output+0x210/0x25a
 [<ffffffff8043ce28>] ip_push_pending_frames+0x37c/0x45b
 [<ffffffff8044ffd7>] icmp_push_reply+0x13b/0x148
 [<ffffffff80450900>] icmp_send+0x366/0x3d3
 [<ffffffff802568a9>] udp_rcv+0x53d/0x556
 [<ffffffff80237e73>] ip_local_deliver+0x1a3/0x26b
 [<ffffffff80238ec8>] ip_rcv+0x4b9/0x501
 [<ffffffff802218bb>] netif_receive_skb+0x33d/0x3c9
 [<ffffffff881f6348>] :e1000:e1000_clean_rx_irq+0x450/0x4fe
 [<ffffffff881f47eb>] :e1000:e1000_clean+0x88/0x17d
 [<ffffffff8020cab3>] net_rx_action+0xac/0x1d1
 [<ffffffff80212711>] __do_softirq+0x54/0xf5
 [<ffffffff80212725>] __do_softirq+0x68/0xf5
 [<ffffffff80262638>] call_softirq+0x1c/0x28
 [<ffffffff80270aaa>] do_softirq+0x39/0x9f
 [<ffffffff80296102>] irq_exit+0x57/0x59
 [<ffffffff80270c0d>] do_IRQ+0xfd/0x107
 [<ffffffff8025b51d>] mwait_idle+0x0/0x54
 [<ffffffff802618c6>] ret_from_intr+0x0/0xf
 <EOI> [<ffffffff80265e66>] __sched_text_start+0xaa6/0xadd
 [<ffffffff8025b55c>] mwait_idle+0x3f/0x54
 [<ffffffff8025b526>] mwait_idle+0x9/0x54
 [<ffffffff8024c81c>] cpu_idle+0xa2/0xc5
 [<ffffffff8026e519>] rest_init+0x2b/0x2d
 [<ffffffff80a7f811>] start_kernel+0x24a/0x24c
 [<ffffffff80a7f28b>] _sinittext+0x28b/0x292

BUG: soft lockup detected on CPU#3!

Call Trace:
 [<ffffffff8026f79b>] show_trace+0xae/0x336
 [<ffffffff8026fa38>] dump_stack+0x15/0x17
 [<ffffffff802bfea7>] softlockup_tick+0xd5/0xea
 [<ffffffff80250dba>] run_local_timers+0x13/0x15
 [<ffffffff8029a0f9>] update_process_times+0x4c/0x79
 [<ffffffff8027a6ca>] smp_local_timer_interrupt+0x2b/0x50
 [<ffffffff8027aea7>] smp_apic_timer_interrupt+0x58/0x62
 [<ffffffff80261faf>] apic_timer_interrupt+0x6b/0x70
DWARF2 unwinder stuck at apic_timer_interrupt+0x6b/0x70
Leftover inexact backtrace:
 <IRQ> [<ffffffff8860c645>] :sch_htb:htb_rate_timer+0x21/0xd8
 [<ffffffff8860c645>] :sch_htb:htb_rate_timer+0x21/0xd8
 [<ffffffff8020cc4c>] __delay+0xa/0x15
 [<ffffffff802079c2>] _raw_spin_lock+0x8c/0x104
 [<ffffffff8860c645>] :sch_htb:htb_rate_timer+0x21/0xd8
 [<ffffffff802687b9>] _spin_lock_bh+0x32/0x36
 [<ffffffff8860c645>] :sch_htb:htb_rate_timer+0x21/0xd8
 [<ffffffff802998c6>] run_timer_softirq+0x14c/0x1d5
 [<ffffffff80212725>] __do_softirq+0x68/0xf5
 [<ffffffff80262638>] call_softirq+0x1c/0x28
 [<ffffffff80270aaa>] do_softirq+0x39/0x9f
 [<ffffffff80296102>] irq_exit+0x57/0x59
 [<ffffffff8027aeac>] smp_apic_timer_interrupt+0x5d/0x62
 [<ffffffff8025b51d>] mwait_idle+0x0/0x54
 [<ffffffff80261faf>] apic_timer_interrupt+0x6b/0x70
 <EOI> [<ffffffff80265e66>] __sched_text_start+0xaa6/0xadd
 [<ffffffff8025b55c>] mwait_idle+0x3f/0x54
 [<ffffffff8025b526>] mwait_idle+0x9/0x54
 [<ffffffff8024c81c>] cpu_idle+0xa2/0xc5
 [<ffffffff8027a4bd>] start_secondary+0x468/0x477

BUG: soft lockup detected on CPU#0!

Call Trace:
 [<ffffffff8026f79b>] show_trace+0xae/0x336
 [<ffffffff8026fa38>] dump_stack+0x15/0x17
 [<ffffffff802bfea7>] softlockup_tick+0xd5/0xea
 [<ffffffff80250dba>] run_local_timers+0x13/0x15
 [<ffffffff8029a0f9>] update_process_times+0x4c/0x79
 [<ffffffff8027a6ca>] smp_local_timer_interrupt+0x2b/0x50
 [<ffffffff8027aea7>] smp_apic_timer_interrupt+0x58/0x62
 [<ffffffff80261faf>] apic_timer_interrupt+0x6b/0x70
DWARF2 unwinder stuck at apic_timer_interrupt+0x6b/0x70
Leftover inexact backtrace:
 <IRQ> [<ffffffff80232976>] dev_queue_xmit+0x107/0x268
 [<ffffffff80232976>] dev_queue_xmit+0x107/0x268
 [<ffffffff802241c6>] __rb_rotate_right+0x41/0x60
 [<ffffffff802135aa>] rb_insert_color+0x68/0xe3
 [<ffffffff8860aa1a>] :sch_htb:htb_add_to_id_tree+0x64/0x66
 [<ffffffff8860aaf2>] :sch_htb:htb_activate_prios+0xd6/0xe7
 [<ffffffff8860c835>] :sch_htb:htb_enqueue+0x139/0x194
 [<ffffffff80232988>] dev_queue_xmit+0x119/0x268
 [<ffffffff8025b4a2>] neigh_resolve_output+0x249/0x27e
 [<ffffffff802353fd>] ip_output+0x210/0x25a
 [<ffffffff8043ce28>] ip_push_pending_frames+0x37c/0x45b
 [<ffffffff8044ffd7>] icmp_push_reply+0x13b/0x148
 [<ffffffff80450900>] icmp_send+0x366/0x3d3
 [<ffffffff802568a9>] udp_rcv+0x53d/0x556
 [<ffffffff80237e73>] ip_local_deliver+0x1a3/0x26b
 [<ffffffff80238ec8>] ip_rcv+0x4b9/0x501
 [<ffffffff802218bb>] netif_receive_skb+0x33d/0x3c9
 [<ffffffff881f6348>] :e1000:e1000_clean_rx_irq+0x450/0x4fe
 [<ffffffff881f47eb>] :e1000:e1000_clean+0x88/0x17d
 [<ffffffff8020cab3>] net_rx_action+0xac/0x1d1
 [<ffffffff80212711>] __do_softirq+0x54/0xf5
 [<ffffffff80212725>] __do_softirq+0x68/0xf5
 [<ffffffff80262638>] call_softirq+0x1c/0x28
 [<ffffffff80270aaa>] do_softirq+0x39/0x9f
 [<ffffffff80296102>] irq_exit+0x57/0x59
 [<ffffffff80270c0d>] do_IRQ+0xfd/0x107
 [<ffffffff8025b51d>] mwait_idle+0x0/0x54
 [<ffffffff802618c6>] ret_from_intr+0x0/0xf
 <EOI> [<ffffffff80265e66>] __sched_text_start+0xaa6/0xadd
 [<ffffffff8025b55c>] mwait_idle+0x3f/0x54
 [<ffffffff8025b526>] mwait_idle+0x9/0x54
 [<ffffffff8024c81c>] cpu_idle+0xa2/0xc5
 [<ffffffff8026e519>] rest_init+0x2b/0x2d
 [<ffffffff80a7f811>] start_kernel+0x24a/0x24c
 [<ffffffff80a7f28b>] _sinittext+0x28b/0x292

BUG: soft lockup detected on CPU#3!

Call Trace:
 [<ffffffff8026f79b>] show_trace+0xae/0x336
 [<ffffffff8026fa38>] dump_stack+0x15/0x17
 [<ffffffff802bfea7>] softlockup_tick+0xd5/0xea
 [<ffffffff80250dba>] run_local_timers+0x13/0x15
 [<ffffffff8029a0f9>] update_process_times+0x4c/0x79
 [<ffffffff8027a6ca>] smp_local_timer_interrupt+0x2b/0x50
 [<ffffffff8027aea7>] smp_apic_timer_interrupt+0x58/0x62
 [<ffffffff80261faf>] apic_timer_interrupt+0x6b/0x70
DWARF2 unwinder stuck at apic_timer_interrupt+0x6b/0x70
Leftover inexact backtrace:
 <IRQ> [<ffffffff8860c645>] :sch_htb:htb_rate_timer+0x21/0xd8
 [<ffffffff8860c645>] :sch_htb:htb_rate_timer+0x21/0xd8
 [<ffffffff8020cc48>] __delay+0x6/0x15
 [<ffffffff802079c2>] _raw_spin_lock+0x8c/0x104
 [<ffffffff8860c645>] :sch_htb:htb_rate_timer+0x21/0xd8
 [<ffffffff802687b9>] _spin_lock_bh+0x32/0x36
 [<ffffffff8860c645>] :sch_htb:htb_rate_timer+0x21/0xd8
 [<ffffffff802998c6>] run_timer_softirq+0x14c/0x1d5
 [<ffffffff80212725>] __do_softirq+0x68/0xf5
 [<ffffffff80262638>] call_softirq+0x1c/0x28
 [<ffffffff80270aaa>] do_softirq+0x39/0x9f
 [<ffffffff80296102>] irq_exit+0x57/0x59
 [<ffffffff8027aeac>] smp_apic_timer_interrupt+0x5d/0x62
 [<ffffffff8025b51d>] mwait_idle+0x0/0x54
 [<ffffffff80261faf>] apic_timer_interrupt+0x6b/0x70
 <EOI> [<ffffffff80265e66>] __sched_text_start+0xaa6/0xadd
 [<ffffffff8025b55c>] mwait_idle+0x3f/0x54
 [<ffffffff8025b526>] mwait_idle+0x9/0x54
 [<ffffffff8024c81c>] cpu_idle+0xa2/0xc5
 [<ffffffff8027a4bd>] start_secondary+0x468/0x477

BUG: soft lockup detected on CPU#0!

Call Trace:
 [<ffffffff8026f79b>] show_trace+0xae/0x336
 [<ffffffff8026fa38>] dump_stack+0x15/0x17
 [<ffffffff802bfea7>] softlockup_tick+0xd5/0xea
 [<ffffffff80250dba>] run_local_timers+0x13/0x15
 [<ffffffff8029a0f9>] update_process_times+0x4c/0x79
 [<ffffffff8027a6ca>] smp_local_timer_interrupt+0x2b/0x50
 [<ffffffff8027aea7>] smp_apic_timer_interrupt+0x58/0x62
 [<ffffffff80261faf>] apic_timer_interrupt+0x6b/0x70
DWARF2 unwinder stuck at apic_timer_interrupt+0x6b/0x70
Leftover inexact backtrace:
 <IRQ> [<ffffffff80232976>] dev_queue_xmit+0x107/0x268
 [<ffffffff80232976>] dev_queue_xmit+0x107/0x268
 [<ffffffff802135aa>] rb_insert_color+0x68/0xe3
 [<ffffffff802135aa>] rb_insert_color+0x68/0xe3
 [<ffffffff8860aa1a>] :sch_htb:htb_add_to_id_tree+0x64/0x66
 [<ffffffff8860aaf2>] :sch_htb:htb_activate_prios+0xd6/0xe7
 [<ffffffff8860c835>] :sch_htb:htb_enqueue+0x139/0x194
 [<ffffffff80232988>] dev_queue_xmit+0x119/0x268
 [<ffffffff8025b4a2>] neigh_resolve_output+0x249/0x27e
 [<ffffffff802353fd>] ip_output+0x210/0x25a
 [<ffffffff8043ce28>] ip_push_pending_frames+0x37c/0x45b
 [<ffffffff8044ffd7>] icmp_push_reply+0x13b/0x148
 [<ffffffff80450900>] icmp_send+0x366/0x3d3
 [<ffffffff802568a9>] udp_rcv+0x53d/0x556
 [<ffffffff80237e73>] ip_local_deliver+0x1a3/0x26b
 [<ffffffff80238ec8>] ip_rcv+0x4b9/0x501
 [<ffffffff802218bb>] netif_receive_skb+0x33d/0x3c9
 [<ffffffff881f6348>] :e1000:e1000_clean_rx_irq+0x450/0x4fe
 [<ffffffff881f47eb>] :e1000:e1000_clean+0x88/0x17d
 [<ffffffff8020cab3>] net_rx_action+0xac/0x1d1
 [<ffffffff80212711>] __do_softirq+0x54/0xf5
 [<ffffffff80212725>] __do_softirq+0x68/0xf5
 [<ffffffff80262638>] call_softirq+0x1c/0x28
 [<ffffffff80270aaa>] do_softirq+0x39/0x9f
 [<ffffffff80296102>] irq_exit+0x57/0x59
 [<ffffffff80270c0d>] do_IRQ+0xfd/0x107
 [<ffffffff8025b51d>] mwait_idle+0x0/0x54
 [<ffffffff802618c6>] ret_from_intr+0x0/0xf
 <EOI> [<ffffffff80265e66>] __sched_text_start+0xaa6/0xadd
 [<ffffffff8025b55c>] mwait_idle+0x3f/0x54
 [<ffffffff8025b526>] mwait_idle+0x9/0x54
 [<ffffffff8024c81c>] cpu_idle+0xa2/0xc5
 [<ffffffff8026e519>] rest_init+0x2b/0x2d
 [<ffffffff80a7f811>] start_kernel+0x24a/0x24c
 [<ffffffff80a7f28b>] _sinittext+0x28b/0x292

BUG: soft lockup detected on CPU#3!

Call Trace:
 [<ffffffff8026f79b>] show_trace+0xae/0x336
 [<ffffffff8026fa38>] dump_stack+0x15/0x17
 [<ffffffff802bfea7>] softlockup_tick+0xd5/0xea
 [<ffffffff80250dba>] run_local_timers+0x13/0x15
 [<ffffffff8029a0f9>] update_process_times+0x4c/0x79
 [<ffffffff8027a6ca>] smp_local_timer_interrupt+0x2b/0x50
 [<ffffffff8027aea7>] smp_apic_timer_interrupt+0x58/0x62
 [<ffffffff80261faf>] apic_timer_interrupt+0x6b/0x70
DWARF2 unwinder stuck at apic_timer_interrupt+0x6b/0x70
Leftover inexact backtrace:
 <IRQ> [<ffffffff8860c645>] :sch_htb:htb_rate_timer+0x21/0xd8
 [<ffffffff8860c645>] :sch_htb:htb_rate_timer+0x21/0xd8
 [<ffffffff8020cc48>] __delay+0x6/0x15
 [<ffffffff802079c2>] _raw_spin_lock+0x8c/0x104
 [<ffffffff8860c645>] :sch_htb:htb_rate_timer+0x21/0xd8
 [<ffffffff802687b9>] _spin_lock_bh+0x32/0x36
 [<ffffffff8860c645>] :sch_htb:htb_rate_timer+0x21/0xd8
 [<ffffffff802998c6>] run_timer_softirq+0x14c/0x1d5
 [<ffffffff80212725>] __do_softirq+0x68/0xf5
 [<ffffffff80262638>] call_softirq+0x1c/0x28
 [<ffffffff80270aaa>] do_softirq+0x39/0x9f
 [<ffffffff80296102>] irq_exit+0x57/0x59
 [<ffffffff8027aeac>] smp_apic_timer_interrupt+0x5d/0x62
 [<ffffffff8025b51d>] mwait_idle+0x0/0x54
 [<ffffffff80261faf>] apic_timer_interrupt+0x6b/0x70
 <EOI> [<ffffffff80265e66>] __sched_text_start+0xaa6/0xadd
 [<ffffffff8025b55c>] mwait_idle+0x3f/0x54
 [<ffffffff8025b526>] mwait_idle+0x9/0x54
 [<ffffffff8024c81c>] cpu_idle+0xa2/0xc5
 [<ffffffff8027a4bd>] start_secondary+0x468/0x477

BUG: soft lockup detected on CPU#0!

Call Trace:
 [<ffffffff8026f79b>] show_trace+0xae/0x336
 [<ffffffff8026fa38>] dump_stack+0x15/0x17
 [<ffffffff802bfea7>] softlockup_tick+0xd5/0xea
 [<ffffffff80250dba>] run_local_timers+0x13/0x15
 [<ffffffff8029a0f9>] update_process_times+0x4c/0x79
 [<ffffffff8027a6ca>] smp_local_timer_interrupt+0x2b/0x50
 [<ffffffff8027aea7>] smp_apic_timer_interrupt+0x58/0x62
 [<ffffffff80261faf>] apic_timer_interrupt+0x6b/0x70
DWARF2 unwinder stuck at apic_timer_interrupt+0x6b/0x70
Leftover inexact backtrace:
 <IRQ> [<ffffffff80232976>] dev_queue_xmit+0x107/0x268
 [<ffffffff80232976>] dev_queue_xmit+0x107/0x268
 [<ffffffff80213605>] rb_insert_color+0xc3/0xe3
 [<ffffffff802135aa>] rb_insert_color+0x68/0xe3
 [<ffffffff8860aa1a>] :sch_htb:htb_add_to_id_tree+0x64/0x66
 [<ffffffff8860aaf2>] :sch_htb:htb_activate_prios+0xd6/0xe7
 [<ffffffff8860c835>] :sch_htb:htb_enqueue+0x139/0x194
 [<ffffffff80232988>] dev_queue_xmit+0x119/0x268
 [<ffffffff8025b4a2>] neigh_resolve_output+0x249/0x27e
 [<ffffffff802353fd>] ip_output+0x210/0x25a
 [<ffffffff8043ce28>] ip_push_pending_frames+0x37c/0x45b
 [<ffffffff8044ffd7>] icmp_push_reply+0x13b/0x148
 [<ffffffff80450900>] icmp_send+0x366/0x3d3
 [<ffffffff802568a9>] udp_rcv+0x53d/0x556
 [<ffffffff80237e73>] ip_local_deliver+0x1a3/0x26b
 [<ffffffff80238ec8>] ip_rcv+0x4b9/0x501
 [<ffffffff802218bb>] netif_receive_skb+0x33d/0x3c9
 [<ffffffff881f6348>] :e1000:e1000_clean_rx_irq+0x450/0x4fe
 [<ffffffff881f47eb>] :e1000:e1000_clean+0x88/0x17d
 [<ffffffff8020cab3>] net_rx_action+0xac/0x1d1
 [<ffffffff80212711>] __do_softirq+0x54/0xf5
 [<ffffffff80212725>] __do_softirq+0x68/0xf5
 [<ffffffff80262638>] call_softirq+0x1c/0x28
 [<ffffffff80270aaa>] do_softirq+0x39/0x9f
 [<ffffffff80296102>] irq_exit+0x57/0x59
 [<ffffffff80270c0d>] do_IRQ+0xfd/0x107
 [<ffffffff8025b51d>] mwait_idle+0x0/0x54
 [<ffffffff802618c6>] ret_from_intr+0x0/0xf
 <EOI> [<ffffffff80265e66>] __sched_text_start+0xaa6/0xadd
 [<ffffffff8025b55c>] mwait_idle+0x3f/0x54
 [<ffffffff8025b526>] mwait_idle+0x9/0x54
 [<ffffffff8024c81c>] cpu_idle+0xa2/0xc5
 [<ffffffff8026e519>] rest_init+0x2b/0x2d
 [<ffffffff80a7f811>] start_kernel+0x24a/0x24c
 [<ffffffff80a7f28b>] _sinittext+0x28b/0x292

BUG: soft lockup detected on CPU#3!

Call Trace:
 [<ffffffff8026f79b>] show_trace+0xae/0x336
 [<ffffffff8026fa38>] dump_stack+0x15/0x17
 [<ffffffff802bfea7>] softlockup_tick+0xd5/0xea
 [<ffffffff80250dba>] run_local_timers+0x13/0x15
 [<ffffffff8029a0f9>] update_process_times+0x4c/0x79
 [<ffffffff8027a6ca>] smp_local_timer_interrupt+0x2b/0x50
 [<ffffffff8027aea7>] smp_apic_timer_interrupt+0x58/0x62
 [<ffffffff80261faf>] apic_timer_interrupt+0x6b/0x70
DWARF2 unwinder stuck at apic_timer_interrupt+0x6b/0x70
Leftover inexact backtrace:
 <IRQ> [<ffffffff8860c645>] :sch_htb:htb_rate_timer+0x21/0xd8
 [<ffffffff8860c645>] :sch_htb:htb_rate_timer+0x21/0xd8
 [<ffffffff8020cc4c>] __delay+0xa/0x15
 [<ffffffff802079c2>] _raw_spin_lock+0x8c/0x104
 [<ffffffff8860c645>] :sch_htb:htb_rate_timer+0x21/0xd8
 [<ffffffff802687b9>] _spin_lock_bh+0x32/0x36
 [<ffffffff8860c645>] :sch_htb:htb_rate_timer+0x21/0xd8
 [<ffffffff802998c6>] run_timer_softirq+0x14c/0x1d5
 [<ffffffff80212725>] __do_softirq+0x68/0xf5
 [<ffffffff80262638>] call_softirq+0x1c/0x28
 [<ffffffff80270aaa>] do_softirq+0x39/0x9f
 [<ffffffff80296102>] irq_exit+0x57/0x59
 [<ffffffff8027aeac>] smp_apic_timer_interrupt+0x5d/0x62
 [<ffffffff8025b51d>] mwait_idle+0x0/0x54
 [<ffffffff80261faf>] apic_timer_interrupt+0x6b/0x70
 <EOI> [<ffffffff80265e66>] __sched_text_start+0xaa6/0xadd
 [<ffffffff8025b55c>] mwait_idle+0x3f/0x54
 [<ffffffff8025b526>] mwait_idle+0x9/0x54
 [<ffffffff8024c81c>] cpu_idle+0xa2/0xc5
 [<ffffffff8027a4bd>] start_secondary+0x468/0x477

BUG: soft lockup detected on CPU#1!

Call Trace:
 [<ffffffff8026f79b>] show_trace+0xae/0x336
 [<ffffffff8026fa38>] dump_stack+0x15/0x17
 [<ffffffff802bfea7>] softlockup_tick+0xd5/0xea
 [<ffffffff80250dba>] run_local_timers+0x13/0x15
 [<ffffffff8029a0f9>] update_process_times+0x4c/0x79
 [<ffffffff8027a6ca>] smp_local_timer_interrupt+0x2b/0x50
 [<ffffffff8027aea7>] smp_apic_timer_interrupt+0x58/0x62
 [<ffffffff80261faf>] apic_timer_interrupt+0x6b/0x70
DWARF2 unwinder stuck at apic_timer_interrupt+0x6b/0x70
Leftover inexact backtrace:
 <IRQ> <EOI> [<ffffffff80232976>] dev_queue_xmit+0x107/0x268
 [<ffffffff80232976>] dev_queue_xmit+0x107/0x268
 [<ffffffff8020cc4e>] __delay+0xc/0x15
 [<ffffffff802079c2>] _raw_spin_lock+0x8c/0x104
 [<ffffffff80232976>] dev_queue_xmit+0x107/0x268
 [<ffffffff80268783>] _spin_lock+0x2d/0x31
 [<ffffffff80232976>] dev_queue_xmit+0x107/0x268
 [<ffffffff802353fd>] ip_output+0x210/0x25a
 [<ffffffff8043ce28>] ip_push_pending_frames+0x37c/0x45b
 [<ffffffff8043dc27>] ip_generic_getfrag+0x0/0x98
 [<ffffffff8044d712>] udp_push_pending_frames+0x251/0x278
 [<ffffffff80256249>] udp_sendmsg+0x4b8/0x5db
 [<ffffffff8024857b>] inet_sendmsg+0x46/0x53
 [<ffffffff80258da5>] sock_sendmsg+0x111/0x130
 [<ffffffff802a2c18>] autoremove_wake_function+0x0/0x38
 [<ffffffff80268a62>] _spin_unlock_irq+0x2b/0x31
 [<ffffffff802681fe>] trace_hardirqs_on_thunk+0x35/0x37
 [<ffffffff802681fe>] trace_hardirqs_on_thunk+0x35/0x37
 [<ffffffff80416108>] sys_sendto+0x106/0x12d
 [<ffffffff80246770>] sys_rt_sigreturn+0x28f/0x360
 [<ffffffff8024680f>] sys_rt_sigreturn+0x32e/0x360
 [<ffffffff8026138e>] system_call+0x7e/0x83

BUG: soft lockup detected on CPU#0!

Call Trace:
 [<ffffffff8026f79b>] show_trace+0xae/0x336
 [<ffffffff8026fa38>] dump_stack+0x15/0x17
 [<ffffffff802bfea7>] softlockup_tick+0xd5/0xea
 [<ffffffff80250dba>] run_local_timers+0x13/0x15
 [<ffffffff8029a0f9>] update_process_times+0x4c/0x79
 [<ffffffff8027a6ca>] smp_local_timer_interrupt+0x2b/0x50
 [<ffffffff8027aea7>] smp_apic_timer_interrupt+0x58/0x62
 [<ffffffff80261faf>] apic_timer_interrupt+0x6b/0x70
DWARF2 unwinder stuck at apic_timer_interrupt+0x6b/0x70
Leftover inexact backtrace:
 <IRQ> [<ffffffff80232976>] dev_queue_xmit+0x107/0x268
 [<ffffffff80232976>] dev_queue_xmit+0x107/0x268
 [<ffffffff80213596>] rb_insert_color+0x54/0xe3
 [<ffffffff802135fb>] rb_insert_color+0xb9/0xe3
 [<ffffffff8860aa1a>] :sch_htb:htb_add_to_id_tree+0x64/0x66
 [<ffffffff8860aaf2>] :sch_htb:htb_activate_prios+0xd6/0xe7
 [<ffffffff8860c835>] :sch_htb:htb_enqueue+0x139/0x194
 [<ffffffff80232988>] dev_queue_xmit+0x119/0x268
 [<ffffffff8025b4a2>] neigh_resolve_output+0x249/0x27e
 [<ffffffff802353fd>] ip_output+0x210/0x25a
 [<ffffffff8043ce28>] ip_push_pending_frames+0x37c/0x45b
 [<ffffffff8044ffd7>] icmp_push_reply+0x13b/0x148
 [<ffffffff80450900>] icmp_send+0x366/0x3d3
 [<ffffffff802568a9>] udp_rcv+0x53d/0x556
 [<ffffffff80237e73>] ip_local_deliver+0x1a3/0x26b
 [<ffffffff80238ec8>] ip_rcv+0x4b9/0x501
 [<ffffffff802218bb>] netif_receive_skb+0x33d/0x3c9
 [<ffffffff881f6348>] :e1000:e1000_clean_rx_irq+0x450/0x4fe
 [<ffffffff881f47eb>] :e1000:e1000_clean+0x88/0x17d
 [<ffffffff8020cab3>] net_rx_action+0xac/0x1d1
 [<ffffffff80212711>] __do_softirq+0x54/0xf5
 [<ffffffff80212725>] __do_softirq+0x68/0xf5
 [<ffffffff80262638>] call_softirq+0x1c/0x28
 [<ffffffff80270aaa>] do_softirq+0x39/0x9f
 [<ffffffff80296102>] irq_exit+0x57/0x59
 [<ffffffff80270c0d>] do_IRQ+0xfd/0x107
 [<ffffffff8025b51d>] mwait_idle+0x0/0x54
 [<ffffffff802618c6>] ret_from_intr+0x0/0xf
 <EOI> [<ffffffff80265e66>] __sched_text_start+0xaa6/0xadd
 [<ffffffff8025b55c>] mwait_idle+0x3f/0x54
 [<ffffffff8025b526>] mwait_idle+0x9/0x54
 [<ffffffff8024c81c>] cpu_idle+0xa2/0xc5
 [<ffffffff8026e519>] rest_init+0x2b/0x2d
 [<ffffffff80a7f811>] start_kernel+0x24a/0x24c
 [<ffffffff80a7f28b>] _sinittext+0x28b/0x292

BUG: soft lockup detected on CPU#3!

Call Trace:
 [<ffffffff8026f79b>] show_trace+0xae/0x336
 [<ffffffff8026fa38>] dump_stack+0x15/0x17
 [<ffffffff802bfea7>] softlockup_tick+0xd5/0xea
 [<ffffffff80250dba>] run_local_timers+0x13/0x15
 [<ffffffff8029a0f9>] update_process_times+0x4c/0x79
 [<ffffffff8027a6ca>] smp_local_timer_interrupt+0x2b/0x50
 [<ffffffff8027aea7>] smp_apic_timer_interrupt+0x58/0x62
 [<ffffffff80261faf>] apic_timer_interrupt+0x6b/0x70
DWARF2 unwinder stuck at apic_timer_interrupt+0x6b/0x70
Leftover inexact backtrace:
 <IRQ> [<ffffffff8860c645>] :sch_htb:htb_rate_timer+0x21/0xd8
 [<ffffffff8860c645>] :sch_htb:htb_rate_timer+0x21/0xd8
 [<ffffffff8020cc4c>] __delay+0xa/0x15
 [<ffffffff802079c2>] _raw_spin_lock+0x8c/0x104
 [<ffffffff8860c645>] :sch_htb:htb_rate_timer+0x21/0xd8
 [<ffffffff802687b9>] _spin_lock_bh+0x32/0x36
 [<ffffffff8860c645>] :sch_htb:htb_rate_timer+0x21/0xd8
 [<ffffffff802998c6>] run_timer_softirq+0x14c/0x1d5
 [<ffffffff80212725>] __do_softirq+0x68/0xf5
 [<ffffffff80262638>] call_softirq+0x1c/0x28
 [<ffffffff80270aaa>] do_softirq+0x39/0x9f
 [<ffffffff80296102>] irq_exit+0x57/0x59
 [<ffffffff8027aeac>] smp_apic_timer_interrupt+0x5d/0x62
 [<ffffffff8025b51d>] mwait_idle+0x0/0x54
 [<fff


-- 
http://www.codemonkey.org.uk

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: tc related lockdep warning.
  2006-09-26 21:20         ` Dave Jones
@ 2006-09-27  8:54           ` Jarek Poplawski
  2006-09-27  9:57             ` Patrick McHardy
  2006-09-28 12:17             ` Patrick McHardy
  2006-09-27 10:14           ` Patrick McHardy
  2006-09-27 12:07           ` Patrick McHardy
  2 siblings, 2 replies; 21+ messages in thread
From: Jarek Poplawski @ 2006-09-27  8:54 UTC (permalink / raw)
  To: Dave Jones; +Cc: Patrick McHardy, hadi, netdev, davem

On Tue, Sep 26, 2006 at 05:20:34PM -0400, Dave Jones wrote:
> On Tue, Sep 26, 2006 at 06:15:21PM +0200, Patrick McHardy wrote:
>  > Patrick McHardy wrote:
>  > > jamal wrote:
>  > > 
>  > >>Yes, that looks plausible. Can you try making those changes and see if
>  > >>the warning is gone?
>  > > 
>  > >
>  > > I think this points to a bigger brokeness caused by the move of
>  > > dev->qdisc to RCU. It means destruction of filters and actions doesn't
>  > > necessarily happens in user-context and thus not protected by the rtnl
>  > > anymore.
>  > 
>  > I looked into this and we indeed still have lots of problems from that
>  > broken RCU patch. Basically all locking (qdiscs, classifiers, actions,
>  > estimators) assumes that updates are only done in process context and
>  > thus read_lock doesn't need bottem half protection. Quite a few things
>  > also assume that updates only happen under the RTNL and don't need
>  > any further protection if not used during packet processing.
>  > 
>  > Instead of "fixing" all this I suggest something like this (untested)
>  > patch instead. Since only the dev->qdisc pointer is protected by RCU,
>  > but enqueue and the qdisc tree are still protected by dev->qdisc_lock,
>  > we can perform destruction of the tree immediately and only do the
>  > final free in the rcu callback, as long as we make sure not to enqueue
>  > anything to a half-way destroyed qdisc.
> 
> With this patch, I get no lockdep warnings, but the machine locks up completely.
> I hooked up a serial console, and found this..
...

Sorry for my not humble and simplistic opinion, but I'd dare
to remind you are changing "stable" version and even without
this lockups this patch would look very "serious". Why don't
try to restore not-rcu version of qdisc_destroy which looks
not lot to do.

Jarek P.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: tc related lockdep warning.
  2006-09-27  8:54           ` Jarek Poplawski
@ 2006-09-27  9:57             ` Patrick McHardy
  2006-09-28 12:17             ` Patrick McHardy
  1 sibling, 0 replies; 21+ messages in thread
From: Patrick McHardy @ 2006-09-27  9:57 UTC (permalink / raw)
  To: Jarek Poplawski; +Cc: Dave Jones, hadi, netdev, davem

Jarek Poplawski wrote:
> Sorry for my not humble and simplistic opinion, but I'd dare
> to remind you are changing "stable" version and even without
> this lockups this patch would look very "serious". Why don't
> try to restore not-rcu version of qdisc_destroy which looks
> not lot to do.

I'm trying to restore the original rules, while keeping the
loopback optimization. That should be safe once I figure out
whats causing the crash.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: tc related lockdep warning.
  2006-09-26 21:20         ` Dave Jones
  2006-09-27  8:54           ` Jarek Poplawski
@ 2006-09-27 10:14           ` Patrick McHardy
  2006-09-27 14:41             ` Ismail Donmez
  2006-09-27 12:07           ` Patrick McHardy
  2 siblings, 1 reply; 21+ messages in thread
From: Patrick McHardy @ 2006-09-27 10:14 UTC (permalink / raw)
  To: Dave Jones; +Cc: hadi, Jarek Poplawski, netdev, davem

Dave Jones wrote:
> With this patch, I get no lockdep warnings, but the machine locks up completely.
> I hooked up a serial console, and found this..
> 
> 
> u32 classifier
>     Performance counters on
>     input device check on 
>     Actions configured 
> BUG: warning at net/sched/sch_htb.c:395/htb_safe_rb_erase()

I can't reproduce this, could you send me your tc script please?
Did you do anything to trigger this?

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: tc related lockdep warning.
  2006-09-26 21:20         ` Dave Jones
  2006-09-27  8:54           ` Jarek Poplawski
  2006-09-27 10:14           ` Patrick McHardy
@ 2006-09-27 12:07           ` Patrick McHardy
  2006-09-27 17:26             ` Ismail Donmez
                               ` (3 more replies)
  2 siblings, 4 replies; 21+ messages in thread
From: Patrick McHardy @ 2006-09-27 12:07 UTC (permalink / raw)
  To: Dave Jones; +Cc: hadi, Jarek Poplawski, netdev, davem

[-- Attachment #1: Type: text/plain, Size: 655 bytes --]

Dave Jones wrote:
> With this patch, I get no lockdep warnings, but the machine locks up completely.
> I hooked up a serial console, and found this..
> 
> 
> u32 classifier
>     Performance counters on
>     input device check on 
>     Actions configured 
> BUG: warning at net/sched/sch_htb.c:395/htb_safe_rb_erase()
> 
> Call Trace:
>  [<ffffffff8026f79b>] show_trace+0xae/0x336
>  [<ffffffff8026fa38>] dump_stack+0x15/0x17
>  [<ffffffff8860a171>] :sch_htb:htb_safe_rb_erase+0x3b/0x55

I found the reason for this, it was an unrelated bug. I've attached
the latest version of the locking fixes and the fix for the HTB bug.
Can you please try again?



[-- Attachment #2: 01.diff --]
[-- Type: text/plain, Size: 1252 bytes --]

[NET_SCHED]: HTB: fix incorrect use of RB_EMPTY_NODE

Fix incorrect use of RB_EMPTY_NODE in htb_safe_rb_erase, which makes it
skip nodes within the rbtree instead of nodes not in the tree, resulting
in crashes later on.

The root cause for this seems to be the very counter-intuitive behaviour
of the RB_EMPTY_NODE macro, which returns _false_ when the node is empty.

Signed-off-by: Patrick McHardy <kaber@trash.net>

---
commit 9a0cd6d60280d88c38791844c87548d45cf6f2c2
tree fdf4f4a46fb088d957322006828af557b8ce594a
parent 7e4720201ad44ace85a443f41d668a62a737e7d0
author Patrick McHardy <kaber@trash.net> Wed, 27 Sep 2006 13:27:24 +0200
committer Patrick McHardy <kaber@trash.net> Wed, 27 Sep 2006 13:27:24 +0200

 net/sched/sch_htb.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/net/sched/sch_htb.c b/net/sched/sch_htb.c
index bb3ddd4..6c058e3 100644
--- a/net/sched/sch_htb.c
+++ b/net/sched/sch_htb.c
@@ -391,7 +391,7 @@ static inline void htb_add_class_to_row(
 /* If this triggers, it is a bug in this code, but it need not be fatal */
 static void htb_safe_rb_erase(struct rb_node *rb, struct rb_root *root)
 {
-	if (RB_EMPTY_NODE(rb)) {
+	if (!RB_EMPTY_NODE(rb)) {
 		WARN_ON(1);
 	} else {
 		rb_erase(rb, root);


[-- Attachment #3: 02.diff --]
[-- Type: text/plain, Size: 8540 bytes --]

[NET_SCHED]: Fix fallout from dev->qdisc RCU change

The move of qdisc destruction to a rcu callback broke locking in the
entire qdisc layer by invalidating previously valid assumptions about
the context in which changes to the qdisc tree occur.

The two assumptions were:

- since changes only happen in process context, read_lock doesn't need
  bottem half protection. Now invalid since destruction of inner qdiscs,
  classifiers, actions and estimators happens in the RCU callback unless
  they're manually deleted, resulting in dead-locks when read_lock in
  process context is interrupted by write_lock_bh in bottem half context.

- since changes only happen under the RTNL, no additional locking is
  necessary for data not used during packet processing (f.e. u32_list).
  Again, since destruction now happens in the RCU callback, this assumption
  is not valid anymore, causing races while using this data, which can
  result in corruption or use-after-free.

Instead of "fixing" this by disabling bottem halfs everywhere and adding
new locks/refcounting, this patch makes these assumptions valid again by
moving destruction back to process context. Since only the dev->qdisc
pointer is protected by RCU, but ->enqueue and the qdisc tree are still
protected by dev->qdisc_lock, destruction of the tree can be performed
immediately and only the final free needs to happen in the rcu callback
to make sure dev_queue_xmit doesn't access already freed memory.

Signed-off-by: Patrick McHardy <kaber@trash.net>

---
commit fe5b95bfcde98ca2e32b4274c93889cdd1fbc040
tree 951a0d83d91b15dbe55a41366e0fe01966fec7ed
parent 9a0cd6d60280d88c38791844c87548d45cf6f2c2
author Patrick McHardy <kaber@trash.net> Wed, 27 Sep 2006 13:57:18 +0200
committer Patrick McHardy <kaber@trash.net> Wed, 27 Sep 2006 13:57:18 +0200

 net/core/dev.c          |   14 ++++++----
 net/sched/cls_api.c     |    4 +--
 net/sched/sch_api.c     |   16 ++++++-----
 net/sched/sch_generic.c |   66 +++++++++++++++--------------------------------
 4 files changed, 39 insertions(+), 61 deletions(-)

diff --git a/net/core/dev.c b/net/core/dev.c
index 14de297..4d891be 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -1480,14 +1480,16 @@ #endif
 	if (q->enqueue) {
 		/* Grab device queue */
 		spin_lock(&dev->queue_lock);
+		q = dev->qdisc;
+		if (q->enqueue) {
+			rc = q->enqueue(skb, q);
+			qdisc_run(dev);
+			spin_unlock(&dev->queue_lock);
 
-		rc = q->enqueue(skb, q);
-
-		qdisc_run(dev);
-
+			rc = rc == NET_XMIT_BYPASS ? NET_XMIT_SUCCESS : rc;
+			goto out;
+		}
 		spin_unlock(&dev->queue_lock);
-		rc = rc == NET_XMIT_BYPASS ? NET_XMIT_SUCCESS : rc;
-		goto out;
 	}
 
 	/* The device has no queue. Common case for software devices:
diff --git a/net/sched/cls_api.c b/net/sched/cls_api.c
index 7e14f14..37a1840 100644
--- a/net/sched/cls_api.c
+++ b/net/sched/cls_api.c
@@ -401,7 +401,7 @@ static int tc_dump_tfilter(struct sk_buf
 	if ((dev = dev_get_by_index(tcm->tcm_ifindex)) == NULL)
 		return skb->len;
 
-	read_lock_bh(&qdisc_tree_lock);
+	read_lock(&qdisc_tree_lock);
 	if (!tcm->tcm_parent)
 		q = dev->qdisc_sleeping;
 	else
@@ -458,7 +458,7 @@ errout:
 	if (cl)
 		cops->put(q, cl);
 out:
-	read_unlock_bh(&qdisc_tree_lock);
+	read_unlock(&qdisc_tree_lock);
 	dev_put(dev);
 	return skb->len;
 }
diff --git a/net/sched/sch_api.c b/net/sched/sch_api.c
index a19eff1..0b64892 100644
--- a/net/sched/sch_api.c
+++ b/net/sched/sch_api.c
@@ -195,14 +195,14 @@ struct Qdisc *qdisc_lookup(struct net_de
 {
 	struct Qdisc *q;
 
-	read_lock_bh(&qdisc_tree_lock);
+	read_lock(&qdisc_tree_lock);
 	list_for_each_entry(q, &dev->qdisc_list, list) {
 		if (q->handle == handle) {
-			read_unlock_bh(&qdisc_tree_lock);
+			read_unlock(&qdisc_tree_lock);
 			return q;
 		}
 	}
-	read_unlock_bh(&qdisc_tree_lock);
+	read_unlock(&qdisc_tree_lock);
 	return NULL;
 }
 
@@ -837,7 +837,7 @@ static int tc_dump_qdisc(struct sk_buff 
 			continue;
 		if (idx > s_idx)
 			s_q_idx = 0;
-		read_lock_bh(&qdisc_tree_lock);
+		read_lock(&qdisc_tree_lock);
 		q_idx = 0;
 		list_for_each_entry(q, &dev->qdisc_list, list) {
 			if (q_idx < s_q_idx) {
@@ -846,12 +846,12 @@ static int tc_dump_qdisc(struct sk_buff 
 			}
 			if (tc_fill_qdisc(skb, q, q->parent, NETLINK_CB(cb->skb).pid,
 					  cb->nlh->nlmsg_seq, NLM_F_MULTI, RTM_NEWQDISC) <= 0) {
-				read_unlock_bh(&qdisc_tree_lock);
+				read_unlock(&qdisc_tree_lock);
 				goto done;
 			}
 			q_idx++;
 		}
-		read_unlock_bh(&qdisc_tree_lock);
+		read_unlock(&qdisc_tree_lock);
 	}
 
 done:
@@ -1074,7 +1074,7 @@ static int tc_dump_tclass(struct sk_buff
 	s_t = cb->args[0];
 	t = 0;
 
-	read_lock_bh(&qdisc_tree_lock);
+	read_lock(&qdisc_tree_lock);
 	list_for_each_entry(q, &dev->qdisc_list, list) {
 		if (t < s_t || !q->ops->cl_ops ||
 		    (tcm->tcm_parent &&
@@ -1096,7 +1096,7 @@ static int tc_dump_tclass(struct sk_buff
 			break;
 		t++;
 	}
-	read_unlock_bh(&qdisc_tree_lock);
+	read_unlock(&qdisc_tree_lock);
 
 	cb->args[0] = t;
 
diff --git a/net/sched/sch_generic.c b/net/sched/sch_generic.c
index 6f91518..88c6a99 100644
--- a/net/sched/sch_generic.c
+++ b/net/sched/sch_generic.c
@@ -45,11 +45,10 @@ #include <net/pkt_sched.h>
    The idea is the following:
    - enqueue, dequeue are serialized via top level device
      spinlock dev->queue_lock.
-   - tree walking is protected by read_lock_bh(qdisc_tree_lock)
+   - tree walking is protected by read_lock(qdisc_tree_lock)
      and this lock is used only in process context.
-   - updates to tree are made under rtnl semaphore or
-     from softirq context (__qdisc_destroy rcu-callback)
-     hence this lock needs local bh disabling.
+   - updates to tree are made only under rtnl semaphore,
+     hence this lock may be made without local bh disabling.
 
    qdisc_tree_lock must be grabbed BEFORE dev->queue_lock!
  */
@@ -57,14 +56,14 @@ DEFINE_RWLOCK(qdisc_tree_lock);
 
 void qdisc_lock_tree(struct net_device *dev)
 {
-	write_lock_bh(&qdisc_tree_lock);
+	write_lock(&qdisc_tree_lock);
 	spin_lock_bh(&dev->queue_lock);
 }
 
 void qdisc_unlock_tree(struct net_device *dev)
 {
 	spin_unlock_bh(&dev->queue_lock);
-	write_unlock_bh(&qdisc_tree_lock);
+	write_unlock(&qdisc_tree_lock);
 }
 
 /* 
@@ -483,20 +482,6 @@ void qdisc_reset(struct Qdisc *qdisc)
 static void __qdisc_destroy(struct rcu_head *head)
 {
 	struct Qdisc *qdisc = container_of(head, struct Qdisc, q_rcu);
-	struct Qdisc_ops  *ops = qdisc->ops;
-
-#ifdef CONFIG_NET_ESTIMATOR
-	gen_kill_estimator(&qdisc->bstats, &qdisc->rate_est);
-#endif
-	write_lock(&qdisc_tree_lock);
-	if (ops->reset)
-		ops->reset(qdisc);
-	if (ops->destroy)
-		ops->destroy(qdisc);
-	write_unlock(&qdisc_tree_lock);
-	module_put(ops->owner);
-
-	dev_put(qdisc->dev);
 	kfree((char *) qdisc - qdisc->padded);
 }
 
@@ -504,32 +489,23 @@ #endif
 
 void qdisc_destroy(struct Qdisc *qdisc)
 {
-	struct list_head cql = LIST_HEAD_INIT(cql);
-	struct Qdisc *cq, *q, *n;
+	struct Qdisc_ops  *ops = qdisc->ops;
 
 	if (qdisc->flags & TCQ_F_BUILTIN ||
-		!atomic_dec_and_test(&qdisc->refcnt))
+	    !atomic_dec_and_test(&qdisc->refcnt))
 		return;
 
-	if (!list_empty(&qdisc->list)) {
-		if (qdisc->ops->cl_ops == NULL)
-			list_del(&qdisc->list);
-		else
-			list_move(&qdisc->list, &cql);
-	}
-
-	/* unlink inner qdiscs from dev->qdisc_list immediately */
-	list_for_each_entry(cq, &cql, list)
-		list_for_each_entry_safe(q, n, &qdisc->dev->qdisc_list, list)
-			if (TC_H_MAJ(q->parent) == TC_H_MAJ(cq->handle)) {
-				if (q->ops->cl_ops == NULL)
-					list_del_init(&q->list);
-				else
-					list_move_tail(&q->list, &cql);
-			}
-	list_for_each_entry_safe(cq, n, &cql, list)
-		list_del_init(&cq->list);
+	list_del(&qdisc->list);
+#ifdef CONFIG_NET_ESTIMATOR
+	gen_kill_estimator(&qdisc->bstats, &qdisc->rate_est);
+#endif
+	if (ops->reset)
+		ops->reset(qdisc);
+	if (ops->destroy)
+		ops->destroy(qdisc);
 
+	module_put(ops->owner);
+	dev_put(qdisc->dev);
 	call_rcu(&qdisc->q_rcu, __qdisc_destroy);
 }
 
@@ -549,15 +525,15 @@ void dev_activate(struct net_device *dev
 				printk(KERN_INFO "%s: activation failed\n", dev->name);
 				return;
 			}
-			write_lock_bh(&qdisc_tree_lock);
+			write_lock(&qdisc_tree_lock);
 			list_add_tail(&qdisc->list, &dev->qdisc_list);
-			write_unlock_bh(&qdisc_tree_lock);
+			write_unlock(&qdisc_tree_lock);
 		} else {
 			qdisc =  &noqueue_qdisc;
 		}
-		write_lock_bh(&qdisc_tree_lock);
+		write_lock(&qdisc_tree_lock);
 		dev->qdisc_sleeping = qdisc;
-		write_unlock_bh(&qdisc_tree_lock);
+		write_unlock(&qdisc_tree_lock);
 	}
 
 	if (!netif_carrier_ok(dev))


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: tc related lockdep warning.
  2006-09-27 10:14           ` Patrick McHardy
@ 2006-09-27 14:41             ` Ismail Donmez
  0 siblings, 0 replies; 21+ messages in thread
From: Ismail Donmez @ 2006-09-27 14:41 UTC (permalink / raw)
  To: Patrick McHardy; +Cc: Dave Jones, hadi, Jarek Poplawski, netdev, davem

27 Eyl 2006 Çar 13:14 tarihinde şunları yazmıştınız:
> Dave Jones wrote:
> > With this patch, I get no lockdep warnings, but the machine locks up
> > completely. I hooked up a serial console, and found this..
> >
> >
> > u32 classifier
> >     Performance counters on
> >     input device check on
> >     Actions configured
> > BUG: warning at net/sched/sch_htb.c:395/htb_safe_rb_erase()
>
> I can't reproduce this, could you send me your tc script please?
> Did you do anything to trigger this?

Using sabishape[1] with latest linux-2.6 git tree I can reproduce this. Just 
running sabishape -i eth1 -u 50 -d 240 freezes whole box.

[1] http://www.sabi.co.uk/Proj/Small/sabishape-2.1.tar.gz

Regards,
ismail

-- 
They that can give up essential liberty to obtain a little temporary safety 
deserve neither liberty nor safety.
-- Benjamin Franklin

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: tc related lockdep warning.
  2006-09-27 12:07           ` Patrick McHardy
@ 2006-09-27 17:26             ` Ismail Donmez
  2006-09-27 17:33             ` Dave Jones
                               ` (2 subsequent siblings)
  3 siblings, 0 replies; 21+ messages in thread
From: Ismail Donmez @ 2006-09-27 17:26 UTC (permalink / raw)
  To: Patrick McHardy; +Cc: Dave Jones, hadi, Jarek Poplawski, netdev, davem

27 Eyl 2006 Çar 15:07 tarihinde, Patrick McHardy şunları yazmıştı: 
> Dave Jones wrote:
> > With this patch, I get no lockdep warnings, but the machine locks up
> > completely. I hooked up a serial console, and found this..
> >
> >
> > u32 classifier
> >     Performance counters on
> >     input device check on
> >     Actions configured
> > BUG: warning at net/sched/sch_htb.c:395/htb_safe_rb_erase()
> >
> > Call Trace:
> >  [<ffffffff8026f79b>] show_trace+0xae/0x336
> >  [<ffffffff8026fa38>] dump_stack+0x15/0x17
> >  [<ffffffff8860a171>] :sch_htb:htb_safe_rb_erase+0x3b/0x55
>
> I found the reason for this, it was an unrelated bug. I've attached
> the latest version of the locking fixes and the fix for the HTB bug.
> Can you please try again?

This patch (01.diff) fixed the hard lockup for me, thanks.

/ismail

-- 
They that can give up essential liberty to obtain a little temporary safety 
deserve neither liberty nor safety.
-- Benjamin Franklin

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: tc related lockdep warning.
  2006-09-27 12:07           ` Patrick McHardy
  2006-09-27 17:26             ` Ismail Donmez
@ 2006-09-27 17:33             ` Dave Jones
  2006-09-27 23:53             ` David Miller
  2006-09-28  8:17             ` Jarek Poplawski
  3 siblings, 0 replies; 21+ messages in thread
From: Dave Jones @ 2006-09-27 17:33 UTC (permalink / raw)
  To: Patrick McHardy; +Cc: hadi, Jarek Poplawski, netdev, davem

On Wed, Sep 27, 2006 at 02:07:04PM +0200, Patrick McHardy wrote:
 > Dave Jones wrote:
 > > 
 > > Call Trace:
 > >  [<ffffffff8026f79b>] show_trace+0xae/0x336
 > >  [<ffffffff8026fa38>] dump_stack+0x15/0x17
 > >  [<ffffffff8860a171>] :sch_htb:htb_safe_rb_erase+0x3b/0x55
 > 
 > I found the reason for this, it was an unrelated bug. I've attached
 > the latest version of the locking fixes and the fix for the HTB bug.
 > Can you please try again?

Yep, that does the trick for me.

Good job.

	Dave

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: tc related lockdep warning.
  2006-09-27 12:07           ` Patrick McHardy
  2006-09-27 17:26             ` Ismail Donmez
  2006-09-27 17:33             ` Dave Jones
@ 2006-09-27 23:53             ` David Miller
  2006-09-28  9:07               ` Jarek Poplawski
  2006-09-28  8:17             ` Jarek Poplawski
  3 siblings, 1 reply; 21+ messages in thread
From: David Miller @ 2006-09-27 23:53 UTC (permalink / raw)
  To: kaber; +Cc: davej, hadi, jarkao2, netdev

From: Patrick McHardy <kaber@trash.net>
Date: Wed, 27 Sep 2006 14:07:04 +0200

> Dave Jones wrote:
> > With this patch, I get no lockdep warnings, but the machine locks up completely.
> > I hooked up a serial console, and found this..
> > 
> > 
> > u32 classifier
> >     Performance counters on
> >     input device check on 
> >     Actions configured 
> > BUG: warning at net/sched/sch_htb.c:395/htb_safe_rb_erase()
> > 
> > Call Trace:
> >  [<ffffffff8026f79b>] show_trace+0xae/0x336
> >  [<ffffffff8026fa38>] dump_stack+0x15/0x17
> >  [<ffffffff8860a171>] :sch_htb:htb_safe_rb_erase+0x3b/0x55
> 
> I found the reason for this, it was an unrelated bug. I've attached
> the latest version of the locking fixes and the fix for the HTB bug.
> Can you please try again?

Although the HTB bug is post-2.6.18, the other issue has been
around for a long time.

Thus I'll need to submit the second patch to -stable, but I want
it to cook in Linus's tree for a few days before I do that.

Thanks.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: tc related lockdep warning.
  2006-09-27 12:07           ` Patrick McHardy
                               ` (2 preceding siblings ...)
  2006-09-27 23:53             ` David Miller
@ 2006-09-28  8:17             ` Jarek Poplawski
  3 siblings, 0 replies; 21+ messages in thread
From: Jarek Poplawski @ 2006-09-28  8:17 UTC (permalink / raw)
  To: Patrick McHardy; +Cc: Dave Jones, hadi, netdev, davem

On Wed, Sep 27, 2006 at 02:07:04PM +0200, Patrick McHardy wrote:
> Dave Jones wrote:
> > With this patch, I get no lockdep warnings, but the machine locks up completely.
> > I hooked up a serial console, and found this..
> > 
> > 
> > u32 classifier
> >     Performance counters on
> >     input device check on 
> >     Actions configured 
> > BUG: warning at net/sched/sch_htb.c:395/htb_safe_rb_erase()
> > 
> > Call Trace:
> >  [<ffffffff8026f79b>] show_trace+0xae/0x336
> >  [<ffffffff8026fa38>] dump_stack+0x15/0x17
> >  [<ffffffff8860a171>] :sch_htb:htb_safe_rb_erase+0x3b/0x55
> 
> I found the reason for this, it was an unrelated bug. I've attached
> the latest version of the locking fixes and the fix for the HTB bug.

Congratulations! (But I think David Jones could have saved some brain
cycles applying fixes to the same version where the bug originated). 

...
> [NET_SCHED]: Fix fallout from dev->qdisc RCU change

Sorry again but I can't abstain from some doubts: 

...
> diff --git a/net/core/dev.c b/net/core/dev.c
> index 14de297..4d891be 100644
> --- a/net/core/dev.c
> +++ b/net/core/dev.c
> @@ -1480,14 +1480,16 @@ #endif
>  	if (q->enqueue) {
>  		/* Grab device queue */
>  		spin_lock(&dev->queue_lock);
> +		q = dev->qdisc;

I don't get it. If it is some anti-race step according to
rcu rules it should be again:
q = rcu_dereference(dev->qdisc);

But I don't know which of the attached lockups would be
fixed by this. 
And by the way - a few lines above is:
rcu_read_lock_bh();
which according to the rules should be
rcu_read_lock();
(or call_rcu should be changed to call_rcu_bh).

> +		if (q->enqueue) {
> +			rc = q->enqueue(skb, q);
> +			qdisc_run(dev);
> +			spin_unlock(&dev->queue_lock);
>  
> -		rc = q->enqueue(skb, q);
> -
> -		qdisc_run(dev);
> -
> +			rc = rc == NET_XMIT_BYPASS ? NET_XMIT_SUCCESS : rc;
> +			goto out;
> +		}
>  		spin_unlock(&dev->queue_lock);
> -		rc = rc == NET_XMIT_BYPASS ? NET_XMIT_SUCCESS : rc;
> -		goto out;
>  	}

By the way: rcu_read_unlock could be done here instead
at the very end. 

> @@ -504,32 +489,23 @@ #endif
>  
>  void qdisc_destroy(struct Qdisc *qdisc)
>  {
> -	struct list_head cql = LIST_HEAD_INIT(cql);
> -	struct Qdisc *cq, *q, *n;
> +	struct Qdisc_ops  *ops = qdisc->ops;
>  
>  	if (qdisc->flags & TCQ_F_BUILTIN ||
> -		!atomic_dec_and_test(&qdisc->refcnt))
> +	    !atomic_dec_and_test(&qdisc->refcnt))
>  		return;
...
> +	list_del(&qdisc->list);
> +#ifdef CONFIG_NET_ESTIMATOR
> +	gen_kill_estimator(&qdisc->bstats, &qdisc->rate_est);
> +#endif
> +	if (ops->reset)
> +		ops->reset(qdisc);
> +	if (ops->destroy)
> +		ops->destroy(qdisc);
>  
> +	module_put(ops->owner);
> +	dev_put(qdisc->dev);
>  	call_rcu(&qdisc->q_rcu, __qdisc_destroy);

This qdisc way of RCU looks very "special" to me.
Is this really doing anything here? There is no
pointers switching, everything is deleted in place, 
refcnt checked, no clean read_lock_rcu (without
spin_locks) anywhere - in my once more not very
humble opinion it is only very advanced method of
time wasting. 

Jarek P.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: tc related lockdep warning.
  2006-09-27 23:53             ` David Miller
@ 2006-09-28  9:07               ` Jarek Poplawski
  0 siblings, 0 replies; 21+ messages in thread
From: Jarek Poplawski @ 2006-09-28  9:07 UTC (permalink / raw)
  To: David Miller; +Cc: kaber, davej, hadi, netdev

On Wed, Sep 27, 2006 at 04:53:04PM -0700, David Miller wrote:
> From: Patrick McHardy <kaber@trash.net>
> Date: Wed, 27 Sep 2006 14:07:04 +0200
...
> Although the HTB bug is post-2.6.18, the other issue has been
> around for a long time.
> 
> Thus I'll need to submit the second patch to -stable, but I want
> it to cook in Linus's tree for a few days before I do that.

I don't know which version of act_police.c is planned for
2.6.18 -stable but I think it would be nice to fix the
original bug of this thread in 2.6.17 etc.
So I remind the proposal of read_lock_bh in tcf_police_lookup. 

Jarek P.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: tc related lockdep warning.
  2006-09-27  8:54           ` Jarek Poplawski
  2006-09-27  9:57             ` Patrick McHardy
@ 2006-09-28 12:17             ` Patrick McHardy
  2006-09-28 13:13               ` Jarek Poplawski
  1 sibling, 1 reply; 21+ messages in thread
From: Patrick McHardy @ 2006-09-28 12:17 UTC (permalink / raw)
  To: Jarek Poplawski; +Cc: Dave Jones, hadi, netdev, davem

[My mail provider is down, so responding "manually"]

Jarek Poplawski wrote:
> > [NET_SCHED]: Fix fallout from dev->qdisc RCU change
>
> Sorry again but I can't abstain from some doubts:
>
> ...
> > diff --git a/net/core/dev.c b/net/core/dev.c
> > index 14de297..4d891be 100644
> > --- a/net/core/dev.c
> > +++ b/net/core/dev.c
> > @@ -1480,14 +1480,16 @@ #endif
> >  	if (q->enqueue) {
> >  		/* Grab device queue */
> >  		spin_lock(&dev->queue_lock);
> > +		q = dev->qdisc;
>
> I don't get it. If it is some anti-race step according to
> rcu rules it should be again:
> q = rcu_dereference(dev->qdisc);

At this point RCU protection is not needed anymore since we
have the lock. We simply want to avoid taking the lock for
devices that don't have a real qdisc attached (like loopback).
Thats what the test for q->enqueue is there for. RCU is only
needed to avoid races between testing q->enqueue and freeing
of the qdisc.

> But I don't know which of the attached lockups would be
> fixed by this.

None - the repeated check is needed because deconstruction
of the qdisc tree now happens immediately instead of in the
RCU callback, so stale data can't be tolerated.

> And by the way - a few lines above is:
> rcu_read_lock_bh();
> which according to the rules should be
> rcu_read_lock();
> (or call_rcu should be changed to call_rcu_bh).

Read up on how it got there or simply look at the comment directly
above it:

        /* Disable soft irqs for various locks below. Also
         * stops preemption for RCU.
         */

> > @@ -504,32 +489,23 @@ #endif
> >
> >  void qdisc_destroy(struct Qdisc *qdisc)
> >  {
> > -	struct list_head cql = LIST_HEAD_INIT(cql);
> > -	struct Qdisc *cq, *q, *n;
> > +	struct Qdisc_ops  *ops = qdisc->ops;
> >
> >  	if (qdisc->flags & TCQ_F_BUILTIN ||
> > -		!atomic_dec_and_test(&qdisc->refcnt))
> > +	    !atomic_dec_and_test(&qdisc->refcnt))
> >  		return;
> ...
> > +	list_del(&qdisc->list);
> > +#ifdef CONFIG_NET_ESTIMATOR
> > +	gen_kill_estimator(&qdisc->bstats, &qdisc->rate_est);
> > +#endif
> > +	if (ops->reset)
> > +		ops->reset(qdisc);
> > +	if (ops->destroy)
> > +		ops->destroy(qdisc);
> >
> > +	module_put(ops->owner);
> > +	dev_put(qdisc->dev);
> >  	call_rcu(&qdisc->q_rcu, __qdisc_destroy);
>
>
> This qdisc way of RCU looks very "special" to me.
> Is this really doing anything here? There is no
> pointers switching, everything is deleted in place,
> refcnt checked, no clean read_lock_rcu (without
> spin_locks) anywhere - in my once more not very
> humble opinion it is only very advanced method of
> time wasting.

You don't seem to understand what is done here at all and
additionally haven't even read any of the comments explaining
what this is for. You are wasting time.

As I already explained, RCU is only needed for one thing,
and for nothing more: to make sure the q->enqueue check
in net/core/dev.c doesn't reference freed memory.
Therefore just the final free is done in the RCU callback.
We could also have used synchronize_net(), but it doesn't
really matter.


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: tc related lockdep warning.
  2006-09-28 12:17             ` Patrick McHardy
@ 2006-09-28 13:13               ` Jarek Poplawski
  2006-09-28 14:20                 ` Stephen Hemminger
  0 siblings, 1 reply; 21+ messages in thread
From: Jarek Poplawski @ 2006-09-28 13:13 UTC (permalink / raw)
  To: Patrick McHardy; +Cc: Dave Jones, hadi, netdev, davem

On Thu, Sep 28, 2006 at 02:17:51PM +0200, Patrick McHardy wrote:
> [My mail provider is down, so responding "manually"]
> 
> Jarek Poplawski wrote:
> > > [NET_SCHED]: Fix fallout from dev->qdisc RCU change
> >
> > Sorry again but I can't abstain from some doubts:
> >
> > ...
> > > diff --git a/net/core/dev.c b/net/core/dev.c
> > > index 14de297..4d891be 100644
> > > --- a/net/core/dev.c
> > > +++ b/net/core/dev.c
> > > @@ -1480,14 +1480,16 @@ #endif
> > >  	if (q->enqueue) {
> > >  		/* Grab device queue */
> > >  		spin_lock(&dev->queue_lock);
> > > +		q = dev->qdisc;
> >
> > I don't get it. If it is some anti-race step according to
> > rcu rules it should be again:
> > q = rcu_dereference(dev->qdisc);
> 
> At this point RCU protection is not needed anymore since we
> have the lock. We simply want to avoid taking the lock for
> devices that don't have a real qdisc attached (like loopback).
> Thats what the test for q->enqueue is there for. RCU is only
> needed to avoid races between testing q->enqueue and freeing
> of the qdisc.

But in Documentation/RCU/whatisRCU.txt I see this:

        /*
         * Return the value of field "a" of the current gbl_foo
         * structure.  Use rcu_read_lock() and rcu_read_unlock()
         * to ensure that the structure does not get deleted out
         * from under us, and use rcu_dereference() to ensure that
         * we see the initialized version of the structure (important
         * for DEC Alpha and for people reading the code).
         */
        int foo_get_a(void)
        {
                int retval;

                rcu_read_lock();
                retval = rcu_dereference(gbl_foo)->a;
                rcu_read_unlock();
                return retval;
        }

 
> > But I don't know which of the attached lockups would be
> > fixed by this.
> 
> None - the repeated check is needed because deconstruction
> of the qdisc tree now happens immediately instead of in the
> RCU callback, so stale data can't be tolerated.
> 
> > And by the way - a few lines above is:
> > rcu_read_lock_bh();
> > which according to the rules should be
> > rcu_read_lock();
> > (or call_rcu should be changed to call_rcu_bh).
> 
> Read up on how it got there or simply look at the comment directly
> above it:
> 
>         /* Disable soft irqs for various locks below. Also
>          * stops preemption for RCU.
>          */
> 

I've read it. And also this from include/linux/rcupdate.h:

/**
 * rcu_read_lock_bh - mark the beginning of a softirq-only RCU critical section
 *
 * This is equivalent of rcu_read_lock(), but to be used when updates
 * are being done using call_rcu_bh(). Since call_rcu_bh() callbacks
 * consider completion of a softirq handler to be a quiescent state,
 * a process in RCU read-side critical section must be protected by
 * disabling softirqs. Read-side critical sections in interrupt context
 * can use just rcu_read_lock().
 *
 */
#define rcu_read_lock_bh() \


> > > @@ -504,32 +489,23 @@ #endif
> > >
> > >  void qdisc_destroy(struct Qdisc *qdisc)
> > >  {
> > > -	struct list_head cql = LIST_HEAD_INIT(cql);
> > > -	struct Qdisc *cq, *q, *n;
> > > +	struct Qdisc_ops  *ops = qdisc->ops;
> > >
> > >  	if (qdisc->flags & TCQ_F_BUILTIN ||
> > > -		!atomic_dec_and_test(&qdisc->refcnt))
> > > +	    !atomic_dec_and_test(&qdisc->refcnt))
> > >  		return;
> > ...
> > > +	list_del(&qdisc->list);
> > > +#ifdef CONFIG_NET_ESTIMATOR
> > > +	gen_kill_estimator(&qdisc->bstats, &qdisc->rate_est);
> > > +#endif
> > > +	if (ops->reset)
> > > +		ops->reset(qdisc);
> > > +	if (ops->destroy)
> > > +		ops->destroy(qdisc);
> > >
> > > +	module_put(ops->owner);
> > > +	dev_put(qdisc->dev);
> > >  	call_rcu(&qdisc->q_rcu, __qdisc_destroy);
> >
> >
> > This qdisc way of RCU looks very "special" to me.
> > Is this really doing anything here? There is no
> > pointers switching, everything is deleted in place,
> > refcnt checked, no clean read_lock_rcu (without
> > spin_locks) anywhere - in my once more not very
> > humble opinion it is only very advanced method of
> > time wasting.
> 
> You don't seem to understand what is done here at all and
> additionally haven't even read any of the comments explaining
> what this is for. You are wasting time.

I've only written my personal opinion and a word "special"
could suggest it is hard to comprehend for me. I didn't
intend to offend anybody so I'm very sorry.  
 
> As I already explained, RCU is only needed for one thing,
> and for nothing more: to make sure the q->enqueue check
> in net/core/dev.c doesn't reference freed memory.
> Therefore just the final free is done in the RCU callback.
> We could also have used synchronize_net(), but it doesn't
> really matter.

Probably my problem is I didn't see anything like that
in docs. Anyway thank you very much for explanation.

Jarek P.  


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: tc related lockdep warning.
  2006-09-28 13:13               ` Jarek Poplawski
@ 2006-09-28 14:20                 ` Stephen Hemminger
  2006-09-29  6:28                   ` Jarek Poplawski
  0 siblings, 1 reply; 21+ messages in thread
From: Stephen Hemminger @ 2006-09-28 14:20 UTC (permalink / raw)
  To: Jarek Poplawski; +Cc: Patrick McHardy, Dave Jones, hadi, netdev, davem

On Thu, 28 Sep 2006 15:13:01 +0200
Jarek Poplawski <jarkao2@o2.pl> wrote:

> On Thu, Sep 28, 2006 at 02:17:51PM +0200, Patrick McHardy wrote:
> > [My mail provider is down, so responding "manually"]
> > 
> > Jarek Poplawski wrote:
> > > > [NET_SCHED]: Fix fallout from dev->qdisc RCU change
> > >
> > > Sorry again but I can't abstain from some doubts:
> > >
> > > ...
> > > > diff --git a/net/core/dev.c b/net/core/dev.c
> > > > index 14de297..4d891be 100644
> > > > --- a/net/core/dev.c
> > > > +++ b/net/core/dev.c
> > > > @@ -1480,14 +1480,16 @@ #endif
> > > >  	if (q->enqueue) {
> > > >  		/* Grab device queue */
> > > >  		spin_lock(&dev->queue_lock);
> > > > +		q = dev->qdisc;
> > >
> > > I don't get it. If it is some anti-race step according to
> > > rcu rules it should be again:
> > > q = rcu_dereference(dev->qdisc);
> > 
> > At this point RCU protection is not needed anymore since we
> > have the lock. We simply want to avoid taking the lock for
> > devices that don't have a real qdisc attached (like loopback).
> > Thats what the test for q->enqueue is there for. RCU is only
> > needed to avoid races between testing q->enqueue and freeing
> > of the qdisc.
> 
> But in Documentation/RCU/whatisRCU.txt I see this:
> 
>         /*
>          * Return the value of field "a" of the current gbl_foo
>          * structure.  Use rcu_read_lock() and rcu_read_unlock()
>          * to ensure that the structure does not get deleted out
>          * from under us, and use rcu_dereference() to ensure that
>          * we see the initialized version of the structure (important
>          * for DEC Alpha and for people reading the code).
>          */
>         int foo_get_a(void)
>         {
>                 int retval;
> 
>                 rcu_read_lock();
>                 retval = rcu_dereference(gbl_foo)->a;
>                 rcu_read_unlock();
>                 return retval;
>         }
> 

The example uses rcu_read_lock() which is a weaker form of protection
than a real lock, so the rcu_dereference() is needed to do memory barriers.

In the qdisc case we have the proper spin_lock() so no additional
barrier is needed.

-- 
Stephen Hemminger <shemminger@osdl.org>

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: tc related lockdep warning.
  2006-09-28 14:20                 ` Stephen Hemminger
@ 2006-09-29  6:28                   ` Jarek Poplawski
  0 siblings, 0 replies; 21+ messages in thread
From: Jarek Poplawski @ 2006-09-29  6:28 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: Patrick McHardy, Dave Jones, hadi, netdev, davem

On Thu, Sep 28, 2006 at 07:20:00AM -0700, Stephen Hemminger wrote:
> On Thu, 28 Sep 2006 15:13:01 +0200
> Jarek Poplawski <jarkao2@o2.pl> wrote:
> 
> > On Thu, Sep 28, 2006 at 02:17:51PM +0200, Patrick McHardy wrote:
> > > [My mail provider is down, so responding "manually"]
> > > 
> > > Jarek Poplawski wrote:
> > > > > [NET_SCHED]: Fix fallout from dev->qdisc RCU change
> > > >
> > > > Sorry again but I can't abstain from some doubts:
> > > >
> > > > ...
> > > > > diff --git a/net/core/dev.c b/net/core/dev.c
> > > > > index 14de297..4d891be 100644
> > > > > --- a/net/core/dev.c
> > > > > +++ b/net/core/dev.c
> > > > > @@ -1480,14 +1480,16 @@ #endif
> > > > >  	if (q->enqueue) {
> > > > >  		/* Grab device queue */
> > > > >  		spin_lock(&dev->queue_lock);
> > > > > +		q = dev->qdisc;
> > > >
> > > > I don't get it. If it is some anti-race step according to
> > > > rcu rules it should be again:
> > > > q = rcu_dereference(dev->qdisc);
> > > 
> > > At this point RCU protection is not needed anymore since we
> > > have the lock. We simply want to avoid taking the lock for
> > > devices that don't have a real qdisc attached (like loopback).
> > > Thats what the test for q->enqueue is there for. RCU is only
> > > needed to avoid races between testing q->enqueue and freeing
> > > of the qdisc.
> > 
> > But in Documentation/RCU/whatisRCU.txt I see this:
> > 
> >         /*
> >          * Return the value of field "a" of the current gbl_foo
> >          * structure.  Use rcu_read_lock() and rcu_read_unlock()
> >          * to ensure that the structure does not get deleted out
> >          * from under us, and use rcu_dereference() to ensure that
> >          * we see the initialized version of the structure (important
> >          * for DEC Alpha and for people reading the code).
> >          */
> >         int foo_get_a(void)
> >         {
> >                 int retval;
> > 
> >                 rcu_read_lock();
> >                 retval = rcu_dereference(gbl_foo)->a;
> >                 rcu_read_unlock();
> >                 return retval;
> >         }
> > 
> 
> The example uses rcu_read_lock() which is a weaker form of protection
> than a real lock, so the rcu_dereference() is needed to do memory barriers.

I don't question this and I know it is not required - I mean
what is in the parenthesis ("important ...") or here: 

>From Documentation/RCU/checklist.txt:

                The rcu_dereference() primitive is used by the various
                "_rcu()" list-traversal primitives, such as the
                list_for_each_entry_rcu().  Note that it is perfectly
                legal (if redundant) for update-side code to use
                rcu_dereference() and the "_rcu()" list-traversal
                primitives.  This is particularly useful in code
                that is common to readers and updaters.


> In the qdisc case we have the proper spin_lock() so no additional
> barrier is needed.

So why isn't it done in the code instead of comments?: 

	if (q->enqueue) {

                rcu_read_unlock();

  		/* Grab device queue */
  		spin_lock(&dev->queue_lock);
		q = dev->qdisc;
	...
	}
        rcu_read_unlock();

I mean the consequence.

Another thing is why? Now it looks RCU is good for the first test
but we don't believe it entirely (or assume the pointer could change
at this particular moment) so take this pointer again with
a real lock. I think RCU assures the consistency within it's block 
and we should only test for NULL pointer.


Jarek P.

^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2006-09-29  6:23 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2006-09-24 21:29 tc related lockdep warning Dave Jones
2006-09-25 12:43 ` Jarek Poplawski
2006-09-25 12:47   ` jamal
2006-09-25 13:05     ` Jarek Poplawski
2006-09-25 13:29     ` Patrick McHardy
2006-09-26 16:15       ` Patrick McHardy
2006-09-26 21:20         ` Dave Jones
2006-09-27  8:54           ` Jarek Poplawski
2006-09-27  9:57             ` Patrick McHardy
2006-09-28 12:17             ` Patrick McHardy
2006-09-28 13:13               ` Jarek Poplawski
2006-09-28 14:20                 ` Stephen Hemminger
2006-09-29  6:28                   ` Jarek Poplawski
2006-09-27 10:14           ` Patrick McHardy
2006-09-27 14:41             ` Ismail Donmez
2006-09-27 12:07           ` Patrick McHardy
2006-09-27 17:26             ` Ismail Donmez
2006-09-27 17:33             ` Dave Jones
2006-09-27 23:53             ` David Miller
2006-09-28  9:07               ` Jarek Poplawski
2006-09-28  8:17             ` Jarek Poplawski

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.