linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [BUG] fib_tries related Oops in 2.6.30
@ 2009-06-10 16:05 Yan Zheng
  2009-06-11 14:39 ` Jarek Poplawski
  0 siblings, 1 reply; 10+ messages in thread
From: Yan Zheng @ 2009-06-10 16:05 UTC (permalink / raw)
  To: linux-kernel, netdev

Hello,

I pull linux-2.6.30 from linus-2.6 git tree. I got following oops
immediately after boot.

# uname -a
Linux zhyan-cn 2.6.30 #1 SMP PREEMPT Wed Jun 10 23:37:22 CST 2009 i686
i686 i386 GNU/Linux

---
BUG: sleeping function called from invalid context at
/mnt/sda7/linux-2.6/mm/slub.c:1598
in_atomic(): 1, irqs_disabled(): 0, pid: 2286, name: NetworkManager
Pid: 2286, comm: NetworkManager Not tainted 2.6.30 #1
Call Trace:
 [<c042568c>] __might_sleep+0xfc/0x103
 [<c049bb79>] __kmalloc+0x7c/0x132
 [<c0697be5>] ? tnode_new+0x27/0x69
 [<c0697be5>] tnode_new+0x27/0x69
 [<c0697d36>] resize+0x94/0x666
 [<c049ac13>] ? __slab_alloc+0xc2/0x4b3
 [<c0697be5>] ? tnode_new+0x27/0x69
 [<c049bc23>] ? __kmalloc+0x126/0x132
 [<c0698384>] trie_rebalance+0x7c/0xde
 [<c0698d77>] fn_trie_insert+0x644/0x6d6
 [<c0427da7>] ? try_to_wake_up+0x2c4/0x2ce
 [<c0427dbc>] ? default_wake_function+0xb/0xd
 [<c069402e>] fib_magic+0x99/0xa8
 [<c069412c>] fib_add_ifaddr+0xef/0x111
 [<c0694178>] fib_inetaddr_event+0x2a/0x1a9
 [<c06cd5a9>] notifier_call_chain+0x2b/0x4a
 [<c0444189>] __blocking_notifier_call_chain+0x37/0x4c
 [<c04441aa>] blocking_notifier_call_chain+0xc/0xe
 [<c068d8f4>] __inet_insert_ifa+0xf9/0x104
 [<c068eb28>] inet_rtm_newaddr+0x16f/0x177
 [<c068e9b9>] ? inet_rtm_newaddr+0x0/0x177
 [<c065218c>] rtnetlink_rcv_msg+0x196/0x1b0
 [<c0651ff6>] ? rtnetlink_rcv_msg+0x0/0x1b0
 [<c065fbac>] netlink_rcv_skb+0x30/0x78
 [<c0651fee>] rtnetlink_rcv+0x1c/0x24
 [<c065f78e>] netlink_unicast+0xee/0x144
 [<c065fa27>] netlink_sendmsg+0x243/0x250
 [<c063d1f4>] __sock_sendmsg+0x45/0x4e
 [<c063d978>] sock_sendmsg+0xb8/0xce
 [<c044072b>] ? autoremove_wake_function+0x0/0x33
 [<c044072b>] ? autoremove_wake_function+0x0/0x33
 [<c044072b>] ? autoremove_wake_function+0x0/0x33
 [<c0523aa7>] ? copy_from_user+0x34/0x11b
 [<c0644ba4>] ? verify_iovec+0x40/0x70
 [<c063dacd>] sys_sendmsg+0x13f/0x192
 [<c063e67a>] ? sys_recvmsg+0x16d/0x17a
 [<c041f6b4>] ? kunmap_atomic+0x8b/0xa3
 [<c0489a88>] ? do_wp_page+0x515/0x5ac
 [<c044072b>] ? autoremove_wake_function+0x0/0x33
 [<c041af58>] ? paravirt_get_lazy_mode+0xe/0x1b
 [<c041b128>] ? arch_flush_lazy_mmu_mode+0x47/0x5b
 [<c041f799>] ? kmap_atomic_prot+0xcd/0xeb
 [<c048af8c>] ? handle_mm_fault+0x58d/0x5e8
 [<c063eb09>] sys_socketcall+0x153/0x183
 [<c0403298>] sysenter_do_call+0x12/0x2d

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [BUG] fib_tries related Oops in 2.6.30
  2009-06-10 16:05 [BUG] fib_tries related Oops in 2.6.30 Yan Zheng
@ 2009-06-11 14:39 ` Jarek Poplawski
  2009-06-12  7:25   ` Jarek Poplawski
  0 siblings, 1 reply; 10+ messages in thread
From: Jarek Poplawski @ 2009-06-11 14:39 UTC (permalink / raw)
  To: Yan Zheng; +Cc: linux-kernel, netdev, Robert Olsson

Cc Robert Olsson.

Jarek P.

Yan Zheng wrote, On 06/10/2009 06:05 PM:

> Hello,
> 
> I pull linux-2.6.30 from linus-2.6 git tree. I got following oops
> immediately after boot.
> 
> # uname -a
> Linux zhyan-cn 2.6.30 #1 SMP PREEMPT Wed Jun 10 23:37:22 CST 2009 i686
> i686 i386 GNU/Linux
> 
> ---
> BUG: sleeping function called from invalid context at
> /mnt/sda7/linux-2.6/mm/slub.c:1598
> in_atomic(): 1, irqs_disabled(): 0, pid: 2286, name: NetworkManager
> Pid: 2286, comm: NetworkManager Not tainted 2.6.30 #1
> Call Trace:
>  [<c042568c>] __might_sleep+0xfc/0x103
>  [<c049bb79>] __kmalloc+0x7c/0x132
>  [<c0697be5>] ? tnode_new+0x27/0x69
>  [<c0697be5>] tnode_new+0x27/0x69
>  [<c0697d36>] resize+0x94/0x666
>  [<c049ac13>] ? __slab_alloc+0xc2/0x4b3
>  [<c0697be5>] ? tnode_new+0x27/0x69
>  [<c049bc23>] ? __kmalloc+0x126/0x132
>  [<c0698384>] trie_rebalance+0x7c/0xde
>  [<c0698d77>] fn_trie_insert+0x644/0x6d6
>  [<c0427da7>] ? try_to_wake_up+0x2c4/0x2ce
>  [<c0427dbc>] ? default_wake_function+0xb/0xd
>  [<c069402e>] fib_magic+0x99/0xa8
>  [<c069412c>] fib_add_ifaddr+0xef/0x111
>  [<c0694178>] fib_inetaddr_event+0x2a/0x1a9
>  [<c06cd5a9>] notifier_call_chain+0x2b/0x4a
>  [<c0444189>] __blocking_notifier_call_chain+0x37/0x4c
>  [<c04441aa>] blocking_notifier_call_chain+0xc/0xe
>  [<c068d8f4>] __inet_insert_ifa+0xf9/0x104
>  [<c068eb28>] inet_rtm_newaddr+0x16f/0x177
>  [<c068e9b9>] ? inet_rtm_newaddr+0x0/0x177
>  [<c065218c>] rtnetlink_rcv_msg+0x196/0x1b0
>  [<c0651ff6>] ? rtnetlink_rcv_msg+0x0/0x1b0
>  [<c065fbac>] netlink_rcv_skb+0x30/0x78
>  [<c0651fee>] rtnetlink_rcv+0x1c/0x24
>  [<c065f78e>] netlink_unicast+0xee/0x144
>  [<c065fa27>] netlink_sendmsg+0x243/0x250
>  [<c063d1f4>] __sock_sendmsg+0x45/0x4e
>  [<c063d978>] sock_sendmsg+0xb8/0xce
>  [<c044072b>] ? autoremove_wake_function+0x0/0x33
>  [<c044072b>] ? autoremove_wake_function+0x0/0x33
>  [<c044072b>] ? autoremove_wake_function+0x0/0x33
>  [<c0523aa7>] ? copy_from_user+0x34/0x11b
>  [<c0644ba4>] ? verify_iovec+0x40/0x70
>  [<c063dacd>] sys_sendmsg+0x13f/0x192
>  [<c063e67a>] ? sys_recvmsg+0x16d/0x17a
>  [<c041f6b4>] ? kunmap_atomic+0x8b/0xa3
>  [<c0489a88>] ? do_wp_page+0x515/0x5ac
>  [<c044072b>] ? autoremove_wake_function+0x0/0x33
>  [<c041af58>] ? paravirt_get_lazy_mode+0xe/0x1b
>  [<c041b128>] ? arch_flush_lazy_mmu_mode+0x47/0x5b
>  [<c041f799>] ? kmap_atomic_prot+0xcd/0xeb
>  [<c048af8c>] ? handle_mm_fault+0x58d/0x5e8
>  [<c063eb09>] sys_socketcall+0x153/0x183
>  [<c0403298>] sysenter_do_call+0x12/0x2d
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [BUG] fib_tries related Oops in 2.6.30
  2009-06-11 14:39 ` Jarek Poplawski
@ 2009-06-12  7:25   ` Jarek Poplawski
  2009-06-15  6:53     ` [PATCH] " Jarek Poplawski
  0 siblings, 1 reply; 10+ messages in thread
From: Jarek Poplawski @ 2009-06-12  7:25 UTC (permalink / raw)
  To: Robert Olsson; +Cc: Yan Zheng, linux-kernel, netdev

Jarek Poplawski wrote, On 06/11/2009 04:39 PM:

> Cc Robert Olsson.
> 
> Jarek P.
> 
> Yan Zheng wrote, On 06/10/2009 06:05 PM:
> 
>> Hello,
>>
>> I pull linux-2.6.30 from linus-2.6 git tree. I got following oops
>> immediately after boot.
>>
>> # uname -a
>> Linux zhyan-cn 2.6.30 #1 SMP PREEMPT Wed Jun 10 23:37:22 CST 2009 i686
>> i686 i386 GNU/Linux
>>
>> ---
>> BUG: sleeping function called from invalid context at
...

Robert, probably I miss something, but since I don't understand this
last patch with preempt_disable(), I've looked a bit at this place and
found this parent update after IMHO possible child destruction quite
suspicious, so I wonder if you could check if this patch could change
anything with previous oops. (It's mainly to test the idea, not to
optimally fix it.)

Thanks,
Jarek P.
---

 net/ipv4/fib_trie.c |    6 +++---
 1 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/net/ipv4/fib_trie.c b/net/ipv4/fib_trie.c
index 538d2a9..565fc1d 100644
--- a/net/ipv4/fib_trie.c
+++ b/net/ipv4/fib_trie.c
@@ -989,16 +989,17 @@ static struct node *trie_rebalance(struct trie *t, struct tnode *tn)
 	t_key cindex, key;
 	struct tnode *tp;
 
-	preempt_disable();
 	key = tn->key;
 
 	while (tn != NULL && (tp = node_parent((struct node *)tn)) != NULL) {
 		cindex = tkey_extract_bits(key, tp->pos, tp->bits);
 		wasfull = tnode_full(tp, tnode_get_child(tp, cindex));
+		tnode_put_child_reorg((struct tnode *)tp, cindex, NULL,
+				      wasfull);
 		tn = (struct tnode *) resize(t, (struct tnode *)tn);
 
 		tnode_put_child_reorg((struct tnode *)tp, cindex,
-				      (struct node *)tn, wasfull);
+				      (struct node *)tn, -1);
 
 		tp = node_parent((struct node *) tn);
 		if (!tp)
@@ -1010,7 +1011,6 @@ static struct node *trie_rebalance(struct trie *t, struct tnode *tn)
 	if (IS_TNODE(tn))
 		tn = (struct tnode *)resize(t, (struct tnode *)tn);
 
-	preempt_enable();
 	return (struct node *)tn;
 }
 

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH] Re: [BUG] fib_tries related Oops in 2.6.30
  2009-06-12  7:25   ` Jarek Poplawski
@ 2009-06-15  6:53     ` Jarek Poplawski
  2009-06-15  9:32       ` David Miller
                         ` (2 more replies)
  0 siblings, 3 replies; 10+ messages in thread
From: Jarek Poplawski @ 2009-06-15  6:53 UTC (permalink / raw)
  To: David Miller; +Cc: Robert Olsson, Yan Zheng, linux-kernel, netdev

On 12-06-2009 09:25, Jarek Poplawski wrote:
> Jarek Poplawski wrote, On 06/11/2009 04:39 PM:
> 
>> Cc Robert Olsson.
>>
>> Jarek P.
>>
>> Yan Zheng wrote, On 06/10/2009 06:05 PM:
>>
>>> Hello,
>>>
>>> I pull linux-2.6.30 from linus-2.6 git tree. I got following oops
>>> immediately after boot.
>>>
>>> # uname -a
>>> Linux zhyan-cn 2.6.30 #1 SMP PREEMPT Wed Jun 10 23:37:22 CST 2009 i686
>>> i686 i386 GNU/Linux
>>>
>>> ---
>>> BUG: sleeping function called from invalid context at
> ...
> 
> Robert, probably I miss something, but since I don't understand this
> last patch with preempt_disable(), I've looked a bit at this place and
> found this parent update after IMHO possible child destruction quite
> suspicious, so I wonder if you could check if this patch could change
> anything with previous oops. (It's mainly to test the idea, not to
> optimally fix it.)

Since I'm not sure Robert is working on this, here is a patch which
I guess should fix this issue more optimally. Alas, until it's tested
by somebody, I can recommend it only for net-next.

Jarek P.
------------------------->
ipv4: Fix fib_trie rebalancing

While doing trie_rebalance(): resize(), inflate(), halve() RCU free
tnodes before updating their parents. It depends on RCU delaying the
real destruction, but if RCU readers start after call_rcu() and before
parent update they could access freed memory.

It is currently prevented with preempt_disable() on the update side,
but it's not safe, except maybe classic RCU, plus it conflicts with
memory allocations with GFP_KERNEL flag used from these functions.

This patch explicitly delays freeing of tnodes by adding them to the
list, which is flushed after the update is finished.

Reported-by: Yan Zheng <zheng.yan@oracle.com>
Signed-off-by: Jarek Poplawski <jarkao2@gmail.com>
---

 net/ipv4/fib_trie.c |   47 +++++++++++++++++++++++++++++++++++++----------
 1 files changed, 37 insertions(+), 10 deletions(-)

diff --git a/net/ipv4/fib_trie.c b/net/ipv4/fib_trie.c
index 538d2a9..d1a39b1 100644
--- a/net/ipv4/fib_trie.c
+++ b/net/ipv4/fib_trie.c
@@ -123,6 +123,7 @@ struct tnode {
 	union {
 		struct rcu_head rcu;
 		struct work_struct work;
+		struct tnode *tnode_free;
 	};
 	struct node *child[0];
 };
@@ -161,6 +162,8 @@ static void tnode_put_child_reorg(struct tnode *tn, int i, struct node *n,
 static struct node *resize(struct trie *t, struct tnode *tn);
 static struct tnode *inflate(struct trie *t, struct tnode *tn);
 static struct tnode *halve(struct trie *t, struct tnode *tn);
+/* tnodes to free after resize(); protected by RTNL */
+static struct tnode *tnode_free_head;
 
 static struct kmem_cache *fn_alias_kmem __read_mostly;
 static struct kmem_cache *trie_leaf_kmem __read_mostly;
@@ -385,6 +388,29 @@ static inline void tnode_free(struct tnode *tn)
 		call_rcu(&tn->rcu, __tnode_free_rcu);
 }
 
+static void tnode_free_safe(struct tnode *tn)
+{
+	BUG_ON(IS_LEAF(tn));
+
+	if (node_parent((struct node *) tn)) {
+		tn->tnode_free = tnode_free_head;
+		tnode_free_head = tn;
+	} else {
+		tnode_free(tn);
+	}
+}
+
+static void tnode_free_flush(void)
+{
+	struct tnode *tn;
+
+	while ((tn = tnode_free_head)) {
+		tnode_free_head = tn->tnode_free;
+		tn->tnode_free = NULL;
+		tnode_free(tn);
+	}
+}
+
 static struct leaf *leaf_new(void)
 {
 	struct leaf *l = kmem_cache_alloc(trie_leaf_kmem, GFP_KERNEL);
@@ -495,7 +521,7 @@ static struct node *resize(struct trie *t, struct tnode *tn)
 
 	/* No children */
 	if (tn->empty_children == tnode_child_length(tn)) {
-		tnode_free(tn);
+		tnode_free_safe(tn);
 		return NULL;
 	}
 	/* One child */
@@ -509,7 +535,7 @@ static struct node *resize(struct trie *t, struct tnode *tn)
 
 			/* compress one level */
 			node_set_parent(n, NULL);
-			tnode_free(tn);
+			tnode_free_safe(tn);
 			return n;
 		}
 	/*
@@ -670,7 +696,7 @@ static struct node *resize(struct trie *t, struct tnode *tn)
 			/* compress one level */
 
 			node_set_parent(n, NULL);
-			tnode_free(tn);
+			tnode_free_safe(tn);
 			return n;
 		}
 
@@ -756,7 +782,7 @@ static struct tnode *inflate(struct trie *t, struct tnode *tn)
 			put_child(t, tn, 2*i, inode->child[0]);
 			put_child(t, tn, 2*i+1, inode->child[1]);
 
-			tnode_free(inode);
+			tnode_free_safe(inode);
 			continue;
 		}
 
@@ -801,9 +827,9 @@ static struct tnode *inflate(struct trie *t, struct tnode *tn)
 		put_child(t, tn, 2*i, resize(t, left));
 		put_child(t, tn, 2*i+1, resize(t, right));
 
-		tnode_free(inode);
+		tnode_free_safe(inode);
 	}
-	tnode_free(oldtnode);
+	tnode_free_safe(oldtnode);
 	return tn;
 nomem:
 	{
@@ -885,7 +911,7 @@ static struct tnode *halve(struct trie *t, struct tnode *tn)
 		put_child(t, newBinNode, 1, right);
 		put_child(t, tn, i/2, resize(t, newBinNode));
 	}
-	tnode_free(oldtnode);
+	tnode_free_safe(oldtnode);
 	return tn;
 nomem:
 	{
@@ -989,7 +1015,6 @@ static struct node *trie_rebalance(struct trie *t, struct tnode *tn)
 	t_key cindex, key;
 	struct tnode *tp;
 
-	preempt_disable();
 	key = tn->key;
 
 	while (tn != NULL && (tp = node_parent((struct node *)tn)) != NULL) {
@@ -1001,16 +1026,18 @@ static struct node *trie_rebalance(struct trie *t, struct tnode *tn)
 				      (struct node *)tn, wasfull);
 
 		tp = node_parent((struct node *) tn);
+		tnode_free_flush();
 		if (!tp)
 			break;
 		tn = tp;
 	}
 
 	/* Handle last (top) tnode */
-	if (IS_TNODE(tn))
+	if (IS_TNODE(tn)) {
 		tn = (struct tnode *)resize(t, (struct tnode *)tn);
+		tnode_free_flush();
+	}
 
-	preempt_enable();
 	return (struct node *)tn;
 }
 

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH] Re: [BUG] fib_tries related Oops in 2.6.30
  2009-06-15  6:53     ` [PATCH] " Jarek Poplawski
@ 2009-06-15  9:32       ` David Miller
  2009-06-15 15:25       ` [PATCH 2/1] " Jarek Poplawski
  2009-06-15 16:08       ` [PATCH 2/1 v2] " Jarek Poplawski
  2 siblings, 0 replies; 10+ messages in thread
From: David Miller @ 2009-06-15  9:32 UTC (permalink / raw)
  To: jarkao2; +Cc: robert.olsson, zheng.yan, linux-kernel, netdev

From: Jarek Poplawski <jarkao2@gmail.com>
Date: Mon, 15 Jun 2009 06:53:33 +0000

> ipv4: Fix fib_trie rebalancing
> 
> While doing trie_rebalance(): resize(), inflate(), halve() RCU free
> tnodes before updating their parents. It depends on RCU delaying the
> real destruction, but if RCU readers start after call_rcu() and before
> parent update they could access freed memory.
> 
> It is currently prevented with preempt_disable() on the update side,
> but it's not safe, except maybe classic RCU, plus it conflicts with
> memory allocations with GFP_KERNEL flag used from these functions.
> 
> This patch explicitly delays freeing of tnodes by adding them to the
> list, which is flushed after the update is finished.
> 
> Reported-by: Yan Zheng <zheng.yan@oracle.com>
> Signed-off-by: Jarek Poplawski <jarkao2@gmail.com>

Applied to net-next-2.6

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH 2/1] Re: [BUG] fib_tries related Oops in 2.6.30
  2009-06-15  6:53     ` [PATCH] " Jarek Poplawski
  2009-06-15  9:32       ` David Miller
@ 2009-06-15 15:25       ` Jarek Poplawski
  2009-06-15 16:08       ` [PATCH 2/1 v2] " Jarek Poplawski
  2 siblings, 0 replies; 10+ messages in thread
From: Jarek Poplawski @ 2009-06-15 15:25 UTC (permalink / raw)
  To: David Miller; +Cc: Robert Olsson, Yan Zheng, linux-kernel, netdev

ipv4: Fix fib_trie rebalancing, part 2

My previous patch, which explicitly delays freeing of tnodes by adding
them to the list to flushe them after the update is finished, isn't
strict enough. It treats exceptionally tnodes without parent, assuming
they are newly created, so "invisible" for the read side yet. But the
top tnode doesn't have parent as well, so we have to exclude all
exceptions (at least until a better way is found).

Reported-by: Yan Zheng <zheng.yan@oracle.com>
Signed-off-by: Jarek Poplawski <jarkao2@gmail.com>
---

 net/ipv4/fib_trie.c |    9 ++-------
 1 files changed, 2 insertions(+), 7 deletions(-)

diff --git a/net/ipv4/fib_trie.c b/net/ipv4/fib_trie.c
index d1a39b1..64395b0 100644
--- a/net/ipv4/fib_trie.c
+++ b/net/ipv4/fib_trie.c
@@ -391,13 +391,8 @@ static inline void tnode_free(struct tnode *tn)
 static void tnode_free_safe(struct tnode *tn)
 {
 	BUG_ON(IS_LEAF(tn));
-
-	if (node_parent((struct node *) tn)) {
-		tn->tnode_free = tnode_free_head;
-		tnode_free_head = tn;
-	} else {
-		tnode_free(tn);
-	}
+	tn->tnode_free = tnode_free_head;
+	tnode_free_head = tn;
 }
 
 static void tnode_free_flush(void)

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 2/1 v2] Re: [BUG] fib_tries related Oops in 2.6.30
  2009-06-15  6:53     ` [PATCH] " Jarek Poplawski
  2009-06-15  9:32       ` David Miller
  2009-06-15 15:25       ` [PATCH 2/1] " Jarek Poplawski
@ 2009-06-15 16:08       ` Jarek Poplawski
  2009-06-18  1:56         ` David Miller
  2 siblings, 1 reply; 10+ messages in thread
From: Jarek Poplawski @ 2009-06-15 16:08 UTC (permalink / raw)
  To: David Miller; +Cc: Robert Olsson, Yan Zheng, linux-kernel, netdev

Alas this top tnode needs even more.

Sorry/thanks,
Jarek P.
-------------------> take 2
ipv4: Fix fib_trie rebalancing, part 2

My previous patch, which explicitly delays freeing of tnodes by adding
them to the list to flush them after the update is finished, isn't
strict enough. It treats exceptionally tnodes without parent, assuming
they are newly created, so "invisible" for the read side yet.

But the top tnode doesn't have parent as well, so we have to exclude
all exceptions (at least until a better way is found). Additionally we
need to move rcu assignment of this node before flushing, so the
return type of the trie_rebalance() function is changed.

Reported-by: Yan Zheng <zheng.yan@oracle.com>
Signed-off-by: Jarek Poplawski <jarkao2@gmail.com>
---

 net/ipv4/fib_trie.c |   23 ++++++++++-------------
 1 files changed, 10 insertions(+), 13 deletions(-)

diff --git a/net/ipv4/fib_trie.c b/net/ipv4/fib_trie.c
index d1a39b1..6188043 100644
--- a/net/ipv4/fib_trie.c
+++ b/net/ipv4/fib_trie.c
@@ -391,13 +391,8 @@ static inline void tnode_free(struct tnode *tn)
 static void tnode_free_safe(struct tnode *tn)
 {
 	BUG_ON(IS_LEAF(tn));
-
-	if (node_parent((struct node *) tn)) {
-		tn->tnode_free = tnode_free_head;
-		tnode_free_head = tn;
-	} else {
-		tnode_free(tn);
-	}
+	tn->tnode_free = tnode_free_head;
+	tnode_free_head = tn;
 }
 
 static void tnode_free_flush(void)
@@ -1009,7 +1004,7 @@ fib_find_node(struct trie *t, u32 key)
 	return NULL;
 }
 
-static struct node *trie_rebalance(struct trie *t, struct tnode *tn)
+static void trie_rebalance(struct trie *t, struct tnode *tn)
 {
 	int wasfull;
 	t_key cindex, key;
@@ -1033,12 +1028,14 @@ static struct node *trie_rebalance(struct trie *t, struct tnode *tn)
 	}
 
 	/* Handle last (top) tnode */
-	if (IS_TNODE(tn)) {
+	if (IS_TNODE(tn))
 		tn = (struct tnode *)resize(t, (struct tnode *)tn);
+
+	rcu_assign_pointer(t->trie, (struct node *)tn);
+	if (IS_TNODE(tn))
 		tnode_free_flush();
-	}
 
-	return (struct node *)tn;
+	return;
 }
 
 /* only used from updater-side */
@@ -1186,7 +1183,7 @@ static struct list_head *fib_insert_node(struct trie *t, u32 key, int plen)
 
 	/* Rebalance the trie */
 
-	rcu_assign_pointer(t->trie, trie_rebalance(t, tp));
+	trie_rebalance(t, tp);
 done:
 	return fa_head;
 }
@@ -1605,7 +1602,7 @@ static void trie_leaf_remove(struct trie *t, struct leaf *l)
 	if (tp) {
 		t_key cindex = tkey_extract_bits(l->key, tp->pos, tp->bits);
 		put_child(t, (struct tnode *)tp, cindex, NULL);
-		rcu_assign_pointer(t->trie, trie_rebalance(t, tp));
+		trie_rebalance(t, tp);
 	} else
 		rcu_assign_pointer(t->trie, NULL);
 

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH 2/1 v2] Re: [BUG] fib_tries related Oops in 2.6.30
  2009-06-15 16:08       ` [PATCH 2/1 v2] " Jarek Poplawski
@ 2009-06-18  1:56         ` David Miller
  2009-06-18  7:23           ` [PATCH 3/1] " Jarek Poplawski
  0 siblings, 1 reply; 10+ messages in thread
From: David Miller @ 2009-06-18  1:56 UTC (permalink / raw)
  To: jarkao2; +Cc: robert.olsson, zheng.yan, linux-kernel, netdev

From: Jarek Poplawski <jarkao2@gmail.com>
Date: Mon, 15 Jun 2009 18:08:01 +0200

> ipv4: Fix fib_trie rebalancing, part 2
> 
> My previous patch, which explicitly delays freeing of tnodes by adding
> them to the list to flush them after the update is finished, isn't
> strict enough. It treats exceptionally tnodes without parent, assuming
> they are newly created, so "invisible" for the read side yet.
> 
> But the top tnode doesn't have parent as well, so we have to exclude
> all exceptions (at least until a better way is found). Additionally we
> need to move rcu assignment of this node before flushing, so the
> return type of the trie_rebalance() function is changed.
> 
> Reported-by: Yan Zheng <zheng.yan@oracle.com>
> Signed-off-by: Jarek Poplawski <jarkao2@gmail.com>

Applied, thanks a lot Jarek.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH 3/1] Re: [BUG] fib_tries related Oops in 2.6.30
  2009-06-18  1:56         ` David Miller
@ 2009-06-18  7:23           ` Jarek Poplawski
  2009-06-18  7:26             ` David Miller
  0 siblings, 1 reply; 10+ messages in thread
From: Jarek Poplawski @ 2009-06-18  7:23 UTC (permalink / raw)
  To: David Miller; +Cc: robert.olsson, zheng.yan, linux-kernel, netdev

On Wed, Jun 17, 2009 at 06:56:58PM -0700, David Miller wrote:
> From: Jarek Poplawski <jarkao2@gmail.com>
> Date: Mon, 15 Jun 2009 18:08:01 +0200
> 
> > ipv4: Fix fib_trie rebalancing, part 2
> > 
> > My previous patch, which explicitly delays freeing of tnodes by adding
> > them to the list to flush them after the update is finished, isn't
> > strict enough. It treats exceptionally tnodes without parent, assuming
> > they are newly created, so "invisible" for the read side yet.
> > 
> > But the top tnode doesn't have parent as well, so we have to exclude
> > all exceptions (at least until a better way is found). Additionally we
> > need to move rcu assignment of this node before flushing, so the
> > return type of the trie_rebalance() function is changed.
> > 
> > Reported-by: Yan Zheng <zheng.yan@oracle.com>
> > Signed-off-by: Jarek Poplawski <jarkao2@gmail.com>
> 
> Applied, thanks a lot Jarek.

Not at all, really :-( I definitely need more time to find out what's
going on here...

Sorry/thanks x2,
Jarek P.
------------------>
ipv4: Fix fib_trie rebalancing, part 3

Alas my fix, part 2, has one if too much again... (We can't repeat the
same test because tn is different.)

Reported-by: Yan Zheng <zheng.yan@oracle.com>
Signed-off-by: Jarek Poplawski <jarkao2@gmail.com>
---

diff -Nurp a/net/ipv4/fib_trie.c b/net/ipv4/fib_trie.c
--- a/net/ipv4/fib_trie.c	2009-06-18 06:53:24.000000000 +0000
+++ b/net/ipv4/fib_trie.c	2009-06-18 06:58:00.000000000 +0000
@@ -1032,8 +1032,7 @@ static void trie_rebalance(struct trie *
 		tn = (struct tnode *)resize(t, (struct tnode *)tn);
 
 	rcu_assign_pointer(t->trie, (struct node *)tn);
-	if (IS_TNODE(tn))
-		tnode_free_flush();
+	tnode_free_flush();
 
 	return;
 }

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 3/1] Re: [BUG] fib_tries related Oops in 2.6.30
  2009-06-18  7:23           ` [PATCH 3/1] " Jarek Poplawski
@ 2009-06-18  7:26             ` David Miller
  0 siblings, 0 replies; 10+ messages in thread
From: David Miller @ 2009-06-18  7:26 UTC (permalink / raw)
  To: jarkao2; +Cc: robert.olsson, zheng.yan, linux-kernel, netdev

From: Jarek Poplawski <jarkao2@gmail.com>
Date: Thu, 18 Jun 2009 07:23:00 +0000

> ipv4: Fix fib_trie rebalancing, part 3
> 
> Alas my fix, part 2, has one if too much again... (We can't repeat the
> same test because tn is different.)
> 
> Reported-by: Yan Zheng <zheng.yan@oracle.com>
> Signed-off-by: Jarek Poplawski <jarkao2@gmail.com>

Since I didn't push patch #2 out yet I'll combine it with
this one.

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2009-06-18  7:26 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-06-10 16:05 [BUG] fib_tries related Oops in 2.6.30 Yan Zheng
2009-06-11 14:39 ` Jarek Poplawski
2009-06-12  7:25   ` Jarek Poplawski
2009-06-15  6:53     ` [PATCH] " Jarek Poplawski
2009-06-15  9:32       ` David Miller
2009-06-15 15:25       ` [PATCH 2/1] " Jarek Poplawski
2009-06-15 16:08       ` [PATCH 2/1 v2] " Jarek Poplawski
2009-06-18  1:56         ` David Miller
2009-06-18  7:23           ` [PATCH 3/1] " Jarek Poplawski
2009-06-18  7:26             ` David Miller

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).