* [PATCH net] ipv6: remove null_entry before adding default route
@ 2018-01-06 1:38 Wei Wang
2018-01-06 7:42 ` Martin KaFai Lau
0 siblings, 1 reply; 5+ messages in thread
From: Wei Wang @ 2018-01-06 1:38 UTC (permalink / raw)
To: David Miller, netdev; +Cc: Martin KaFai Lau, Eric Dumazet, Wei Wang
From: Wei Wang <weiwan@google.com>
In the current code, when creating a new fib6 table, tb6_root.leaf gets
initialized to net->ipv6.ip6_null_entry.
If a default route is being added with rt->rt6i_metric = 0xffffffff,
fib6_add() will add this route after net->ipv6.ip6_null_entry. As
null_entry is shared, it could cause problem.
In order to fix it, set fn->leaf to NULL before calling
fib6_add_rt2node() when trying to add the first default route.
And reset fn->leaf to null_entry when adding fails or when deleting the
last default route.
syzkaller reported the following issue which is fixed by this commit:
=============================
WARNING: suspicious RCU usage
4.15.0-rc5+ #171 Not tainted
-----------------------------
net/ipv6/ip6_fib.c:1702 suspicious rcu_dereference_protected() usage!
other info that might help us debug this:
rcu_scheduler_active = 2, debug_locks = 1
4 locks held by swapper/0/0:
#0: ((&net->ipv6.ip6_fib_timer)){+.-.}, at: [<00000000d43f631b>] lockdep_copy_map include/linux/lockdep.h:178 [inline]
#0: ((&net->ipv6.ip6_fib_timer)){+.-.}, at: [<00000000d43f631b>] call_timer_fn+0x1c6/0x820 kernel/time/timer.c:1310
#1: (&(&net->ipv6.fib6_gc_lock)->rlock){+.-.}, at: [<000000002ff9d65c>] spin_lock_bh include/linux/spinlock.h:315 [inline]
#1: (&(&net->ipv6.fib6_gc_lock)->rlock){+.-.}, at: [<000000002ff9d65c>] fib6_run_gc+0x9d/0x3c0 net/ipv6/ip6_fib.c:2007
#2: (rcu_read_lock){....}, at: [<0000000091db762d>] __fib6_clean_all+0x0/0x3a0 net/ipv6/ip6_fib.c:1560
#3: (&(&tb->tb6_lock)->rlock){+.-.}, at: [<000000009e503581>] spin_lock_bh include/linux/spinlock.h:315 [inline]
#3: (&(&tb->tb6_lock)->rlock){+.-.}, at: [<000000009e503581>] __fib6_clean_all+0x1d0/0x3a0 net/ipv6/ip6_fib.c:1948
stack backtrace:
CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.15.0-rc5+ #171
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
<IRQ>
__dump_stack lib/dump_stack.c:17 [inline]
dump_stack+0x194/0x257 lib/dump_stack.c:53
lockdep_rcu_suspicious+0x123/0x170 kernel/locking/lockdep.c:4585
fib6_del+0xcaa/0x11b0 net/ipv6/ip6_fib.c:1701
fib6_clean_node+0x3aa/0x4f0 net/ipv6/ip6_fib.c:1892
fib6_walk_continue+0x46c/0x8a0 net/ipv6/ip6_fib.c:1815
fib6_walk+0x91/0xf0 net/ipv6/ip6_fib.c:1863
fib6_clean_tree+0x1e6/0x340 net/ipv6/ip6_fib.c:1933
__fib6_clean_all+0x1f4/0x3a0 net/ipv6/ip6_fib.c:1949
fib6_clean_all net/ipv6/ip6_fib.c:1960 [inline]
fib6_run_gc+0x16b/0x3c0 net/ipv6/ip6_fib.c:2016
fib6_gc_timer_cb+0x20/0x30 net/ipv6/ip6_fib.c:2033
call_timer_fn+0x228/0x820 kernel/time/timer.c:1320
expire_timers kernel/time/timer.c:1357 [inline]
__run_timers+0x7ee/0xb70 kernel/time/timer.c:1660
run_timer_softirq+0x4c/0xb0 kernel/time/timer.c:1686
__do_softirq+0x2d7/0xb85 kernel/softirq.c:285
invoke_softirq kernel/softirq.c:365 [inline]
irq_exit+0x1cc/0x200 kernel/softirq.c:405
exiting_irq arch/x86/include/asm/apic.h:540 [inline]
smp_apic_timer_interrupt+0x16b/0x700 arch/x86/kernel/apic/apic.c:1052
apic_timer_interrupt+0xa9/0xb0 arch/x86/entry/entry_64.S:904
</IRQ>
Reported-by: syzbot <syzkaller@googlegroups.com>
Fixes: 66f5d6ce53e6 ("ipv6: replace rwlock with rcu and spinlock in fib6_table")
Signed-off-by: Wei Wang <weiwan@google.com>
---
net/ipv6/ip6_fib.c | 45 +++++++++++++++++++++++++++++++++++----------
1 file changed, 35 insertions(+), 10 deletions(-)
diff --git a/net/ipv6/ip6_fib.c b/net/ipv6/ip6_fib.c
index d11a5578e4f8..37cb4ad1ea29 100644
--- a/net/ipv6/ip6_fib.c
+++ b/net/ipv6/ip6_fib.c
@@ -640,6 +640,11 @@ static struct fib6_node *fib6_add_1(struct net *net,
if (!(fn->fn_flags & RTN_RTINFO)) {
RCU_INIT_POINTER(fn->leaf, NULL);
rt6_release(leaf);
+ /* remove null_entry in the root node */
+ } else if (fn->fn_flags & RTN_TL_ROOT &&
+ rcu_access_pointer(fn->leaf) ==
+ net->ipv6.ip6_null_entry) {
+ RCU_INIT_POINTER(fn->leaf, NULL);
}
return fn;
@@ -1270,14 +1275,27 @@ int fib6_add(struct fib6_node *root, struct rt6_info *rt,
return err;
failure:
- /* fn->leaf could be NULL if fn is an intermediate node and we
- * failed to add the new route to it in both subtree creation
- * failure and fib6_add_rt2node() failure case.
- * In both cases, fib6_repair_tree() should be called to fix
+ /* fn->leaf could be NULL if:
+ * 1. fn is the root node in the table and we fail to add the default
+ * route to it.
+ * In this case, we put fn->leaf back to net->ipv6.ip6_null_entry as
+ * the way the table was created.
+ * 2. fn is an intermediate node and we failed to add the new
+ * route to it in both subtree creation failure and fib6_add_rt2node()
+ * failure case.
+ * In this case, fib6_repair_tree() should be called to fix
* fn->leaf.
*/
- if (fn && !(fn->fn_flags & (RTN_RTINFO|RTN_ROOT)))
- fib6_repair_tree(info->nl_net, table, fn);
+ if (fn) {
+ if (fn->fn_flags & RTN_TL_ROOT) {
+ if (!rcu_access_pointer(fn->leaf)) {
+ rcu_assign_pointer(fn->leaf,
+ info->nl_net->ipv6.ip6_null_entry);
+ }
+ } else if (!(fn->fn_flags & (RTN_RTINFO|RTN_ROOT))) {
+ fib6_repair_tree(info->nl_net, table, fn);
+ }
+ }
/* Always release dst as dst->__refcnt is guaranteed
* to be taken before entering this function
*/
@@ -1685,11 +1703,18 @@ static void fib6_del_route(struct fib6_table *table, struct fib6_node *fn,
}
read_unlock(&net->ipv6.fib6_walker_lock);
- /* If it was last route, expunge its radix tree node */
+ /* If it was last route:
+ * 1. For root node, put back null_entry as how the table was created.
+ * 2. For other nodes, expunge its radix tree node.
+ */
if (!rcu_access_pointer(fn->leaf)) {
- fn->fn_flags &= ~RTN_RTINFO;
- net->ipv6.rt6_stats->fib_route_nodes--;
- fn = fib6_repair_tree(net, table, fn);
+ if (fn->fn_flags & RTN_TL_ROOT) {
+ rcu_assign_pointer(fn->leaf, net->ipv6.ip6_null_entry);
+ } else {
+ fn->fn_flags &= ~RTN_RTINFO;
+ net->ipv6.rt6_stats->fib_route_nodes--;
+ fn = fib6_repair_tree(net, table, fn);
+ }
}
fib6_purge_rt(rt, fn, net);
--
2.16.0.rc0.223.g4a4ac83678-goog
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH net] ipv6: remove null_entry before adding default route
2018-01-06 1:38 [PATCH net] ipv6: remove null_entry before adding default route Wei Wang
@ 2018-01-06 7:42 ` Martin KaFai Lau
2018-01-07 1:41 ` Wei Wang
0 siblings, 1 reply; 5+ messages in thread
From: Martin KaFai Lau @ 2018-01-06 7:42 UTC (permalink / raw)
To: Wei Wang; +Cc: David Miller, netdev, Eric Dumazet
On Fri, Jan 05, 2018 at 05:38:35PM -0800, Wei Wang wrote:
> From: Wei Wang <weiwan@google.com>
>
> In the current code, when creating a new fib6 table, tb6_root.leaf gets
> initialized to net->ipv6.ip6_null_entry.
> If a default route is being added with rt->rt6i_metric = 0xffffffff,
> fib6_add() will add this route after net->ipv6.ip6_null_entry. As
> null_entry is shared, it could cause problem.
>
> In order to fix it, set fn->leaf to NULL before calling
> fib6_add_rt2node() when trying to add the first default route.
> And reset fn->leaf to null_entry when adding fails or when deleting the
> last default route.
>
> syzkaller reported the following issue which is fixed by this commit:
> =============================
> WARNING: suspicious RCU usage
> 4.15.0-rc5+ #171 Not tainted
> -----------------------------
> net/ipv6/ip6_fib.c:1702 suspicious rcu_dereference_protected() usage!
>
> other info that might help us debug this:
>
> rcu_scheduler_active = 2, debug_locks = 1
> 4 locks held by swapper/0/0:
> #0: ((&net->ipv6.ip6_fib_timer)){+.-.}, at: [<00000000d43f631b>] lockdep_copy_map include/linux/lockdep.h:178 [inline]
> #0: ((&net->ipv6.ip6_fib_timer)){+.-.}, at: [<00000000d43f631b>] call_timer_fn+0x1c6/0x820 kernel/time/timer.c:1310
> #1: (&(&net->ipv6.fib6_gc_lock)->rlock){+.-.}, at: [<000000002ff9d65c>] spin_lock_bh include/linux/spinlock.h:315 [inline]
> #1: (&(&net->ipv6.fib6_gc_lock)->rlock){+.-.}, at: [<000000002ff9d65c>] fib6_run_gc+0x9d/0x3c0 net/ipv6/ip6_fib.c:2007
> #2: (rcu_read_lock){....}, at: [<0000000091db762d>] __fib6_clean_all+0x0/0x3a0 net/ipv6/ip6_fib.c:1560
> #3: (&(&tb->tb6_lock)->rlock){+.-.}, at: [<000000009e503581>] spin_lock_bh include/linux/spinlock.h:315 [inline]
> #3: (&(&tb->tb6_lock)->rlock){+.-.}, at: [<000000009e503581>] __fib6_clean_all+0x1d0/0x3a0 net/ipv6/ip6_fib.c:1948
>
> stack backtrace:
> CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.15.0-rc5+ #171
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
> Call Trace:
> <IRQ>
> __dump_stack lib/dump_stack.c:17 [inline]
> dump_stack+0x194/0x257 lib/dump_stack.c:53
> lockdep_rcu_suspicious+0x123/0x170 kernel/locking/lockdep.c:4585
> fib6_del+0xcaa/0x11b0 net/ipv6/ip6_fib.c:1701
> fib6_clean_node+0x3aa/0x4f0 net/ipv6/ip6_fib.c:1892
> fib6_walk_continue+0x46c/0x8a0 net/ipv6/ip6_fib.c:1815
> fib6_walk+0x91/0xf0 net/ipv6/ip6_fib.c:1863
> fib6_clean_tree+0x1e6/0x340 net/ipv6/ip6_fib.c:1933
> __fib6_clean_all+0x1f4/0x3a0 net/ipv6/ip6_fib.c:1949
> fib6_clean_all net/ipv6/ip6_fib.c:1960 [inline]
> fib6_run_gc+0x16b/0x3c0 net/ipv6/ip6_fib.c:2016
> fib6_gc_timer_cb+0x20/0x30 net/ipv6/ip6_fib.c:2033
> call_timer_fn+0x228/0x820 kernel/time/timer.c:1320
> expire_timers kernel/time/timer.c:1357 [inline]
> __run_timers+0x7ee/0xb70 kernel/time/timer.c:1660
> run_timer_softirq+0x4c/0xb0 kernel/time/timer.c:1686
> __do_softirq+0x2d7/0xb85 kernel/softirq.c:285
> invoke_softirq kernel/softirq.c:365 [inline]
> irq_exit+0x1cc/0x200 kernel/softirq.c:405
> exiting_irq arch/x86/include/asm/apic.h:540 [inline]
> smp_apic_timer_interrupt+0x16b/0x700 arch/x86/kernel/apic/apic.c:1052
> apic_timer_interrupt+0xa9/0xb0 arch/x86/entry/entry_64.S:904
> </IRQ>
>
> Reported-by: syzbot <syzkaller@googlegroups.com>
> Fixes: 66f5d6ce53e6 ("ipv6: replace rwlock with rcu and spinlock in fib6_table")
> Signed-off-by: Wei Wang <weiwan@google.com>
> ---
> net/ipv6/ip6_fib.c | 45 +++++++++++++++++++++++++++++++++++----------
> 1 file changed, 35 insertions(+), 10 deletions(-)
>
> diff --git a/net/ipv6/ip6_fib.c b/net/ipv6/ip6_fib.c
> index d11a5578e4f8..37cb4ad1ea29 100644
> --- a/net/ipv6/ip6_fib.c
> +++ b/net/ipv6/ip6_fib.c
> @@ -640,6 +640,11 @@ static struct fib6_node *fib6_add_1(struct net *net,
> if (!(fn->fn_flags & RTN_RTINFO)) {
> RCU_INIT_POINTER(fn->leaf, NULL);
> rt6_release(leaf);
> + /* remove null_entry in the root node */
> + } else if (fn->fn_flags & RTN_TL_ROOT &&
> + rcu_access_pointer(fn->leaf) ==
> + net->ipv6.ip6_null_entry) {
> + RCU_INIT_POINTER(fn->leaf, NULL);
It seems the reader side could see tb6_root.leaf == NULL after
this change and I think it should be fine? If it is, instead
of switching betwen NULL and ip6_null_entry, would it be simpler
to always set tb6_root.leaf to NULL until a legit default route
is added?
> }
>
> return fn;
> @@ -1270,14 +1275,27 @@ int fib6_add(struct fib6_node *root, struct rt6_info *rt,
> return err;
>
> failure:
> - /* fn->leaf could be NULL if fn is an intermediate node and we
> - * failed to add the new route to it in both subtree creation
> - * failure and fib6_add_rt2node() failure case.
> - * In both cases, fib6_repair_tree() should be called to fix
> + /* fn->leaf could be NULL if:
> + * 1. fn is the root node in the table and we fail to add the default
> + * route to it.
> + * In this case, we put fn->leaf back to net->ipv6.ip6_null_entry as
> + * the way the table was created.
> + * 2. fn is an intermediate node and we failed to add the new
> + * route to it in both subtree creation failure and fib6_add_rt2node()
> + * failure case.
> + * In this case, fib6_repair_tree() should be called to fix
> * fn->leaf.
> */
> - if (fn && !(fn->fn_flags & (RTN_RTINFO|RTN_ROOT)))
> - fib6_repair_tree(info->nl_net, table, fn);
> + if (fn) {
> + if (fn->fn_flags & RTN_TL_ROOT) {
> + if (!rcu_access_pointer(fn->leaf)) {
> + rcu_assign_pointer(fn->leaf,
> + info->nl_net->ipv6.ip6_null_entry);
> + }
> + } else if (!(fn->fn_flags & (RTN_RTINFO|RTN_ROOT))) {
> + fib6_repair_tree(info->nl_net, table, fn);
> + }
> + }
> /* Always release dst as dst->__refcnt is guaranteed
> * to be taken before entering this function
> */
> @@ -1685,11 +1703,18 @@ static void fib6_del_route(struct fib6_table *table, struct fib6_node *fn,
> }
> read_unlock(&net->ipv6.fib6_walker_lock);
>
> - /* If it was last route, expunge its radix tree node */
> + /* If it was last route:
> + * 1. For root node, put back null_entry as how the table was created.
> + * 2. For other nodes, expunge its radix tree node.
> + */
> if (!rcu_access_pointer(fn->leaf)) {
> - fn->fn_flags &= ~RTN_RTINFO;
> - net->ipv6.rt6_stats->fib_route_nodes--;
> - fn = fib6_repair_tree(net, table, fn);
> + if (fn->fn_flags & RTN_TL_ROOT) {
> + rcu_assign_pointer(fn->leaf, net->ipv6.ip6_null_entry);
> + } else {
> + fn->fn_flags &= ~RTN_RTINFO;
> + net->ipv6.rt6_stats->fib_route_nodes--;
> + fn = fib6_repair_tree(net, table, fn);
> + }
> }
>
> fib6_purge_rt(rt, fn, net);
> --
> 2.16.0.rc0.223.g4a4ac83678-goog
>
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH net] ipv6: remove null_entry before adding default route
2018-01-06 7:42 ` Martin KaFai Lau
@ 2018-01-07 1:41 ` Wei Wang
2018-01-07 6:16 ` Martin KaFai Lau
0 siblings, 1 reply; 5+ messages in thread
From: Wei Wang @ 2018-01-07 1:41 UTC (permalink / raw)
To: Martin KaFai Lau
Cc: David Miller, Linux Kernel Network Developers, Eric Dumazet
On Fri, Jan 5, 2018 at 11:42 PM, Martin KaFai Lau <kafai@fb.com> wrote:
> On Fri, Jan 05, 2018 at 05:38:35PM -0800, Wei Wang wrote:
>> From: Wei Wang <weiwan@google.com>
>>
>> In the current code, when creating a new fib6 table, tb6_root.leaf gets
>> initialized to net->ipv6.ip6_null_entry.
>> If a default route is being added with rt->rt6i_metric = 0xffffffff,
>> fib6_add() will add this route after net->ipv6.ip6_null_entry. As
>> null_entry is shared, it could cause problem.
>>
>> In order to fix it, set fn->leaf to NULL before calling
>> fib6_add_rt2node() when trying to add the first default route.
>> And reset fn->leaf to null_entry when adding fails or when deleting the
>> last default route.
>>
>> syzkaller reported the following issue which is fixed by this commit:
>> =============================
>> WARNING: suspicious RCU usage
>> 4.15.0-rc5+ #171 Not tainted
>> -----------------------------
>> net/ipv6/ip6_fib.c:1702 suspicious rcu_dereference_protected() usage!
>>
>> other info that might help us debug this:
>>
>> rcu_scheduler_active = 2, debug_locks = 1
>> 4 locks held by swapper/0/0:
>> #0: ((&net->ipv6.ip6_fib_timer)){+.-.}, at: [<00000000d43f631b>] lockdep_copy_map include/linux/lockdep.h:178 [inline]
>> #0: ((&net->ipv6.ip6_fib_timer)){+.-.}, at: [<00000000d43f631b>] call_timer_fn+0x1c6/0x820 kernel/time/timer.c:1310
>> #1: (&(&net->ipv6.fib6_gc_lock)->rlock){+.-.}, at: [<000000002ff9d65c>] spin_lock_bh include/linux/spinlock.h:315 [inline]
>> #1: (&(&net->ipv6.fib6_gc_lock)->rlock){+.-.}, at: [<000000002ff9d65c>] fib6_run_gc+0x9d/0x3c0 net/ipv6/ip6_fib.c:2007
>> #2: (rcu_read_lock){....}, at: [<0000000091db762d>] __fib6_clean_all+0x0/0x3a0 net/ipv6/ip6_fib.c:1560
>> #3: (&(&tb->tb6_lock)->rlock){+.-.}, at: [<000000009e503581>] spin_lock_bh include/linux/spinlock.h:315 [inline]
>> #3: (&(&tb->tb6_lock)->rlock){+.-.}, at: [<000000009e503581>] __fib6_clean_all+0x1d0/0x3a0 net/ipv6/ip6_fib.c:1948
>>
>> stack backtrace:
>> CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.15.0-rc5+ #171
>> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
>> Call Trace:
>> <IRQ>
>> __dump_stack lib/dump_stack.c:17 [inline]
>> dump_stack+0x194/0x257 lib/dump_stack.c:53
>> lockdep_rcu_suspicious+0x123/0x170 kernel/locking/lockdep.c:4585
>> fib6_del+0xcaa/0x11b0 net/ipv6/ip6_fib.c:1701
>> fib6_clean_node+0x3aa/0x4f0 net/ipv6/ip6_fib.c:1892
>> fib6_walk_continue+0x46c/0x8a0 net/ipv6/ip6_fib.c:1815
>> fib6_walk+0x91/0xf0 net/ipv6/ip6_fib.c:1863
>> fib6_clean_tree+0x1e6/0x340 net/ipv6/ip6_fib.c:1933
>> __fib6_clean_all+0x1f4/0x3a0 net/ipv6/ip6_fib.c:1949
>> fib6_clean_all net/ipv6/ip6_fib.c:1960 [inline]
>> fib6_run_gc+0x16b/0x3c0 net/ipv6/ip6_fib.c:2016
>> fib6_gc_timer_cb+0x20/0x30 net/ipv6/ip6_fib.c:2033
>> call_timer_fn+0x228/0x820 kernel/time/timer.c:1320
>> expire_timers kernel/time/timer.c:1357 [inline]
>> __run_timers+0x7ee/0xb70 kernel/time/timer.c:1660
>> run_timer_softirq+0x4c/0xb0 kernel/time/timer.c:1686
>> __do_softirq+0x2d7/0xb85 kernel/softirq.c:285
>> invoke_softirq kernel/softirq.c:365 [inline]
>> irq_exit+0x1cc/0x200 kernel/softirq.c:405
>> exiting_irq arch/x86/include/asm/apic.h:540 [inline]
>> smp_apic_timer_interrupt+0x16b/0x700 arch/x86/kernel/apic/apic.c:1052
>> apic_timer_interrupt+0xa9/0xb0 arch/x86/entry/entry_64.S:904
>> </IRQ>
>>
>> Reported-by: syzbot <syzkaller@googlegroups.com>
>> Fixes: 66f5d6ce53e6 ("ipv6: replace rwlock with rcu and spinlock in fib6_table")
>> Signed-off-by: Wei Wang <weiwan@google.com>
>> ---
>> net/ipv6/ip6_fib.c | 45 +++++++++++++++++++++++++++++++++++----------
>> 1 file changed, 35 insertions(+), 10 deletions(-)
>>
>> diff --git a/net/ipv6/ip6_fib.c b/net/ipv6/ip6_fib.c
>> index d11a5578e4f8..37cb4ad1ea29 100644
>> --- a/net/ipv6/ip6_fib.c
>> +++ b/net/ipv6/ip6_fib.c
>> @@ -640,6 +640,11 @@ static struct fib6_node *fib6_add_1(struct net *net,
>> if (!(fn->fn_flags & RTN_RTINFO)) {
>> RCU_INIT_POINTER(fn->leaf, NULL);
>> rt6_release(leaf);
>> + /* remove null_entry in the root node */
>> + } else if (fn->fn_flags & RTN_TL_ROOT &&
>> + rcu_access_pointer(fn->leaf) ==
>> + net->ipv6.ip6_null_entry) {
>> + RCU_INIT_POINTER(fn->leaf, NULL);
> It seems the reader side could see tb6_root.leaf == NULL after
> this change and I think it should be fine? If it is, instead
> of switching betwen NULL and ip6_null_entry, would it be simpler
> to always set tb6_root.leaf to NULL until a legit default route
> is added?
>
Agree that reader side could see tb6_root.leaf == NULL and it should
be fine cause the lookup code path takes care of NULL case.
If we init tb6_root to NULL, it does seem simpler. The only place we
need to take care is that fib6_del() should not try to expunge fn if
is root node if it does not contain any route. But does it seem weird
that tb6_root.fn_flags is marked with RTN_RTINFO without containing
any route?
>> }
>>
>> return fn;
>> @@ -1270,14 +1275,27 @@ int fib6_add(struct fib6_node *root, struct rt6_info *rt,
>> return err;
>>
>> failure:
>> - /* fn->leaf could be NULL if fn is an intermediate node and we
>> - * failed to add the new route to it in both subtree creation
>> - * failure and fib6_add_rt2node() failure case.
>> - * In both cases, fib6_repair_tree() should be called to fix
>> + /* fn->leaf could be NULL if:
>> + * 1. fn is the root node in the table and we fail to add the default
>> + * route to it.
>> + * In this case, we put fn->leaf back to net->ipv6.ip6_null_entry as
>> + * the way the table was created.
>> + * 2. fn is an intermediate node and we failed to add the new
>> + * route to it in both subtree creation failure and fib6_add_rt2node()
>> + * failure case.
>> + * In this case, fib6_repair_tree() should be called to fix
>> * fn->leaf.
>> */
>> - if (fn && !(fn->fn_flags & (RTN_RTINFO|RTN_ROOT)))
>> - fib6_repair_tree(info->nl_net, table, fn);
>> + if (fn) {
>> + if (fn->fn_flags & RTN_TL_ROOT) {
>> + if (!rcu_access_pointer(fn->leaf)) {
>> + rcu_assign_pointer(fn->leaf,
>> + info->nl_net->ipv6.ip6_null_entry);
>> + }
>> + } else if (!(fn->fn_flags & (RTN_RTINFO|RTN_ROOT))) {
>> + fib6_repair_tree(info->nl_net, table, fn);
>> + }
>> + }
>> /* Always release dst as dst->__refcnt is guaranteed
>> * to be taken before entering this function
>> */
>> @@ -1685,11 +1703,18 @@ static void fib6_del_route(struct fib6_table *table, struct fib6_node *fn,
>> }
>> read_unlock(&net->ipv6.fib6_walker_lock);
>>
>> - /* If it was last route, expunge its radix tree node */
>> + /* If it was last route:
>> + * 1. For root node, put back null_entry as how the table was created.
>> + * 2. For other nodes, expunge its radix tree node.
>> + */
>> if (!rcu_access_pointer(fn->leaf)) {
>> - fn->fn_flags &= ~RTN_RTINFO;
>> - net->ipv6.rt6_stats->fib_route_nodes--;
>> - fn = fib6_repair_tree(net, table, fn);
>> + if (fn->fn_flags & RTN_TL_ROOT) {
>> + rcu_assign_pointer(fn->leaf, net->ipv6.ip6_null_entry);
>> + } else {
>> + fn->fn_flags &= ~RTN_RTINFO;
>> + net->ipv6.rt6_stats->fib_route_nodes--;
>> + fn = fib6_repair_tree(net, table, fn);
>> + }
>> }
>>
>> fib6_purge_rt(rt, fn, net);
>> --
>> 2.16.0.rc0.223.g4a4ac83678-goog
>>
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH net] ipv6: remove null_entry before adding default route
2018-01-07 1:41 ` Wei Wang
@ 2018-01-07 6:16 ` Martin KaFai Lau
2018-01-07 17:57 ` Wei Wang
0 siblings, 1 reply; 5+ messages in thread
From: Martin KaFai Lau @ 2018-01-07 6:16 UTC (permalink / raw)
To: Wei Wang; +Cc: David Miller, Linux Kernel Network Developers, Eric Dumazet
On Sat, Jan 06, 2018 at 05:41:28PM -0800, Wei Wang wrote:
> On Fri, Jan 5, 2018 at 11:42 PM, Martin KaFai Lau <kafai@fb.com> wrote:
> > On Fri, Jan 05, 2018 at 05:38:35PM -0800, Wei Wang wrote:
> >> From: Wei Wang <weiwan@google.com>
> >>
> >> In the current code, when creating a new fib6 table, tb6_root.leaf gets
> >> initialized to net->ipv6.ip6_null_entry.
> >> If a default route is being added with rt->rt6i_metric = 0xffffffff,
> >> fib6_add() will add this route after net->ipv6.ip6_null_entry. As
> >> null_entry is shared, it could cause problem.
> >>
> >> In order to fix it, set fn->leaf to NULL before calling
> >> fib6_add_rt2node() when trying to add the first default route.
> >> And reset fn->leaf to null_entry when adding fails or when deleting the
> >> last default route.
> >>
> >> syzkaller reported the following issue which is fixed by this commit:
> >> =============================
> >> WARNING: suspicious RCU usage
> >> 4.15.0-rc5+ #171 Not tainted
> >> -----------------------------
> >> net/ipv6/ip6_fib.c:1702 suspicious rcu_dereference_protected() usage!
> >>
> >> other info that might help us debug this:
> >>
> >> rcu_scheduler_active = 2, debug_locks = 1
> >> 4 locks held by swapper/0/0:
> >> #0: ((&net->ipv6.ip6_fib_timer)){+.-.}, at: [<00000000d43f631b>] lockdep_copy_map include/linux/lockdep.h:178 [inline]
> >> #0: ((&net->ipv6.ip6_fib_timer)){+.-.}, at: [<00000000d43f631b>] call_timer_fn+0x1c6/0x820 kernel/time/timer.c:1310
> >> #1: (&(&net->ipv6.fib6_gc_lock)->rlock){+.-.}, at: [<000000002ff9d65c>] spin_lock_bh include/linux/spinlock.h:315 [inline]
> >> #1: (&(&net->ipv6.fib6_gc_lock)->rlock){+.-.}, at: [<000000002ff9d65c>] fib6_run_gc+0x9d/0x3c0 net/ipv6/ip6_fib.c:2007
> >> #2: (rcu_read_lock){....}, at: [<0000000091db762d>] __fib6_clean_all+0x0/0x3a0 net/ipv6/ip6_fib.c:1560
> >> #3: (&(&tb->tb6_lock)->rlock){+.-.}, at: [<000000009e503581>] spin_lock_bh include/linux/spinlock.h:315 [inline]
> >> #3: (&(&tb->tb6_lock)->rlock){+.-.}, at: [<000000009e503581>] __fib6_clean_all+0x1d0/0x3a0 net/ipv6/ip6_fib.c:1948
> >>
> >> stack backtrace:
> >> CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.15.0-rc5+ #171
> >> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
> >> Call Trace:
> >> <IRQ>
> >> __dump_stack lib/dump_stack.c:17 [inline]
> >> dump_stack+0x194/0x257 lib/dump_stack.c:53
> >> lockdep_rcu_suspicious+0x123/0x170 kernel/locking/lockdep.c:4585
> >> fib6_del+0xcaa/0x11b0 net/ipv6/ip6_fib.c:1701
> >> fib6_clean_node+0x3aa/0x4f0 net/ipv6/ip6_fib.c:1892
> >> fib6_walk_continue+0x46c/0x8a0 net/ipv6/ip6_fib.c:1815
> >> fib6_walk+0x91/0xf0 net/ipv6/ip6_fib.c:1863
> >> fib6_clean_tree+0x1e6/0x340 net/ipv6/ip6_fib.c:1933
> >> __fib6_clean_all+0x1f4/0x3a0 net/ipv6/ip6_fib.c:1949
> >> fib6_clean_all net/ipv6/ip6_fib.c:1960 [inline]
> >> fib6_run_gc+0x16b/0x3c0 net/ipv6/ip6_fib.c:2016
> >> fib6_gc_timer_cb+0x20/0x30 net/ipv6/ip6_fib.c:2033
> >> call_timer_fn+0x228/0x820 kernel/time/timer.c:1320
> >> expire_timers kernel/time/timer.c:1357 [inline]
> >> __run_timers+0x7ee/0xb70 kernel/time/timer.c:1660
> >> run_timer_softirq+0x4c/0xb0 kernel/time/timer.c:1686
> >> __do_softirq+0x2d7/0xb85 kernel/softirq.c:285
> >> invoke_softirq kernel/softirq.c:365 [inline]
> >> irq_exit+0x1cc/0x200 kernel/softirq.c:405
> >> exiting_irq arch/x86/include/asm/apic.h:540 [inline]
> >> smp_apic_timer_interrupt+0x16b/0x700 arch/x86/kernel/apic/apic.c:1052
> >> apic_timer_interrupt+0xa9/0xb0 arch/x86/entry/entry_64.S:904
> >> </IRQ>
> >>
> >> Reported-by: syzbot <syzkaller@googlegroups.com>
> >> Fixes: 66f5d6ce53e6 ("ipv6: replace rwlock with rcu and spinlock in fib6_table")
> >> Signed-off-by: Wei Wang <weiwan@google.com>
> >> ---
> >> net/ipv6/ip6_fib.c | 45 +++++++++++++++++++++++++++++++++++----------
> >> 1 file changed, 35 insertions(+), 10 deletions(-)
> >>
> >> diff --git a/net/ipv6/ip6_fib.c b/net/ipv6/ip6_fib.c
> >> index d11a5578e4f8..37cb4ad1ea29 100644
> >> --- a/net/ipv6/ip6_fib.c
> >> +++ b/net/ipv6/ip6_fib.c
> >> @@ -640,6 +640,11 @@ static struct fib6_node *fib6_add_1(struct net *net,
> >> if (!(fn->fn_flags & RTN_RTINFO)) {
> >> RCU_INIT_POINTER(fn->leaf, NULL);
> >> rt6_release(leaf);
> >> + /* remove null_entry in the root node */
> >> + } else if (fn->fn_flags & RTN_TL_ROOT &&
> >> + rcu_access_pointer(fn->leaf) ==
> >> + net->ipv6.ip6_null_entry) {
> >> + RCU_INIT_POINTER(fn->leaf, NULL);
> > It seems the reader side could see tb6_root.leaf == NULL after
> > this change and I think it should be fine? If it is, instead
> > of switching betwen NULL and ip6_null_entry, would it be simpler
> > to always set tb6_root.leaf to NULL until a legit default route
> > is added?
> >
>
> Agree that reader side could see tb6_root.leaf == NULL and it should
> be fine cause the lookup code path takes care of NULL case.
> If we init tb6_root to NULL, it does seem simpler. The only place we
> need to take care is that fib6_del() should not try to expunge fn if
> is root node if it does not contain any route. But does it seem weird
> that tb6_root.fn_flags is marked with RTN_RTINFO without containing
> any route?
Good point on the writer's side. I think we also need to modify
fib6_add_1() to expect fn->leaf == NULL. Also, agree that
RTN_RTINFO needs to be set/clear accordingly. We also need to
make changes to the CONFIG_IPV6_SUBTREES......
It should be doable but requires more changes to break the writer's side
assumption. net-next may be a better fit for those changes.
For this patch, I have one inline comment below:
>
> >> }
> >>
> >> return fn;
> >> @@ -1270,14 +1275,27 @@ int fib6_add(struct fib6_node *root, struct rt6_info *rt,
> >> return err;
> >>
> >> failure:
> >> - /* fn->leaf could be NULL if fn is an intermediate node and we
> >> - * failed to add the new route to it in both subtree creation
> >> - * failure and fib6_add_rt2node() failure case.
> >> - * In both cases, fib6_repair_tree() should be called to fix
> >> + /* fn->leaf could be NULL if:
> >> + * 1. fn is the root node in the table and we fail to add the default
> >> + * route to it.
> >> + * In this case, we put fn->leaf back to net->ipv6.ip6_null_entry as
> >> + * the way the table was created.
> >> + * 2. fn is an intermediate node and we failed to add the new
> >> + * route to it in both subtree creation failure and fib6_add_rt2node()
> >> + * failure case.
> >> + * In this case, fib6_repair_tree() should be called to fix
> >> * fn->leaf.
> >> */
> >> - if (fn && !(fn->fn_flags & (RTN_RTINFO|RTN_ROOT)))
> >> - fib6_repair_tree(info->nl_net, table, fn);
> >> + if (fn) {
> >> + if (fn->fn_flags & RTN_TL_ROOT) {
> >> + if (!rcu_access_pointer(fn->leaf)) {
> >> + rcu_assign_pointer(fn->leaf,
> >> + info->nl_net->ipv6.ip6_null_entry);
> >> + }
> >> + } else if (!(fn->fn_flags & (RTN_RTINFO|RTN_ROOT))) {
> >> + fib6_repair_tree(info->nl_net, table, fn);
fib6_repair_tree() is responsible to ensure intermediate fn->leaf is
not NULL (or free the fn if possible). In this case, we want tb6_root.leaf
to be not NULL. Can we consolidate the tb6_root.leaf fixing codes to
fib6_repair_tree() also such that all tree repairing logic is
in one place?
> >> + }
> >> + }
> >> /* Always release dst as dst->__refcnt is guaranteed
> >> * to be taken before entering this function
> >> */
> >> @@ -1685,11 +1703,18 @@ static void fib6_del_route(struct fib6_table *table, struct fib6_node *fn,
> >> }
> >> read_unlock(&net->ipv6.fib6_walker_lock);
> >>
> >> - /* If it was last route, expunge its radix tree node */
> >> + /* If it was last route:
> >> + * 1. For root node, put back null_entry as how the table was created.
> >> + * 2. For other nodes, expunge its radix tree node.
> >> + */
> >> if (!rcu_access_pointer(fn->leaf)) {
> >> - fn->fn_flags &= ~RTN_RTINFO;
> >> - net->ipv6.rt6_stats->fib_route_nodes--;
> >> - fn = fib6_repair_tree(net, table, fn);
> >> + if (fn->fn_flags & RTN_TL_ROOT) {
> >> + rcu_assign_pointer(fn->leaf, net->ipv6.ip6_null_entry);
> >> + } else {
> >> + fn->fn_flags &= ~RTN_RTINFO;
> >> + net->ipv6.rt6_stats->fib_route_nodes--;
> >> + fn = fib6_repair_tree(net, table, fn);
> >> + }
> >> }
> >>
> >> fib6_purge_rt(rt, fn, net);
> >> --
> >> 2.16.0.rc0.223.g4a4ac83678-goog
> >>
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH net] ipv6: remove null_entry before adding default route
2018-01-07 6:16 ` Martin KaFai Lau
@ 2018-01-07 17:57 ` Wei Wang
0 siblings, 0 replies; 5+ messages in thread
From: Wei Wang @ 2018-01-07 17:57 UTC (permalink / raw)
To: Martin KaFai Lau
Cc: David Miller, Linux Kernel Network Developers, Eric Dumazet
On Sat, Jan 6, 2018 at 10:16 PM, Martin KaFai Lau <kafai@fb.com> wrote:
> On Sat, Jan 06, 2018 at 05:41:28PM -0800, Wei Wang wrote:
>> On Fri, Jan 5, 2018 at 11:42 PM, Martin KaFai Lau <kafai@fb.com> wrote:
>> > On Fri, Jan 05, 2018 at 05:38:35PM -0800, Wei Wang wrote:
>> >> From: Wei Wang <weiwan@google.com>
>> >>
>> >> In the current code, when creating a new fib6 table, tb6_root.leaf gets
>> >> initialized to net->ipv6.ip6_null_entry.
>> >> If a default route is being added with rt->rt6i_metric = 0xffffffff,
>> >> fib6_add() will add this route after net->ipv6.ip6_null_entry. As
>> >> null_entry is shared, it could cause problem.
>> >>
>> >> In order to fix it, set fn->leaf to NULL before calling
>> >> fib6_add_rt2node() when trying to add the first default route.
>> >> And reset fn->leaf to null_entry when adding fails or when deleting the
>> >> last default route.
>> >>
>> >> syzkaller reported the following issue which is fixed by this commit:
>> >> =============================
>> >> WARNING: suspicious RCU usage
>> >> 4.15.0-rc5+ #171 Not tainted
>> >> -----------------------------
>> >> net/ipv6/ip6_fib.c:1702 suspicious rcu_dereference_protected() usage!
>> >>
>> >> other info that might help us debug this:
>> >>
>> >> rcu_scheduler_active = 2, debug_locks = 1
>> >> 4 locks held by swapper/0/0:
>> >> #0: ((&net->ipv6.ip6_fib_timer)){+.-.}, at: [<00000000d43f631b>] lockdep_copy_map include/linux/lockdep.h:178 [inline]
>> >> #0: ((&net->ipv6.ip6_fib_timer)){+.-.}, at: [<00000000d43f631b>] call_timer_fn+0x1c6/0x820 kernel/time/timer.c:1310
>> >> #1: (&(&net->ipv6.fib6_gc_lock)->rlock){+.-.}, at: [<000000002ff9d65c>] spin_lock_bh include/linux/spinlock.h:315 [inline]
>> >> #1: (&(&net->ipv6.fib6_gc_lock)->rlock){+.-.}, at: [<000000002ff9d65c>] fib6_run_gc+0x9d/0x3c0 net/ipv6/ip6_fib.c:2007
>> >> #2: (rcu_read_lock){....}, at: [<0000000091db762d>] __fib6_clean_all+0x0/0x3a0 net/ipv6/ip6_fib.c:1560
>> >> #3: (&(&tb->tb6_lock)->rlock){+.-.}, at: [<000000009e503581>] spin_lock_bh include/linux/spinlock.h:315 [inline]
>> >> #3: (&(&tb->tb6_lock)->rlock){+.-.}, at: [<000000009e503581>] __fib6_clean_all+0x1d0/0x3a0 net/ipv6/ip6_fib.c:1948
>> >>
>> >> stack backtrace:
>> >> CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.15.0-rc5+ #171
>> >> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
>> >> Call Trace:
>> >> <IRQ>
>> >> __dump_stack lib/dump_stack.c:17 [inline]
>> >> dump_stack+0x194/0x257 lib/dump_stack.c:53
>> >> lockdep_rcu_suspicious+0x123/0x170 kernel/locking/lockdep.c:4585
>> >> fib6_del+0xcaa/0x11b0 net/ipv6/ip6_fib.c:1701
>> >> fib6_clean_node+0x3aa/0x4f0 net/ipv6/ip6_fib.c:1892
>> >> fib6_walk_continue+0x46c/0x8a0 net/ipv6/ip6_fib.c:1815
>> >> fib6_walk+0x91/0xf0 net/ipv6/ip6_fib.c:1863
>> >> fib6_clean_tree+0x1e6/0x340 net/ipv6/ip6_fib.c:1933
>> >> __fib6_clean_all+0x1f4/0x3a0 net/ipv6/ip6_fib.c:1949
>> >> fib6_clean_all net/ipv6/ip6_fib.c:1960 [inline]
>> >> fib6_run_gc+0x16b/0x3c0 net/ipv6/ip6_fib.c:2016
>> >> fib6_gc_timer_cb+0x20/0x30 net/ipv6/ip6_fib.c:2033
>> >> call_timer_fn+0x228/0x820 kernel/time/timer.c:1320
>> >> expire_timers kernel/time/timer.c:1357 [inline]
>> >> __run_timers+0x7ee/0xb70 kernel/time/timer.c:1660
>> >> run_timer_softirq+0x4c/0xb0 kernel/time/timer.c:1686
>> >> __do_softirq+0x2d7/0xb85 kernel/softirq.c:285
>> >> invoke_softirq kernel/softirq.c:365 [inline]
>> >> irq_exit+0x1cc/0x200 kernel/softirq.c:405
>> >> exiting_irq arch/x86/include/asm/apic.h:540 [inline]
>> >> smp_apic_timer_interrupt+0x16b/0x700 arch/x86/kernel/apic/apic.c:1052
>> >> apic_timer_interrupt+0xa9/0xb0 arch/x86/entry/entry_64.S:904
>> >> </IRQ>
>> >>
>> >> Reported-by: syzbot <syzkaller@googlegroups.com>
>> >> Fixes: 66f5d6ce53e6 ("ipv6: replace rwlock with rcu and spinlock in fib6_table")
>> >> Signed-off-by: Wei Wang <weiwan@google.com>
>> >> ---
>> >> net/ipv6/ip6_fib.c | 45 +++++++++++++++++++++++++++++++++++----------
>> >> 1 file changed, 35 insertions(+), 10 deletions(-)
>> >>
>> >> diff --git a/net/ipv6/ip6_fib.c b/net/ipv6/ip6_fib.c
>> >> index d11a5578e4f8..37cb4ad1ea29 100644
>> >> --- a/net/ipv6/ip6_fib.c
>> >> +++ b/net/ipv6/ip6_fib.c
>> >> @@ -640,6 +640,11 @@ static struct fib6_node *fib6_add_1(struct net *net,
>> >> if (!(fn->fn_flags & RTN_RTINFO)) {
>> >> RCU_INIT_POINTER(fn->leaf, NULL);
>> >> rt6_release(leaf);
>> >> + /* remove null_entry in the root node */
>> >> + } else if (fn->fn_flags & RTN_TL_ROOT &&
>> >> + rcu_access_pointer(fn->leaf) ==
>> >> + net->ipv6.ip6_null_entry) {
>> >> + RCU_INIT_POINTER(fn->leaf, NULL);
>> > It seems the reader side could see tb6_root.leaf == NULL after
>> > this change and I think it should be fine? If it is, instead
>> > of switching betwen NULL and ip6_null_entry, would it be simpler
>> > to always set tb6_root.leaf to NULL until a legit default route
>> > is added?
>> >
>>
>> Agree that reader side could see tb6_root.leaf == NULL and it should
>> be fine cause the lookup code path takes care of NULL case.
>> If we init tb6_root to NULL, it does seem simpler. The only place we
>> need to take care is that fib6_del() should not try to expunge fn if
>> is root node if it does not contain any route. But does it seem weird
>> that tb6_root.fn_flags is marked with RTN_RTINFO without containing
>> any route?
> Good point on the writer's side. I think we also need to modify
> fib6_add_1() to expect fn->leaf == NULL. Also, agree that
> RTN_RTINFO needs to be set/clear accordingly. We also need to
> make changes to the CONFIG_IPV6_SUBTREES......
> It should be doable but requires more changes to break the writer's side
> assumption. net-next may be a better fit for those changes.
>
Understood. Yes. Agree. I will give it a try on net-next.
> For this patch, I have one inline comment below:
>
>>
>> >> }
>> >>
>> >> return fn;
>> >> @@ -1270,14 +1275,27 @@ int fib6_add(struct fib6_node *root, struct rt6_info *rt,
>> >> return err;
>> >>
>> >> failure:
>> >> - /* fn->leaf could be NULL if fn is an intermediate node and we
>> >> - * failed to add the new route to it in both subtree creation
>> >> - * failure and fib6_add_rt2node() failure case.
>> >> - * In both cases, fib6_repair_tree() should be called to fix
>> >> + /* fn->leaf could be NULL if:
>> >> + * 1. fn is the root node in the table and we fail to add the default
>> >> + * route to it.
>> >> + * In this case, we put fn->leaf back to net->ipv6.ip6_null_entry as
>> >> + * the way the table was created.
>> >> + * 2. fn is an intermediate node and we failed to add the new
>> >> + * route to it in both subtree creation failure and fib6_add_rt2node()
>> >> + * failure case.
>> >> + * In this case, fib6_repair_tree() should be called to fix
>> >> * fn->leaf.
>> >> */
>> >> - if (fn && !(fn->fn_flags & (RTN_RTINFO|RTN_ROOT)))
>> >> - fib6_repair_tree(info->nl_net, table, fn);
>> >> + if (fn) {
>> >> + if (fn->fn_flags & RTN_TL_ROOT) {
>> >> + if (!rcu_access_pointer(fn->leaf)) {
>> >> + rcu_assign_pointer(fn->leaf,
>> >> + info->nl_net->ipv6.ip6_null_entry);
>> >> + }
>> >> + } else if (!(fn->fn_flags & (RTN_RTINFO|RTN_ROOT))) {
>> >> + fib6_repair_tree(info->nl_net, table, fn);
> fib6_repair_tree() is responsible to ensure intermediate fn->leaf is
> not NULL (or free the fn if possible). In this case, we want tb6_root.leaf
> to be not NULL. Can we consolidate the tb6_root.leaf fixing codes to
> fib6_repair_tree() also such that all tree repairing logic is
> in one place?
>
Yes. Good idea. I will send out v2 for this.
>> >> + }
>> >> + }
>> >> /* Always release dst as dst->__refcnt is guaranteed
>> >> * to be taken before entering this function
>> >> */
>> >> @@ -1685,11 +1703,18 @@ static void fib6_del_route(struct fib6_table *table, struct fib6_node *fn,
>> >> }
>> >> read_unlock(&net->ipv6.fib6_walker_lock);
>> >>
>> >> - /* If it was last route, expunge its radix tree node */
>> >> + /* If it was last route:
>> >> + * 1. For root node, put back null_entry as how the table was created.
>> >> + * 2. For other nodes, expunge its radix tree node.
>> >> + */
>> >> if (!rcu_access_pointer(fn->leaf)) {
>> >> - fn->fn_flags &= ~RTN_RTINFO;
>> >> - net->ipv6.rt6_stats->fib_route_nodes--;
>> >> - fn = fib6_repair_tree(net, table, fn);
>> >> + if (fn->fn_flags & RTN_TL_ROOT) {
>> >> + rcu_assign_pointer(fn->leaf, net->ipv6.ip6_null_entry);
>> >> + } else {
>> >> + fn->fn_flags &= ~RTN_RTINFO;
>> >> + net->ipv6.rt6_stats->fib_route_nodes--;
>> >> + fn = fib6_repair_tree(net, table, fn);
>> >> + }
>> >> }
>> >>
>> >> fib6_purge_rt(rt, fn, net);
>> >> --
>> >> 2.16.0.rc0.223.g4a4ac83678-goog
>> >>
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2018-01-07 17:57 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-01-06 1:38 [PATCH net] ipv6: remove null_entry before adding default route Wei Wang
2018-01-06 7:42 ` Martin KaFai Lau
2018-01-07 1:41 ` Wei Wang
2018-01-07 6:16 ` Martin KaFai Lau
2018-01-07 17:57 ` Wei Wang
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.