* [PATCH net-next] ipv6: Do not use this_cpu_ptr() in preemptible context
@ 2017-10-08 15:18 Ido Schimmel
2017-10-08 16:03 ` Eric Dumazet
0 siblings, 1 reply; 7+ messages in thread
From: Ido Schimmel @ 2017-10-08 15:18 UTC (permalink / raw)
To: netdev; +Cc: davem, weiwan, mlxsw, Ido Schimmel
Without the rwlock and with PREEMPT_RCU we're no longer guaranteed to be
in non-preemptible context when performing a route lookup, so use
raw_cpu_ptr() instead.
Takes care of the following splat:
[ 122.221814] BUG: using smp_processor_id() in preemptible [00000000] code: sshd/2672
[ 122.221845] caller is debug_smp_processor_id+0x17/0x20
[ 122.221866] CPU: 0 PID: 2672 Comm: sshd Not tainted 4.14.0-rc3-idosch-next-custom #639
[ 122.221880] Hardware name: Mellanox Technologies Ltd. MSN2100-CB2FO/SA001017, BIOS 5.6.5 06/07/2016
[ 122.221893] Call Trace:
[ 122.221919] dump_stack+0xb1/0x10c
[ 122.221946] ? _atomic_dec_and_lock+0x124/0x124
[ 122.221974] ? ___ratelimit+0xfe/0x240
[ 122.222020] check_preemption_disabled+0x173/0x1b0
[ 122.222060] debug_smp_processor_id+0x17/0x20
[ 122.222083] ip6_pol_route+0x1482/0x24a0
...
Fixes: 66f5d6ce53e6 ("ipv6: replace rwlock with rcu and spinlock in fib6_table")
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
---
net/ipv6/route.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/net/ipv6/route.c b/net/ipv6/route.c
index 399d1bceec4a..579d4b73beb1 100644
--- a/net/ipv6/route.c
+++ b/net/ipv6/route.c
@@ -1112,7 +1112,7 @@ static struct rt6_info *rt6_get_pcpu_route(struct rt6_info *rt)
{
struct rt6_info *pcpu_rt, **p;
- p = this_cpu_ptr(rt->rt6i_pcpu);
+ p = raw_cpu_ptr(rt->rt6i_pcpu);
pcpu_rt = *p;
if (pcpu_rt && ip6_hold_safe(NULL, &pcpu_rt, false))
@@ -1134,7 +1134,7 @@ static struct rt6_info *rt6_make_pcpu_route(struct rt6_info *rt)
}
dst_hold(&pcpu_rt->dst);
- p = this_cpu_ptr(rt->rt6i_pcpu);
+ p = raw_cpu_ptr(rt->rt6i_pcpu);
prev = cmpxchg(p, NULL, pcpu_rt);
if (prev) {
/* If someone did it before us, return prev instead */
--
2.13.6
^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH net-next] ipv6: Do not use this_cpu_ptr() in preemptible context
2017-10-08 15:18 [PATCH net-next] ipv6: Do not use this_cpu_ptr() in preemptible context Ido Schimmel
@ 2017-10-08 16:03 ` Eric Dumazet
2017-10-08 16:54 ` Ido Schimmel
2017-10-09 4:07 ` [PATCH net-next] ipv6: fix a BUG in rt6_get_pcpu_route() Eric Dumazet
0 siblings, 2 replies; 7+ messages in thread
From: Eric Dumazet @ 2017-10-08 16:03 UTC (permalink / raw)
To: Ido Schimmel; +Cc: netdev, davem, weiwan, mlxsw
On Sun, 2017-10-08 at 18:18 +0300, Ido Schimmel wrote:
> Without the rwlock and with PREEMPT_RCU we're no longer guaranteed to be
> in non-preemptible context when performing a route lookup, so use
> raw_cpu_ptr() instead.
>
> Takes care of the following splat:
> [ 122.221814] BUG: using smp_processor_id() in preemptible [00000000] code: sshd/2672
> [ 122.221845] caller is debug_smp_processor_id+0x17/0x20
> [ 122.221866] CPU: 0 PID: 2672 Comm: sshd Not tainted 4.14.0-rc3-idosch-next-custom #639
> [ 122.221880] Hardware name: Mellanox Technologies Ltd. MSN2100-CB2FO/SA001017, BIOS 5.6.5 06/07/2016
> [ 122.221893] Call Trace:
> [ 122.221919] dump_stack+0xb1/0x10c
> [ 122.221946] ? _atomic_dec_and_lock+0x124/0x124
> [ 122.221974] ? ___ratelimit+0xfe/0x240
> [ 122.222020] check_preemption_disabled+0x173/0x1b0
> [ 122.222060] debug_smp_processor_id+0x17/0x20
> [ 122.222083] ip6_pol_route+0x1482/0x24a0
> ...
>
> Fixes: 66f5d6ce53e6 ("ipv6: replace rwlock with rcu and spinlock in fib6_table")
> Signed-off-by: Ido Schimmel <idosch@mellanox.com>
> ---
Thanks Ido for this patch.
IMO, we no longer play this read_lock() -> write_lock() game since
ip6_dst_gc() could be called from rt6_make_pcpu_route()
So we might simplify things quite a bit, by blocking BH (and thus
preventing preemption)
Something like :
net/ipv6/route.c | 26 ++++++--------------------
1 file changed, 6 insertions(+), 20 deletions(-)
diff --git a/net/ipv6/route.c b/net/ipv6/route.c
index 399d1bceec4a6e6736c367e706dd2acbd4093d58..606e80325b21c0e10a02e9c7d5b3fcfbfc26a003 100644
--- a/net/ipv6/route.c
+++ b/net/ipv6/route.c
@@ -1136,15 +1136,7 @@ static struct rt6_info *rt6_make_pcpu_route(struct rt6_info *rt)
dst_hold(&pcpu_rt->dst);
p = this_cpu_ptr(rt->rt6i_pcpu);
prev = cmpxchg(p, NULL, pcpu_rt);
- if (prev) {
- /* If someone did it before us, return prev instead */
- /* release refcnt taken by ip6_rt_pcpu_alloc() */
- dst_release_immediate(&pcpu_rt->dst);
- /* release refcnt taken by above dst_hold() */
- dst_release_immediate(&pcpu_rt->dst);
- dst_hold(&prev->dst);
- pcpu_rt = prev;
- }
+ BUG_ON(prev);
rt6_dst_from_metrics_check(pcpu_rt);
return pcpu_rt;
@@ -1739,31 +1731,25 @@ struct rt6_info *ip6_pol_route(struct net *net, struct fib6_table *table,
struct rt6_info *pcpu_rt;
dst_use_noref(&rt->dst, jiffies);
+ local_bh_disable();
pcpu_rt = rt6_get_pcpu_route(rt);
- if (pcpu_rt) {
- rcu_read_unlock();
- } else {
+ if (!pcpu_rt) {
/* atomic_inc_not_zero() is needed when using rcu */
if (atomic_inc_not_zero(&rt->rt6i_ref)) {
- /* We have to do the read_unlock first
- * because rt6_make_pcpu_route() may trigger
- * ip6_dst_gc() which will take the write_lock.
- *
- * No dst_hold() on rt is needed because grabbing
+ /* No dst_hold() on rt is needed because grabbing
* rt->rt6i_ref makes sure rt can't be released.
*/
- rcu_read_unlock();
pcpu_rt = rt6_make_pcpu_route(rt);
rt6_release(rt);
} else {
/* rt is already removed from tree */
- rcu_read_unlock();
pcpu_rt = net->ipv6.ip6_null_entry;
dst_hold(&pcpu_rt->dst);
}
}
-
+ local_bh_enable();
+ rcu_read_unlock();
trace_fib6_table_lookup(net, pcpu_rt, table->tb6_id, fl6);
return pcpu_rt;
}
^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH net-next] ipv6: Do not use this_cpu_ptr() in preemptible context
2017-10-08 16:03 ` Eric Dumazet
@ 2017-10-08 16:54 ` Ido Schimmel
2017-10-08 18:25 ` Eric Dumazet
2017-10-09 4:07 ` [PATCH net-next] ipv6: fix a BUG in rt6_get_pcpu_route() Eric Dumazet
1 sibling, 1 reply; 7+ messages in thread
From: Ido Schimmel @ 2017-10-08 16:54 UTC (permalink / raw)
To: Eric Dumazet; +Cc: Ido Schimmel, netdev, davem, weiwan, mlxsw
Hi Eric,
On Sun, Oct 08, 2017 at 09:03:53AM -0700, Eric Dumazet wrote:
> Thanks Ido for this patch.
>
> IMO, we no longer play this read_lock() -> write_lock() game since
> ip6_dst_gc() could be called from rt6_make_pcpu_route()
Right, cause we can't deadlock anymore as with the rwlock.
>
> So we might simplify things quite a bit, by blocking BH (and thus
> preventing preemption)
>
> Something like :
>
> net/ipv6/route.c | 26 ++++++--------------------
> 1 file changed, 6 insertions(+), 20 deletions(-)
>
> diff --git a/net/ipv6/route.c b/net/ipv6/route.c
> index 399d1bceec4a6e6736c367e706dd2acbd4093d58..606e80325b21c0e10a02e9c7d5b3fcfbfc26a003 100644
> --- a/net/ipv6/route.c
> +++ b/net/ipv6/route.c
> @@ -1136,15 +1136,7 @@ static struct rt6_info *rt6_make_pcpu_route(struct rt6_info *rt)
> dst_hold(&pcpu_rt->dst);
> p = this_cpu_ptr(rt->rt6i_pcpu);
> prev = cmpxchg(p, NULL, pcpu_rt);
> - if (prev) {
> - /* If someone did it before us, return prev instead */
> - /* release refcnt taken by ip6_rt_pcpu_alloc() */
> - dst_release_immediate(&pcpu_rt->dst);
> - /* release refcnt taken by above dst_hold() */
> - dst_release_immediate(&pcpu_rt->dst);
> - dst_hold(&prev->dst);
> - pcpu_rt = prev;
> - }
> + BUG_ON(prev);
Is this BUG_ON() now valid because of the local_bh_disable() in
ip6_pol_route()?
>
> rt6_dst_from_metrics_check(pcpu_rt);
> return pcpu_rt;
> @@ -1739,31 +1731,25 @@ struct rt6_info *ip6_pol_route(struct net *net, struct fib6_table *table,
> struct rt6_info *pcpu_rt;
>
> dst_use_noref(&rt->dst, jiffies);
> + local_bh_disable();
> pcpu_rt = rt6_get_pcpu_route(rt);
>
> - if (pcpu_rt) {
> - rcu_read_unlock();
> - } else {
> + if (!pcpu_rt) {
> /* atomic_inc_not_zero() is needed when using rcu */
> if (atomic_inc_not_zero(&rt->rt6i_ref)) {
> - /* We have to do the read_unlock first
> - * because rt6_make_pcpu_route() may trigger
> - * ip6_dst_gc() which will take the write_lock.
> - *
> - * No dst_hold() on rt is needed because grabbing
> + /* No dst_hold() on rt is needed because grabbing
> * rt->rt6i_ref makes sure rt can't be released.
> */
> - rcu_read_unlock();
> pcpu_rt = rt6_make_pcpu_route(rt);
> rt6_release(rt);
> } else {
> /* rt is already removed from tree */
> - rcu_read_unlock();
> pcpu_rt = net->ipv6.ip6_null_entry;
> dst_hold(&pcpu_rt->dst);
> }
> }
> -
> + local_bh_enable();
> + rcu_read_unlock();
> trace_fib6_table_lookup(net, pcpu_rt, table->tb6_id, fl6);
> return pcpu_rt;
> }
I replaced my patch with yours and I don't trigger the bug anymore. Feel
free to add my tag:
Tested-by: Ido Schimmel <idosch@mellanox.com>
Thanks!
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH net-next] ipv6: Do not use this_cpu_ptr() in preemptible context
2017-10-08 16:54 ` Ido Schimmel
@ 2017-10-08 18:25 ` Eric Dumazet
0 siblings, 0 replies; 7+ messages in thread
From: Eric Dumazet @ 2017-10-08 18:25 UTC (permalink / raw)
To: Ido Schimmel; +Cc: Ido Schimmel, netdev, davem, weiwan, mlxsw
On Sun, 2017-10-08 at 19:54 +0300, Ido Schimmel wrote:
> Hi Eric,
> > prev = cmpxchg(p, NULL, pcpu_rt);
> > - if (prev) {
> > - /* If someone did it before us, return prev instead */
> > - /* release refcnt taken by ip6_rt_pcpu_alloc() */
> > - dst_release_immediate(&pcpu_rt->dst);
> > - /* release refcnt taken by above dst_hold() */
> > - dst_release_immediate(&pcpu_rt->dst);
> > - dst_hold(&prev->dst);
> > - pcpu_rt = prev;
> > - }
> > + BUG_ON(prev);
>
> Is this BUG_ON() now valid because of the local_bh_disable() in
> ip6_pol_route()?
Yes, this bug to trigger would need this code be re-entered from a hard
IRQ, and that would be wrong of course.
^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH net-next] ipv6: fix a BUG in rt6_get_pcpu_route()
2017-10-08 16:03 ` Eric Dumazet
2017-10-08 16:54 ` Ido Schimmel
@ 2017-10-09 4:07 ` Eric Dumazet
2017-10-09 4:09 ` David Miller
2017-10-09 17:06 ` Martin KaFai Lau
1 sibling, 2 replies; 7+ messages in thread
From: Eric Dumazet @ 2017-10-09 4:07 UTC (permalink / raw)
To: Ido Schimmel; +Cc: netdev, davem, weiwan, mlxsw
From: Eric Dumazet <edumazet@google.com>
Ido reported following splat and provided a patch.
[ 122.221814] BUG: using smp_processor_id() in preemptible [00000000] code: sshd/2672
[ 122.221845] caller is debug_smp_processor_id+0x17/0x20
[ 122.221866] CPU: 0 PID: 2672 Comm: sshd Not tainted 4.14.0-rc3-idosch-next-custom #639
[ 122.221880] Hardware name: Mellanox Technologies Ltd. MSN2100-CB2FO/SA001017, BIOS 5.6.5 06/07/2016
[ 122.221893] Call Trace:
[ 122.221919] dump_stack+0xb1/0x10c
[ 122.221946] ? _atomic_dec_and_lock+0x124/0x124
[ 122.221974] ? ___ratelimit+0xfe/0x240
[ 122.222020] check_preemption_disabled+0x173/0x1b0
[ 122.222060] debug_smp_processor_id+0x17/0x20
[ 122.222083] ip6_pol_route+0x1482/0x24a0
...
I believe we can simplify this code path a bit, since we no longer
hold a read_lock and need to release it to avoid a dead lock.
By disabling BH, we make sure we'll prevent code re-entry and
rt6_get_pcpu_route()/rt6_make_pcpu_route() run on the same cpu.
Fixes: 66f5d6ce53e6 ("ipv6: replace rwlock with rcu and spinlock in fib6_table")
Reported-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Tested-by: Ido Schimmel <idosch@mellanox.com>
---
net/ipv6/route.c | 26 ++++++--------------------
1 file changed, 6 insertions(+), 20 deletions(-)
diff --git a/net/ipv6/route.c b/net/ipv6/route.c
index 399d1bceec4a6e6736c367e706dd2acbd4093d58..606e80325b21c0e10a02e9c7d5b3fcfbfc26a003 100644
--- a/net/ipv6/route.c
+++ b/net/ipv6/route.c
@@ -1136,15 +1136,7 @@ static struct rt6_info *rt6_make_pcpu_route(struct rt6_info *rt)
dst_hold(&pcpu_rt->dst);
p = this_cpu_ptr(rt->rt6i_pcpu);
prev = cmpxchg(p, NULL, pcpu_rt);
- if (prev) {
- /* If someone did it before us, return prev instead */
- /* release refcnt taken by ip6_rt_pcpu_alloc() */
- dst_release_immediate(&pcpu_rt->dst);
- /* release refcnt taken by above dst_hold() */
- dst_release_immediate(&pcpu_rt->dst);
- dst_hold(&prev->dst);
- pcpu_rt = prev;
- }
+ BUG_ON(prev);
rt6_dst_from_metrics_check(pcpu_rt);
return pcpu_rt;
@@ -1739,31 +1731,25 @@ struct rt6_info *ip6_pol_route(struct net *net, struct fib6_table *table,
struct rt6_info *pcpu_rt;
dst_use_noref(&rt->dst, jiffies);
+ local_bh_disable();
pcpu_rt = rt6_get_pcpu_route(rt);
- if (pcpu_rt) {
- rcu_read_unlock();
- } else {
+ if (!pcpu_rt) {
/* atomic_inc_not_zero() is needed when using rcu */
if (atomic_inc_not_zero(&rt->rt6i_ref)) {
- /* We have to do the read_unlock first
- * because rt6_make_pcpu_route() may trigger
- * ip6_dst_gc() which will take the write_lock.
- *
- * No dst_hold() on rt is needed because grabbing
+ /* No dst_hold() on rt is needed because grabbing
* rt->rt6i_ref makes sure rt can't be released.
*/
- rcu_read_unlock();
pcpu_rt = rt6_make_pcpu_route(rt);
rt6_release(rt);
} else {
/* rt is already removed from tree */
- rcu_read_unlock();
pcpu_rt = net->ipv6.ip6_null_entry;
dst_hold(&pcpu_rt->dst);
}
}
-
+ local_bh_enable();
+ rcu_read_unlock();
trace_fib6_table_lookup(net, pcpu_rt, table->tb6_id, fl6);
return pcpu_rt;
}
^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH net-next] ipv6: fix a BUG in rt6_get_pcpu_route()
2017-10-09 4:07 ` [PATCH net-next] ipv6: fix a BUG in rt6_get_pcpu_route() Eric Dumazet
@ 2017-10-09 4:09 ` David Miller
2017-10-09 17:06 ` Martin KaFai Lau
1 sibling, 0 replies; 7+ messages in thread
From: David Miller @ 2017-10-09 4:09 UTC (permalink / raw)
To: eric.dumazet; +Cc: idosch, netdev, weiwan, mlxsw
From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Sun, 08 Oct 2017 21:07:18 -0700
> From: Eric Dumazet <edumazet@google.com>
>
> Ido reported following splat and provided a patch.
>
> [ 122.221814] BUG: using smp_processor_id() in preemptible [00000000] code: sshd/2672
> [ 122.221845] caller is debug_smp_processor_id+0x17/0x20
> [ 122.221866] CPU: 0 PID: 2672 Comm: sshd Not tainted 4.14.0-rc3-idosch-next-custom #639
> [ 122.221880] Hardware name: Mellanox Technologies Ltd. MSN2100-CB2FO/SA001017, BIOS 5.6.5 06/07/2016
> [ 122.221893] Call Trace:
> [ 122.221919] dump_stack+0xb1/0x10c
> [ 122.221946] ? _atomic_dec_and_lock+0x124/0x124
> [ 122.221974] ? ___ratelimit+0xfe/0x240
> [ 122.222020] check_preemption_disabled+0x173/0x1b0
> [ 122.222060] debug_smp_processor_id+0x17/0x20
> [ 122.222083] ip6_pol_route+0x1482/0x24a0
> ...
>
> I believe we can simplify this code path a bit, since we no longer
> hold a read_lock and need to release it to avoid a dead lock.
>
> By disabling BH, we make sure we'll prevent code re-entry and
> rt6_get_pcpu_route()/rt6_make_pcpu_route() run on the same cpu.
>
> Fixes: 66f5d6ce53e6 ("ipv6: replace rwlock with rcu and spinlock in fib6_table")
> Reported-by: Ido Schimmel <idosch@mellanox.com>
> Signed-off-by: Eric Dumazet <edumazet@google.com>
> Tested-by: Ido Schimmel <idosch@mellanox.com>
Applied, thanks Eric.
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH net-next] ipv6: fix a BUG in rt6_get_pcpu_route()
2017-10-09 4:07 ` [PATCH net-next] ipv6: fix a BUG in rt6_get_pcpu_route() Eric Dumazet
2017-10-09 4:09 ` David Miller
@ 2017-10-09 17:06 ` Martin KaFai Lau
1 sibling, 0 replies; 7+ messages in thread
From: Martin KaFai Lau @ 2017-10-09 17:06 UTC (permalink / raw)
To: Eric Dumazet; +Cc: Ido Schimmel, netdev, davem, weiwan, mlxsw
On Mon, Oct 09, 2017 at 04:07:18AM +0000, Eric Dumazet wrote:
> From: Eric Dumazet <edumazet@google.com>
>
> Ido reported following splat and provided a patch.
>
> [ 122.221814] BUG: using smp_processor_id() in preemptible [00000000] code: sshd/2672
> [ 122.221845] caller is debug_smp_processor_id+0x17/0x20
> [ 122.221866] CPU: 0 PID: 2672 Comm: sshd Not tainted 4.14.0-rc3-idosch-next-custom #639
> [ 122.221880] Hardware name: Mellanox Technologies Ltd. MSN2100-CB2FO/SA001017, BIOS 5.6.5 06/07/2016
> [ 122.221893] Call Trace:
> [ 122.221919] dump_stack+0xb1/0x10c
> [ 122.221946] ? _atomic_dec_and_lock+0x124/0x124
> [ 122.221974] ? ___ratelimit+0xfe/0x240
> [ 122.222020] check_preemption_disabled+0x173/0x1b0
> [ 122.222060] debug_smp_processor_id+0x17/0x20
> [ 122.222083] ip6_pol_route+0x1482/0x24a0
> ...
>
> I believe we can simplify this code path a bit, since we no longer
> hold a read_lock and need to release it to avoid a dead lock.
>
> By disabling BH, we make sure we'll prevent code re-entry and
> rt6_get_pcpu_route()/rt6_make_pcpu_route() run on the same cpu.
Thanks for fixing it!
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2017-10-09 17:07 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-10-08 15:18 [PATCH net-next] ipv6: Do not use this_cpu_ptr() in preemptible context Ido Schimmel
2017-10-08 16:03 ` Eric Dumazet
2017-10-08 16:54 ` Ido Schimmel
2017-10-08 18:25 ` Eric Dumazet
2017-10-09 4:07 ` [PATCH net-next] ipv6: fix a BUG in rt6_get_pcpu_route() Eric Dumazet
2017-10-09 4:09 ` David Miller
2017-10-09 17:06 ` Martin KaFai Lau
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.