All of lore.kernel.org
 help / color / mirror / Atom feed
* ipv6_addrconf: WARNING about suspicious RCU usage
@ 2018-01-20 11:57 Heiner Kallweit
  2018-01-20 13:37 ` Ido Schimmel
  0 siblings, 1 reply; 12+ messages in thread
From: Heiner Kallweit @ 2018-01-20 11:57 UTC (permalink / raw)
  To: netdev

Since some time (didn't bisect it yet) I get the following warning.
Is it a known issue?

[86220.125562] =============================
[86220.125586] WARNING: suspicious RCU usage
[86220.125612] 4.15.0-rc7-next-20180110+ #7 Not tainted
[86220.125641] -----------------------------
[86220.125666] kernel/sched/core.c:6026 Illegal context switch in RCU-bh read-side critical section!
[86220.125711]
               other info that might help us debug this:

[86220.125755]
               rcu_scheduler_active = 2, debug_locks = 1
[86220.125792] 4 locks held by kworker/0:2/1003:
[86220.125817]  #0:  ((wq_completion)"%s"("ipv6_addrconf")){+.+.}, at: [<00000000da8e9b73>] process_one_work+0x1de/0x680
[86220.125895]  #1:  ((addr_chk_work).work){+.+.}, at: [<00000000da8e9b73>] process_one_work+0x1de/0x680
[86220.125959]  #2:  (rtnl_mutex){+.+.}, at: [<00000000b06d9510>] rtnl_lock+0x12/0x20
[86220.126017]  #3:  (rcu_read_lock_bh){....}, at: [<00000000aef52299>] addrconf_verify_rtnl+0x1e/0x510 [ipv6]
[86220.126111]
               stack backtrace:
[86220.126142] CPU: 0 PID: 1003 Comm: kworker/0:2 Not tainted 4.15.0-rc7-next-20180110+ #7
[86220.126185] Hardware name: ZOTAC ZBOX-CI321NANO/ZBOX-CI321NANO, BIOS B246P105 06/01/2015
[86220.126250] Workqueue: ipv6_addrconf addrconf_verify_work [ipv6]
[86220.126288] Call Trace:
[86220.126312]  dump_stack+0x70/0x9e
[86220.126337]  lockdep_rcu_suspicious+0xce/0xf0
[86220.126365]  ___might_sleep+0x1d3/0x240
[86220.126390]  __might_sleep+0x45/0x80
[86220.126416]  kmem_cache_alloc_trace+0x53/0x250
[86220.126458]  ? ipv6_add_addr+0xfe/0x6e0 [ipv6]
[86220.126498]  ipv6_add_addr+0xfe/0x6e0 [ipv6]
[86220.126538]  ipv6_create_tempaddr+0x24d/0x430 [ipv6]
[86220.126580]  ? ipv6_create_tempaddr+0x24d/0x430 [ipv6]
[86220.126623]  addrconf_verify_rtnl+0x339/0x510 [ipv6]
[86220.126664]  ? addrconf_verify_rtnl+0x339/0x510 [ipv6]
[86220.126708]  addrconf_verify_work+0xe/0x20 [ipv6]
[86220.126738]  process_one_work+0x258/0x680
[86220.126765]  worker_thread+0x35/0x3f0
[86220.126790]  kthread+0x124/0x140
[86220.126813]  ? process_one_work+0x680/0x680
[86220.126839]  ? kthread_create_worker_on_cpu+0x40/0x40
[86220.126869]  ? umh_complete+0x40/0x40
[86220.126893]  ? call_usermodehelper_exec_async+0x12a/0x160
[86220.126926]  ret_from_fork+0x4b/0x60
[86220.126999] BUG: sleeping function called from invalid context at mm/slab.h:420
[86220.127041] in_atomic(): 1, irqs_disabled(): 0, pid: 1003, name: kworker/0:2
[86220.127082] 4 locks held by kworker/0:2/1003:
[86220.127107]  #0:  ((wq_completion)"%s"("ipv6_addrconf")){+.+.}, at: [<00000000da8e9b73>] process_one_work+0x1de/0x680
[86220.127179]  #1:  ((addr_chk_work).work){+.+.}, at: [<00000000da8e9b73>] process_one_work+0x1de/0x680
[86220.127242]  #2:  (rtnl_mutex){+.+.}, at: [<00000000b06d9510>] rtnl_lock+0x12/0x20
[86220.127300]  #3:  (rcu_read_lock_bh){....}, at: [<00000000aef52299>] addrconf_verify_rtnl+0x1e/0x510 [ipv6]
[86220.127414] CPU: 0 PID: 1003 Comm: kworker/0:2 Not tainted 4.15.0-rc7-next-20180110+ #7
[86220.127463] Hardware name: ZOTAC ZBOX-CI321NANO/ZBOX-CI321NANO, BIOS B246P105 06/01/2015
[86220.127528] Workqueue: ipv6_addrconf addrconf_verify_work [ipv6]
[86220.127568] Call Trace:
[86220.127591]  dump_stack+0x70/0x9e
[86220.127616]  ___might_sleep+0x14d/0x240
[86220.127644]  __might_sleep+0x45/0x80
[86220.127672]  kmem_cache_alloc_trace+0x53/0x250
[86220.127717]  ? ipv6_add_addr+0xfe/0x6e0 [ipv6]
[86220.127762]  ipv6_add_addr+0xfe/0x6e0 [ipv6]
[86220.127807]  ipv6_create_tempaddr+0x24d/0x430 [ipv6]
[86220.127854]  ? ipv6_create_tempaddr+0x24d/0x430 [ipv6]
[86220.127903]  addrconf_verify_rtnl+0x339/0x510 [ipv6]
[86220.127950]  ? addrconf_verify_rtnl+0x339/0x510 [ipv6]
[86220.127998]  addrconf_verify_work+0xe/0x20 [ipv6]
[86220.128032]  process_one_work+0x258/0x680
[86220.128063]  worker_thread+0x35/0x3f0
[86220.128091]  kthread+0x124/0x140
[86220.128117]  ? process_one_work+0x680/0x680
[86220.128146]  ? kthread_create_worker_on_cpu+0x40/0x40
[86220.128180]  ? umh_complete+0x40/0x40
[86220.128207]  ? call_usermodehelper_exec_async+0x12a/0x160
[86220.128243]  ret_from_fork+0x4b/0x60

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: ipv6_addrconf: WARNING about suspicious RCU usage
  2018-01-20 11:57 ipv6_addrconf: WARNING about suspicious RCU usage Heiner Kallweit
@ 2018-01-20 13:37 ` Ido Schimmel
  2018-01-20 18:49   ` Eric Dumazet
  0 siblings, 1 reply; 12+ messages in thread
From: Ido Schimmel @ 2018-01-20 13:37 UTC (permalink / raw)
  To: Heiner Kallweit; +Cc: netdev

[-- Attachment #1: Type: text/plain, Size: 2103 bytes --]

On Sat, Jan 20, 2018 at 12:57:01PM +0100, Heiner Kallweit wrote:
> Since some time (didn't bisect it yet) I get the following warning.
> Is it a known issue?

[...]

> [86220.126999] BUG: sleeping function called from invalid context at mm/slab.h:420
> [86220.127041] in_atomic(): 1, irqs_disabled(): 0, pid: 1003, name: kworker/0:2
> [86220.127082] 4 locks held by kworker/0:2/1003:
> [86220.127107]  #0:  ((wq_completion)"%s"("ipv6_addrconf")){+.+.}, at: [<00000000da8e9b73>] process_one_work+0x1de/0x680
> [86220.127179]  #1:  ((addr_chk_work).work){+.+.}, at: [<00000000da8e9b73>] process_one_work+0x1de/0x680
> [86220.127242]  #2:  (rtnl_mutex){+.+.}, at: [<00000000b06d9510>] rtnl_lock+0x12/0x20
> [86220.127300]  #3:  (rcu_read_lock_bh){....}, at: [<00000000aef52299>] addrconf_verify_rtnl+0x1e/0x510 [ipv6]
> [86220.127414] CPU: 0 PID: 1003 Comm: kworker/0:2 Not tainted 4.15.0-rc7-next-20180110+ #7
> [86220.127463] Hardware name: ZOTAC ZBOX-CI321NANO/ZBOX-CI321NANO, BIOS B246P105 06/01/2015
> [86220.127528] Workqueue: ipv6_addrconf addrconf_verify_work [ipv6]
> [86220.127568] Call Trace:
> [86220.127591]  dump_stack+0x70/0x9e
> [86220.127616]  ___might_sleep+0x14d/0x240
> [86220.127644]  __might_sleep+0x45/0x80
> [86220.127672]  kmem_cache_alloc_trace+0x53/0x250
> [86220.127717]  ? ipv6_add_addr+0xfe/0x6e0 [ipv6]
> [86220.127762]  ipv6_add_addr+0xfe/0x6e0 [ipv6]
> [86220.127807]  ipv6_create_tempaddr+0x24d/0x430 [ipv6]
> [86220.127854]  ? ipv6_create_tempaddr+0x24d/0x430 [ipv6]
> [86220.127903]  addrconf_verify_rtnl+0x339/0x510 [ipv6]
> [86220.127950]  ? addrconf_verify_rtnl+0x339/0x510 [ipv6]
> [86220.127998]  addrconf_verify_work+0xe/0x20 [ipv6]
> [86220.128032]  process_one_work+0x258/0x680
> [86220.128063]  worker_thread+0x35/0x3f0
> [86220.128091]  kthread+0x124/0x140
> [86220.128117]  ? process_one_work+0x680/0x680
> [86220.128146]  ? kthread_create_worker_on_cpu+0x40/0x40
> [86220.128180]  ? umh_complete+0x40/0x40
> [86220.128207]  ? call_usermodehelper_exec_async+0x12a/0x160
> [86220.128243]  ret_from_fork+0x4b/0x60

Can you please try attached patch (untested)?

[-- Attachment #2: 0001-ipv6-Do-not-sleep-in-RCU-read-side-critical-section.patch --]
[-- Type: text/plain, Size: 853 bytes --]

>From 70e4bc2a8fe08fb30251f786990a91d3ed2232e6 Mon Sep 17 00:00:00 2001
From: Ido Schimmel <idosch@mellanox.com>
Date: Sat, 20 Jan 2018 15:29:33 +0200
Subject: [PATCH] ipv6: Do not sleep in RCU read-side critical section

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
---
 net/ipv6/addrconf.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c
index ab99cb641b7c..9ad1365c91ca 100644
--- a/net/ipv6/addrconf.c
+++ b/net/ipv6/addrconf.c
@@ -4356,7 +4356,8 @@ static void addrconf_verify_rtnl(void)
 						spin_lock(&ifpub->lock);
 						ifpub->regen_count = 0;
 						spin_unlock(&ifpub->lock);
-						ipv6_create_tempaddr(ifpub, ifp, true);
+						ipv6_create_tempaddr(ifpub, ifp,
+								     false);
 						in6_ifa_put(ifpub);
 						in6_ifa_put(ifp);
 						goto restart;
-- 
2.14.3


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: ipv6_addrconf: WARNING about suspicious RCU usage
  2018-01-20 13:37 ` Ido Schimmel
@ 2018-01-20 18:49   ` Eric Dumazet
  2018-01-20 19:19     ` Ido Schimmel
  0 siblings, 1 reply; 12+ messages in thread
From: Eric Dumazet @ 2018-01-20 18:49 UTC (permalink / raw)
  To: Ido Schimmel, Heiner Kallweit; +Cc: netdev

On Sat, 2018-01-20 at 15:37 +0200, Ido Schimmel wrote:
> On Sat, Jan 20, 2018 at 12:57:01PM +0100, Heiner Kallweit wrote:
> > Since some time (didn't bisect it yet) I get the following warning.
> > Is it a known issue?
> 
> [...]
> 
> > [86220.126999] BUG: sleeping function called from invalid context at mm/slab.h:420
> > [86220.127041] in_atomic(): 1, irqs_disabled(): 0, pid: 1003, name: kworker/0:2
> > [86220.127082] 4 locks held by kworker/0:2/1003:
> > [86220.127107]  #0:  ((wq_completion)"%s"("ipv6_addrconf")){+.+.}, at: [<00000000da8e9b73>] process_one_work+0x1de/0x680
> > [86220.127179]  #1:  ((addr_chk_work).work){+.+.}, at: [<00000000da8e9b73>] process_one_work+0x1de/0x680
> > [86220.127242]  #2:  (rtnl_mutex){+.+.}, at: [<00000000b06d9510>] rtnl_lock+0x12/0x20
> > [86220.127300]  #3:  (rcu_read_lock_bh){....}, at: [<00000000aef52299>] addrconf_verify_rtnl+0x1e/0x510 [ipv6]
> > [86220.127414] CPU: 0 PID: 1003 Comm: kworker/0:2 Not tainted 4.15.0-rc7-next-20180110+ #7
> > [86220.127463] Hardware name: ZOTAC ZBOX-CI321NANO/ZBOX-CI321NANO, BIOS B246P105 06/01/2015
> > [86220.127528] Workqueue: ipv6_addrconf addrconf_verify_work [ipv6]
> > [86220.127568] Call Trace:
> > [86220.127591]  dump_stack+0x70/0x9e
> > [86220.127616]  ___might_sleep+0x14d/0x240
> > [86220.127644]  __might_sleep+0x45/0x80
> > [86220.127672]  kmem_cache_alloc_trace+0x53/0x250
> > [86220.127717]  ? ipv6_add_addr+0xfe/0x6e0 [ipv6]
> > [86220.127762]  ipv6_add_addr+0xfe/0x6e0 [ipv6]
> > [86220.127807]  ipv6_create_tempaddr+0x24d/0x430 [ipv6]
> > [86220.127854]  ? ipv6_create_tempaddr+0x24d/0x430 [ipv6]
> > [86220.127903]  addrconf_verify_rtnl+0x339/0x510 [ipv6]
> > [86220.127950]  ? addrconf_verify_rtnl+0x339/0x510 [ipv6]
> > [86220.127998]  addrconf_verify_work+0xe/0x20 [ipv6]
> > [86220.128032]  process_one_work+0x258/0x680
> > [86220.128063]  worker_thread+0x35/0x3f0
> > [86220.128091]  kthread+0x124/0x140
> > [86220.128117]  ? process_one_work+0x680/0x680
> > [86220.128146]  ? kthread_create_worker_on_cpu+0x40/0x40
> > [86220.128180]  ? umh_complete+0x40/0x40
> > [86220.128207]  ? call_usermodehelper_exec_async+0x12a/0x160
> > [86220.128243]  ret_from_fork+0x4b/0x60
> 
> Can you please try attached patch (untested)?



I would also/instead break rcu section.

Holding RCU (and BH) for whole hash traversal is a recipe for disaster,
if we have thousands of IPv6 addresses.

diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c
index ab99cb641b7cccdda0ad4ae553c09274d7dbc047..adda73466ae1dd0f3b700b3db5fbf3065e4d3f7f 100644
--- a/net/ipv6/addrconf.c
+++ b/net/ipv6/addrconf.c
@@ -4356,9 +4356,11 @@ static void addrconf_verify_rtnl(void)
 						spin_lock(&ifpub->lock);
 						ifpub->regen_count = 0;
 						spin_unlock(&ifpub->lock);
+						rcu_read_unlock_bh();
 						ipv6_create_tempaddr(ifpub, ifp, true);
 						in6_ifa_put(ifpub);
 						in6_ifa_put(ifp);
+						rcu_read_lock_bh();
 						goto restart;
 					}
 				} else if (time_before(ifp->tstamp + ifp->prefered_lft * HZ - regen_advance * HZ, next))

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: ipv6_addrconf: WARNING about suspicious RCU usage
  2018-01-20 18:49   ` Eric Dumazet
@ 2018-01-20 19:19     ` Ido Schimmel
  2018-01-20 19:48       ` Heiner Kallweit
  2018-01-21 21:22       ` Heiner Kallweit
  0 siblings, 2 replies; 12+ messages in thread
From: Ido Schimmel @ 2018-01-20 19:19 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: Heiner Kallweit, netdev

On Sat, Jan 20, 2018 at 10:49:03AM -0800, Eric Dumazet wrote:
> On Sat, 2018-01-20 at 15:37 +0200, Ido Schimmel wrote:
> > On Sat, Jan 20, 2018 at 12:57:01PM +0100, Heiner Kallweit wrote:
> > > Since some time (didn't bisect it yet) I get the following warning.
> > > Is it a known issue?
> > 
> > [...]
> > 
> > > [86220.126999] BUG: sleeping function called from invalid context at mm/slab.h:420
> > > [86220.127041] in_atomic(): 1, irqs_disabled(): 0, pid: 1003, name: kworker/0:2
> > > [86220.127082] 4 locks held by kworker/0:2/1003:
> > > [86220.127107]  #0:  ((wq_completion)"%s"("ipv6_addrconf")){+.+.}, at: [<00000000da8e9b73>] process_one_work+0x1de/0x680
> > > [86220.127179]  #1:  ((addr_chk_work).work){+.+.}, at: [<00000000da8e9b73>] process_one_work+0x1de/0x680
> > > [86220.127242]  #2:  (rtnl_mutex){+.+.}, at: [<00000000b06d9510>] rtnl_lock+0x12/0x20
> > > [86220.127300]  #3:  (rcu_read_lock_bh){....}, at: [<00000000aef52299>] addrconf_verify_rtnl+0x1e/0x510 [ipv6]
> > > [86220.127414] CPU: 0 PID: 1003 Comm: kworker/0:2 Not tainted 4.15.0-rc7-next-20180110+ #7
> > > [86220.127463] Hardware name: ZOTAC ZBOX-CI321NANO/ZBOX-CI321NANO, BIOS B246P105 06/01/2015
> > > [86220.127528] Workqueue: ipv6_addrconf addrconf_verify_work [ipv6]
> > > [86220.127568] Call Trace:
> > > [86220.127591]  dump_stack+0x70/0x9e
> > > [86220.127616]  ___might_sleep+0x14d/0x240
> > > [86220.127644]  __might_sleep+0x45/0x80
> > > [86220.127672]  kmem_cache_alloc_trace+0x53/0x250
> > > [86220.127717]  ? ipv6_add_addr+0xfe/0x6e0 [ipv6]
> > > [86220.127762]  ipv6_add_addr+0xfe/0x6e0 [ipv6]
> > > [86220.127807]  ipv6_create_tempaddr+0x24d/0x430 [ipv6]
> > > [86220.127854]  ? ipv6_create_tempaddr+0x24d/0x430 [ipv6]
> > > [86220.127903]  addrconf_verify_rtnl+0x339/0x510 [ipv6]
> > > [86220.127950]  ? addrconf_verify_rtnl+0x339/0x510 [ipv6]
> > > [86220.127998]  addrconf_verify_work+0xe/0x20 [ipv6]
> > > [86220.128032]  process_one_work+0x258/0x680
> > > [86220.128063]  worker_thread+0x35/0x3f0
> > > [86220.128091]  kthread+0x124/0x140
> > > [86220.128117]  ? process_one_work+0x680/0x680
> > > [86220.128146]  ? kthread_create_worker_on_cpu+0x40/0x40
> > > [86220.128180]  ? umh_complete+0x40/0x40
> > > [86220.128207]  ? call_usermodehelper_exec_async+0x12a/0x160
> > > [86220.128243]  ret_from_fork+0x4b/0x60
> > 
> > Can you please try attached patch (untested)?
> 
> 
> 
> I would also/instead break rcu section.

Thanks Eric, this should work. We can continue to block in
ipv6_create_tempaddr().

Heiner, can you try Eric's patch instead?

> 
> Holding RCU (and BH) for whole hash traversal is a recipe for disaster,
> if we have thousands of IPv6 addresses.
> 
> diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c
> index ab99cb641b7cccdda0ad4ae553c09274d7dbc047..adda73466ae1dd0f3b700b3db5fbf3065e4d3f7f 100644
> --- a/net/ipv6/addrconf.c
> +++ b/net/ipv6/addrconf.c
> @@ -4356,9 +4356,11 @@ static void addrconf_verify_rtnl(void)
>  						spin_lock(&ifpub->lock);
>  						ifpub->regen_count = 0;
>  						spin_unlock(&ifpub->lock);
> +						rcu_read_unlock_bh();
>  						ipv6_create_tempaddr(ifpub, ifp, true);
>  						in6_ifa_put(ifpub);
>  						in6_ifa_put(ifp);
> +						rcu_read_lock_bh();
>  						goto restart;
>  					}
>  				} else if (time_before(ifp->tstamp + ifp->prefered_lft * HZ - regen_advance * HZ, next))
> 
> 
> 
> 

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: ipv6_addrconf: WARNING about suspicious RCU usage
  2018-01-20 19:19     ` Ido Schimmel
@ 2018-01-20 19:48       ` Heiner Kallweit
  2018-01-21 21:22       ` Heiner Kallweit
  1 sibling, 0 replies; 12+ messages in thread
From: Heiner Kallweit @ 2018-01-20 19:48 UTC (permalink / raw)
  To: Ido Schimmel, Eric Dumazet; +Cc: netdev

Am 20.01.2018 um 20:19 schrieb Ido Schimmel:
> On Sat, Jan 20, 2018 at 10:49:03AM -0800, Eric Dumazet wrote:
>> On Sat, 2018-01-20 at 15:37 +0200, Ido Schimmel wrote:
>>> On Sat, Jan 20, 2018 at 12:57:01PM +0100, Heiner Kallweit wrote:
>>>> Since some time (didn't bisect it yet) I get the following warning.
>>>> Is it a known issue?
>>>
>>> [...]
>>>
>>>> [86220.126999] BUG: sleeping function called from invalid context at mm/slab.h:420
>>>> [86220.127041] in_atomic(): 1, irqs_disabled(): 0, pid: 1003, name: kworker/0:2
>>>> [86220.127082] 4 locks held by kworker/0:2/1003:
>>>> [86220.127107]  #0:  ((wq_completion)"%s"("ipv6_addrconf")){+.+.}, at: [<00000000da8e9b73>] process_one_work+0x1de/0x680
>>>> [86220.127179]  #1:  ((addr_chk_work).work){+.+.}, at: [<00000000da8e9b73>] process_one_work+0x1de/0x680
>>>> [86220.127242]  #2:  (rtnl_mutex){+.+.}, at: [<00000000b06d9510>] rtnl_lock+0x12/0x20
>>>> [86220.127300]  #3:  (rcu_read_lock_bh){....}, at: [<00000000aef52299>] addrconf_verify_rtnl+0x1e/0x510 [ipv6]
>>>> [86220.127414] CPU: 0 PID: 1003 Comm: kworker/0:2 Not tainted 4.15.0-rc7-next-20180110+ #7
>>>> [86220.127463] Hardware name: ZOTAC ZBOX-CI321NANO/ZBOX-CI321NANO, BIOS B246P105 06/01/2015
>>>> [86220.127528] Workqueue: ipv6_addrconf addrconf_verify_work [ipv6]
>>>> [86220.127568] Call Trace:
>>>> [86220.127591]  dump_stack+0x70/0x9e
>>>> [86220.127616]  ___might_sleep+0x14d/0x240
>>>> [86220.127644]  __might_sleep+0x45/0x80
>>>> [86220.127672]  kmem_cache_alloc_trace+0x53/0x250
>>>> [86220.127717]  ? ipv6_add_addr+0xfe/0x6e0 [ipv6]
>>>> [86220.127762]  ipv6_add_addr+0xfe/0x6e0 [ipv6]
>>>> [86220.127807]  ipv6_create_tempaddr+0x24d/0x430 [ipv6]
>>>> [86220.127854]  ? ipv6_create_tempaddr+0x24d/0x430 [ipv6]
>>>> [86220.127903]  addrconf_verify_rtnl+0x339/0x510 [ipv6]
>>>> [86220.127950]  ? addrconf_verify_rtnl+0x339/0x510 [ipv6]
>>>> [86220.127998]  addrconf_verify_work+0xe/0x20 [ipv6]
>>>> [86220.128032]  process_one_work+0x258/0x680
>>>> [86220.128063]  worker_thread+0x35/0x3f0
>>>> [86220.128091]  kthread+0x124/0x140
>>>> [86220.128117]  ? process_one_work+0x680/0x680
>>>> [86220.128146]  ? kthread_create_worker_on_cpu+0x40/0x40
>>>> [86220.128180]  ? umh_complete+0x40/0x40
>>>> [86220.128207]  ? call_usermodehelper_exec_async+0x12a/0x160
>>>> [86220.128243]  ret_from_fork+0x4b/0x60
>>>
>>> Can you please try attached patch (untested)?
>>
>>
>>
>> I would also/instead break rcu section.
> 
> Thanks Eric, this should work. We can continue to block in
> ipv6_create_tempaddr().
> 
> Heiner, can you try Eric's patch instead?
> 
Sure, thanks for the quick response to both of you.

>>
>> Holding RCU (and BH) for whole hash traversal is a recipe for disaster,
>> if we have thousands of IPv6 addresses.
>>
>> diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c
>> index ab99cb641b7cccdda0ad4ae553c09274d7dbc047..adda73466ae1dd0f3b700b3db5fbf3065e4d3f7f 100644
>> --- a/net/ipv6/addrconf.c
>> +++ b/net/ipv6/addrconf.c
>> @@ -4356,9 +4356,11 @@ static void addrconf_verify_rtnl(void)
>>  						spin_lock(&ifpub->lock);
>>  						ifpub->regen_count = 0;
>>  						spin_unlock(&ifpub->lock);
>> +						rcu_read_unlock_bh();
>>  						ipv6_create_tempaddr(ifpub, ifp, true);
>>  						in6_ifa_put(ifpub);
>>  						in6_ifa_put(ifp);
>> +						rcu_read_lock_bh();
>>  						goto restart;
>>  					}
>>  				} else if (time_before(ifp->tstamp + ifp->prefered_lft * HZ - regen_advance * HZ, next))
>>
>>
>>
>>
> 

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: ipv6_addrconf: WARNING about suspicious RCU usage
  2018-01-20 19:19     ` Ido Schimmel
  2018-01-20 19:48       ` Heiner Kallweit
@ 2018-01-21 21:22       ` Heiner Kallweit
  2018-01-23 18:01         ` Ido Schimmel
  1 sibling, 1 reply; 12+ messages in thread
From: Heiner Kallweit @ 2018-01-21 21:22 UTC (permalink / raw)
  To: Ido Schimmel, Eric Dumazet; +Cc: netdev

Am 20.01.2018 um 20:19 schrieb Ido Schimmel:
> On Sat, Jan 20, 2018 at 10:49:03AM -0800, Eric Dumazet wrote:
>> On Sat, 2018-01-20 at 15:37 +0200, Ido Schimmel wrote:
>>> On Sat, Jan 20, 2018 at 12:57:01PM +0100, Heiner Kallweit wrote:
>>>> Since some time (didn't bisect it yet) I get the following warning.
>>>> Is it a known issue?
>>>
>>> [...]
>>>
>>>> [86220.126999] BUG: sleeping function called from invalid context at mm/slab.h:420
>>>> [86220.127041] in_atomic(): 1, irqs_disabled(): 0, pid: 1003, name: kworker/0:2
>>>> [86220.127082] 4 locks held by kworker/0:2/1003:
>>>> [86220.127107]  #0:  ((wq_completion)"%s"("ipv6_addrconf")){+.+.}, at: [<00000000da8e9b73>] process_one_work+0x1de/0x680
>>>> [86220.127179]  #1:  ((addr_chk_work).work){+.+.}, at: [<00000000da8e9b73>] process_one_work+0x1de/0x680
>>>> [86220.127242]  #2:  (rtnl_mutex){+.+.}, at: [<00000000b06d9510>] rtnl_lock+0x12/0x20
>>>> [86220.127300]  #3:  (rcu_read_lock_bh){....}, at: [<00000000aef52299>] addrconf_verify_rtnl+0x1e/0x510 [ipv6]
>>>> [86220.127414] CPU: 0 PID: 1003 Comm: kworker/0:2 Not tainted 4.15.0-rc7-next-20180110+ #7
>>>> [86220.127463] Hardware name: ZOTAC ZBOX-CI321NANO/ZBOX-CI321NANO, BIOS B246P105 06/01/2015
>>>> [86220.127528] Workqueue: ipv6_addrconf addrconf_verify_work [ipv6]
>>>> [86220.127568] Call Trace:
>>>> [86220.127591]  dump_stack+0x70/0x9e
>>>> [86220.127616]  ___might_sleep+0x14d/0x240
>>>> [86220.127644]  __might_sleep+0x45/0x80
>>>> [86220.127672]  kmem_cache_alloc_trace+0x53/0x250
>>>> [86220.127717]  ? ipv6_add_addr+0xfe/0x6e0 [ipv6]
>>>> [86220.127762]  ipv6_add_addr+0xfe/0x6e0 [ipv6]
>>>> [86220.127807]  ipv6_create_tempaddr+0x24d/0x430 [ipv6]
>>>> [86220.127854]  ? ipv6_create_tempaddr+0x24d/0x430 [ipv6]
>>>> [86220.127903]  addrconf_verify_rtnl+0x339/0x510 [ipv6]
>>>> [86220.127950]  ? addrconf_verify_rtnl+0x339/0x510 [ipv6]
>>>> [86220.127998]  addrconf_verify_work+0xe/0x20 [ipv6]
>>>> [86220.128032]  process_one_work+0x258/0x680
>>>> [86220.128063]  worker_thread+0x35/0x3f0
>>>> [86220.128091]  kthread+0x124/0x140
>>>> [86220.128117]  ? process_one_work+0x680/0x680
>>>> [86220.128146]  ? kthread_create_worker_on_cpu+0x40/0x40
>>>> [86220.128180]  ? umh_complete+0x40/0x40
>>>> [86220.128207]  ? call_usermodehelper_exec_async+0x12a/0x160
>>>> [86220.128243]  ret_from_fork+0x4b/0x60
>>>
>>> Can you please try attached patch (untested)?
>>
>>
>>
>> I would also/instead break rcu section.
> 
> Thanks Eric, this should work. We can continue to block in
> ipv6_create_tempaddr().
> 
> Heiner, can you try Eric's patch instead?
> 
So far everything looks good with Eric's patch. The warning didn't show up again.

>>
>> Holding RCU (and BH) for whole hash traversal is a recipe for disaster,
>> if we have thousands of IPv6 addresses.
>>
>> diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c
>> index ab99cb641b7cccdda0ad4ae553c09274d7dbc047..adda73466ae1dd0f3b700b3db5fbf3065e4d3f7f 100644
>> --- a/net/ipv6/addrconf.c
>> +++ b/net/ipv6/addrconf.c
>> @@ -4356,9 +4356,11 @@ static void addrconf_verify_rtnl(void)
>>  						spin_lock(&ifpub->lock);
>>  						ifpub->regen_count = 0;
>>  						spin_unlock(&ifpub->lock);
>> +						rcu_read_unlock_bh();
>>  						ipv6_create_tempaddr(ifpub, ifp, true);
>>  						in6_ifa_put(ifpub);
>>  						in6_ifa_put(ifp);
>> +						rcu_read_lock_bh();
>>  						goto restart;
>>  					}
>>  				} else if (time_before(ifp->tstamp + ifp->prefered_lft * HZ - regen_advance * HZ, next))
>>
>>
>>
>>
> 

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: ipv6_addrconf: WARNING about suspicious RCU usage
  2018-01-21 21:22       ` Heiner Kallweit
@ 2018-01-23 18:01         ` Ido Schimmel
  2018-01-23 18:43           ` Eric Dumazet
  2018-01-26 22:15           ` ipv6_addrconf: WARNING about suspicious RCU usage Heiner Kallweit
  0 siblings, 2 replies; 12+ messages in thread
From: Ido Schimmel @ 2018-01-23 18:01 UTC (permalink / raw)
  To: Heiner Kallweit, eric.dumazet; +Cc: netdev

On Sun, Jan 21, 2018 at 10:22:16PM +0100, Heiner Kallweit wrote:
> So far everything looks good with Eric's patch. The warning didn't show up again.

Eric, can you please submit your patch?

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: ipv6_addrconf: WARNING about suspicious RCU usage
  2018-01-23 18:01         ` Ido Schimmel
@ 2018-01-23 18:43           ` Eric Dumazet
  2018-01-27  0:10             ` [PATCH net] ipv6: addrconf: break critical section in addrconf_verify_rtnl() Eric Dumazet
  2018-01-26 22:15           ` ipv6_addrconf: WARNING about suspicious RCU usage Heiner Kallweit
  1 sibling, 1 reply; 12+ messages in thread
From: Eric Dumazet @ 2018-01-23 18:43 UTC (permalink / raw)
  To: Ido Schimmel, Heiner Kallweit; +Cc: netdev

On Tue, 2018-01-23 at 20:01 +0200, Ido Schimmel wrote:
> On Sun, Jan 21, 2018 at 10:22:16PM +0100, Heiner Kallweit wrote:
> > So far everything looks good with Eric's patch. The warning didn't show up again.
> 
> Eric, can you please submit your patch?

Sure, will do today. Thanks for the reminder.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: ipv6_addrconf: WARNING about suspicious RCU usage
  2018-01-23 18:01         ` Ido Schimmel
  2018-01-23 18:43           ` Eric Dumazet
@ 2018-01-26 22:15           ` Heiner Kallweit
  1 sibling, 0 replies; 12+ messages in thread
From: Heiner Kallweit @ 2018-01-26 22:15 UTC (permalink / raw)
  To: eric.dumazet; +Cc: Ido Schimmel, netdev

Am 23.01.2018 um 19:01 schrieb Ido Schimmel:
> On Sun, Jan 21, 2018 at 10:22:16PM +0100, Heiner Kallweit wrote:
>> So far everything looks good with Eric's patch. The warning didn't show up again.
> 
> Eric, can you please submit your patch?
> .
> 
Soon the merge window starts, can the patch be submitted before?

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH net] ipv6: addrconf: break critical section in addrconf_verify_rtnl()
  2018-01-23 18:43           ` Eric Dumazet
@ 2018-01-27  0:10             ` Eric Dumazet
  2018-01-27  8:52               ` Ido Schimmel
  2018-01-29 19:24               ` David Miller
  0 siblings, 2 replies; 12+ messages in thread
From: Eric Dumazet @ 2018-01-27  0:10 UTC (permalink / raw)
  To: Ido Schimmel, Heiner Kallweit, David Miller; +Cc: netdev

From: Eric Dumazet <edumazet@google.com>

Heiner reported a lockdep splat [1]

This is caused by attempting GFP_KERNEL allocation while RCU lock is
held and BH blocked.

We believe that addrconf_verify_rtnl() could run for a long period,
so instead of using GFP_ATOMIC here as Ido suggested, we should break
the critical section and restart it after the allocation.


[1]
[86220.125562] =============================
[86220.125586] WARNING: suspicious RCU usage
[86220.125612] 4.15.0-rc7-next-20180110+ #7 Not tainted
[86220.125641] -----------------------------
[86220.125666] kernel/sched/core.c:6026 Illegal context switch in RCU-bh read-side critical section!
[86220.125711]
               other info that might help us debug this:

[86220.125755]
               rcu_scheduler_active = 2, debug_locks = 1
[86220.125792] 4 locks held by kworker/0:2/1003:
[86220.125817]  #0:  ((wq_completion)"%s"("ipv6_addrconf")){+.+.}, at: [<00000000da8e9b73>] process_one_work+0x1de/0x680
[86220.125895]  #1:  ((addr_chk_work).work){+.+.}, at: [<00000000da8e9b73>] process_one_work+0x1de/0x680
[86220.125959]  #2:  (rtnl_mutex){+.+.}, at: [<00000000b06d9510>] rtnl_lock+0x12/0x20
[86220.126017]  #3:  (rcu_read_lock_bh){....}, at: [<00000000aef52299>] addrconf_verify_rtnl+0x1e/0x510 [ipv6]
[86220.126111]
               stack backtrace:
[86220.126142] CPU: 0 PID: 1003 Comm: kworker/0:2 Not tainted 4.15.0-rc7-next-20180110+ #7
[86220.126185] Hardware name: ZOTAC ZBOX-CI321NANO/ZBOX-CI321NANO, BIOS B246P105 06/01/2015
[86220.126250] Workqueue: ipv6_addrconf addrconf_verify_work [ipv6]
[86220.126288] Call Trace:
[86220.126312]  dump_stack+0x70/0x9e
[86220.126337]  lockdep_rcu_suspicious+0xce/0xf0
[86220.126365]  ___might_sleep+0x1d3/0x240
[86220.126390]  __might_sleep+0x45/0x80
[86220.126416]  kmem_cache_alloc_trace+0x53/0x250
[86220.126458]  ? ipv6_add_addr+0xfe/0x6e0 [ipv6]
[86220.126498]  ipv6_add_addr+0xfe/0x6e0 [ipv6]
[86220.126538]  ipv6_create_tempaddr+0x24d/0x430 [ipv6]
[86220.126580]  ? ipv6_create_tempaddr+0x24d/0x430 [ipv6]
[86220.126623]  addrconf_verify_rtnl+0x339/0x510 [ipv6]
[86220.126664]  ? addrconf_verify_rtnl+0x339/0x510 [ipv6]
[86220.126708]  addrconf_verify_work+0xe/0x20 [ipv6]
[86220.126738]  process_one_work+0x258/0x680
[86220.126765]  worker_thread+0x35/0x3f0
[86220.126790]  kthread+0x124/0x140
[86220.126813]  ? process_one_work+0x680/0x680
[86220.126839]  ? kthread_create_worker_on_cpu+0x40/0x40
[86220.126869]  ? umh_complete+0x40/0x40
[86220.126893]  ? call_usermodehelper_exec_async+0x12a/0x160
[86220.126926]  ret_from_fork+0x4b/0x60
[86220.126999] BUG: sleeping function called from invalid context at mm/slab.h:420
[86220.127041] in_atomic(): 1, irqs_disabled(): 0, pid: 1003, name: kworker/0:2
[86220.127082] 4 locks held by kworker/0:2/1003:
[86220.127107]  #0:  ((wq_completion)"%s"("ipv6_addrconf")){+.+.}, at: [<00000000da8e9b73>] process_one_work+0x1de/0x680
[86220.127179]  #1:  ((addr_chk_work).work){+.+.}, at: [<00000000da8e9b73>] process_one_work+0x1de/0x680
[86220.127242]  #2:  (rtnl_mutex){+.+.}, at: [<00000000b06d9510>] rtnl_lock+0x12/0x20
[86220.127300]  #3:  (rcu_read_lock_bh){....}, at: [<00000000aef52299>] addrconf_verify_rtnl+0x1e/0x510 [ipv6]
[86220.127414] CPU: 0 PID: 1003 Comm: kworker/0:2 Not tainted 4.15.0-rc7-next-20180110+ #7
[86220.127463] Hardware name: ZOTAC ZBOX-CI321NANO/ZBOX-CI321NANO, BIOS B246P105 06/01/2015
[86220.127528] Workqueue: ipv6_addrconf addrconf_verify_work [ipv6]
[86220.127568] Call Trace:
[86220.127591]  dump_stack+0x70/0x9e
[86220.127616]  ___might_sleep+0x14d/0x240
[86220.127644]  __might_sleep+0x45/0x80
[86220.127672]  kmem_cache_alloc_trace+0x53/0x250
[86220.127717]  ? ipv6_add_addr+0xfe/0x6e0 [ipv6]
[86220.127762]  ipv6_add_addr+0xfe/0x6e0 [ipv6]
[86220.127807]  ipv6_create_tempaddr+0x24d/0x430 [ipv6]
[86220.127854]  ? ipv6_create_tempaddr+0x24d/0x430 [ipv6]
[86220.127903]  addrconf_verify_rtnl+0x339/0x510 [ipv6]
[86220.127950]  ? addrconf_verify_rtnl+0x339/0x510 [ipv6]
[86220.127998]  addrconf_verify_work+0xe/0x20 [ipv6]
[86220.128032]  process_one_work+0x258/0x680
[86220.128063]  worker_thread+0x35/0x3f0
[86220.128091]  kthread+0x124/0x140
[86220.128117]  ? process_one_work+0x680/0x680
[86220.128146]  ? kthread_create_worker_on_cpu+0x40/0x40
[86220.128180]  ? umh_complete+0x40/0x40
[86220.128207]  ? call_usermodehelper_exec_async+0x12a/0x160
[86220.128243]  ret_from_fork+0x4b/0x60

Fixes: f3d9832e56c4 ("ipv6: addrconf: cleanup locking in ipv6_add_addr")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Heiner Kallweit <hkallweit1@gmail.com>
---
 net/ipv6/addrconf.c |    2 ++
 1 file changed, 2 insertions(+)

diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c
index f49bd7897e95f15a381e4700660991f2d3c3fed4..10facd174210974ac82b2304211061b90714aac8 100644
--- a/net/ipv6/addrconf.c
+++ b/net/ipv6/addrconf.c
@@ -4352,9 +4352,11 @@ static void addrconf_verify_rtnl(void)
 						spin_lock(&ifpub->lock);
 						ifpub->regen_count = 0;
 						spin_unlock(&ifpub->lock);
+						rcu_read_unlock_bh();
 						ipv6_create_tempaddr(ifpub, ifp, true);
 						in6_ifa_put(ifpub);
 						in6_ifa_put(ifp);
+						rcu_read_lock_bh();
 						goto restart;
 					}
 				} else if (time_before(ifp->tstamp + ifp->prefered_lft * HZ - regen_advance * HZ, next))

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH net] ipv6: addrconf: break critical section in addrconf_verify_rtnl()
  2018-01-27  0:10             ` [PATCH net] ipv6: addrconf: break critical section in addrconf_verify_rtnl() Eric Dumazet
@ 2018-01-27  8:52               ` Ido Schimmel
  2018-01-29 19:24               ` David Miller
  1 sibling, 0 replies; 12+ messages in thread
From: Ido Schimmel @ 2018-01-27  8:52 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: Heiner Kallweit, David Miller, netdev

On Fri, Jan 26, 2018 at 04:10:43PM -0800, Eric Dumazet wrote:
> From: Eric Dumazet <edumazet@google.com>
> 
> Heiner reported a lockdep splat [1]
> 
> This is caused by attempting GFP_KERNEL allocation while RCU lock is
> held and BH blocked.
> 
> We believe that addrconf_verify_rtnl() could run for a long period,
> so instead of using GFP_ATOMIC here as Ido suggested, we should break
> the critical section and restart it after the allocation.

[...]

> Fixes: f3d9832e56c4 ("ipv6: addrconf: cleanup locking in ipv6_add_addr")
> Signed-off-by: Eric Dumazet <edumazet@google.com>
> Reported-by: Heiner Kallweit <hkallweit1@gmail.com>

Reviewed-by: Ido Schimmel <idosch@mellanox.com>

Thanks!

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH net] ipv6: addrconf: break critical section in addrconf_verify_rtnl()
  2018-01-27  0:10             ` [PATCH net] ipv6: addrconf: break critical section in addrconf_verify_rtnl() Eric Dumazet
  2018-01-27  8:52               ` Ido Schimmel
@ 2018-01-29 19:24               ` David Miller
  1 sibling, 0 replies; 12+ messages in thread
From: David Miller @ 2018-01-29 19:24 UTC (permalink / raw)
  To: eric.dumazet; +Cc: idosch, hkallweit1, netdev

From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Fri, 26 Jan 2018 16:10:43 -0800

> From: Eric Dumazet <edumazet@google.com>
> 
> Heiner reported a lockdep splat [1]
> 
> This is caused by attempting GFP_KERNEL allocation while RCU lock is
> held and BH blocked.
> 
> We believe that addrconf_verify_rtnl() could run for a long period,
> so instead of using GFP_ATOMIC here as Ido suggested, we should break
> the critical section and restart it after the allocation.
> 
> 
> [1]
 ...
> Fixes: f3d9832e56c4 ("ipv6: addrconf: cleanup locking in ipv6_add_addr")
> Signed-off-by: Eric Dumazet <edumazet@google.com>
> Reported-by: Heiner Kallweit <hkallweit1@gmail.com>

Applied and queued up for v4.15 -stable, thanks.

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2018-01-29 19:24 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-01-20 11:57 ipv6_addrconf: WARNING about suspicious RCU usage Heiner Kallweit
2018-01-20 13:37 ` Ido Schimmel
2018-01-20 18:49   ` Eric Dumazet
2018-01-20 19:19     ` Ido Schimmel
2018-01-20 19:48       ` Heiner Kallweit
2018-01-21 21:22       ` Heiner Kallweit
2018-01-23 18:01         ` Ido Schimmel
2018-01-23 18:43           ` Eric Dumazet
2018-01-27  0:10             ` [PATCH net] ipv6: addrconf: break critical section in addrconf_verify_rtnl() Eric Dumazet
2018-01-27  8:52               ` Ido Schimmel
2018-01-29 19:24               ` David Miller
2018-01-26 22:15           ` ipv6_addrconf: WARNING about suspicious RCU usage Heiner Kallweit

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.