ipv6: tunnel: hang when destroying ipv6 tunnel

* ipv6: tunnel: hang when destroying ipv6 tunnel
@ 2012-03-31 17:51 Sasha Levin
  2012-03-31 20:59 ` Eric Dumazet
  0 siblings, 1 reply; 15+ messages in thread
From: Sasha Levin @ 2012-03-31 17:51 UTC (permalink / raw)
  To: davem, kuznet, jmorris, yoshfuji, Patrick McHardy
  Cc: netdev, linux-kernel@vger.kernel.org List, Dave Jones

Hi all,

It appears that a hang may occur when destroying an ipv6 tunnel, which
I've reproduced several times in a KVM vm.

The pattern in the stack dump below is consistent with unregistering a
kobject when holding multiple locks. Unregistering a kobject usually
leads to an exit back to userspace with call_usermodehelper_exec().
The userspace code may access sysfs files which in turn will require
locking within the kernel, leading to a deadlock since those locks are
already held by kernel.

[ 1561.564172] INFO: task kworker/u:2:3140 blocked for more than 120 seconds.
[ 1561.566945] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[ 1561.570062] kworker/u:2     D ffff88006ee63000  4504  3140      2 0x00000000
[ 1561.572968]  ffff88006ed9f7e0 0000000000000082 ffff88006ed9f790
ffffffff8107d346
[ 1561.575680]  ffff88006ed9ffd8 00000000001d4580 ffff88006ed9e010
00000000001d4580
[ 1561.578601]  00000000001d4580 00000000001d4580 ffff88006ed9ffd8
00000000001d4580
[ 1561.581697] Call Trace:
[ 1561.582650]  [<ffffffff8107d346>] ? kvm_clock_read+0x46/0x80
[ 1561.584543]  [<ffffffff827063d4>] schedule+0x24/0x70
[ 1561.586231]  [<ffffffff82704025>] schedule_timeout+0x245/0x2c0
[ 1561.588508]  [<ffffffff81117c9a>] ? mark_held_locks+0x7a/0x120
[ 1561.590858]  [<ffffffff81119bbd>] ? __lock_release+0x8d/0x1d0
[ 1561.593162]  [<ffffffff82707e6b>] ? _raw_spin_unlock_irq+0x2b/0x70
[ 1561.595394]  [<ffffffff810e36d1>] ? get_parent_ip+0x11/0x50
[ 1561.597403]  [<ffffffff82705919>] wait_for_common+0x119/0x190
[ 1561.599707]  [<ffffffff810ed1b0>] ? try_to_wake_up+0x2c0/0x2c0
[ 1561.601758]  [<ffffffff82705a38>] wait_for_completion+0x18/0x20
[ 1561.603843]  [<ffffffff810cdcd8>] call_usermodehelper_exec+0x228/0x240
[ 1561.606059]  [<ffffffff82705844>] ? wait_for_common+0x44/0x190
[ 1561.608352]  [<ffffffff81878445>] kobject_uevent_env+0x615/0x650
[ 1561.610908]  [<ffffffff810e36d1>] ? get_parent_ip+0x11/0x50
[ 1561.613146]  [<ffffffff8187848b>] kobject_uevent+0xb/0x10
[ 1561.615312]  [<ffffffff81876f5a>] kobject_cleanup+0xca/0x1b0
[ 1561.617509]  [<ffffffff8187704d>] kobject_release+0xd/0x10
[ 1561.619334]  [<ffffffff81876d9c>] kobject_put+0x2c/0x60
[ 1561.621117]  [<ffffffff8226ea80>] net_rx_queue_update_kobjects+0xa0/0xf0
[ 1561.623421]  [<ffffffff8226ec87>] netdev_unregister_kobject+0x37/0x70
[ 1561.625979]  [<ffffffff82253e26>] rollback_registered_many+0x186/0x260
[ 1561.628526]  [<ffffffff82253f14>] unregister_netdevice_many+0x14/0x60
[ 1561.631064]  [<ffffffff8243922e>] ip6_tnl_destroy_tunnels+0xee/0x160
[ 1561.633549]  [<ffffffff8243b8f3>] ip6_tnl_exit_net+0xd3/0x1c0
[ 1561.635843]  [<ffffffff8243b820>] ? ip6_tnl_ioctl+0x550/0x550
[ 1561.637972]  [<ffffffff81259c86>] ? proc_net_remove+0x16/0x20
[ 1561.639881]  [<ffffffff8224f119>] ops_exit_list+0x39/0x60
[ 1561.641666]  [<ffffffff8224f72b>] cleanup_net+0xfb/0x1a0
[ 1561.643528]  [<ffffffff810ce97d>] process_one_work+0x1cd/0x460
[ 1561.645828]  [<ffffffff810ce91c>] ? process_one_work+0x16c/0x460
[ 1561.648180]  [<ffffffff8224f630>] ? net_drop_ns+0x40/0x40
[ 1561.650285]  [<ffffffff810d1e76>] worker_thread+0x176/0x3b0
[ 1561.652460]  [<ffffffff810d1d00>] ? manage_workers+0x120/0x120
[ 1561.654734]  [<ffffffff810d727e>] kthread+0xbe/0xd0
[ 1561.656656]  [<ffffffff8270a134>] kernel_thread_helper+0x4/0x10
[ 1561.658881]  [<ffffffff810e3fe0>] ? finish_task_switch+0x80/0x110
[ 1561.660828]  [<ffffffff82708434>] ? retint_restore_args+0x13/0x13
[ 1561.662795]  [<ffffffff810d71c0>] ? __init_kthread_worker+0x70/0x70
[ 1561.664932]  [<ffffffff8270a130>] ? gs_change+0x13/0x13
[ 1561.667001] 4 locks held by kworker/u:2/3140:
[ 1561.667599]  #0:  (netns){.+.+.+}, at: [<ffffffff810ce91c>]
process_one_work+0x16c/0x460
[ 1561.668758]  #1:  (net_cleanup_work){+.+.+.}, at:
[<ffffffff810ce91c>] process_one_work+0x16c/0x460
[ 1561.670002]  #2:  (net_mutex){+.+.+.}, at: [<ffffffff8224f6b0>]
cleanup_net+0x80/0x1a0
[ 1561.671700]  #3:  (rtnl_mutex){+.+.+.}, at: [<ffffffff82267f02>]
rtnl_lock+0x12/0x20

^ permalink raw reply	[flat|nested] 15+ messages in thread