All of lore.kernel.org
 help / color / mirror / Atom feed
* ipv6: tunnel: hang when destroying ipv6 tunnel
@ 2012-03-31 17:51 Sasha Levin
  2012-03-31 20:59 ` Eric Dumazet
  0 siblings, 1 reply; 15+ messages in thread
From: Sasha Levin @ 2012-03-31 17:51 UTC (permalink / raw)
  To: davem, kuznet, jmorris, yoshfuji, Patrick McHardy
  Cc: netdev, linux-kernel@vger.kernel.org List, Dave Jones

Hi all,

It appears that a hang may occur when destroying an ipv6 tunnel, which
I've reproduced several times in a KVM vm.

The pattern in the stack dump below is consistent with unregistering a
kobject when holding multiple locks. Unregistering a kobject usually
leads to an exit back to userspace with call_usermodehelper_exec().
The userspace code may access sysfs files which in turn will require
locking within the kernel, leading to a deadlock since those locks are
already held by kernel.

[ 1561.564172] INFO: task kworker/u:2:3140 blocked for more than 120 seconds.
[ 1561.566945] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[ 1561.570062] kworker/u:2     D ffff88006ee63000  4504  3140      2 0x00000000
[ 1561.572968]  ffff88006ed9f7e0 0000000000000082 ffff88006ed9f790
ffffffff8107d346
[ 1561.575680]  ffff88006ed9ffd8 00000000001d4580 ffff88006ed9e010
00000000001d4580
[ 1561.578601]  00000000001d4580 00000000001d4580 ffff88006ed9ffd8
00000000001d4580
[ 1561.581697] Call Trace:
[ 1561.582650]  [<ffffffff8107d346>] ? kvm_clock_read+0x46/0x80
[ 1561.584543]  [<ffffffff827063d4>] schedule+0x24/0x70
[ 1561.586231]  [<ffffffff82704025>] schedule_timeout+0x245/0x2c0
[ 1561.588508]  [<ffffffff81117c9a>] ? mark_held_locks+0x7a/0x120
[ 1561.590858]  [<ffffffff81119bbd>] ? __lock_release+0x8d/0x1d0
[ 1561.593162]  [<ffffffff82707e6b>] ? _raw_spin_unlock_irq+0x2b/0x70
[ 1561.595394]  [<ffffffff810e36d1>] ? get_parent_ip+0x11/0x50
[ 1561.597403]  [<ffffffff82705919>] wait_for_common+0x119/0x190
[ 1561.599707]  [<ffffffff810ed1b0>] ? try_to_wake_up+0x2c0/0x2c0
[ 1561.601758]  [<ffffffff82705a38>] wait_for_completion+0x18/0x20
[ 1561.603843]  [<ffffffff810cdcd8>] call_usermodehelper_exec+0x228/0x240
[ 1561.606059]  [<ffffffff82705844>] ? wait_for_common+0x44/0x190
[ 1561.608352]  [<ffffffff81878445>] kobject_uevent_env+0x615/0x650
[ 1561.610908]  [<ffffffff810e36d1>] ? get_parent_ip+0x11/0x50
[ 1561.613146]  [<ffffffff8187848b>] kobject_uevent+0xb/0x10
[ 1561.615312]  [<ffffffff81876f5a>] kobject_cleanup+0xca/0x1b0
[ 1561.617509]  [<ffffffff8187704d>] kobject_release+0xd/0x10
[ 1561.619334]  [<ffffffff81876d9c>] kobject_put+0x2c/0x60
[ 1561.621117]  [<ffffffff8226ea80>] net_rx_queue_update_kobjects+0xa0/0xf0
[ 1561.623421]  [<ffffffff8226ec87>] netdev_unregister_kobject+0x37/0x70
[ 1561.625979]  [<ffffffff82253e26>] rollback_registered_many+0x186/0x260
[ 1561.628526]  [<ffffffff82253f14>] unregister_netdevice_many+0x14/0x60
[ 1561.631064]  [<ffffffff8243922e>] ip6_tnl_destroy_tunnels+0xee/0x160
[ 1561.633549]  [<ffffffff8243b8f3>] ip6_tnl_exit_net+0xd3/0x1c0
[ 1561.635843]  [<ffffffff8243b820>] ? ip6_tnl_ioctl+0x550/0x550
[ 1561.637972]  [<ffffffff81259c86>] ? proc_net_remove+0x16/0x20
[ 1561.639881]  [<ffffffff8224f119>] ops_exit_list+0x39/0x60
[ 1561.641666]  [<ffffffff8224f72b>] cleanup_net+0xfb/0x1a0
[ 1561.643528]  [<ffffffff810ce97d>] process_one_work+0x1cd/0x460
[ 1561.645828]  [<ffffffff810ce91c>] ? process_one_work+0x16c/0x460
[ 1561.648180]  [<ffffffff8224f630>] ? net_drop_ns+0x40/0x40
[ 1561.650285]  [<ffffffff810d1e76>] worker_thread+0x176/0x3b0
[ 1561.652460]  [<ffffffff810d1d00>] ? manage_workers+0x120/0x120
[ 1561.654734]  [<ffffffff810d727e>] kthread+0xbe/0xd0
[ 1561.656656]  [<ffffffff8270a134>] kernel_thread_helper+0x4/0x10
[ 1561.658881]  [<ffffffff810e3fe0>] ? finish_task_switch+0x80/0x110
[ 1561.660828]  [<ffffffff82708434>] ? retint_restore_args+0x13/0x13
[ 1561.662795]  [<ffffffff810d71c0>] ? __init_kthread_worker+0x70/0x70
[ 1561.664932]  [<ffffffff8270a130>] ? gs_change+0x13/0x13
[ 1561.667001] 4 locks held by kworker/u:2/3140:
[ 1561.667599]  #0:  (netns){.+.+.+}, at: [<ffffffff810ce91c>]
process_one_work+0x16c/0x460
[ 1561.668758]  #1:  (net_cleanup_work){+.+.+.}, at:
[<ffffffff810ce91c>] process_one_work+0x16c/0x460
[ 1561.670002]  #2:  (net_mutex){+.+.+.}, at: [<ffffffff8224f6b0>]
cleanup_net+0x80/0x1a0
[ 1561.671700]  #3:  (rtnl_mutex){+.+.+.}, at: [<ffffffff82267f02>]
rtnl_lock+0x12/0x20

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: ipv6: tunnel: hang when destroying ipv6 tunnel
  2012-03-31 17:51 ipv6: tunnel: hang when destroying ipv6 tunnel Sasha Levin
@ 2012-03-31 20:59 ` Eric Dumazet
  2012-03-31 21:34   ` Oleg Nesterov
  0 siblings, 1 reply; 15+ messages in thread
From: Eric Dumazet @ 2012-03-31 20:59 UTC (permalink / raw)
  To: Sasha Levin
  Cc: davem, kuznet, jmorris, yoshfuji, Patrick McHardy, netdev,
	linux-kernel@vger.kernel.org List, Dave Jones, Oleg Nesterov

On Sat, 2012-03-31 at 19:51 +0200, Sasha Levin wrote:
> Hi all,
> 
> It appears that a hang may occur when destroying an ipv6 tunnel, which
> I've reproduced several times in a KVM vm.
> 
> The pattern in the stack dump below is consistent with unregistering a
> kobject when holding multiple locks. Unregistering a kobject usually
> leads to an exit back to userspace with call_usermodehelper_exec().

Yes but this userspace call is done asynchronously and we dont have to
wait its done.

> The userspace code may access sysfs files which in turn will require
> locking within the kernel, leading to a deadlock since those locks are
> already held by kernel.


> 
> [ 1561.564172] INFO: task kworker/u:2:3140 blocked for more than 120 seconds.
> [ 1561.566945] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> disables this message.
> [ 1561.570062] kworker/u:2     D ffff88006ee63000  4504  3140      2 0x00000000
> [ 1561.572968]  ffff88006ed9f7e0 0000000000000082 ffff88006ed9f790
> ffffffff8107d346
> [ 1561.575680]  ffff88006ed9ffd8 00000000001d4580 ffff88006ed9e010
> 00000000001d4580
> [ 1561.578601]  00000000001d4580 00000000001d4580 ffff88006ed9ffd8
> 00000000001d4580
> [ 1561.581697] Call Trace:
> [ 1561.582650]  [<ffffffff8107d346>] ? kvm_clock_read+0x46/0x80
> [ 1561.584543]  [<ffffffff827063d4>] schedule+0x24/0x70
> [ 1561.586231]  [<ffffffff82704025>] schedule_timeout+0x245/0x2c0
> [ 1561.588508]  [<ffffffff81117c9a>] ? mark_held_locks+0x7a/0x120
> [ 1561.590858]  [<ffffffff81119bbd>] ? __lock_release+0x8d/0x1d0
> [ 1561.593162]  [<ffffffff82707e6b>] ? _raw_spin_unlock_irq+0x2b/0x70
> [ 1561.595394]  [<ffffffff810e36d1>] ? get_parent_ip+0x11/0x50
> [ 1561.597403]  [<ffffffff82705919>] wait_for_common+0x119/0x190
> [ 1561.599707]  [<ffffffff810ed1b0>] ? try_to_wake_up+0x2c0/0x2c0
> [ 1561.601758]  [<ffffffff82705a38>] wait_for_completion+0x18/0x20

Something is wrong here, call_usermodehelper_exec ( ... UMH_WAIT_EXEC)
should not block forever. Its not like UMH_WAIT_PROC

Cc Oleg Nesterov <oleg@redhat.com>

> [ 1561.603843]  [<ffffffff810cdcd8>] call_usermodehelper_exec+0x228/0x240
> [ 1561.606059]  [<ffffffff82705844>] ? wait_for_common+0x44/0x190
> [ 1561.608352]  [<ffffffff81878445>] kobject_uevent_env+0x615/0x650
> [ 1561.610908]  [<ffffffff810e36d1>] ? get_parent_ip+0x11/0x50
> [ 1561.613146]  [<ffffffff8187848b>] kobject_uevent+0xb/0x10
> [ 1561.615312]  [<ffffffff81876f5a>] kobject_cleanup+0xca/0x1b0
> [ 1561.617509]  [<ffffffff8187704d>] kobject_release+0xd/0x10
> [ 1561.619334]  [<ffffffff81876d9c>] kobject_put+0x2c/0x60
> [ 1561.621117]  [<ffffffff8226ea80>] net_rx_queue_update_kobjects+0xa0/0xf0
> [ 1561.623421]  [<ffffffff8226ec87>] netdev_unregister_kobject+0x37/0x70
> [ 1561.625979]  [<ffffffff82253e26>] rollback_registered_many+0x186/0x260
> [ 1561.628526]  [<ffffffff82253f14>] unregister_netdevice_many+0x14/0x60
> [ 1561.631064]  [<ffffffff8243922e>] ip6_tnl_destroy_tunnels+0xee/0x160
> [ 1561.633549]  [<ffffffff8243b8f3>] ip6_tnl_exit_net+0xd3/0x1c0
> [ 1561.635843]  [<ffffffff8243b820>] ? ip6_tnl_ioctl+0x550/0x550
> [ 1561.637972]  [<ffffffff81259c86>] ? proc_net_remove+0x16/0x20
> [ 1561.639881]  [<ffffffff8224f119>] ops_exit_list+0x39/0x60
> [ 1561.641666]  [<ffffffff8224f72b>] cleanup_net+0xfb/0x1a0
> [ 1561.643528]  [<ffffffff810ce97d>] process_one_work+0x1cd/0x460
> [ 1561.645828]  [<ffffffff810ce91c>] ? process_one_work+0x16c/0x460
> [ 1561.648180]  [<ffffffff8224f630>] ? net_drop_ns+0x40/0x40
> [ 1561.650285]  [<ffffffff810d1e76>] worker_thread+0x176/0x3b0
> [ 1561.652460]  [<ffffffff810d1d00>] ? manage_workers+0x120/0x120
> [ 1561.654734]  [<ffffffff810d727e>] kthread+0xbe/0xd0
> [ 1561.656656]  [<ffffffff8270a134>] kernel_thread_helper+0x4/0x10
> [ 1561.658881]  [<ffffffff810e3fe0>] ? finish_task_switch+0x80/0x110
> [ 1561.660828]  [<ffffffff82708434>] ? retint_restore_args+0x13/0x13
> [ 1561.662795]  [<ffffffff810d71c0>] ? __init_kthread_worker+0x70/0x70
> [ 1561.664932]  [<ffffffff8270a130>] ? gs_change+0x13/0x13
> [ 1561.667001] 4 locks held by kworker/u:2/3140:
> [ 1561.667599]  #0:  (netns){.+.+.+}, at: [<ffffffff810ce91c>]
> process_one_work+0x16c/0x460
> [ 1561.668758]  #1:  (net_cleanup_work){+.+.+.}, at:
> [<ffffffff810ce91c>] process_one_work+0x16c/0x460
> [ 1561.670002]  #2:  (net_mutex){+.+.+.}, at: [<ffffffff8224f6b0>]
> cleanup_net+0x80/0x1a0
> [ 1561.671700]  #3:  (rtnl_mutex){+.+.+.}, at: [<ffffffff82267f02>]
> rtnl_lock+0x12/0x20
> --


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: ipv6: tunnel: hang when destroying ipv6 tunnel
  2012-03-31 20:59 ` Eric Dumazet
@ 2012-03-31 21:34   ` Oleg Nesterov
  2012-03-31 21:43     ` Sasha Levin
  0 siblings, 1 reply; 15+ messages in thread
From: Oleg Nesterov @ 2012-03-31 21:34 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Sasha Levin, davem, kuznet, jmorris, yoshfuji, Patrick McHardy,
	netdev, linux-kernel@vger.kernel.org List, Dave Jones,
	Tetsuo Handa

On 03/31, Eric Dumazet wrote:
>
> On Sat, 2012-03-31 at 19:51 +0200, Sasha Levin wrote:
> > Hi all,
> >
> > It appears that a hang may occur when destroying an ipv6 tunnel, which
> > I've reproduced several times in a KVM vm.

kernel version?

> > [ 1561.564172] INFO: task kworker/u:2:3140 blocked for more than 120 seconds.

And nobody else?

It would be nice to know what sysrq-t says, in particular the trace
of khelper thread is interesting.

> Something is wrong here, call_usermodehelper_exec ( ... UMH_WAIT_EXEC)
> should not block forever.

Yes, unless it triggers another request_module()...

Tetsuo, could you please take a look? Unlikely, but may be this
is fixed by your kmod-avoid-deadlock-by-recursive-kmod-call.patch
in -mm ?

Oleg.


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: ipv6: tunnel: hang when destroying ipv6 tunnel
  2012-03-31 21:34   ` Oleg Nesterov
@ 2012-03-31 21:43     ` Sasha Levin
  2012-03-31 23:26       ` Sasha Levin
  0 siblings, 1 reply; 15+ messages in thread
From: Sasha Levin @ 2012-03-31 21:43 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Eric Dumazet, davem, kuznet, jmorris, yoshfuji, Patrick McHardy,
	netdev, linux-kernel@vger.kernel.org List, Dave Jones,
	Tetsuo Handa

On Sat, Mar 31, 2012 at 11:34 PM, Oleg Nesterov <oleg@redhat.com> wrote:
> On 03/31, Eric Dumazet wrote:
>>
>> On Sat, 2012-03-31 at 19:51 +0200, Sasha Levin wrote:
>> > Hi all,
>> >
>> > It appears that a hang may occur when destroying an ipv6 tunnel, which
>> > I've reproduced several times in a KVM vm.
>
> kernel version?

latest linux-next

>> > [ 1561.564172] INFO: task kworker/u:2:3140 blocked for more than 120 seconds.
>
> And nobody else?

Some more messages follow a bit later which get stuck in vfs related code.

> It would be nice to know what sysrq-t says, in particular the trace
> of khelper thread is interesting.

Sure, I'll get one when it happens again.

>> Something is wrong here, call_usermodehelper_exec ( ... UMH_WAIT_EXEC)
>> should not block forever.
>
> Yes, unless it triggers another request_module()...
>
> Tetsuo, could you please take a look? Unlikely, but may be this
> is fixed by your kmod-avoid-deadlock-by-recursive-kmod-call.patch
> in -mm ?
>
> Oleg.
>

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: ipv6: tunnel: hang when destroying ipv6 tunnel
  2012-03-31 21:43     ` Sasha Levin
@ 2012-03-31 23:26       ` Sasha Levin
  2012-04-01  3:21         ` Tetsuo Handa
                           ` (2 more replies)
  0 siblings, 3 replies; 15+ messages in thread
From: Sasha Levin @ 2012-03-31 23:26 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Eric Dumazet, davem, kuznet, jmorris, yoshfuji, Patrick McHardy,
	netdev, linux-kernel@vger.kernel.org List, Dave Jones,
	Tetsuo Handa

On Sat, Mar 31, 2012 at 11:43 PM, Sasha Levin <levinsasha928@gmail.com> wrote:
> On Sat, Mar 31, 2012 at 11:34 PM, Oleg Nesterov <oleg@redhat.com> wrote:
>> On 03/31, Eric Dumazet wrote:
>>>
>>> On Sat, 2012-03-31 at 19:51 +0200, Sasha Levin wrote:
>>> > Hi all,
>>> >
>>> > It appears that a hang may occur when destroying an ipv6 tunnel, which
>>> > I've reproduced several times in a KVM vm.
>>
>> kernel version?
>
> latest linux-next
>
>>> > [ 1561.564172] INFO: task kworker/u:2:3140 blocked for more than 120 seconds.
>>
>> And nobody else?
>
> Some more messages follow a bit later which get stuck in vfs related code.
>
>> It would be nice to know what sysrq-t says, in particular the trace
>> of khelper thread is interesting.
>
> Sure, I'll get one when it happens again.

So here's the stack of the usermode thread:

[  336.614015] kworker/u:2     S ffff880062c13000  5176  4539   3031 0x00000000
[  336.614015]  ffff880062fb38d0 0000000000000082 ffff880062fb3860
0000000000000001
[  336.614015]  ffff880062fb3fd8 00000000001d4580 ffff880062fb2010
00000000001d4580
[  336.614015]  00000000001d4580 00000000001d4580 ffff880062fb3fd8
00000000001d4580
[  336.614015] Call Trace:
[  336.614015]  [<ffffffff826a8e54>] schedule+0x24/0x70
[  336.614015]  [<ffffffff825fd66d>] p9_client_rpc+0x13d/0x360
[  336.614015]  [<ffffffff810d7850>] ? wake_up_bit+0x40/0x40
[  336.614015]  [<ffffffff810e3671>] ? get_parent_ip+0x11/0x50
[  336.614015]  [<ffffffff810e399d>] ? sub_preempt_count+0x9d/0xd0
[  336.614015]  [<ffffffff825ff5ff>] p9_client_walk+0x8f/0x220
[  336.614015]  [<ffffffff815a8e3b>] v9fs_vfs_lookup+0xab/0x1c0
[  336.614015]  [<ffffffff811ee0c0>] d_alloc_and_lookup+0x40/0x80
[  336.614015]  [<ffffffff811fdea0>] ? d_lookup+0x30/0x50
[  336.614015]  [<ffffffff811f0aea>] do_lookup+0x28a/0x3b0
[  336.614015]  [<ffffffff817c9117>] ? security_inode_permission+0x17/0x20
[  336.614015]  [<ffffffff811f1c07>] link_path_walk+0x167/0x420
[  336.614015]  [<ffffffff811ee630>] ? generic_readlink+0xb0/0xb0
[  336.614015]  [<ffffffff81896d88>] ? __raw_spin_lock_init+0x38/0x70
[  336.614015]  [<ffffffff811f24da>] path_openat+0xba/0x500
[  336.614015]  [<ffffffff81057253>] ? sched_clock+0x13/0x20
[  336.614015]  [<ffffffff810ed805>] ? sched_clock_local+0x25/0x90
[  336.614015]  [<ffffffff810ed940>] ? sched_clock_cpu+0xd0/0x120
[  336.614015]  [<ffffffff811f2a34>] do_filp_open+0x44/0xa0
[  336.614015]  [<ffffffff81119acd>] ? __lock_release+0x8d/0x1d0
[  336.614015]  [<ffffffff810e3671>] ? get_parent_ip+0x11/0x50
[  336.614015]  [<ffffffff810e399d>] ? sub_preempt_count+0x9d/0xd0
[  336.614015]  [<ffffffff826aa7f0>] ? _raw_spin_unlock+0x30/0x60
[  336.614015]  [<ffffffff811ea74d>] open_exec+0x2d/0xf0
[  336.614015]  [<ffffffff811eb888>] do_execve_common+0x128/0x320
[  336.614015]  [<ffffffff811ebb05>] do_execve+0x35/0x40
[  336.614015]  [<ffffffff810589e5>] sys_execve+0x45/0x70
[  336.614015]  [<ffffffff826acc28>] kernel_execve+0x68/0xd0
[  336.614015]  [<ffffffff810cd6a6>] ? ____call_usermodehelper+0xf6/0x130
[  336.614015]  [<ffffffff810cd6f9>] call_helper+0x19/0x20
[  336.614015]  [<ffffffff826acbb4>] kernel_thread_helper+0x4/0x10
[  336.614015]  [<ffffffff810e3f80>] ? finish_task_switch+0x80/0x110
[  336.614015]  [<ffffffff826aaeb4>] ? retint_restore_args+0x13/0x13
[  336.614015]  [<ffffffff810cd6e0>] ? ____call_usermodehelper+0x130/0x130
[  336.614015]  [<ffffffff826acbb0>] ? gs_change+0x13/0x13

While it seems that 9p is the culprit, I have to point out that this
bug is easily reproducible, and it happens each time due to a
call_usermode_helper() call. Other than that 9p behaves perfectly and
I'd assume that I'd be seeing other things break besides
call_usermode_helper() related ones.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: ipv6: tunnel: hang when destroying ipv6 tunnel
  2012-03-31 23:26       ` Sasha Levin
@ 2012-04-01  3:21         ` Tetsuo Handa
  2012-04-01 17:33           ` Sasha Levin
  2012-04-01  5:07         ` Eric Dumazet
  2012-04-01 16:38         ` Oleg Nesterov
  2 siblings, 1 reply; 15+ messages in thread
From: Tetsuo Handa @ 2012-04-01  3:21 UTC (permalink / raw)
  To: levinsasha928, oleg
  Cc: eric.dumazet, davem, kuznet, jmorris, yoshfuji, kaber, netdev,
	linux-kernel, davej

Sasha Levin wrote:
> While it seems that 9p is the culprit, I have to point out that this
> bug is easily reproducible, and it happens each time due to a
> call_usermode_helper() call. Other than that 9p behaves perfectly and
> I'd assume that I'd be seeing other things break besides
> call_usermode_helper() related ones.

I think one of below two patches can catch the bug if this is a usermodehelper
related bug. Please try.

----- Patch 1 -----
diff --git a/kernel/kmod.c b/kernel/kmod.c
index 01394b6..3e63319 100644
--- a/kernel/kmod.c
+++ b/kernel/kmod.c
@@ -571,7 +571,7 @@ int call_usermodehelper_exec(struct subprocess_info *sub_info, int wait)
 	 * flag, for khelper thread is already waiting for the thread at
 	 * wait_for_completion() in do_fork().
 	 */
-	if (wait != UMH_NO_WAIT && current == kmod_thread_locker) {
+	if (WARN_ON(wait != UMH_NO_WAIT && current == kmod_thread_locker)) {
 		retval = -EBUSY;
 		goto out;
 	}


----- Patch 2 -----
diff --git a/include/linux/kmod.h b/include/linux/kmod.h
index 9efeae6..1350670 100644
--- a/include/linux/kmod.h
+++ b/include/linux/kmod.h
@@ -48,10 +48,10 @@ static inline int request_module_nowait(const char *name, ...) { return -ENOSYS;
 struct cred;
 struct file;
 
-#define UMH_NO_WAIT	0	/* don't wait at all */
-#define UMH_WAIT_EXEC	1	/* wait for the exec, but not the process */
-#define UMH_WAIT_PROC	2	/* wait for the process to complete */
-#define UMH_KILLABLE	4	/* wait for EXEC/PROC killable */
+#define UMH_NO_WAIT	0x10	/* don't wait at all */
+#define UMH_WAIT_EXEC	0x11	/* wait for the exec, but not the process */
+#define UMH_WAIT_PROC	0x12	/* wait for the process to complete */
+#define UMH_KILLABLE	0x04	/* wait for EXEC/PROC killable */
 
 struct subprocess_info {
 	struct work_struct work;
diff --git a/kernel/kmod.c b/kernel/kmod.c
index 957a7aa..ecfd3d5 100644
--- a/kernel/kmod.c
+++ b/kernel/kmod.c
@@ -483,6 +483,18 @@ int call_usermodehelper_exec(struct subprocess_info *sub_info, int wait)
 	DECLARE_COMPLETION_ONSTACK(done);
 	int retval = 0;
 
+	if (unlikely(wait == -1 || wait == 0 || wait == 1)) {
+		WARN(1, "Requesting for usermode helper with hardcoded wait "
+		     "flag. Change to use UMH_* symbols and recompile, or "
+		     "this request will fail on Linux 3.4.\n");
+		if (wait == -1)
+			wait = UMH_NO_WAIT;
+		else if (wait == 0)
+			wait = UMH_WAIT_EXEC;
+		else
+			wait = UMH_WAIT_PROC;
+	}
+
 	helper_lock();
 	if (sub_info->path[0] == '\0')
 		goto out;

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: ipv6: tunnel: hang when destroying ipv6 tunnel
  2012-03-31 23:26       ` Sasha Levin
  2012-04-01  3:21         ` Tetsuo Handa
@ 2012-04-01  5:07         ` Eric Dumazet
  2012-04-01 16:38         ` Oleg Nesterov
  2 siblings, 0 replies; 15+ messages in thread
From: Eric Dumazet @ 2012-04-01  5:07 UTC (permalink / raw)
  To: Sasha Levin
  Cc: Oleg Nesterov, davem, kuznet, jmorris, yoshfuji, Patrick McHardy,
	netdev, linux-kernel@vger.kernel.org List, Dave Jones,
	Tetsuo Handa

On Sun, 2012-04-01 at 01:26 +0200, Sasha Levin wrote:
> On Sat, Mar 31, 2012 at 11:43 PM, Sasha Levin <levinsasha928@gmail.com> wrote:
> > On Sat, Mar 31, 2012 at 11:34 PM, Oleg Nesterov <oleg@redhat.com> wrote:
> >> On 03/31, Eric Dumazet wrote:
> >>>
> >>> On Sat, 2012-03-31 at 19:51 +0200, Sasha Levin wrote:
> >>> > Hi all,
> >>> >
> >>> > It appears that a hang may occur when destroying an ipv6 tunnel, which
> >>> > I've reproduced several times in a KVM vm.
> >>
> >> kernel version?
> >
> > latest linux-next
> >
> >>> > [ 1561.564172] INFO: task kworker/u:2:3140 blocked for more than 120 seconds.
> >>
> >> And nobody else?
> >
> > Some more messages follow a bit later which get stuck in vfs related code.
> >
> >> It would be nice to know what sysrq-t says, in particular the trace
> >> of khelper thread is interesting.
> >
> > Sure, I'll get one when it happens again.
> 
> So here's the stack of the usermode thread:
> 
> [  336.614015] kworker/u:2     S ffff880062c13000  5176  4539   3031 0x00000000
> [  336.614015]  ffff880062fb38d0 0000000000000082 ffff880062fb3860
> 0000000000000001
> [  336.614015]  ffff880062fb3fd8 00000000001d4580 ffff880062fb2010
> 00000000001d4580
> [  336.614015]  00000000001d4580 00000000001d4580 ffff880062fb3fd8
> 00000000001d4580
> [  336.614015] Call Trace:
> [  336.614015]  [<ffffffff826a8e54>] schedule+0x24/0x70
> [  336.614015]  [<ffffffff825fd66d>] p9_client_rpc+0x13d/0x360
> [  336.614015]  [<ffffffff810d7850>] ? wake_up_bit+0x40/0x40
> [  336.614015]  [<ffffffff810e3671>] ? get_parent_ip+0x11/0x50
> [  336.614015]  [<ffffffff810e399d>] ? sub_preempt_count+0x9d/0xd0
> [  336.614015]  [<ffffffff825ff5ff>] p9_client_walk+0x8f/0x220
> [  336.614015]  [<ffffffff815a8e3b>] v9fs_vfs_lookup+0xab/0x1c0
> [  336.614015]  [<ffffffff811ee0c0>] d_alloc_and_lookup+0x40/0x80
> [  336.614015]  [<ffffffff811fdea0>] ? d_lookup+0x30/0x50
> [  336.614015]  [<ffffffff811f0aea>] do_lookup+0x28a/0x3b0
> [  336.614015]  [<ffffffff817c9117>] ? security_inode_permission+0x17/0x20
> [  336.614015]  [<ffffffff811f1c07>] link_path_walk+0x167/0x420
> [  336.614015]  [<ffffffff811ee630>] ? generic_readlink+0xb0/0xb0
> [  336.614015]  [<ffffffff81896d88>] ? __raw_spin_lock_init+0x38/0x70
> [  336.614015]  [<ffffffff811f24da>] path_openat+0xba/0x500
> [  336.614015]  [<ffffffff81057253>] ? sched_clock+0x13/0x20
> [  336.614015]  [<ffffffff810ed805>] ? sched_clock_local+0x25/0x90
> [  336.614015]  [<ffffffff810ed940>] ? sched_clock_cpu+0xd0/0x120
> [  336.614015]  [<ffffffff811f2a34>] do_filp_open+0x44/0xa0
> [  336.614015]  [<ffffffff81119acd>] ? __lock_release+0x8d/0x1d0
> [  336.614015]  [<ffffffff810e3671>] ? get_parent_ip+0x11/0x50
> [  336.614015]  [<ffffffff810e399d>] ? sub_preempt_count+0x9d/0xd0
> [  336.614015]  [<ffffffff826aa7f0>] ? _raw_spin_unlock+0x30/0x60
> [  336.614015]  [<ffffffff811ea74d>] open_exec+0x2d/0xf0
> [  336.614015]  [<ffffffff811eb888>] do_execve_common+0x128/0x320
> [  336.614015]  [<ffffffff811ebb05>] do_execve+0x35/0x40
> [  336.614015]  [<ffffffff810589e5>] sys_execve+0x45/0x70
> [  336.614015]  [<ffffffff826acc28>] kernel_execve+0x68/0xd0
> [  336.614015]  [<ffffffff810cd6a6>] ? ____call_usermodehelper+0xf6/0x130
> [  336.614015]  [<ffffffff810cd6f9>] call_helper+0x19/0x20
> [  336.614015]  [<ffffffff826acbb4>] kernel_thread_helper+0x4/0x10
> [  336.614015]  [<ffffffff810e3f80>] ? finish_task_switch+0x80/0x110
> [  336.614015]  [<ffffffff826aaeb4>] ? retint_restore_args+0x13/0x13
> [  336.614015]  [<ffffffff810cd6e0>] ? ____call_usermodehelper+0x130/0x130
> [  336.614015]  [<ffffffff826acbb0>] ? gs_change+0x13/0x13
> 
> While it seems that 9p is the culprit, I have to point out that this
> bug is easily reproducible, and it happens each time due to a
> call_usermode_helper() call. Other than that 9p behaves perfectly and
> I'd assume that I'd be seeing other things break besides
> call_usermode_helper() related ones.

OK then there is a third process (might be a 9p related one trying to
serve this RPC request) blocking on one of the mutex hold by your first
process (the one invoking call_usermodehelper())

Maybe kobject_uevent_env() should not use UMH_WAIT_EXEC  to get a non
blocking guarantee.




^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: ipv6: tunnel: hang when destroying ipv6 tunnel
  2012-03-31 23:26       ` Sasha Levin
  2012-04-01  3:21         ` Tetsuo Handa
  2012-04-01  5:07         ` Eric Dumazet
@ 2012-04-01 16:38         ` Oleg Nesterov
  2 siblings, 0 replies; 15+ messages in thread
From: Oleg Nesterov @ 2012-04-01 16:38 UTC (permalink / raw)
  To: Sasha Levin
  Cc: Eric Dumazet, davem, kuznet, jmorris, yoshfuji, Patrick McHardy,
	netdev, linux-kernel@vger.kernel.org List, Dave Jones,
	Tetsuo Handa

On 04/01, Sasha Levin wrote:
>
> >> It would be nice to know what sysrq-t says, in particular the trace
> >> of khelper thread is interesting.
> >
> > Sure, I'll get one when it happens again.
>
> So here's the stack of the usermode thread:

Great, thanks, this is even better than khelper's trace,

> [  336.614015]  [<ffffffff826a8e54>] schedule+0x24/0x70
> [  336.614015]  [<ffffffff825fd66d>] p9_client_rpc+0x13d/0x360
> [  336.614015]  [<ffffffff810d7850>] ? wake_up_bit+0x40/0x40
> [  336.614015]  [<ffffffff810e3671>] ? get_parent_ip+0x11/0x50
> [  336.614015]  [<ffffffff810e399d>] ? sub_preempt_count+0x9d/0xd0
> [  336.614015]  [<ffffffff825ff5ff>] p9_client_walk+0x8f/0x220
> [  336.614015]  [<ffffffff815a8e3b>] v9fs_vfs_lookup+0xab/0x1c0
> [  336.614015]  [<ffffffff811ee0c0>] d_alloc_and_lookup+0x40/0x80
> [  336.614015]  [<ffffffff811fdea0>] ? d_lookup+0x30/0x50
> [  336.614015]  [<ffffffff811f0aea>] do_lookup+0x28a/0x3b0
> [  336.614015]  [<ffffffff817c9117>] ? security_inode_permission+0x17/0x20
> [  336.614015]  [<ffffffff811f1c07>] link_path_walk+0x167/0x420
> [  336.614015]  [<ffffffff811ee630>] ? generic_readlink+0xb0/0xb0
> [  336.614015]  [<ffffffff81896d88>] ? __raw_spin_lock_init+0x38/0x70
> [  336.614015]  [<ffffffff811f24da>] path_openat+0xba/0x500
> [  336.614015]  [<ffffffff81057253>] ? sched_clock+0x13/0x20
> [  336.614015]  [<ffffffff810ed805>] ? sched_clock_local+0x25/0x90
> [  336.614015]  [<ffffffff810ed940>] ? sched_clock_cpu+0xd0/0x120
> [  336.614015]  [<ffffffff811f2a34>] do_filp_open+0x44/0xa0
> [  336.614015]  [<ffffffff81119acd>] ? __lock_release+0x8d/0x1d0
> [  336.614015]  [<ffffffff810e3671>] ? get_parent_ip+0x11/0x50
> [  336.614015]  [<ffffffff810e399d>] ? sub_preempt_count+0x9d/0xd0
> [  336.614015]  [<ffffffff826aa7f0>] ? _raw_spin_unlock+0x30/0x60
> [  336.614015]  [<ffffffff811ea74d>] open_exec+0x2d/0xf0
> [  336.614015]  [<ffffffff811eb888>] do_execve_common+0x128/0x320
> [  336.614015]  [<ffffffff811ebb05>] do_execve+0x35/0x40
> [  336.614015]  [<ffffffff810589e5>] sys_execve+0x45/0x70
> [  336.614015]  [<ffffffff826acc28>] kernel_execve+0x68/0xd0
> [  336.614015]  [<ffffffff810cd6a6>] ? ____call_usermodehelper+0xf6/0x130
> [  336.614015]  [<ffffffff810cd6f9>] call_helper+0x19/0x20
> [  336.614015]  [<ffffffff826acbb4>] kernel_thread_helper+0x4/0x10
> [  336.614015]  [<ffffffff810e3f80>] ? finish_task_switch+0x80/0x110
> [  336.614015]  [<ffffffff826aaeb4>] ? retint_restore_args+0x13/0x13
> [  336.614015]  [<ffffffff810cd6e0>] ? ____call_usermodehelper+0x130/0x130
> [  336.614015]  [<ffffffff826acbb0>] ? gs_change+0x13/0x13
>
> While it seems that 9p is the culprit, I have to point out that this
> bug is easily reproducible, and it happens each time due to a
> call_usermode_helper() call. Other than that 9p behaves perfectly and
> I'd assume that I'd be seeing other things break besides
> call_usermode_helper() related ones.

Of course I do not know what happens, but at least this obviously
explains why UMH_WAIT_EXEC hangs, I think call_usermodehelper_exec()
itself is innocent.

Oleg.


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: ipv6: tunnel: hang when destroying ipv6 tunnel
  2012-04-01  3:21         ` Tetsuo Handa
@ 2012-04-01 17:33           ` Sasha Levin
  2012-04-05 14:29             ` Tetsuo Handa
  0 siblings, 1 reply; 15+ messages in thread
From: Sasha Levin @ 2012-04-01 17:33 UTC (permalink / raw)
  To: Tetsuo Handa
  Cc: oleg, eric.dumazet, davem, kuznet, jmorris, yoshfuji, kaber,
	netdev, linux-kernel, davej

On Sun, Apr 1, 2012 at 5:21 AM, Tetsuo Handa
<penguin-kernel@i-love.sakura.ne.jp> wrote:
> Sasha Levin wrote:
>> While it seems that 9p is the culprit, I have to point out that this
>> bug is easily reproducible, and it happens each time due to a
>> call_usermode_helper() call. Other than that 9p behaves perfectly and
>> I'd assume that I'd be seeing other things break besides
>> call_usermode_helper() related ones.
>
> I think one of below two patches can catch the bug if this is a usermodehelper
> related bug. Please try.
>
> ----- Patch 1 -----
> diff --git a/kernel/kmod.c b/kernel/kmod.c
> index 01394b6..3e63319 100644
> --- a/kernel/kmod.c
> +++ b/kernel/kmod.c
> @@ -571,7 +571,7 @@ int call_usermodehelper_exec(struct subprocess_info *sub_info, int wait)
>         * flag, for khelper thread is already waiting for the thread at
>         * wait_for_completion() in do_fork().
>         */
> -       if (wait != UMH_NO_WAIT && current == kmod_thread_locker) {
> +       if (WARN_ON(wait != UMH_NO_WAIT && current == kmod_thread_locker)) {
>                retval = -EBUSY;
>                goto out;
>        }
>
>
> ----- Patch 2 -----
> diff --git a/include/linux/kmod.h b/include/linux/kmod.h
> index 9efeae6..1350670 100644
> --- a/include/linux/kmod.h
> +++ b/include/linux/kmod.h
> @@ -48,10 +48,10 @@ static inline int request_module_nowait(const char *name, ...) { return -ENOSYS;
>  struct cred;
>  struct file;
>
> -#define UMH_NO_WAIT    0       /* don't wait at all */
> -#define UMH_WAIT_EXEC  1       /* wait for the exec, but not the process */
> -#define UMH_WAIT_PROC  2       /* wait for the process to complete */
> -#define UMH_KILLABLE   4       /* wait for EXEC/PROC killable */
> +#define UMH_NO_WAIT    0x10    /* don't wait at all */
> +#define UMH_WAIT_EXEC  0x11    /* wait for the exec, but not the process */
> +#define UMH_WAIT_PROC  0x12    /* wait for the process to complete */
> +#define UMH_KILLABLE   0x04    /* wait for EXEC/PROC killable */
>
>  struct subprocess_info {
>        struct work_struct work;
> diff --git a/kernel/kmod.c b/kernel/kmod.c
> index 957a7aa..ecfd3d5 100644
> --- a/kernel/kmod.c
> +++ b/kernel/kmod.c
> @@ -483,6 +483,18 @@ int call_usermodehelper_exec(struct subprocess_info *sub_info, int wait)
>        DECLARE_COMPLETION_ONSTACK(done);
>        int retval = 0;
>
> +       if (unlikely(wait == -1 || wait == 0 || wait == 1)) {
> +               WARN(1, "Requesting for usermode helper with hardcoded wait "
> +                    "flag. Change to use UMH_* symbols and recompile, or "
> +                    "this request will fail on Linux 3.4.\n");
> +               if (wait == -1)
> +                       wait = UMH_NO_WAIT;
> +               else if (wait == 0)
> +                       wait = UMH_WAIT_EXEC;
> +               else
> +                       wait = UMH_WAIT_PROC;
> +       }
> +
>        helper_lock();
>        if (sub_info->path[0] == '\0')
>                goto out;

Neither of these patches did the trick (no warnings showing up and
still seeing the hangs). However, the following patch has fixed it:

diff --git a/lib/kobject_uevent.c b/lib/kobject_uevent.c
index 1a91efa..8ecc377 100644
--- a/lib/kobject_uevent.c
+++ b/lib/kobject_uevent.c
@@ -309,7 +309,7 @@ int kobject_uevent_env(struct kobject *kobj, enum
kobject_action action,
                        goto exit;

                retval = call_usermodehelper(argv[0], argv,
-                                            env->envp, UMH_WAIT_EXEC);
+                                            env->envp, UMH_NO_WAIT);
        }

 exit:

Not sure if that info helps any, but just in case.

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: ipv6: tunnel: hang when destroying ipv6 tunnel
  2012-04-01 17:33           ` Sasha Levin
@ 2012-04-05 14:29             ` Tetsuo Handa
  2012-04-05 14:34               ` Tetsuo Handa
  2012-04-06 11:44               ` Tetsuo Handa
  0 siblings, 2 replies; 15+ messages in thread
From: Tetsuo Handa @ 2012-04-05 14:29 UTC (permalink / raw)
  To: levinsasha928, garlick, ericvh
  Cc: oleg, eric.dumazet, davem, kuznet, jmorris, yoshfuji, kaber,
	netdev, linux-kernel, davej

> Tetsuo Handa wrote:
> > Maybe you can get more useful information with below untested printk() patch.
> > 
> > diff --git a/net/9p/client.c b/net/9p/client.c
> > index b23a17c..2dd447a 100644
> > --- a/net/9p/client.c
> > +++ b/net/9p/client.c
> > @@ -734,7 +734,9 @@ p9_client_rpc(struct p9_client *c, int8_t type, const char *fmt, ...)
> >  	} else
> >  		sigpending = 0;
> >  
> > +	printk("%u:Calling %pS\n", current->pid, c->trans_mod->request);
> >  	err = c->trans_mod->request(c, req);
> > +	printk("%u:%pS = %d\n", current->pid, c->trans_mod->request, err);
> >  	if (err < 0) {
> >  		if (err != -ERESTARTSYS && err != -EFAULT)
> >  			c->status = Disconnected;
> > @@ -742,8 +744,10 @@ p9_client_rpc(struct p9_client *c, int8_t type, const char *fmt, ...)
> >  	}
> >  again:
> >  	/* Wait for the response */
> > +	printk("%u:req->status = %u\n", current->pid, req->status);
> >  	err = wait_event_interruptible(*req->wq,
> >  				       req->status >= REQ_STATUS_RCVD);
> > +	printk("%u:wait = %d\n", current->pid, err);
> >  
> >  	if ((err == -ERESTARTSYS) && (c->status == Connected)
> >  				  && (type == P9_TFLUSH)) {
> > 

Sasha Levin wrote:
> Heya,
> 
> The output from the printk confirmed that there are several threads
> waiting for RPC to complete, with the last two having and odd 'wait'
> result. This is just before the hang:
> 
> [  809.165663] 19964:Calling p9_virtio_request+0x0/0x200
> [  809.166951] 19964:p9_virtio_request+0x0/0x200 = 0
> [  809.167878] 19964:req->status = 3
> [  809.803535] 19957:Calling p9_virtio_request+0x0/0x200
> [  809.804506] 19957:p9_virtio_request+0x0/0x200 = 0
> [  809.805332] 19957:req->status = 3
> [  809.868591] 19955:Calling p9_virtio_request+0x0/0x200
> [  809.869493] 19955:p9_virtio_request+0x0/0x200 = 0
> [  809.870331] 19955:req->status = 3
> [  811.364554] 19985:Calling p9_virtio_request+0x0/0x200
> [  811.365498] 19985:p9_virtio_request+0x0/0x200 = 0
> [  811.366386] 19985:req->status = 3
> [  811.458600] 19999:wait = -512
> [  811.459171] 19999:Calling p9_virtio_request+0x0/0x200
> [  811.459992] 19999:p9_virtio_request+0x0/0x200 = 0
> [  811.460822] 19999:req->status = 3
> [  811.472175] 19994:wait = -512
> [  811.472943] 19994:Calling p9_virtio_request+0x0/0x200
> [  811.474195] 19994:p9_virtio_request+0x0/0x200 = 0
> [  811.474955] 19994:req->status = 3
> [... Hang 120 sec later here]
> 

Good. -512 is -ERESTARTSYS, and this hang occurs after -ERESTARTSYS is
returned. It indicates that c->trans_mod->request() is interrupted by signal.
Since c->trans_mod->request is pointing at p9_virtio_request, the location
returning that error would be

254 static int
255 p9_virtio_request(struct p9_client *client, struct p9_req_t *req)
256 {
257         int err;
258         int in, out;
259         unsigned long flags;
260         struct virtio_chan *chan = client->trans;
261 
262         p9_debug(P9_DEBUG_TRANS, "9p debug: virtio request\n");
263 
264         req->status = REQ_STATUS_SENT;
265 req_retry:
266         spin_lock_irqsave(&chan->lock, flags);
267 
268         /* Handle out VirtIO ring buffers */
269         out = pack_sg_list(chan->sg, 0,
270                            VIRTQUEUE_NUM, req->tc->sdata, req->tc->size);
271 
272         in = pack_sg_list(chan->sg, out,
273                           VIRTQUEUE_NUM, req->rc->sdata, req->rc->capacity);
274 
275         err = virtqueue_add_buf(chan->vq, chan->sg, out, in, req->tc,
276                                 GFP_ATOMIC);
277         if (err < 0) {
278                 if (err == -ENOSPC) {
279                         chan->ring_bufs_avail = 0;
280                         spin_unlock_irqrestore(&chan->lock, flags);
281                         err = wait_event_interruptible(*chan->vc_wq,
282                                                         chan->ring_bufs_avail);

   here.

283                         if (err  == -ERESTARTSYS)
284                                 return err;
285 
286                         p9_debug(P9_DEBUG_TRANS, "Retry virtio request\n");
287                         goto req_retry;
288                 } else {
289                         spin_unlock_irqrestore(&chan->lock, flags);
290                         p9_debug(P9_DEBUG_TRANS,
291                                  "virtio rpc add_buf returned failure\n");
292                         return -EIO;
293                 }
294         }
295         virtqueue_kick(chan->vq);
296         spin_unlock_irqrestore(&chan->lock, flags);
297 
298         p9_debug(P9_DEBUG_TRANS, "virtio request kicked\n");
299         return 0;
300 }

Comparing 3.3.1 and linux-next in my environment, there are several changes.

# diff -ur linux-3.3.1/drivers/virtio/ linux-next/drivers/virtio/ | diffstat
 config.c         |    1
 virtio_balloon.c |   14 ----------
 virtio_pci.c     |   74 +++++--------------------------------------------------
 3 files changed, 8 insertions(+), 81 deletions(-)
# diff -urp linux-3.3.1/fs/9p/ linux-next/fs/9p/ | diffstat
 v9fs.c      |   16 ++++++++--------
 vfs_super.c |    5 ++---
 2 files changed, 10 insertions(+), 11 deletions(-)
# diff -ur linux-3.3.1/net/9p/ linux-next/net/9p/ | diffstat
 client.c |   26 +++++++++++++++++++++++---
 1 file changed, 23 insertions(+), 3 deletions(-)

Most suspicious change is net/9p/client.c because it is changing handling of
ERESTARTSYS case.

--- linux-3.3.1/net/9p/client.c
+++ linux-next/net/9p/client.c
@@ -740,10 +740,18 @@
                        c->status = Disconnected;
                goto reterr;
        }
+again:
        /* Wait for the response */
        err = wait_event_interruptible(*req->wq,
                                       req->status >= REQ_STATUS_RCVD);

+       if ((err == -ERESTARTSYS) && (c->status == Connected)
+                                 && (type == P9_TFLUSH)) {
+               sigpending = 1;
+               clear_thread_flag(TIF_SIGPENDING);
+               goto again;
+       }
+
        if (req->status == REQ_STATUS_ERROR) {
                p9_debug(P9_DEBUG_ERROR, "req_status error %d\n", req->t_err);
                err = req->t_err;
@@ -1420,6 +1428,7 @@
        int err;
        struct p9_client *clnt;
        struct p9_req_t *req;
+       int retries = 0;

        if (!fid) {
                pr_warn("%s (%d): Trying to clunk with NULL fid\n",
@@ -1428,7 +1437,9 @@
                return 0;
        }

-       p9_debug(P9_DEBUG_9P, ">>> TCLUNK fid %d\n", fid->fid);
+again:
+       p9_debug(P9_DEBUG_9P, ">>> TCLUNK fid %d (try %d)\n", fid->fid,
+                                                               retries);
        err = 0;
        clnt = fid->clnt;

@@ -1444,8 +1455,14 @@
 error:
        /*
         * Fid is not valid even after a failed clunk
+        * If interrupted, retry once then give up and
+        * leak fid until umount.
         */
-       p9_fid_destroy(fid);
+       if (err == -ERESTARTSYS) {
+               if (retries++ == 0)
+                       goto again;
+       } else
+               p9_fid_destroy(fid);
        return err;
 }
 EXPORT_SYMBOL(p9_client_clunk);
@@ -1470,7 +1487,10 @@

        p9_free_req(clnt, req);
 error:
-       p9_fid_destroy(fid);
+       if (err == -ERESTARTSYS)
+               p9_client_clunk(fid);
+       else
+               p9_fid_destroy(fid);
        return err;
 }
 EXPORT_SYMBOL(p9_client_remove);

Maybe commit a314f274 "net/9p: don't allow Tflush to be interrupted" or nearby.



By the way, have you already tried 3.4-rc1?
In my environment, there is no difference between linux-next and 3.4-rc1.

# diff -ur linux-3.4.0-rc1/net/9p/ linux-next/net/9p/
# diff -ur linux-3.4.0-rc1/drivers/virtio/ linux-next/drivers/virtio/
# diff -ur linux-3.4.0-rc1/fs/9p/ linux-next/fs/9p/

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: ipv6: tunnel: hang when destroying ipv6 tunnel
  2012-04-05 14:29             ` Tetsuo Handa
@ 2012-04-05 14:34               ` Tetsuo Handa
  2012-04-06 11:44               ` Tetsuo Handa
  1 sibling, 0 replies; 15+ messages in thread
From: Tetsuo Handa @ 2012-04-05 14:34 UTC (permalink / raw)
  To: levinsasha928, garlick, ericvh
  Cc: oleg, eric.dumazet, davem, kuznet, jmorris, yoshfuji, kaber,
	netdev, linux-kernel, davej, penguin-kernel

Tetsuo Handa wrote:
> Good. -512 is -ERESTARTSYS, and this hang occurs after -ERESTARTSYS is
> returned. It indicates that c->trans_mod->request() is interrupted by signal.
> Since c->trans_mod->request is pointing at p9_virtio_request, the location
> returning that error would be
(...snipped...)
> 281                         err = wait_event_interruptible(*chan->vc_wq,
> 282                                                         chan->ring_bufs_avail);
> 
>    here.

Oops. Not p9_virtio_request().

It is p9_client_rpc(). I misread the output lines.

> > > +	printk("%u:req->status = %u\n", current->pid, req->status);
> > >  	err = wait_event_interruptible(*req->wq,
> > >  				       req->status >= REQ_STATUS_RCVD);
> > > +	printk("%u:wait = %d\n", current->pid, err);

But anyway, I think this is interupt related bug.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: ipv6: tunnel: hang when destroying ipv6 tunnel
  2012-04-05 14:29             ` Tetsuo Handa
  2012-04-05 14:34               ` Tetsuo Handa
@ 2012-04-06 11:44               ` Tetsuo Handa
  2012-04-06 18:09                 ` Jim Garlick
  1 sibling, 1 reply; 15+ messages in thread
From: Tetsuo Handa @ 2012-04-06 11:44 UTC (permalink / raw)
  To: levinsasha928, garlick, ericvh
  Cc: oleg, eric.dumazet, davem, kuznet, jmorris, yoshfuji, kaber,
	netdev, linux-kernel, davej

Tetsuo Handa wrote:
> Most suspicious change is net/9p/client.c because it is changing handling of
> ERESTARTSYS case.
> 
> --- linux-3.3.1/net/9p/client.c
> +++ linux-next/net/9p/client.c
> @@ -740,10 +740,18 @@
>                         c->status = Disconnected;
>                 goto reterr;
>         }
> +again:
>         /* Wait for the response */
>         err = wait_event_interruptible(*req->wq,
>                                        req->status >= REQ_STATUS_RCVD);
> 
> +       if ((err == -ERESTARTSYS) && (c->status == Connected)
> +                                 && (type == P9_TFLUSH)) {
> +               sigpending = 1;
> +               clear_thread_flag(TIF_SIGPENDING);
> +               goto again;
> +       }
> +

I think this loop is bad with regard to response to SIGKILL.
If wait_event_interruptible() was interrupted by SIGKILL, it will
spin until req->status >= REQ_STATUS_RCVD becomes true.
Rather,

	if ((c->status == Connected) && (type == P9_TFLUSH))
		err = wait_event_killable(*req->wq,
					  req->status >= REQ_STATUS_RCVD);
	else
		err = wait_event_interruptible(*req->wq,
					       req->status >= REQ_STATUS_RCVD);

would be safer.



>  error:
>         /*
>          * Fid is not valid even after a failed clunk
> +        * If interrupted, retry once then give up and
> +        * leak fid until umount.
>          */
> -       p9_fid_destroy(fid);
> +       if (err == -ERESTARTSYS) {
> +               if (retries++ == 0)
> +                       goto again;

I think it is possible that the process is interrupted again upon retrying.
I suspect the handling of err == -ERESTARTSYS case when retries != 0.
It is returning without calling p9_fid_destroy(), which will be
unexpected behaviour for the various callers.

> +       } else
> +               p9_fid_destroy(fid);
>         return err;
>  }
>  EXPORT_SYMBOL(p9_client_clunk);

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: ipv6: tunnel: hang when destroying ipv6 tunnel
  2012-04-06 11:44               ` Tetsuo Handa
@ 2012-04-06 18:09                 ` Jim Garlick
  2012-04-07  0:06                   ` Tetsuo Handa
  0 siblings, 1 reply; 15+ messages in thread
From: Jim Garlick @ 2012-04-06 18:09 UTC (permalink / raw)
  To: Tetsuo Handa
  Cc: levinsasha928, ericvh, oleg, eric.dumazet, davem, kuznet,
	jmorris, yoshfuji, kaber, netdev, linux-kernel, davej

Hi Tetsuo,

I am sorry if my patch is causing you grief!

On Fri, Apr 06, 2012 at 04:44:37AM -0700, Tetsuo Handa wrote:
> Tetsuo Handa wrote:
> > Most suspicious change is net/9p/client.c because it is changing handling of
> > ERESTARTSYS case.
> > 
> > --- linux-3.3.1/net/9p/client.c
> > +++ linux-next/net/9p/client.c
> > @@ -740,10 +740,18 @@
> >                         c->status = Disconnected;
> >                 goto reterr;
> >         }
> > +again:
> >         /* Wait for the response */
> >         err = wait_event_interruptible(*req->wq,
> >                                        req->status >= REQ_STATUS_RCVD);
> > 
> > +       if ((err == -ERESTARTSYS) && (c->status == Connected)
> > +                                 && (type == P9_TFLUSH)) {
> > +               sigpending = 1;
> > +               clear_thread_flag(TIF_SIGPENDING);
> > +               goto again;
> > +       }
> > +
> 
> I think this loop is bad with regard to response to SIGKILL.
> If wait_event_interruptible() was interrupted by SIGKILL, it will
> spin until req->status >= REQ_STATUS_RCVD becomes true.
> Rather,
> 
> 	if ((c->status == Connected) && (type == P9_TFLUSH))
> 		err = wait_event_killable(*req->wq,
> 					  req->status >= REQ_STATUS_RCVD);
> 	else
> 		err = wait_event_interruptible(*req->wq,
> 					       req->status >= REQ_STATUS_RCVD);
> 
> would be safer.

Does that work?  What prevents p9_client_rpc() from recursing via
p9_client_flush() on receipt of SIGKILL? 


> >  error:
> >         /*
> >          * Fid is not valid even after a failed clunk
> > +        * If interrupted, retry once then give up and
> > +        * leak fid until umount.
> >          */
> > -       p9_fid_destroy(fid);
> > +       if (err == -ERESTARTSYS) {
> > +               if (retries++ == 0)
> > +                       goto again;
> 
> I think it is possible that the process is interrupted again upon retrying.
> I suspect the handling of err == -ERESTARTSYS case when retries != 0.
> It is returning without calling p9_fid_destroy(), which will be
> unexpected behaviour for the various callers.

Yes but in the unlikely event that this happens, the effect is a small
memory leak for the duration of the mount.  On the other hand if the
fid is destroyed without successfully informing the server, then
subsequent operations that involve new file references will fail
when that fid number is reused, and the mount becomes unusable.

> > +       } else
> > +               p9_fid_destroy(fid);
> >         return err;
> >  }
> >  EXPORT_SYMBOL(p9_client_clunk);

Regards,

Jim

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: ipv6: tunnel: hang when destroying ipv6 tunnel
  2012-04-06 18:09                 ` Jim Garlick
@ 2012-04-07  0:06                   ` Tetsuo Handa
  2012-04-11 12:20                     ` Sasha Levin
  0 siblings, 1 reply; 15+ messages in thread
From: Tetsuo Handa @ 2012-04-07  0:06 UTC (permalink / raw)
  To: garlick
  Cc: levinsasha928, ericvh, oleg, eric.dumazet, davem, kuznet,
	jmorris, yoshfuji, kaber, netdev, linux-kernel, davej

Jim Garlick wrote:
> > I think this loop is bad with regard to response to SIGKILL.
> > If wait_event_interruptible() was interrupted by SIGKILL, it will
> > spin until req->status >= REQ_STATUS_RCVD becomes true.
> > Rather,
> > 
> > 	if ((c->status == Connected) && (type == P9_TFLUSH))
> > 		err = wait_event_killable(*req->wq,
> > 					  req->status >= REQ_STATUS_RCVD);
> > 	else
> > 		err = wait_event_interruptible(*req->wq,
> > 					       req->status >= REQ_STATUS_RCVD);
> > 
> > would be safer.
> 
> Does that work?  What prevents p9_client_rpc() from recursing via
> p9_client_flush() on receipt of SIGKILL? 

Sorry, I'm not a 9p user and I can't test whether that works or not.
But at least, continuing the loop even after SIGKILL is not good.
If you have to wait for req->status >= REQ_STATUS_RCVD becomes true, can you
use a kernel thread that waits req->status >= REQ_STATUS_RCVD to become true
and delegate the job of notifying the server from a userspace task to the
kernel thread?

> Yes but in the unlikely event that this happens, the effect is a small
> memory leak for the duration of the mount.  On the other hand if the
> fid is destroyed without successfully informing the server, then
> subsequent operations that involve new file references will fail
> when that fid number is reused, and the mount becomes unusable.

I don't know whether Sasha's problem is caused by this patch or not.
But p9_client_clunk() is called from many functions in fs/9p/ directory.
They are assuming that p9_client_clunk() will call p9_fid_destroy() but
this patch is breaking that assumption. I think this is the cause of hang which
Sasha is experiencing because Sasha's trace shows that call_usermodehelper() is
blocked by functions in fs/9p/ directory. Seems inconsistency state problem.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: ipv6: tunnel: hang when destroying ipv6 tunnel
  2012-04-07  0:06                   ` Tetsuo Handa
@ 2012-04-11 12:20                     ` Sasha Levin
  0 siblings, 0 replies; 15+ messages in thread
From: Sasha Levin @ 2012-04-11 12:20 UTC (permalink / raw)
  To: Tetsuo Handa
  Cc: garlick, ericvh, oleg, eric.dumazet, davem, kuznet, jmorris,
	yoshfuji, kaber, netdev, linux-kernel, davej

>> Yes but in the unlikely event that this happens, the effect is a small
>> memory leak for the duration of the mount.  On the other hand if the
>> fid is destroyed without successfully informing the server, then
>> subsequent operations that involve new file references will fail
>> when that fid number is reused, and the mount becomes unusable.
>
> I don't know whether Sasha's problem is caused by this patch or not.
> But p9_client_clunk() is called from many functions in fs/9p/ directory.
> They are assuming that p9_client_clunk() will call p9_fid_destroy() but
> this patch is breaking that assumption. I think this is the cause of hang which
> Sasha is experiencing because Sasha's trace shows that call_usermodehelper() is
> blocked by functions in fs/9p/ directory. Seems inconsistency state problem.

I'd be happy to try out any other patches or help debugging this issue.

Which behavior did this patch fix exactly? can I just revert it and
try running without it?

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2012-04-11 12:21 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-03-31 17:51 ipv6: tunnel: hang when destroying ipv6 tunnel Sasha Levin
2012-03-31 20:59 ` Eric Dumazet
2012-03-31 21:34   ` Oleg Nesterov
2012-03-31 21:43     ` Sasha Levin
2012-03-31 23:26       ` Sasha Levin
2012-04-01  3:21         ` Tetsuo Handa
2012-04-01 17:33           ` Sasha Levin
2012-04-05 14:29             ` Tetsuo Handa
2012-04-05 14:34               ` Tetsuo Handa
2012-04-06 11:44               ` Tetsuo Handa
2012-04-06 18:09                 ` Jim Garlick
2012-04-07  0:06                   ` Tetsuo Handa
2012-04-11 12:20                     ` Sasha Levin
2012-04-01  5:07         ` Eric Dumazet
2012-04-01 16:38         ` Oleg Nesterov

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.