All of lore.kernel.org
 help / color / mirror / Atom feed
* namespace: deadlock in dec_pid_namespaces
@ 2017-01-20 13:07 Dmitry Vyukov
  2017-01-20 13:26 ` Nikolay Borisov
  0 siblings, 1 reply; 6+ messages in thread
From: Dmitry Vyukov @ 2017-01-20 13:07 UTC (permalink / raw)
  To: Eric W. Biederman, avagin, serge, Kees Cook, Al Viro, LKML; +Cc: syzkaller

Hello,

I've got the following deadlock report while running syzkaller fuzzer
on eec0d3d065bfcdf9cd5f56dd2a36b94d12d32297 of linux-next (on odroid
device if it matters):

=================================
[ INFO: inconsistent lock state ]
4.10.0-rc3-next-20170112-xc2-dirty #6 Not tainted
---------------------------------
inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} usage.
swapper/2/0 [HC0[0]:SC1[1]:HE1:SE0] takes:
 (ucounts_lock){+.?...}, at: [<     inline     >] spin_lock
./include/linux/spinlock.h:302
 (ucounts_lock){+.?...}, at: [<ffff2000081678c8>]
put_ucounts+0x60/0x138 kernel/ucount.c:162
{SOFTIRQ-ON-W} state was registered at:
[<ffff2000081c82d8>] mark_lock+0x220/0xb60 kernel/locking/lockdep.c:3054
[<     inline     >] mark_irqflags kernel/locking/lockdep.c:2941
[<ffff2000081c97a8>] __lock_acquire+0x388/0x3260 kernel/locking/lockdep.c:3295
[<ffff2000081cce24>] lock_acquire+0xa4/0x138 kernel/locking/lockdep.c:3753
[<     inline     >] __raw_spin_lock ./include/linux/spinlock_api_smp.h:144
[<ffff200009798128>] _raw_spin_lock+0x90/0xd0 kernel/locking/spinlock.c:151
[<     inline     >] spin_lock ./include/linux/spinlock.h:302
[<     inline     >] get_ucounts kernel/ucount.c:131
[<ffff200008167c28>] inc_ucount+0x80/0x6c8 kernel/ucount.c:189
[<     inline     >] inc_mnt_namespaces fs/namespace.c:2818
[<ffff200008481850>] alloc_mnt_ns+0x78/0x3a8 fs/namespace.c:2849
[<ffff200008487298>] create_mnt_ns+0x28/0x200 fs/namespace.c:2959
[<     inline     >] init_mount_tree fs/namespace.c:3199
[<ffff200009bd6674>] mnt_init+0x258/0x384 fs/namespace.c:3251
[<ffff200009bd60bc>] vfs_caches_init+0x6c/0x80 fs/dcache.c:3626
[<ffff200009bb1114>] start_kernel+0x414/0x460 init/main.c:648
[<ffff200009bb01e8>] __primary_switched+0x6c/0x70 arch/arm64/kernel/head.S:456
irq event stamp: 2316924
hardirqs last  enabled at (2316924): [<     inline     >] rcu_do_batch
kernel/rcu/tree.c:2911
hardirqs last  enabled at (2316924): [<     inline     >]
invoke_rcu_callbacks kernel/rcu/tree.c:3182
hardirqs last  enabled at (2316924): [<     inline     >]
__rcu_process_callbacks kernel/rcu/tree.c:3149
hardirqs last  enabled at (2316924): [<ffff200008210414>]
rcu_process_callbacks+0x7a4/0xc28 kernel/rcu/tree.c:3166
hardirqs last disabled at (2316923): [<     inline     >] rcu_do_batch
kernel/rcu/tree.c:2900
hardirqs last disabled at (2316923): [<     inline     >]
invoke_rcu_callbacks kernel/rcu/tree.c:3182
hardirqs last disabled at (2316923): [<     inline     >]
__rcu_process_callbacks kernel/rcu/tree.c:3149
hardirqs last disabled at (2316923): [<ffff20000820fe80>]
rcu_process_callbacks+0x210/0xc28 kernel/rcu/tree.c:3166
softirqs last  enabled at (2316912): [<ffff20000811b4c4>]
_local_bh_enable+0x4c/0x80 kernel/softirq.c:155
softirqs last disabled at (2316913): [<     inline     >]
do_softirq_own_stack ./include/linux/interrupt.h:488
softirqs last disabled at (2316913): [<     inline     >]
invoke_softirq kernel/softirq.c:371
softirqs last disabled at (2316913): [<ffff20000811c994>]
irq_exit+0x264/0x308 kernel/softirq.c:405

other info that might help us debug this:
 Possible unsafe locking scenario:

       CPU0
       ----
  lock(ucounts_lock);
  <Interrupt>
    lock(ucounts_lock);

 *** DEADLOCK ***

1 lock held by swapper/2/0:
 #0:  (rcu_callback){......}, at: [<     inline     >] __rcu_reclaim
kernel/rcu/rcu.h:108
 #0:  (rcu_callback){......}, at: [<     inline     >] rcu_do_batch
kernel/rcu/tree.c:2919
 #0:  (rcu_callback){......}, at: [<     inline     >]
invoke_rcu_callbacks kernel/rcu/tree.c:3182
 #0:  (rcu_callback){......}, at: [<     inline     >]
__rcu_process_callbacks kernel/rcu/tree.c:3149
 #0:  (rcu_callback){......}, at: [<ffff200008210390>]
rcu_process_callbacks+0x720/0xc28 kernel/rcu/tree.c:3166

stack backtrace:
CPU: 2 PID: 0 Comm: swapper/2 Not tainted 4.10.0-rc3-next-20170112-xc2-dirty #6
Hardware name: Hardkernel ODROID-C2 (DT)
Call trace:
[<ffff20000808fa60>] dump_backtrace+0x0/0x440 arch/arm64/kernel/traps.c:500
[<ffff20000808fec0>] show_stack+0x20/0x30 arch/arm64/kernel/traps.c:225
[<ffff2000088a99e0>] dump_stack+0x110/0x168
[<ffff2000082fa2b4>] print_usage_bug.part.27+0x49c/0x4bc
kernel/locking/lockdep.c:2387
[<     inline     >] print_usage_bug kernel/locking/lockdep.c:2357
[<     inline     >] valid_state kernel/locking/lockdep.c:2400
[<     inline     >] mark_lock_irq kernel/locking/lockdep.c:2617
[<ffff2000081c89ec>] mark_lock+0x934/0xb60 kernel/locking/lockdep.c:3065
[<     inline     >] mark_irqflags kernel/locking/lockdep.c:2923
[<ffff2000081c9a60>] __lock_acquire+0x640/0x3260 kernel/locking/lockdep.c:3295
[<ffff2000081cce24>] lock_acquire+0xa4/0x138 kernel/locking/lockdep.c:3753
[<     inline     >] __raw_spin_lock ./include/linux/spinlock_api_smp.h:144
[<ffff200009798128>] _raw_spin_lock+0x90/0xd0 kernel/locking/spinlock.c:151
[<     inline     >] spin_lock ./include/linux/spinlock.h:302
[<ffff2000081678c8>] put_ucounts+0x60/0x138 kernel/ucount.c:162
[<ffff200008168364>] dec_ucount+0xf4/0x158 kernel/ucount.c:214
[<     inline     >] dec_pid_namespaces kernel/pid_namespace.c:89
[<ffff200008293dc8>] delayed_free_pidns+0x40/0xe0 kernel/pid_namespace.c:156
[<     inline     >] __rcu_reclaim kernel/rcu/rcu.h:118
[<     inline     >] rcu_do_batch kernel/rcu/tree.c:2919
[<     inline     >] invoke_rcu_callbacks kernel/rcu/tree.c:3182
[<     inline     >] __rcu_process_callbacks kernel/rcu/tree.c:3149
[<ffff2000082103d8>] rcu_process_callbacks+0x768/0xc28 kernel/rcu/tree.c:3166
[<ffff2000080821dc>] __do_softirq+0x324/0x6e0 kernel/softirq.c:284
[<     inline     >] do_softirq_own_stack ./include/linux/interrupt.h:488
[<     inline     >] invoke_softirq kernel/softirq.c:371
[<ffff20000811c994>] irq_exit+0x264/0x308 kernel/softirq.c:405
[<ffff2000081ecc28>] __handle_domain_irq+0xc0/0x150 kernel/irq/irqdesc.c:636
[<ffff200008081c80>] gic_handle_irq+0x68/0xd8
Exception stack(0xffff8000648e7dd0 to 0xffff8000648e7f00)
7dc0:                                   ffff8000648d4b3c 0000000000000007
7de0: 0000000000000000 1ffff0000c91a967 1ffff0000c91a967 1ffff0000c91a967
7e00: ffff20000a4b6b68 0000000000000001 0000000000000007 0000000000000001
7e20: 1fffe4000149ae90 ffff200009d35000 0000000000000000 0000000000000002
7e40: 0000000000000000 0000000000000000 0000000002624a1a 0000000000000000
7e60: 0000000000000000 ffff200009cbcd88 000060006d2ed000 0000000000000140
7e80: ffff200009cff000 ffff200009cb6000 ffff200009cc2020 ffff200009d2159d
7ea0: 0000000000000000 ffff8000648d4380 0000000000000000 ffff8000648e7f00
7ec0: ffff20000820a478 ffff8000648e7f00 ffff20000820a47c 0000000010000145
7ee0: 0000000000000140 dfff200000000000 ffffffffffffffff ffff20000820a478
[<ffff2000080837f8>] el1_irq+0xb8/0x130 arch/arm64/kernel/entry.S:486
[<     inline     >] arch_local_irq_restore
./arch/arm64/include/asm/irqflags.h:81
[<ffff20000820a47c>] rcu_idle_exit+0x64/0xa8 kernel/rcu/tree.c:1030
[<     inline     >] cpuidle_idle_call kernel/sched/idle.c:200
[<ffff2000081bcbfc>] do_idle+0x1dc/0x2d0 kernel/sched/idle.c:243
[<ffff2000081bd1cc>] cpu_startup_entry+0x24/0x28 kernel/sched/idle.c:345
[<ffff200008099f8c>] secondary_start_kernel+0x2cc/0x358
arch/arm64/kernel/smp.c:276
[<000000000279f1a4>] 0x279f1a4

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: namespace: deadlock in dec_pid_namespaces
  2017-01-20 13:07 namespace: deadlock in dec_pid_namespaces Dmitry Vyukov
@ 2017-01-20 13:26 ` Nikolay Borisov
  2017-01-20 18:05   ` Eric W. Biederman
  0 siblings, 1 reply; 6+ messages in thread
From: Nikolay Borisov @ 2017-01-20 13:26 UTC (permalink / raw)
  To: Dmitry Vyukov, Eric W. Biederman, avagin, serge, Kees Cook,
	Al Viro, LKML
  Cc: syzkaller

[-- Attachment #1: Type: text/plain, Size: 7734 bytes --]



On 20.01.2017 15:07, Dmitry Vyukov wrote:
> Hello,
> 
> I've got the following deadlock report while running syzkaller fuzzer
> on eec0d3d065bfcdf9cd5f56dd2a36b94d12d32297 of linux-next (on odroid
> device if it matters):
> 
> =================================
> [ INFO: inconsistent lock state ]
> 4.10.0-rc3-next-20170112-xc2-dirty #6 Not tainted
> ---------------------------------
> inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} usage.
> swapper/2/0 [HC0[0]:SC1[1]:HE1:SE0] takes:
>  (ucounts_lock){+.?...}, at: [<     inline     >] spin_lock
> ./include/linux/spinlock.h:302
>  (ucounts_lock){+.?...}, at: [<ffff2000081678c8>]
> put_ucounts+0x60/0x138 kernel/ucount.c:162
> {SOFTIRQ-ON-W} state was registered at:
> [<ffff2000081c82d8>] mark_lock+0x220/0xb60 kernel/locking/lockdep.c:3054
> [<     inline     >] mark_irqflags kernel/locking/lockdep.c:2941
> [<ffff2000081c97a8>] __lock_acquire+0x388/0x3260 kernel/locking/lockdep.c:3295
> [<ffff2000081cce24>] lock_acquire+0xa4/0x138 kernel/locking/lockdep.c:3753
> [<     inline     >] __raw_spin_lock ./include/linux/spinlock_api_smp.h:144
> [<ffff200009798128>] _raw_spin_lock+0x90/0xd0 kernel/locking/spinlock.c:151
> [<     inline     >] spin_lock ./include/linux/spinlock.h:302
> [<     inline     >] get_ucounts kernel/ucount.c:131
> [<ffff200008167c28>] inc_ucount+0x80/0x6c8 kernel/ucount.c:189
> [<     inline     >] inc_mnt_namespaces fs/namespace.c:2818
> [<ffff200008481850>] alloc_mnt_ns+0x78/0x3a8 fs/namespace.c:2849
> [<ffff200008487298>] create_mnt_ns+0x28/0x200 fs/namespace.c:2959
> [<     inline     >] init_mount_tree fs/namespace.c:3199
> [<ffff200009bd6674>] mnt_init+0x258/0x384 fs/namespace.c:3251
> [<ffff200009bd60bc>] vfs_caches_init+0x6c/0x80 fs/dcache.c:3626
> [<ffff200009bb1114>] start_kernel+0x414/0x460 init/main.c:648
> [<ffff200009bb01e8>] __primary_switched+0x6c/0x70 arch/arm64/kernel/head.S:456
> irq event stamp: 2316924
> hardirqs last  enabled at (2316924): [<     inline     >] rcu_do_batch
> kernel/rcu/tree.c:2911
> hardirqs last  enabled at (2316924): [<     inline     >]
> invoke_rcu_callbacks kernel/rcu/tree.c:3182
> hardirqs last  enabled at (2316924): [<     inline     >]
> __rcu_process_callbacks kernel/rcu/tree.c:3149
> hardirqs last  enabled at (2316924): [<ffff200008210414>]
> rcu_process_callbacks+0x7a4/0xc28 kernel/rcu/tree.c:3166
> hardirqs last disabled at (2316923): [<     inline     >] rcu_do_batch
> kernel/rcu/tree.c:2900
> hardirqs last disabled at (2316923): [<     inline     >]
> invoke_rcu_callbacks kernel/rcu/tree.c:3182
> hardirqs last disabled at (2316923): [<     inline     >]
> __rcu_process_callbacks kernel/rcu/tree.c:3149
> hardirqs last disabled at (2316923): [<ffff20000820fe80>]
> rcu_process_callbacks+0x210/0xc28 kernel/rcu/tree.c:3166
> softirqs last  enabled at (2316912): [<ffff20000811b4c4>]
> _local_bh_enable+0x4c/0x80 kernel/softirq.c:155
> softirqs last disabled at (2316913): [<     inline     >]
> do_softirq_own_stack ./include/linux/interrupt.h:488
> softirqs last disabled at (2316913): [<     inline     >]
> invoke_softirq kernel/softirq.c:371
> softirqs last disabled at (2316913): [<ffff20000811c994>]
> irq_exit+0x264/0x308 kernel/softirq.c:405
> 
> other info that might help us debug this:
>  Possible unsafe locking scenario:
> 
>        CPU0
>        ----
>   lock(ucounts_lock);
>   <Interrupt>
>     lock(ucounts_lock);
> 
>  *** DEADLOCK ***
> 
> 1 lock held by swapper/2/0:
>  #0:  (rcu_callback){......}, at: [<     inline     >] __rcu_reclaim
> kernel/rcu/rcu.h:108
>  #0:  (rcu_callback){......}, at: [<     inline     >] rcu_do_batch
> kernel/rcu/tree.c:2919
>  #0:  (rcu_callback){......}, at: [<     inline     >]
> invoke_rcu_callbacks kernel/rcu/tree.c:3182
>  #0:  (rcu_callback){......}, at: [<     inline     >]
> __rcu_process_callbacks kernel/rcu/tree.c:3149
>  #0:  (rcu_callback){......}, at: [<ffff200008210390>]
> rcu_process_callbacks+0x720/0xc28 kernel/rcu/tree.c:3166
> 
> stack backtrace:
> CPU: 2 PID: 0 Comm: swapper/2 Not tainted 4.10.0-rc3-next-20170112-xc2-dirty #6
> Hardware name: Hardkernel ODROID-C2 (DT)
> Call trace:
> [<ffff20000808fa60>] dump_backtrace+0x0/0x440 arch/arm64/kernel/traps.c:500
> [<ffff20000808fec0>] show_stack+0x20/0x30 arch/arm64/kernel/traps.c:225
> [<ffff2000088a99e0>] dump_stack+0x110/0x168
> [<ffff2000082fa2b4>] print_usage_bug.part.27+0x49c/0x4bc
> kernel/locking/lockdep.c:2387
> [<     inline     >] print_usage_bug kernel/locking/lockdep.c:2357
> [<     inline     >] valid_state kernel/locking/lockdep.c:2400
> [<     inline     >] mark_lock_irq kernel/locking/lockdep.c:2617
> [<ffff2000081c89ec>] mark_lock+0x934/0xb60 kernel/locking/lockdep.c:3065
> [<     inline     >] mark_irqflags kernel/locking/lockdep.c:2923
> [<ffff2000081c9a60>] __lock_acquire+0x640/0x3260 kernel/locking/lockdep.c:3295
> [<ffff2000081cce24>] lock_acquire+0xa4/0x138 kernel/locking/lockdep.c:3753
> [<     inline     >] __raw_spin_lock ./include/linux/spinlock_api_smp.h:144
> [<ffff200009798128>] _raw_spin_lock+0x90/0xd0 kernel/locking/spinlock.c:151
> [<     inline     >] spin_lock ./include/linux/spinlock.h:302
> [<ffff2000081678c8>] put_ucounts+0x60/0x138 kernel/ucount.c:162
> [<ffff200008168364>] dec_ucount+0xf4/0x158 kernel/ucount.c:214
> [<     inline     >] dec_pid_namespaces kernel/pid_namespace.c:89
> [<ffff200008293dc8>] delayed_free_pidns+0x40/0xe0 kernel/pid_namespace.c:156
> [<     inline     >] __rcu_reclaim kernel/rcu/rcu.h:118
> [<     inline     >] rcu_do_batch kernel/rcu/tree.c:2919
> [<     inline     >] invoke_rcu_callbacks kernel/rcu/tree.c:3182
> [<     inline     >] __rcu_process_callbacks kernel/rcu/tree.c:3149
> [<ffff2000082103d8>] rcu_process_callbacks+0x768/0xc28 kernel/rcu/tree.c:3166
> [<ffff2000080821dc>] __do_softirq+0x324/0x6e0 kernel/softirq.c:284
> [<     inline     >] do_softirq_own_stack ./include/linux/interrupt.h:488
> [<     inline     >] invoke_softirq kernel/softirq.c:371
> [<ffff20000811c994>] irq_exit+0x264/0x308 kernel/softirq.c:405
> [<ffff2000081ecc28>] __handle_domain_irq+0xc0/0x150 kernel/irq/irqdesc.c:636
> [<ffff200008081c80>] gic_handle_irq+0x68/0xd8
> Exception stack(0xffff8000648e7dd0 to 0xffff8000648e7f00)
> 7dc0:                                   ffff8000648d4b3c 0000000000000007
> 7de0: 0000000000000000 1ffff0000c91a967 1ffff0000c91a967 1ffff0000c91a967
> 7e00: ffff20000a4b6b68 0000000000000001 0000000000000007 0000000000000001
> 7e20: 1fffe4000149ae90 ffff200009d35000 0000000000000000 0000000000000002
> 7e40: 0000000000000000 0000000000000000 0000000002624a1a 0000000000000000
> 7e60: 0000000000000000 ffff200009cbcd88 000060006d2ed000 0000000000000140
> 7e80: ffff200009cff000 ffff200009cb6000 ffff200009cc2020 ffff200009d2159d
> 7ea0: 0000000000000000 ffff8000648d4380 0000000000000000 ffff8000648e7f00
> 7ec0: ffff20000820a478 ffff8000648e7f00 ffff20000820a47c 0000000010000145
> 7ee0: 0000000000000140 dfff200000000000 ffffffffffffffff ffff20000820a478
> [<ffff2000080837f8>] el1_irq+0xb8/0x130 arch/arm64/kernel/entry.S:486
> [<     inline     >] arch_local_irq_restore
> ./arch/arm64/include/asm/irqflags.h:81
> [<ffff20000820a47c>] rcu_idle_exit+0x64/0xa8 kernel/rcu/tree.c:1030
> [<     inline     >] cpuidle_idle_call kernel/sched/idle.c:200
> [<ffff2000081bcbfc>] do_idle+0x1dc/0x2d0 kernel/sched/idle.c:243
> [<ffff2000081bd1cc>] cpu_startup_entry+0x24/0x28 kernel/sched/idle.c:345
> [<ffff200008099f8c>] secondary_start_kernel+0x2cc/0x358
> arch/arm64/kernel/smp.c:276
> [<000000000279f1a4>] 0x279f1a4
> 


So it seems that ucounts_lock is being used in a softirq context (in an
RCU callback) when a pidns is being freed. But this lock is not
softirq-safe e.g. it doesn't disable bh. Can you try the attached patch.




[-- Attachment #2: 0001-userns-Make-ucounts-lock-softirq-safe.patch --]
[-- Type: text/x-patch, Size: 2057 bytes --]

>From 86565326b382b41cb988a83791eff1c4d600040b Mon Sep 17 00:00:00 2001
From: Nikolay Borisov <n.borisov.lkml@gmail.com>
Date: Fri, 20 Jan 2017 15:21:35 +0200
Subject: [PATCH] userns: Make ucounts lock softirq-safe

The ucounts_lock is being used to protect various ucounts lifecycle
management functionalities. However, those services can also be invoked
when a pidns is being freed in an RCU callback (e.g. softirq context).
This can lead to deadlocks. Fix it by making the spinlock disable softirq

Signed-off-by: Nikolay Borisov <n.borisov.lkml@gmail.com>
---
 kernel/ucount.c | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/kernel/ucount.c b/kernel/ucount.c
index b4aaee935b3e..23a44ea894cd 100644
--- a/kernel/ucount.c
+++ b/kernel/ucount.c
@@ -132,10 +132,10 @@ static struct ucounts *get_ucounts(struct user_namespace *ns, kuid_t uid)
 	struct hlist_head *hashent = ucounts_hashentry(ns, uid);
 	struct ucounts *ucounts, *new;
 
-	spin_lock(&ucounts_lock);
+	spin_lock_bh(&ucounts_lock);
 	ucounts = find_ucounts(ns, uid, hashent);
 	if (!ucounts) {
-		spin_unlock(&ucounts_lock);
+		spin_unlock_bh(&ucounts_lock);
 
 		new = kzalloc(sizeof(*new), GFP_KERNEL);
 		if (!new)
@@ -145,7 +145,7 @@ static struct ucounts *get_ucounts(struct user_namespace *ns, kuid_t uid)
 		new->uid = uid;
 		atomic_set(&new->count, 0);
 
-		spin_lock(&ucounts_lock);
+		spin_lock_bh(&ucounts_lock);
 		ucounts = find_ucounts(ns, uid, hashent);
 		if (ucounts) {
 			kfree(new);
@@ -156,16 +156,16 @@ static struct ucounts *get_ucounts(struct user_namespace *ns, kuid_t uid)
 	}
 	if (!atomic_add_unless(&ucounts->count, 1, INT_MAX))
 		ucounts = NULL;
-	spin_unlock(&ucounts_lock);
+	spin_unlock_bh(&ucounts_lock);
 	return ucounts;
 }
 
 static void put_ucounts(struct ucounts *ucounts)
 {
 	if (atomic_dec_and_test(&ucounts->count)) {
-		spin_lock(&ucounts_lock);
+		spin_lock_bh(&ucounts_lock);
 		hlist_del_init(&ucounts->node);
-		spin_unlock(&ucounts_lock);
+		spin_unlock_bh(&ucounts_lock);
 
 		kfree(ucounts);
 	}
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: namespace: deadlock in dec_pid_namespaces
  2017-01-20 13:26 ` Nikolay Borisov
@ 2017-01-20 18:05   ` Eric W. Biederman
  2017-01-20 22:44     ` Nikolay Borisov
  0 siblings, 1 reply; 6+ messages in thread
From: Eric W. Biederman @ 2017-01-20 18:05 UTC (permalink / raw)
  To: Nikolay Borisov
  Cc: Dmitry Vyukov, avagin, serge, Kees Cook, Al Viro, LKML, syzkaller

Nikolay Borisov <n.borisov.lkml@gmail.com> writes:

> On 20.01.2017 15:07, Dmitry Vyukov wrote:
>> Hello,
>> 
>> I've got the following deadlock report while running syzkaller fuzzer
>> on eec0d3d065bfcdf9cd5f56dd2a36b94d12d32297 of linux-next (on odroid
>> device if it matters):

I am puzzled I thought we had fixed this with:
  add7c65ca426 ("pid: fix lockdep deadlock warning due to ucount_lock")
But apparently not.  We  just moved it from hardirq to softirq context.  Bah.

Thank you very much for the report.

Nikolay can you make your change use spinlock_irq?  And have put_ucounts
do spin_lock_irqsave?  That way we just don't care where we call this.

I a tired of being clever.

Eric


> From 86565326b382b41cb988a83791eff1c4d600040b Mon Sep 17 00:00:00 2001
> From: Nikolay Borisov <n.borisov.lkml@gmail.com>
> Date: Fri, 20 Jan 2017 15:21:35 +0200
> Subject: [PATCH] userns: Make ucounts lock softirq-safe
>
> The ucounts_lock is being used to protect various ucounts lifecycle
> management functionalities. However, those services can also be invoked
> when a pidns is being freed in an RCU callback (e.g. softirq context).
> This can lead to deadlocks. Fix it by making the spinlock disable softirq
>
> Signed-off-by: Nikolay Borisov <n.borisov.lkml@gmail.com>
> ---
>  kernel/ucount.c | 12 ++++++------
>  1 file changed, 6 insertions(+), 6 deletions(-)
>
> diff --git a/kernel/ucount.c b/kernel/ucount.c
> index b4aaee935b3e..23a44ea894cd 100644
> --- a/kernel/ucount.c
> +++ b/kernel/ucount.c
> @@ -132,10 +132,10 @@ static struct ucounts *get_ucounts(struct user_namespace *ns, kuid_t uid)
>  	struct hlist_head *hashent = ucounts_hashentry(ns, uid);
>  	struct ucounts *ucounts, *new;
>  
> -	spin_lock(&ucounts_lock);
> +	spin_lock_bh(&ucounts_lock);
>  	ucounts = find_ucounts(ns, uid, hashent);
>  	if (!ucounts) {
> -		spin_unlock(&ucounts_lock);
> +		spin_unlock_bh(&ucounts_lock);
>  
>  		new = kzalloc(sizeof(*new), GFP_KERNEL);
>  		if (!new)
> @@ -145,7 +145,7 @@ static struct ucounts *get_ucounts(struct user_namespace *ns, kuid_t uid)
>  		new->uid = uid;
>  		atomic_set(&new->count, 0);
>  
> -		spin_lock(&ucounts_lock);
> +		spin_lock_bh(&ucounts_lock);
>  		ucounts = find_ucounts(ns, uid, hashent);
>  		if (ucounts) {
>  			kfree(new);
> @@ -156,16 +156,16 @@ static struct ucounts *get_ucounts(struct user_namespace *ns, kuid_t uid)
>  	}
>  	if (!atomic_add_unless(&ucounts->count, 1, INT_MAX))
>  		ucounts = NULL;
> -	spin_unlock(&ucounts_lock);
> +	spin_unlock_bh(&ucounts_lock);
>  	return ucounts;
>  }
>  
>  static void put_ucounts(struct ucounts *ucounts)
>  {
>  	if (atomic_dec_and_test(&ucounts->count)) {
> -		spin_lock(&ucounts_lock);
> +		spin_lock_bh(&ucounts_lock);
>  		hlist_del_init(&ucounts->node);
> -		spin_unlock(&ucounts_lock);
> +		spin_unlock_bh(&ucounts_lock);
>  
>  		kfree(ucounts);
>  	}

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: namespace: deadlock in dec_pid_namespaces
  2017-01-20 18:05   ` Eric W. Biederman
@ 2017-01-20 22:44     ` Nikolay Borisov
  2017-01-21  0:28       ` Eric W. Biederman
  0 siblings, 1 reply; 6+ messages in thread
From: Nikolay Borisov @ 2017-01-20 22:44 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Dmitry Vyukov, avagin, serge, Kees Cook, Al Viro, LKML, syzkaller

[-- Attachment #1: Type: text/plain, Size: 986 bytes --]



On 20.01.2017 20:05, Eric W. Biederman wrote:
> Nikolay Borisov <n.borisov.lkml@gmail.com> writes:
> 
>> On 20.01.2017 15:07, Dmitry Vyukov wrote:
>>> Hello,
>>>
>>> I've got the following deadlock report while running syzkaller fuzzer
>>> on eec0d3d065bfcdf9cd5f56dd2a36b94d12d32297 of linux-next (on odroid
>>> device if it matters):
> 
> I am puzzled I thought we had fixed this with:
>   add7c65ca426 ("pid: fix lockdep deadlock warning due to ucount_lock")
> But apparently not.  We  just moved it from hardirq to softirq context.  Bah.
> 
> Thank you very much for the report.
> 
> Nikolay can you make your change use spinlock_irq?  And have put_ucounts
> do spin_lock_irqsave?  That way we just don't care where we call this.

Like the one attached? I haven't really taken careful look as to whether
the function where _irq versions do fiddle with irq state, since this
might cause a problem if we unconditionally enable them.

> 
> I a tired of being clever.
> 
> Eric
> 
> 

[-- Attachment #2: 0001-userns-Make-ucounts-lock-softirq-safe.patch --]
[-- Type: text/x-patch, Size: 2488 bytes --]

>From 0aa66c85afdac0cd07fabdf899c173c6dca2b6e7 Mon Sep 17 00:00:00 2001
From: Nikolay Borisov <n.borisov.lkml@gmail.com>
Date: Fri, 20 Jan 2017 15:21:35 +0200
Subject: [PATCH] userns: Make ucounts lock softirq-safe

The ucounts_lock is being used to protect various ucounts lifecycle
management functionalities. However, those services can also be invoked
when a pidns is being freed in an RCU callback (e.g. softirq context).
This can lead to deadlocks. There were already efforts trying to
prevent similar deadlocks in add7c65ca426 ("pid: fix lockdep deadlock
warning due to ucount_lock"), however they just moved the context
from hardirq to softrq. Fix this issue once and for all by explictly
making the lock disable irqs altogether.

Fixes: add7c65ca426 ("pid: fix lockdep deadlock warning due to ucount_lock")
Link: https://www.spinics.net/lists/kernel/msg2426637.html
Signed-off-by: Nikolay Borisov <n.borisov.lkml@gmail.com>
---
 kernel/ucount.c | 14 ++++++++------
 1 file changed, 8 insertions(+), 6 deletions(-)

diff --git a/kernel/ucount.c b/kernel/ucount.c
index b4aaee935b3e..68716403b261 100644
--- a/kernel/ucount.c
+++ b/kernel/ucount.c
@@ -132,10 +132,10 @@ static struct ucounts *get_ucounts(struct user_namespace *ns, kuid_t uid)
 	struct hlist_head *hashent = ucounts_hashentry(ns, uid);
 	struct ucounts *ucounts, *new;
 
-	spin_lock(&ucounts_lock);
+	spin_lock_irq(&ucounts_lock);
 	ucounts = find_ucounts(ns, uid, hashent);
 	if (!ucounts) {
-		spin_unlock(&ucounts_lock);
+		spin_unlock_irq(&ucounts_lock);
 
 		new = kzalloc(sizeof(*new), GFP_KERNEL);
 		if (!new)
@@ -145,7 +145,7 @@ static struct ucounts *get_ucounts(struct user_namespace *ns, kuid_t uid)
 		new->uid = uid;
 		atomic_set(&new->count, 0);
 
-		spin_lock(&ucounts_lock);
+		spin_lock_irq(&ucounts_lock);
 		ucounts = find_ucounts(ns, uid, hashent);
 		if (ucounts) {
 			kfree(new);
@@ -156,16 +156,18 @@ static struct ucounts *get_ucounts(struct user_namespace *ns, kuid_t uid)
 	}
 	if (!atomic_add_unless(&ucounts->count, 1, INT_MAX))
 		ucounts = NULL;
-	spin_unlock(&ucounts_lock);
+	spin_unlock_irq(&ucounts_lock);
 	return ucounts;
 }
 
 static void put_ucounts(struct ucounts *ucounts)
 {
+	unsigned long flags;
+
 	if (atomic_dec_and_test(&ucounts->count)) {
-		spin_lock(&ucounts_lock);
+		spin_lock_irqsave(&ucounts_lock, flags);
 		hlist_del_init(&ucounts->node);
-		spin_unlock(&ucounts_lock);
+		spin_unlock_irqrestore(&ucounts_lock, flags);
 
 		kfree(ucounts);
 	}
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: namespace: deadlock in dec_pid_namespaces
  2017-01-20 22:44     ` Nikolay Borisov
@ 2017-01-21  0:28       ` Eric W. Biederman
  2017-01-23  9:35         ` Dmitry Vyukov
  0 siblings, 1 reply; 6+ messages in thread
From: Eric W. Biederman @ 2017-01-21  0:28 UTC (permalink / raw)
  To: Nikolay Borisov
  Cc: Dmitry Vyukov, avagin, serge, Kees Cook, Al Viro, LKML, syzkaller

Nikolay Borisov <n.borisov.lkml@gmail.com> writes:

> On 20.01.2017 20:05, Eric W. Biederman wrote:
>> Nikolay Borisov <n.borisov.lkml@gmail.com> writes:
>> 
>>> On 20.01.2017 15:07, Dmitry Vyukov wrote:
>>>> Hello,
>>>>
>>>> I've got the following deadlock report while running syzkaller fuzzer
>>>> on eec0d3d065bfcdf9cd5f56dd2a36b94d12d32297 of linux-next (on odroid
>>>> device if it matters):
>> 
>> I am puzzled I thought we had fixed this with:
>>   add7c65ca426 ("pid: fix lockdep deadlock warning due to ucount_lock")
>> But apparently not.  We  just moved it from hardirq to softirq context.  Bah.
>> 
>> Thank you very much for the report.
>> 
>> Nikolay can you make your change use spinlock_irq?  And have put_ucounts
>> do spin_lock_irqsave?  That way we just don't care where we call this.
>
> Like the one attached?

Exactly thank you.  Dmitry if you have time to test that patch and
verify it fixes your issue I would appreciate it.

> I haven't really taken careful look as to whether
> the function where _irq versions do fiddle with irq state, since this
> might cause a problem if we unconditionally enable them.

In code paths where we can sleep irqs must come in enabled or it's a
bug.

spin_lock_irq which unconditionally disables irqs is thus safe on the
allocation path.

Similary spin_unlock_irq which unconditionally enables irqs is also safe
on the allocation path.

Eric

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: namespace: deadlock in dec_pid_namespaces
  2017-01-21  0:28       ` Eric W. Biederman
@ 2017-01-23  9:35         ` Dmitry Vyukov
  0 siblings, 0 replies; 6+ messages in thread
From: Dmitry Vyukov @ 2017-01-23  9:35 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Nikolay Borisov, avagin, serge, Kees Cook, Al Viro, LKML, syzkaller

On Sat, Jan 21, 2017 at 1:28 AM, Eric W. Biederman
<ebiederm@xmission.com> wrote:
> Nikolay Borisov <n.borisov.lkml@gmail.com> writes:
>
>> On 20.01.2017 20:05, Eric W. Biederman wrote:
>>> Nikolay Borisov <n.borisov.lkml@gmail.com> writes:
>>>
>>>> On 20.01.2017 15:07, Dmitry Vyukov wrote:
>>>>> Hello,
>>>>>
>>>>> I've got the following deadlock report while running syzkaller fuzzer
>>>>> on eec0d3d065bfcdf9cd5f56dd2a36b94d12d32297 of linux-next (on odroid
>>>>> device if it matters):
>>>
>>> I am puzzled I thought we had fixed this with:
>>>   add7c65ca426 ("pid: fix lockdep deadlock warning due to ucount_lock")
>>> But apparently not.  We  just moved it from hardirq to softirq context.  Bah.
>>>
>>> Thank you very much for the report.
>>>
>>> Nikolay can you make your change use spinlock_irq?  And have put_ucounts
>>> do spin_lock_irqsave?  That way we just don't care where we call this.
>>
>> Like the one attached?
>
> Exactly thank you.  Dmitry if you have time to test that patch and
> verify it fixes your issue I would appreciate it.
>
>> I haven't really taken careful look as to whether
>> the function where _irq versions do fiddle with irq state, since this
>> might cause a problem if we unconditionally enable them.
>
> In code paths where we can sleep irqs must come in enabled or it's a
> bug.
>
> spin_lock_irq which unconditionally disables irqs is thus safe on the
> allocation path.
>
> Similary spin_unlock_irq which unconditionally enables irqs is also safe
> on the allocation path.

Yes, it fixes the issue for me:

Tested-by: Dmitry Vyukov <dvyukov@google.com>

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2017-01-23  9:36 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-01-20 13:07 namespace: deadlock in dec_pid_namespaces Dmitry Vyukov
2017-01-20 13:26 ` Nikolay Borisov
2017-01-20 18:05   ` Eric W. Biederman
2017-01-20 22:44     ` Nikolay Borisov
2017-01-21  0:28       ` Eric W. Biederman
2017-01-23  9:35         ` Dmitry Vyukov

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.