linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* mm: possible deadlock in mm_take_all_locks
@ 2016-01-08 16:58 Dmitry Vyukov
  2016-01-08 23:23 ` Kirill A. Shutemov
  0 siblings, 1 reply; 5+ messages in thread
From: Dmitry Vyukov @ 2016-01-08 16:58 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, LKML, Andrew Morton,
	Kirill A. Shutemov, Oleg Nesterov, Chen Gang, linux-mm
  Cc: syzkaller, Kostya Serebryany, Alexander Potapenko, Eric Dumazet,
	Sasha Levin

Hello,

I've hit the following deadlock warning while running syzkaller fuzzer
on commit b06f3a168cdcd80026276898fd1fee443ef25743. As far as I
understand this is a false positive, because both call stacks are
protected by mm_all_locks_mutex. What would be a way to annotate such
locking discipline?


======================================================
[ INFO: possible circular locking dependency detected ]
4.4.0-rc8+ #211 Not tainted
-------------------------------------------------------
syz-executor/11520 is trying to acquire lock:
 (&mapping->i_mmap_rwsem){++++..}, at: [<     inline     >]
vm_lock_mapping mm/mmap.c:3159
 (&mapping->i_mmap_rwsem){++++..}, at: [<ffffffff816e2e6d>]
mm_take_all_locks+0x1bd/0x5f0 mm/mmap.c:3207

but task is already holding lock:
 (&hugetlbfs_i_mmap_rwsem_key){+.+...}, at: [<     inline     >]
vm_lock_mapping mm/mmap.c:3159
 (&hugetlbfs_i_mmap_rwsem_key){+.+...}, at: [<ffffffff816e2e6d>]
mm_take_all_locks+0x1bd/0x5f0 mm/mmap.c:3207

which lock already depends on the new lock.

the existing dependency chain (in reverse order) is:

-> #1 (&hugetlbfs_i_mmap_rwsem_key){+.+...}:
       [<ffffffff814472ec>] lock_acquire+0x1dc/0x430
kernel/locking/lockdep.c:3585
       [<ffffffff81434989>] _down_write_nest_lock+0x49/0xa0
kernel/locking/rwsem.c:129
       [<     inline     >] vm_lock_mapping mm/mmap.c:3159
       [<ffffffff816e2e6d>] mm_take_all_locks+0x1bd/0x5f0 mm/mmap.c:3207
       [<ffffffff817295a8>] do_mmu_notifier_register+0x328/0x420
mm/mmu_notifier.c:267
       [<ffffffff817296c2>] mmu_notifier_register+0x22/0x30
mm/mmu_notifier.c:317
       [<     inline     >] kvm_init_mmu_notifier
arch/x86/kvm/../../../virt/kvm/kvm_main.c:474
       [<     inline     >] kvm_create_vm
arch/x86/kvm/../../../virt/kvm/kvm_main.c:592
       [<     inline     >] kvm_dev_ioctl_create_vm
arch/x86/kvm/../../../virt/kvm/kvm_main.c:2966
       [<ffffffff8101acea>] kvm_dev_ioctl+0x72a/0x920
arch/x86/kvm/../../../virt/kvm/kvm_main.c:2995
       [<     inline     >] vfs_ioctl fs/ioctl.c:43
       [<ffffffff817b66f1>] do_vfs_ioctl+0x681/0xe40 fs/ioctl.c:607
       [<     inline     >] SYSC_ioctl fs/ioctl.c:622
       [<ffffffff817b6f3f>] SyS_ioctl+0x8f/0xc0 fs/ioctl.c:613
       [<ffffffff85e77af6>] entry_SYSCALL_64_fastpath+0x16/0x7a
arch/x86/entry/entry_64.S:185

-> #0 (&mapping->i_mmap_rwsem){++++..}:
       [<     inline     >] check_prev_add kernel/locking/lockdep.c:1853
       [<     inline     >] check_prevs_add kernel/locking/lockdep.c:1958
       [<     inline     >] validate_chain kernel/locking/lockdep.c:2144
       [<ffffffff8144398d>] __lock_acquire+0x320d/0x4720
kernel/locking/lockdep.c:3206
       [<     inline     >] __lock_release kernel/locking/lockdep.c:3432
       [<ffffffff81447e17>] lock_release+0x697/0xce0
kernel/locking/lockdep.c:3604
       [<ffffffff81434ada>] up_write+0x1a/0x60 kernel/locking/rwsem.c:91
       [<     inline     >] i_mmap_unlock_write include/linux/fs.h:504
       [<     inline     >] vm_unlock_mapping mm/mmap.c:3254
       [<ffffffff816e2bf6>] mm_drop_all_locks+0x266/0x320 mm/mmap.c:3278
       [<ffffffff81729506>] do_mmu_notifier_register+0x286/0x420
mm/mmu_notifier.c:292
       [<ffffffff817296c2>] mmu_notifier_register+0x22/0x30
mm/mmu_notifier.c:317
       [<     inline     >] kvm_init_mmu_notifier
arch/x86/kvm/../../../virt/kvm/kvm_main.c:474
       [<     inline     >] kvm_create_vm
arch/x86/kvm/../../../virt/kvm/kvm_main.c:592
       [<     inline     >] kvm_dev_ioctl_create_vm
arch/x86/kvm/../../../virt/kvm/kvm_main.c:2966
       [<ffffffff8101acea>] kvm_dev_ioctl+0x72a/0x920
arch/x86/kvm/../../../virt/kvm/kvm_main.c:2995
       [<     inline     >] vfs_ioctl fs/ioctl.c:43
       [<ffffffff817b66f1>] do_vfs_ioctl+0x681/0xe40 fs/ioctl.c:607
       [<     inline     >] SYSC_ioctl fs/ioctl.c:622
       [<ffffffff817b6f3f>] SyS_ioctl+0x8f/0xc0 fs/ioctl.c:613
       [<ffffffff85e77af6>] entry_SYSCALL_64_fastpath+0x16/0x7a
arch/x86/entry/entry_64.S:185

other info that might help us debug this:

 Possible unsafe locking scenario:

       CPU0                    CPU1
       ----                    ----
  lock(&hugetlbfs_i_mmap_rwsem_key);
                               lock(&mapping->i_mmap_rwsem);
                               lock(&hugetlbfs_i_mmap_rwsem_key);
  lock(&mapping->i_mmap_rwsem);

 *** DEADLOCK ***

3 locks held by syz-executor/11520:
 #0:  (&mm->mmap_sem){++++++}, at: [<ffffffff817295a0>]
do_mmu_notifier_register+0x320/0x420 mm/mmu_notifier.c:266
 #1:  (mm_all_locks_mutex){+.+...}, at: [<ffffffff816e2cf7>]
mm_take_all_locks+0x47/0x5f0 mm/mmap.c:3201
 #2:  (&hugetlbfs_i_mmap_rwsem_key){+.+...}, at: [<     inline     >]
vm_lock_mapping mm/mmap.c:3159
 #2:  (&hugetlbfs_i_mmap_rwsem_key){+.+...}, at: [<ffffffff816e2e6d>]
mm_take_all_locks+0x1bd/0x5f0 mm/mmap.c:3207

stack backtrace:
CPU: 2 PID: 11520 Comm: syz-executor Not tainted 4.4.0-rc8+ #211
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
 00000000ffffffff ffff88003613fa10 ffffffff82907ccd ffffffff88911190
 ffffffff88911190 ffffffff889321c0 ffff88003613fa60 ffffffff8143cb68
 ffff880034bbaf00 ffff880034bbb73a 0000000000000000 ffff880034bbb718
Call Trace:
 [<     inline     >] __dump_stack lib/dump_stack.c:15
 [<ffffffff82907ccd>] dump_stack+0x6f/0xa2 lib/dump_stack.c:50
 [<ffffffff8143cb68>] print_circular_bug+0x288/0x340
kernel/locking/lockdep.c:1226
 [<     inline     >] check_prev_add kernel/locking/lockdep.c:1853
 [<     inline     >] check_prevs_add kernel/locking/lockdep.c:1958
 [<     inline     >] validate_chain kernel/locking/lockdep.c:2144
 [<ffffffff8144398d>] __lock_acquire+0x320d/0x4720 kernel/locking/lockdep.c:3206
 [<     inline     >] __lock_release kernel/locking/lockdep.c:3432
 [<ffffffff81447e17>] lock_release+0x697/0xce0 kernel/locking/lockdep.c:3604
 [<ffffffff81434ada>] up_write+0x1a/0x60 kernel/locking/rwsem.c:91
 [<     inline     >] i_mmap_unlock_write include/linux/fs.h:504
 [<     inline     >] vm_unlock_mapping mm/mmap.c:3254
 [<ffffffff816e2bf6>] mm_drop_all_locks+0x266/0x320 mm/mmap.c:3278
 [<ffffffff81729506>] do_mmu_notifier_register+0x286/0x420 mm/mmu_notifier.c:292
 [<ffffffff817296c2>] mmu_notifier_register+0x22/0x30 mm/mmu_notifier.c:317
 [<     inline     >] kvm_init_mmu_notifier
arch/x86/kvm/../../../virt/kvm/kvm_main.c:474
 [<     inline     >] kvm_create_vm
arch/x86/kvm/../../../virt/kvm/kvm_main.c:592
 [<     inline     >] kvm_dev_ioctl_create_vm
arch/x86/kvm/../../../virt/kvm/kvm_main.c:2966
 [<ffffffff8101acea>] kvm_dev_ioctl+0x72a/0x920
arch/x86/kvm/../../../virt/kvm/kvm_main.c:2995
 [<     inline     >] vfs_ioctl fs/ioctl.c:43
 [<ffffffff817b66f1>] do_vfs_ioctl+0x681/0xe40 fs/ioctl.c:607
 [<     inline     >] SYSC_ioctl fs/ioctl.c:622
 [<ffffffff817b6f3f>] SyS_ioctl+0x8f/0xc0 fs/ioctl.c:613
 [<ffffffff85e77af6>] entry_SYSCALL_64_fastpath+0x16/0x7a
arch/x86/entry/entry_64.S:185

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: mm: possible deadlock in mm_take_all_locks
  2016-01-08 16:58 mm: possible deadlock in mm_take_all_locks Dmitry Vyukov
@ 2016-01-08 23:23 ` Kirill A. Shutemov
  2016-01-10  8:05   ` Dmitry Vyukov
  0 siblings, 1 reply; 5+ messages in thread
From: Kirill A. Shutemov @ 2016-01-08 23:23 UTC (permalink / raw)
  To: Dmitry Vyukov, Michal Hocko
  Cc: Peter Zijlstra, Ingo Molnar, LKML, Andrew Morton,
	Kirill A. Shutemov, Oleg Nesterov, Chen Gang, linux-mm,
	syzkaller, Kostya Serebryany, Alexander Potapenko, Eric Dumazet,
	Sasha Levin

On Fri, Jan 08, 2016 at 05:58:33PM +0100, Dmitry Vyukov wrote:
> Hello,
> 
> I've hit the following deadlock warning while running syzkaller fuzzer
> on commit b06f3a168cdcd80026276898fd1fee443ef25743. As far as I
> understand this is a false positive, because both call stacks are
> protected by mm_all_locks_mutex.

+Michal

I don't think it's false positive.

The reason we don't care about order of taking i_mmap_rwsem is that we
never takes i_mmap_rwsem under other i_mmap_rwsem, but that's not true for
i_mmap_rwsem vs. hugetlbfs_i_mmap_rwsem_key. That's why we have the
annotation in the first place.

See commit b610ded71918 ("hugetlb: fix lockdep splat caused by pmd
sharing").

Consider totally untested patch below.

diff --git a/mm/mmap.c b/mm/mmap.c
index 2ce04a649f6b..63aefcf409e1 100644
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -3203,7 +3203,16 @@ int mm_take_all_locks(struct mm_struct *mm)
 	for (vma = mm->mmap; vma; vma = vma->vm_next) {
 		if (signal_pending(current))
 			goto out_unlock;
-		if (vma->vm_file && vma->vm_file->f_mapping)
+		if (vma->vm_file && vma->vm_file->f_mapping &&
+				!is_vm_hugetlb_page(vma))
+			vm_lock_mapping(mm, vma->vm_file->f_mapping);
+	}
+
+	for (vma = mm->mmap; vma; vma = vma->vm_next) {
+		if (signal_pending(current))
+			goto out_unlock;
+		if (vma->vm_file && vma->vm_file->f_mapping &&
+				is_vm_hugetlb_page(vma))
 			vm_lock_mapping(mm, vma->vm_file->f_mapping);
 	}
 
-- 
 Kirill A. Shutemov

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: mm: possible deadlock in mm_take_all_locks
  2016-01-08 23:23 ` Kirill A. Shutemov
@ 2016-01-10  8:05   ` Dmitry Vyukov
  2016-01-10 20:39     ` Kirill A. Shutemov
  0 siblings, 1 reply; 5+ messages in thread
From: Dmitry Vyukov @ 2016-01-10  8:05 UTC (permalink / raw)
  To: Kirill A. Shutemov
  Cc: Michal Hocko, Peter Zijlstra, Ingo Molnar, LKML, Andrew Morton,
	Kirill A. Shutemov, Oleg Nesterov, Chen Gang, linux-mm,
	syzkaller, Kostya Serebryany, Alexander Potapenko, Eric Dumazet,
	Sasha Levin

On Sat, Jan 9, 2016 at 12:23 AM, Kirill A. Shutemov
<kirill@shutemov.name> wrote:
> On Fri, Jan 08, 2016 at 05:58:33PM +0100, Dmitry Vyukov wrote:
>> Hello,
>>
>> I've hit the following deadlock warning while running syzkaller fuzzer
>> on commit b06f3a168cdcd80026276898fd1fee443ef25743. As far as I
>> understand this is a false positive, because both call stacks are
>> protected by mm_all_locks_mutex.
>
> +Michal
>
> I don't think it's false positive.
>
> The reason we don't care about order of taking i_mmap_rwsem is that we
> never takes i_mmap_rwsem under other i_mmap_rwsem, but that's not true for
> i_mmap_rwsem vs. hugetlbfs_i_mmap_rwsem_key. That's why we have the
> annotation in the first place.
>
> See commit b610ded71918 ("hugetlb: fix lockdep splat caused by pmd
> sharing").

Description of b610ded71918 suggests that that code takes hugetlb
mutex first and them normal page mutex. In this patch you take them in
the opposite order: normal mutex, then hugetlb mutex. Won't this patch
only increase probability of deadlocks? Shouldn't you take them in the
opposite order?


> Consider totally untested patch below.
>
> diff --git a/mm/mmap.c b/mm/mmap.c
> index 2ce04a649f6b..63aefcf409e1 100644
> --- a/mm/mmap.c
> +++ b/mm/mmap.c
> @@ -3203,7 +3203,16 @@ int mm_take_all_locks(struct mm_struct *mm)
>         for (vma = mm->mmap; vma; vma = vma->vm_next) {
>                 if (signal_pending(current))
>                         goto out_unlock;
> -               if (vma->vm_file && vma->vm_file->f_mapping)
> +               if (vma->vm_file && vma->vm_file->f_mapping &&
> +                               !is_vm_hugetlb_page(vma))
> +                       vm_lock_mapping(mm, vma->vm_file->f_mapping);
> +       }
> +
> +       for (vma = mm->mmap; vma; vma = vma->vm_next) {
> +               if (signal_pending(current))
> +                       goto out_unlock;
> +               if (vma->vm_file && vma->vm_file->f_mapping &&
> +                               is_vm_hugetlb_page(vma))
>                         vm_lock_mapping(mm, vma->vm_file->f_mapping);
>         }
>
> --
>  Kirill A. Shutemov

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: mm: possible deadlock in mm_take_all_locks
  2016-01-10  8:05   ` Dmitry Vyukov
@ 2016-01-10 20:39     ` Kirill A. Shutemov
  2016-01-11  9:04       ` Dmitry Vyukov
  0 siblings, 1 reply; 5+ messages in thread
From: Kirill A. Shutemov @ 2016-01-10 20:39 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: Michal Hocko, Peter Zijlstra, Ingo Molnar, LKML, Andrew Morton,
	Kirill A. Shutemov, Oleg Nesterov, Chen Gang, linux-mm,
	syzkaller, Kostya Serebryany, Alexander Potapenko, Eric Dumazet,
	Sasha Levin

On Sun, Jan 10, 2016 at 09:05:32AM +0100, Dmitry Vyukov wrote:
> On Sat, Jan 9, 2016 at 12:23 AM, Kirill A. Shutemov
> <kirill@shutemov.name> wrote:
> > On Fri, Jan 08, 2016 at 05:58:33PM +0100, Dmitry Vyukov wrote:
> >> Hello,
> >>
> >> I've hit the following deadlock warning while running syzkaller fuzzer
> >> on commit b06f3a168cdcd80026276898fd1fee443ef25743. As far as I
> >> understand this is a false positive, because both call stacks are
> >> protected by mm_all_locks_mutex.
> >
> > +Michal
> >
> > I don't think it's false positive.
> >
> > The reason we don't care about order of taking i_mmap_rwsem is that we
> > never takes i_mmap_rwsem under other i_mmap_rwsem, but that's not true for
> > i_mmap_rwsem vs. hugetlbfs_i_mmap_rwsem_key. That's why we have the
> > annotation in the first place.
> >
> > See commit b610ded71918 ("hugetlb: fix lockdep splat caused by pmd
> > sharing").
> 
> Description of b610ded71918 suggests that that code takes hugetlb
> mutex first and them normal page mutex. In this patch you take them in
> the opposite order: normal mutex, then hugetlb mutex. Won't this patch
> only increase probability of deadlocks? Shouldn't you take them in the
> opposite order?

You are right. I got it wrong. Conditions should be reversed.

The comment around hugetlbfs_i_mmap_rwsem_key definition is somewhat
confusing:

"This needs an annotation because huge_pmd_share() does an allocation
under i_mmap_rwsem."

I read this as we do hugetlb allocation when i_mmap_rwsem already taken
and made locking order respectively. I guess i_mmap_rwsem should be
replaced with hugetlbfs_i_mmap_rwsem_key in the comment.

-- 
 Kirill A. Shutemov

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: mm: possible deadlock in mm_take_all_locks
  2016-01-10 20:39     ` Kirill A. Shutemov
@ 2016-01-11  9:04       ` Dmitry Vyukov
  0 siblings, 0 replies; 5+ messages in thread
From: Dmitry Vyukov @ 2016-01-11  9:04 UTC (permalink / raw)
  To: Kirill A. Shutemov
  Cc: Michal Hocko, Peter Zijlstra, Ingo Molnar, LKML, Andrew Morton,
	Kirill A. Shutemov, Oleg Nesterov, Chen Gang, linux-mm,
	syzkaller, Kostya Serebryany, Alexander Potapenko, Eric Dumazet,
	Sasha Levin

On Sun, Jan 10, 2016 at 9:39 PM, Kirill A. Shutemov
<kirill@shutemov.name> wrote:
> On Sun, Jan 10, 2016 at 09:05:32AM +0100, Dmitry Vyukov wrote:
>> On Sat, Jan 9, 2016 at 12:23 AM, Kirill A. Shutemov
>> <kirill@shutemov.name> wrote:
>> > On Fri, Jan 08, 2016 at 05:58:33PM +0100, Dmitry Vyukov wrote:
>> >> Hello,
>> >>
>> >> I've hit the following deadlock warning while running syzkaller fuzzer
>> >> on commit b06f3a168cdcd80026276898fd1fee443ef25743. As far as I
>> >> understand this is a false positive, because both call stacks are
>> >> protected by mm_all_locks_mutex.
>> >
>> > +Michal
>> >
>> > I don't think it's false positive.
>> >
>> > The reason we don't care about order of taking i_mmap_rwsem is that we
>> > never takes i_mmap_rwsem under other i_mmap_rwsem, but that's not true for
>> > i_mmap_rwsem vs. hugetlbfs_i_mmap_rwsem_key. That's why we have the
>> > annotation in the first place.
>> >
>> > See commit b610ded71918 ("hugetlb: fix lockdep splat caused by pmd
>> > sharing").
>>
>> Description of b610ded71918 suggests that that code takes hugetlb
>> mutex first and them normal page mutex. In this patch you take them in
>> the opposite order: normal mutex, then hugetlb mutex. Won't this patch
>> only increase probability of deadlocks? Shouldn't you take them in the
>> opposite order?
>
> You are right. I got it wrong. Conditions should be reversed.
>
> The comment around hugetlbfs_i_mmap_rwsem_key definition is somewhat
> confusing:
>
> "This needs an annotation because huge_pmd_share() does an allocation
> under i_mmap_rwsem."
>
> I read this as we do hugetlb allocation when i_mmap_rwsem already taken
> and made locking order respectively. I guess i_mmap_rwsem should be
> replaced with hugetlbfs_i_mmap_rwsem_key in the comment.


Comment on mm_take_all_locks probably also needs updating.

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2016-01-11  9:04 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-01-08 16:58 mm: possible deadlock in mm_take_all_locks Dmitry Vyukov
2016-01-08 23:23 ` Kirill A. Shutemov
2016-01-10  8:05   ` Dmitry Vyukov
2016-01-10 20:39     ` Kirill A. Shutemov
2016-01-11  9:04       ` Dmitry Vyukov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).