* mm: possible deadlock in mm_take_all_locks
@ 2016-01-08 16:58 Dmitry Vyukov
2016-01-08 23:23 ` Kirill A. Shutemov
0 siblings, 1 reply; 5+ messages in thread
From: Dmitry Vyukov @ 2016-01-08 16:58 UTC (permalink / raw)
To: Peter Zijlstra, Ingo Molnar, LKML, Andrew Morton,
Kirill A. Shutemov, Oleg Nesterov, Chen Gang, linux-mm
Cc: syzkaller, Kostya Serebryany, Alexander Potapenko, Eric Dumazet,
Sasha Levin
Hello,
I've hit the following deadlock warning while running syzkaller fuzzer
on commit b06f3a168cdcd80026276898fd1fee443ef25743. As far as I
understand this is a false positive, because both call stacks are
protected by mm_all_locks_mutex. What would be a way to annotate such
locking discipline?
======================================================
[ INFO: possible circular locking dependency detected ]
4.4.0-rc8+ #211 Not tainted
-------------------------------------------------------
syz-executor/11520 is trying to acquire lock:
(&mapping->i_mmap_rwsem){++++..}, at: [< inline >]
vm_lock_mapping mm/mmap.c:3159
(&mapping->i_mmap_rwsem){++++..}, at: [<ffffffff816e2e6d>]
mm_take_all_locks+0x1bd/0x5f0 mm/mmap.c:3207
but task is already holding lock:
(&hugetlbfs_i_mmap_rwsem_key){+.+...}, at: [< inline >]
vm_lock_mapping mm/mmap.c:3159
(&hugetlbfs_i_mmap_rwsem_key){+.+...}, at: [<ffffffff816e2e6d>]
mm_take_all_locks+0x1bd/0x5f0 mm/mmap.c:3207
which lock already depends on the new lock.
the existing dependency chain (in reverse order) is:
-> #1 (&hugetlbfs_i_mmap_rwsem_key){+.+...}:
[<ffffffff814472ec>] lock_acquire+0x1dc/0x430
kernel/locking/lockdep.c:3585
[<ffffffff81434989>] _down_write_nest_lock+0x49/0xa0
kernel/locking/rwsem.c:129
[< inline >] vm_lock_mapping mm/mmap.c:3159
[<ffffffff816e2e6d>] mm_take_all_locks+0x1bd/0x5f0 mm/mmap.c:3207
[<ffffffff817295a8>] do_mmu_notifier_register+0x328/0x420
mm/mmu_notifier.c:267
[<ffffffff817296c2>] mmu_notifier_register+0x22/0x30
mm/mmu_notifier.c:317
[< inline >] kvm_init_mmu_notifier
arch/x86/kvm/../../../virt/kvm/kvm_main.c:474
[< inline >] kvm_create_vm
arch/x86/kvm/../../../virt/kvm/kvm_main.c:592
[< inline >] kvm_dev_ioctl_create_vm
arch/x86/kvm/../../../virt/kvm/kvm_main.c:2966
[<ffffffff8101acea>] kvm_dev_ioctl+0x72a/0x920
arch/x86/kvm/../../../virt/kvm/kvm_main.c:2995
[< inline >] vfs_ioctl fs/ioctl.c:43
[<ffffffff817b66f1>] do_vfs_ioctl+0x681/0xe40 fs/ioctl.c:607
[< inline >] SYSC_ioctl fs/ioctl.c:622
[<ffffffff817b6f3f>] SyS_ioctl+0x8f/0xc0 fs/ioctl.c:613
[<ffffffff85e77af6>] entry_SYSCALL_64_fastpath+0x16/0x7a
arch/x86/entry/entry_64.S:185
-> #0 (&mapping->i_mmap_rwsem){++++..}:
[< inline >] check_prev_add kernel/locking/lockdep.c:1853
[< inline >] check_prevs_add kernel/locking/lockdep.c:1958
[< inline >] validate_chain kernel/locking/lockdep.c:2144
[<ffffffff8144398d>] __lock_acquire+0x320d/0x4720
kernel/locking/lockdep.c:3206
[< inline >] __lock_release kernel/locking/lockdep.c:3432
[<ffffffff81447e17>] lock_release+0x697/0xce0
kernel/locking/lockdep.c:3604
[<ffffffff81434ada>] up_write+0x1a/0x60 kernel/locking/rwsem.c:91
[< inline >] i_mmap_unlock_write include/linux/fs.h:504
[< inline >] vm_unlock_mapping mm/mmap.c:3254
[<ffffffff816e2bf6>] mm_drop_all_locks+0x266/0x320 mm/mmap.c:3278
[<ffffffff81729506>] do_mmu_notifier_register+0x286/0x420
mm/mmu_notifier.c:292
[<ffffffff817296c2>] mmu_notifier_register+0x22/0x30
mm/mmu_notifier.c:317
[< inline >] kvm_init_mmu_notifier
arch/x86/kvm/../../../virt/kvm/kvm_main.c:474
[< inline >] kvm_create_vm
arch/x86/kvm/../../../virt/kvm/kvm_main.c:592
[< inline >] kvm_dev_ioctl_create_vm
arch/x86/kvm/../../../virt/kvm/kvm_main.c:2966
[<ffffffff8101acea>] kvm_dev_ioctl+0x72a/0x920
arch/x86/kvm/../../../virt/kvm/kvm_main.c:2995
[< inline >] vfs_ioctl fs/ioctl.c:43
[<ffffffff817b66f1>] do_vfs_ioctl+0x681/0xe40 fs/ioctl.c:607
[< inline >] SYSC_ioctl fs/ioctl.c:622
[<ffffffff817b6f3f>] SyS_ioctl+0x8f/0xc0 fs/ioctl.c:613
[<ffffffff85e77af6>] entry_SYSCALL_64_fastpath+0x16/0x7a
arch/x86/entry/entry_64.S:185
other info that might help us debug this:
Possible unsafe locking scenario:
CPU0 CPU1
---- ----
lock(&hugetlbfs_i_mmap_rwsem_key);
lock(&mapping->i_mmap_rwsem);
lock(&hugetlbfs_i_mmap_rwsem_key);
lock(&mapping->i_mmap_rwsem);
*** DEADLOCK ***
3 locks held by syz-executor/11520:
#0: (&mm->mmap_sem){++++++}, at: [<ffffffff817295a0>]
do_mmu_notifier_register+0x320/0x420 mm/mmu_notifier.c:266
#1: (mm_all_locks_mutex){+.+...}, at: [<ffffffff816e2cf7>]
mm_take_all_locks+0x47/0x5f0 mm/mmap.c:3201
#2: (&hugetlbfs_i_mmap_rwsem_key){+.+...}, at: [< inline >]
vm_lock_mapping mm/mmap.c:3159
#2: (&hugetlbfs_i_mmap_rwsem_key){+.+...}, at: [<ffffffff816e2e6d>]
mm_take_all_locks+0x1bd/0x5f0 mm/mmap.c:3207
stack backtrace:
CPU: 2 PID: 11520 Comm: syz-executor Not tainted 4.4.0-rc8+ #211
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
00000000ffffffff ffff88003613fa10 ffffffff82907ccd ffffffff88911190
ffffffff88911190 ffffffff889321c0 ffff88003613fa60 ffffffff8143cb68
ffff880034bbaf00 ffff880034bbb73a 0000000000000000 ffff880034bbb718
Call Trace:
[< inline >] __dump_stack lib/dump_stack.c:15
[<ffffffff82907ccd>] dump_stack+0x6f/0xa2 lib/dump_stack.c:50
[<ffffffff8143cb68>] print_circular_bug+0x288/0x340
kernel/locking/lockdep.c:1226
[< inline >] check_prev_add kernel/locking/lockdep.c:1853
[< inline >] check_prevs_add kernel/locking/lockdep.c:1958
[< inline >] validate_chain kernel/locking/lockdep.c:2144
[<ffffffff8144398d>] __lock_acquire+0x320d/0x4720 kernel/locking/lockdep.c:3206
[< inline >] __lock_release kernel/locking/lockdep.c:3432
[<ffffffff81447e17>] lock_release+0x697/0xce0 kernel/locking/lockdep.c:3604
[<ffffffff81434ada>] up_write+0x1a/0x60 kernel/locking/rwsem.c:91
[< inline >] i_mmap_unlock_write include/linux/fs.h:504
[< inline >] vm_unlock_mapping mm/mmap.c:3254
[<ffffffff816e2bf6>] mm_drop_all_locks+0x266/0x320 mm/mmap.c:3278
[<ffffffff81729506>] do_mmu_notifier_register+0x286/0x420 mm/mmu_notifier.c:292
[<ffffffff817296c2>] mmu_notifier_register+0x22/0x30 mm/mmu_notifier.c:317
[< inline >] kvm_init_mmu_notifier
arch/x86/kvm/../../../virt/kvm/kvm_main.c:474
[< inline >] kvm_create_vm
arch/x86/kvm/../../../virt/kvm/kvm_main.c:592
[< inline >] kvm_dev_ioctl_create_vm
arch/x86/kvm/../../../virt/kvm/kvm_main.c:2966
[<ffffffff8101acea>] kvm_dev_ioctl+0x72a/0x920
arch/x86/kvm/../../../virt/kvm/kvm_main.c:2995
[< inline >] vfs_ioctl fs/ioctl.c:43
[<ffffffff817b66f1>] do_vfs_ioctl+0x681/0xe40 fs/ioctl.c:607
[< inline >] SYSC_ioctl fs/ioctl.c:622
[<ffffffff817b6f3f>] SyS_ioctl+0x8f/0xc0 fs/ioctl.c:613
[<ffffffff85e77af6>] entry_SYSCALL_64_fastpath+0x16/0x7a
arch/x86/entry/entry_64.S:185
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: mm: possible deadlock in mm_take_all_locks
2016-01-08 16:58 mm: possible deadlock in mm_take_all_locks Dmitry Vyukov
@ 2016-01-08 23:23 ` Kirill A. Shutemov
2016-01-10 8:05 ` Dmitry Vyukov
0 siblings, 1 reply; 5+ messages in thread
From: Kirill A. Shutemov @ 2016-01-08 23:23 UTC (permalink / raw)
To: Dmitry Vyukov, Michal Hocko
Cc: Peter Zijlstra, Ingo Molnar, LKML, Andrew Morton,
Kirill A. Shutemov, Oleg Nesterov, Chen Gang, linux-mm,
syzkaller, Kostya Serebryany, Alexander Potapenko, Eric Dumazet,
Sasha Levin
On Fri, Jan 08, 2016 at 05:58:33PM +0100, Dmitry Vyukov wrote:
> Hello,
>
> I've hit the following deadlock warning while running syzkaller fuzzer
> on commit b06f3a168cdcd80026276898fd1fee443ef25743. As far as I
> understand this is a false positive, because both call stacks are
> protected by mm_all_locks_mutex.
+Michal
I don't think it's false positive.
The reason we don't care about order of taking i_mmap_rwsem is that we
never takes i_mmap_rwsem under other i_mmap_rwsem, but that's not true for
i_mmap_rwsem vs. hugetlbfs_i_mmap_rwsem_key. That's why we have the
annotation in the first place.
See commit b610ded71918 ("hugetlb: fix lockdep splat caused by pmd
sharing").
Consider totally untested patch below.
diff --git a/mm/mmap.c b/mm/mmap.c
index 2ce04a649f6b..63aefcf409e1 100644
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -3203,7 +3203,16 @@ int mm_take_all_locks(struct mm_struct *mm)
for (vma = mm->mmap; vma; vma = vma->vm_next) {
if (signal_pending(current))
goto out_unlock;
- if (vma->vm_file && vma->vm_file->f_mapping)
+ if (vma->vm_file && vma->vm_file->f_mapping &&
+ !is_vm_hugetlb_page(vma))
+ vm_lock_mapping(mm, vma->vm_file->f_mapping);
+ }
+
+ for (vma = mm->mmap; vma; vma = vma->vm_next) {
+ if (signal_pending(current))
+ goto out_unlock;
+ if (vma->vm_file && vma->vm_file->f_mapping &&
+ is_vm_hugetlb_page(vma))
vm_lock_mapping(mm, vma->vm_file->f_mapping);
}
--
Kirill A. Shutemov
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: mm: possible deadlock in mm_take_all_locks
2016-01-08 23:23 ` Kirill A. Shutemov
@ 2016-01-10 8:05 ` Dmitry Vyukov
2016-01-10 20:39 ` Kirill A. Shutemov
0 siblings, 1 reply; 5+ messages in thread
From: Dmitry Vyukov @ 2016-01-10 8:05 UTC (permalink / raw)
To: Kirill A. Shutemov
Cc: Michal Hocko, Peter Zijlstra, Ingo Molnar, LKML, Andrew Morton,
Kirill A. Shutemov, Oleg Nesterov, Chen Gang, linux-mm,
syzkaller, Kostya Serebryany, Alexander Potapenko, Eric Dumazet,
Sasha Levin
On Sat, Jan 9, 2016 at 12:23 AM, Kirill A. Shutemov
<kirill@shutemov.name> wrote:
> On Fri, Jan 08, 2016 at 05:58:33PM +0100, Dmitry Vyukov wrote:
>> Hello,
>>
>> I've hit the following deadlock warning while running syzkaller fuzzer
>> on commit b06f3a168cdcd80026276898fd1fee443ef25743. As far as I
>> understand this is a false positive, because both call stacks are
>> protected by mm_all_locks_mutex.
>
> +Michal
>
> I don't think it's false positive.
>
> The reason we don't care about order of taking i_mmap_rwsem is that we
> never takes i_mmap_rwsem under other i_mmap_rwsem, but that's not true for
> i_mmap_rwsem vs. hugetlbfs_i_mmap_rwsem_key. That's why we have the
> annotation in the first place.
>
> See commit b610ded71918 ("hugetlb: fix lockdep splat caused by pmd
> sharing").
Description of b610ded71918 suggests that that code takes hugetlb
mutex first and them normal page mutex. In this patch you take them in
the opposite order: normal mutex, then hugetlb mutex. Won't this patch
only increase probability of deadlocks? Shouldn't you take them in the
opposite order?
> Consider totally untested patch below.
>
> diff --git a/mm/mmap.c b/mm/mmap.c
> index 2ce04a649f6b..63aefcf409e1 100644
> --- a/mm/mmap.c
> +++ b/mm/mmap.c
> @@ -3203,7 +3203,16 @@ int mm_take_all_locks(struct mm_struct *mm)
> for (vma = mm->mmap; vma; vma = vma->vm_next) {
> if (signal_pending(current))
> goto out_unlock;
> - if (vma->vm_file && vma->vm_file->f_mapping)
> + if (vma->vm_file && vma->vm_file->f_mapping &&
> + !is_vm_hugetlb_page(vma))
> + vm_lock_mapping(mm, vma->vm_file->f_mapping);
> + }
> +
> + for (vma = mm->mmap; vma; vma = vma->vm_next) {
> + if (signal_pending(current))
> + goto out_unlock;
> + if (vma->vm_file && vma->vm_file->f_mapping &&
> + is_vm_hugetlb_page(vma))
> vm_lock_mapping(mm, vma->vm_file->f_mapping);
> }
>
> --
> Kirill A. Shutemov
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: mm: possible deadlock in mm_take_all_locks
2016-01-10 8:05 ` Dmitry Vyukov
@ 2016-01-10 20:39 ` Kirill A. Shutemov
2016-01-11 9:04 ` Dmitry Vyukov
0 siblings, 1 reply; 5+ messages in thread
From: Kirill A. Shutemov @ 2016-01-10 20:39 UTC (permalink / raw)
To: Dmitry Vyukov
Cc: Michal Hocko, Peter Zijlstra, Ingo Molnar, LKML, Andrew Morton,
Kirill A. Shutemov, Oleg Nesterov, Chen Gang, linux-mm,
syzkaller, Kostya Serebryany, Alexander Potapenko, Eric Dumazet,
Sasha Levin
On Sun, Jan 10, 2016 at 09:05:32AM +0100, Dmitry Vyukov wrote:
> On Sat, Jan 9, 2016 at 12:23 AM, Kirill A. Shutemov
> <kirill@shutemov.name> wrote:
> > On Fri, Jan 08, 2016 at 05:58:33PM +0100, Dmitry Vyukov wrote:
> >> Hello,
> >>
> >> I've hit the following deadlock warning while running syzkaller fuzzer
> >> on commit b06f3a168cdcd80026276898fd1fee443ef25743. As far as I
> >> understand this is a false positive, because both call stacks are
> >> protected by mm_all_locks_mutex.
> >
> > +Michal
> >
> > I don't think it's false positive.
> >
> > The reason we don't care about order of taking i_mmap_rwsem is that we
> > never takes i_mmap_rwsem under other i_mmap_rwsem, but that's not true for
> > i_mmap_rwsem vs. hugetlbfs_i_mmap_rwsem_key. That's why we have the
> > annotation in the first place.
> >
> > See commit b610ded71918 ("hugetlb: fix lockdep splat caused by pmd
> > sharing").
>
> Description of b610ded71918 suggests that that code takes hugetlb
> mutex first and them normal page mutex. In this patch you take them in
> the opposite order: normal mutex, then hugetlb mutex. Won't this patch
> only increase probability of deadlocks? Shouldn't you take them in the
> opposite order?
You are right. I got it wrong. Conditions should be reversed.
The comment around hugetlbfs_i_mmap_rwsem_key definition is somewhat
confusing:
"This needs an annotation because huge_pmd_share() does an allocation
under i_mmap_rwsem."
I read this as we do hugetlb allocation when i_mmap_rwsem already taken
and made locking order respectively. I guess i_mmap_rwsem should be
replaced with hugetlbfs_i_mmap_rwsem_key in the comment.
--
Kirill A. Shutemov
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: mm: possible deadlock in mm_take_all_locks
2016-01-10 20:39 ` Kirill A. Shutemov
@ 2016-01-11 9:04 ` Dmitry Vyukov
0 siblings, 0 replies; 5+ messages in thread
From: Dmitry Vyukov @ 2016-01-11 9:04 UTC (permalink / raw)
To: Kirill A. Shutemov
Cc: Michal Hocko, Peter Zijlstra, Ingo Molnar, LKML, Andrew Morton,
Kirill A. Shutemov, Oleg Nesterov, Chen Gang, linux-mm,
syzkaller, Kostya Serebryany, Alexander Potapenko, Eric Dumazet,
Sasha Levin
On Sun, Jan 10, 2016 at 9:39 PM, Kirill A. Shutemov
<kirill@shutemov.name> wrote:
> On Sun, Jan 10, 2016 at 09:05:32AM +0100, Dmitry Vyukov wrote:
>> On Sat, Jan 9, 2016 at 12:23 AM, Kirill A. Shutemov
>> <kirill@shutemov.name> wrote:
>> > On Fri, Jan 08, 2016 at 05:58:33PM +0100, Dmitry Vyukov wrote:
>> >> Hello,
>> >>
>> >> I've hit the following deadlock warning while running syzkaller fuzzer
>> >> on commit b06f3a168cdcd80026276898fd1fee443ef25743. As far as I
>> >> understand this is a false positive, because both call stacks are
>> >> protected by mm_all_locks_mutex.
>> >
>> > +Michal
>> >
>> > I don't think it's false positive.
>> >
>> > The reason we don't care about order of taking i_mmap_rwsem is that we
>> > never takes i_mmap_rwsem under other i_mmap_rwsem, but that's not true for
>> > i_mmap_rwsem vs. hugetlbfs_i_mmap_rwsem_key. That's why we have the
>> > annotation in the first place.
>> >
>> > See commit b610ded71918 ("hugetlb: fix lockdep splat caused by pmd
>> > sharing").
>>
>> Description of b610ded71918 suggests that that code takes hugetlb
>> mutex first and them normal page mutex. In this patch you take them in
>> the opposite order: normal mutex, then hugetlb mutex. Won't this patch
>> only increase probability of deadlocks? Shouldn't you take them in the
>> opposite order?
>
> You are right. I got it wrong. Conditions should be reversed.
>
> The comment around hugetlbfs_i_mmap_rwsem_key definition is somewhat
> confusing:
>
> "This needs an annotation because huge_pmd_share() does an allocation
> under i_mmap_rwsem."
>
> I read this as we do hugetlb allocation when i_mmap_rwsem already taken
> and made locking order respectively. I guess i_mmap_rwsem should be
> replaced with hugetlbfs_i_mmap_rwsem_key in the comment.
Comment on mm_take_all_locks probably also needs updating.
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2016-01-11 9:04 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-01-08 16:58 mm: possible deadlock in mm_take_all_locks Dmitry Vyukov
2016-01-08 23:23 ` Kirill A. Shutemov
2016-01-10 8:05 ` Dmitry Vyukov
2016-01-10 20:39 ` Kirill A. Shutemov
2016-01-11 9:04 ` Dmitry Vyukov
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).