All of lore.kernel.org
 help / color / mirror / Atom feed
* WARNING: locking bug in lock_downgrade
@ 2018-11-26 20:38 syzbot
  2018-12-13  1:14 ` Yang Shi
  0 siblings, 1 reply; 6+ messages in thread
From: syzbot @ 2018-11-26 20:38 UTC (permalink / raw)
  To: akpm, kirill.shutemov, linux-kernel, linux-mm, linux, mhocko,
	rientjes, syzkaller-bugs, vbabka, yang.shi

Hello,

syzbot found the following crash on:

HEAD commit:    e195ca6cb6f2 Merge branch 'for-linus' of git://git.kernel...
git tree:       upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=12336ed5400000
kernel config:  https://syzkaller.appspot.com/x/.config?x=73e2bc0cb6463446
dashboard link: https://syzkaller.appspot.com/bug?extid=53383ae265fb161ef488
compiler:       gcc (GCC) 8.0.1 20180413 (experimental)

Unfortunately, I don't have any reproducer for this crash yet.

IMPORTANT: if you fix the bug, please add the following tag to the commit:
Reported-by: syzbot+53383ae265fb161ef488@syzkaller.appspotmail.com


------------[ cut here ]------------
downgrading a read lock
WARNING: CPU: 0 PID: 26202 at kernel/locking/lockdep.c:3556  
__lock_downgrade kernel/locking/lockdep.c:3556 [inline]
WARNING: CPU: 0 PID: 26202 at kernel/locking/lockdep.c:3556  
lock_downgrade+0x4d7/0x900 kernel/locking/lockdep.c:3819
Kernel panic - not syncing: panic_on_warn set ...
CPU: 0 PID: 26202 Comm: blkid Not tainted 4.20.0-rc3+ #127
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS  
Google 01/01/2011
Call Trace:
  __dump_stack lib/dump_stack.c:77 [inline]
  dump_stack+0x244/0x39d lib/dump_stack.c:113
  panic+0x2ad/0x55c kernel/panic.c:188
  __warn.cold.8+0x20/0x45 kernel/panic.c:540
  report_bug+0x254/0x2d0 lib/bug.c:186
  fixup_bug arch/x86/kernel/traps.c:178 [inline]
  do_error_trap+0x11b/0x200 arch/x86/kernel/traps.c:271
  do_invalid_op+0x36/0x40 arch/x86/kernel/traps.c:290
  invalid_op+0x14/0x20 arch/x86/entry/entry_64.S:969
RIP: 0010:__lock_downgrade kernel/locking/lockdep.c:3556 [inline]
RIP: 0010:lock_downgrade+0x4d7/0x900 kernel/locking/lockdep.c:3819
Code: 00 00 fc ff df 41 c6 44 05 00 f8 e9 1b ff ff ff 48 c7 c7 60 68 2b 88  
4c 89 9d 58 ff ff ff 48 89 85 60 ff ff ff e8 d9 1f e7 ff <0f> 0b 48 8b 85  
60 ff ff ff 4c 8d 4d d8 4c 89 e9 48 ba 00 00 00 00
RSP: 0018:ffff8881ba547b70 EFLAGS: 00010086
RAX: 0000000000000000 RBX: 1ffff110374a8f74 RCX: 0000000000000000
RDX: 0000000000000000 RSI: ffffffff8165eaf5 RDI: 0000000000000006
RBP: ffff8881ba547c28 R08: ffff88817c0b2400 R09: fffffbfff12b2254
R10: fffffbfff12b2254 R11: ffffffff895912a3 R12: ffffffff8b0f67a0
R13: ffff8881ba547bc0 R14: 0000000000000001 R15: ffff88817c0b2400
  downgrade_write+0x76/0x270 kernel/locking/rwsem.c:147
  __do_munmap+0xcd8/0xf80 mm/mmap.c:2812
  __vm_munmap+0x138/0x1f0 mm/mmap.c:2837
  __do_sys_munmap mm/mmap.c:2862 [inline]
  __se_sys_munmap mm/mmap.c:2859 [inline]
  __x64_sys_munmap+0x65/0x80 mm/mmap.c:2859
  do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290
  entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x7f237e5ce417
Code: f0 ff ff 73 01 c3 48 8d 0d 8a ad 20 00 31 d2 48 29 c2 89 11 48 83 c8  
ff eb eb 90 90 90 90 90 90 90 90 90 b8 0b 00 00 00 0f 05 <48> 3d 01 f0 ff  
ff 73 01 c3 48 8d 0d 5d ad 20 00 31 d2 48 29 c2 89
RSP: 002b:00007ffe61bed9d8 EFLAGS: 00000203 ORIG_RAX: 000000000000000b
RAX: ffffffffffffffda RBX: 00007f237e7d91c8 RCX: 00007f237e5ce417
RDX: 000000000224ff00 RSI: 00000000000033ef RDI: 00007f237e7d1000
RBP: 00007ffe61bedb40 R08: 0000000000000001 R09: 0000000000000007
R10: 00007f237e5c8a0b R11: 0000000000000203 R12: 00000000b1f97d00
R13: 000004b2b1f97d00 R14: 000004b2afd5fdeb R15: 00007f237e7ce740
Kernel Offset: disabled
Rebooting in 86400 seconds..


---
This bug is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzkaller@googlegroups.com.

syzbot will keep track of this bug report. See:
https://goo.gl/tpsmEJ#bug-status-tracking for how to communicate with  
syzbot.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: WARNING: locking bug in lock_downgrade
  2018-11-26 20:38 WARNING: locking bug in lock_downgrade syzbot
@ 2018-12-13  1:14 ` Yang Shi
  2018-12-13 19:46   ` Waiman Long
  0 siblings, 1 reply; 6+ messages in thread
From: Yang Shi @ 2018-12-13  1:14 UTC (permalink / raw)
  To: syzbot, akpm, kirill.shutemov, linux-kernel, linux-mm, linux,
	mhocko, rientjes, syzkaller-bugs, vbabka, Waiman Long, peterz,
	mingo, boqun.feng

Cc'ed Peter, Ingo and Waiman.


It took me a few days to look into this warning, but I got lost in 
lockdep code.


The problem is the commit dd2283f2605e ("mm: mmap: zap pages with read 
mmap_sem in munmap") does an optimization for munmap by downgrading 
write mmap_sem to read before zapping pages. But, lockdep reports 
downgrading a read lock.


I'm pretty sure mmap_sem is held as write before downgrade_write() is 
called in the patch. And, there are 4 places which may downgrade a mmap_sem:

     - munmap

     - mremap

     - brk

     - clear_refs_write (fs/proc/task_mmu.c)


The first three come from my patches, and they just do: 
down_write_killable() -> .. -> downgrade_write().

But the last one is a little bit more complicated, it does down_read() 
->.. -> up_read() ->.. -> down_write_killable() ->.. -> downgrade_write().

And, the last one may be called from any process to touch the other 
processes' mmap_sem.


By looking into lockdep code, I'm not sure if lockdep may get confused 
by such sequence or not?


Any hint is appreciated.


Regards,

Yang



On 11/26/18 12:38 PM, syzbot wrote:
> Hello,
>
> syzbot found the following crash on:
>
> HEAD commit:    e195ca6cb6f2 Merge branch 'for-linus' of 
> git://git.kernel...
> git tree:       upstream
> console output: https://syzkaller.appspot.com/x/log.txt?x=12336ed5400000
> kernel config: https://syzkaller.appspot.com/x/.config?x=73e2bc0cb6463446
> dashboard link: 
> https://syzkaller.appspot.com/bug?extid=53383ae265fb161ef488
> compiler:       gcc (GCC) 8.0.1 20180413 (experimental)
>
> Unfortunately, I don't have any reproducer for this crash yet.
>
> IMPORTANT: if you fix the bug, please add the following tag to the 
> commit:
> Reported-by: syzbot+53383ae265fb161ef488@syzkaller.appspotmail.com
>
>
> ------------[ cut here ]------------
> downgrading a read lock
> WARNING: CPU: 0 PID: 26202 at kernel/locking/lockdep.c:3556 
> __lock_downgrade kernel/locking/lockdep.c:3556 [inline]
> WARNING: CPU: 0 PID: 26202 at kernel/locking/lockdep.c:3556 
> lock_downgrade+0x4d7/0x900 kernel/locking/lockdep.c:3819
> Kernel panic - not syncing: panic_on_warn set ...
> CPU: 0 PID: 26202 Comm: blkid Not tainted 4.20.0-rc3+ #127
> Hardware name: Google Google Compute Engine/Google Compute Engine, 
> BIOS Google 01/01/2011
> Call Trace:
>  __dump_stack lib/dump_stack.c:77 [inline]
>  dump_stack+0x244/0x39d lib/dump_stack.c:113
>  panic+0x2ad/0x55c kernel/panic.c:188
>  __warn.cold.8+0x20/0x45 kernel/panic.c:540
>  report_bug+0x254/0x2d0 lib/bug.c:186
>  fixup_bug arch/x86/kernel/traps.c:178 [inline]
>  do_error_trap+0x11b/0x200 arch/x86/kernel/traps.c:271
>  do_invalid_op+0x36/0x40 arch/x86/kernel/traps.c:290
>  invalid_op+0x14/0x20 arch/x86/entry/entry_64.S:969
> RIP: 0010:__lock_downgrade kernel/locking/lockdep.c:3556 [inline]
> RIP: 0010:lock_downgrade+0x4d7/0x900 kernel/locking/lockdep.c:3819
> Code: 00 00 fc ff df 41 c6 44 05 00 f8 e9 1b ff ff ff 48 c7 c7 60 68 
> 2b 88 4c 89 9d 58 ff ff ff 48 89 85 60 ff ff ff e8 d9 1f e7 ff <0f> 0b 
> 48 8b 85 60 ff ff ff 4c 8d 4d d8 4c 89 e9 48 ba 00 00 00 00
> RSP: 0018:ffff8881ba547b70 EFLAGS: 00010086
> RAX: 0000000000000000 RBX: 1ffff110374a8f74 RCX: 0000000000000000
> RDX: 0000000000000000 RSI: ffffffff8165eaf5 RDI: 0000000000000006
> RBP: ffff8881ba547c28 R08: ffff88817c0b2400 R09: fffffbfff12b2254
> R10: fffffbfff12b2254 R11: ffffffff895912a3 R12: ffffffff8b0f67a0
> R13: ffff8881ba547bc0 R14: 0000000000000001 R15: ffff88817c0b2400
>  downgrade_write+0x76/0x270 kernel/locking/rwsem.c:147
>  __do_munmap+0xcd8/0xf80 mm/mmap.c:2812
>  __vm_munmap+0x138/0x1f0 mm/mmap.c:2837
>  __do_sys_munmap mm/mmap.c:2862 [inline]
>  __se_sys_munmap mm/mmap.c:2859 [inline]
>  __x64_sys_munmap+0x65/0x80 mm/mmap.c:2859
>  do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290
>  entry_SYSCALL_64_after_hwframe+0x49/0xbe
> RIP: 0033:0x7f237e5ce417
> Code: f0 ff ff 73 01 c3 48 8d 0d 8a ad 20 00 31 d2 48 29 c2 89 11 48 
> 83 c8 ff eb eb 90 90 90 90 90 90 90 90 90 b8 0b 00 00 00 0f 05 <48> 3d 
> 01 f0 ff ff 73 01 c3 48 8d 0d 5d ad 20 00 31 d2 48 29 c2 89
> RSP: 002b:00007ffe61bed9d8 EFLAGS: 00000203 ORIG_RAX: 000000000000000b
> RAX: ffffffffffffffda RBX: 00007f237e7d91c8 RCX: 00007f237e5ce417
> RDX: 000000000224ff00 RSI: 00000000000033ef RDI: 00007f237e7d1000
> RBP: 00007ffe61bedb40 R08: 0000000000000001 R09: 0000000000000007
> R10: 00007f237e5c8a0b R11: 0000000000000203 R12: 00000000b1f97d00
> R13: 000004b2b1f97d00 R14: 000004b2afd5fdeb R15: 00007f237e7ce740
> Kernel Offset: disabled
> Rebooting in 86400 seconds..
>
>
> ---
> This bug is generated by a bot. It may contain errors.
> See https://goo.gl/tpsmEJ for more information about syzbot.
> syzbot engineers can be reached at syzkaller@googlegroups.com.
>
> syzbot will keep track of this bug report. See:
> https://goo.gl/tpsmEJ#bug-status-tracking for how to communicate with 
> syzbot.


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: WARNING: locking bug in lock_downgrade
  2018-12-13  1:14 ` Yang Shi
@ 2018-12-13 19:46   ` Waiman Long
  2019-01-09 14:18     ` Tetsuo Handa
  0 siblings, 1 reply; 6+ messages in thread
From: Waiman Long @ 2018-12-13 19:46 UTC (permalink / raw)
  To: Yang Shi, syzbot, akpm, kirill.shutemov, linux-kernel, linux-mm,
	linux, mhocko, rientjes, syzkaller-bugs, vbabka, peterz, mingo,
	boqun.feng

On 12/12/2018 08:14 PM, Yang Shi wrote:
> Cc'ed Peter, Ingo and Waiman.
>
>
> It took me a few days to look into this warning, but I got lost in
> lockdep code.
>
>
> The problem is the commit dd2283f2605e ("mm: mmap: zap pages with read
> mmap_sem in munmap") does an optimization for munmap by downgrading
> write mmap_sem to read before zapping pages. But, lockdep reports
> downgrading a read lock.
>
>
> I'm pretty sure mmap_sem is held as write before downgrade_write() is
> called in the patch. And, there are 4 places which may downgrade a
> mmap_sem:
>
>     - munmap
>
>     - mremap
>
>     - brk
>
>     - clear_refs_write (fs/proc/task_mmu.c)
>
>
> The first three come from my patches, and they just do:
> down_write_killable() -> .. -> downgrade_write().
>
> But the last one is a little bit more complicated, it does down_read()
> ->.. -> up_read() ->.. -> down_write_killable() ->.. ->
> downgrade_write().
>
> And, the last one may be called from any process to touch the other
> processes' mmap_sem.
>
>
> By looking into lockdep code, I'm not sure if lockdep may get confused
> by such sequence or not?
>
>
> Any hint is appreciated.
>
>
> Regards,
>
> Yang 

The warning was printed because hlock->read was set when doing the
downgrade_write(). So it is either downgrade_write() was called a second
time or a read lock was held originally. It is hard to tell what is the
root cause without a reproducer.

Cheers,
Longman

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: WARNING: locking bug in lock_downgrade
  2018-12-13 19:46   ` Waiman Long
@ 2019-01-09 14:18     ` Tetsuo Handa
  2019-01-09 22:46       ` Waiman Long
  0 siblings, 1 reply; 6+ messages in thread
From: Tetsuo Handa @ 2019-01-09 14:18 UTC (permalink / raw)
  To: peterz, mingo
  Cc: Waiman Long, Yang Shi, syzbot, akpm, kirill.shutemov,
	linux-kernel, linux-mm, linux, mhocko, rientjes, syzkaller-bugs,
	vbabka, boqun.feng

On 2018/12/14 4:46, Waiman Long wrote:
> On 12/12/2018 08:14 PM, Yang Shi wrote:
>> By looking into lockdep code, I'm not sure if lockdep may get confused
>> by such sequence or not?
>>
>>
>> Any hint is appreciated.
>>
>>
>> Regards,
>>
>> Yang 
> 
> The warning was printed because hlock->read was set when doing the
> downgrade_write(). So it is either downgrade_write() was called a second
> time or a read lock was held originally. It is hard to tell what is the
> root cause without a reproducer.
> 
> Cheers,
> Longman
> 

Comparing with output from

        struct rw_semaphore *sem = &current->mm->mmap_sem;

        down_write(sem);
        pr_warn("mmap_sem: count=%ld current=%px, owner=%px\n", atomic_long_read(&sem->count), current, READ_ONCE(sem->owner));
        /* mmap_sem: count=-4294967295 current=ffff88813095ca80, owner=ffff88813095ca80 */
        downgrade_write(sem);
        pr_warn("mmap_sem: count=%ld current=%px, owner=%px\n", atomic_long_read(&sem->count), current, READ_ONCE(sem->owner));
        /* mmap_sem: count=1 current=ffff88813095ca80, owner=ffff88813095ca83 */
        up_read(sem);
        pr_warn("mmap_sem: count=%ld current=%px, owner=%px\n", atomic_long_read(&sem->count), current, READ_ONCE(sem->owner));
        /* mmap_sem: count=0 current=ffff88813095ca80, owner=0000000000000003 */

what we got with debug printk() patch

  https://syzkaller.appspot.com/text?tag=CrashLog&x=169dbb9b400000

  [ 2580.337550][ T3645] mmap_sem: hlock->read=1 count=-4294967295 current=ffff888050e04140, owner=ffff888050e04140
  [ 2580.353526][ T3645] ------------[ cut here ]------------
  [ 2580.367859][ T3645] downgrading a read lock
  [ 2580.367935][ T3645] WARNING: CPU: 1 PID: 3645 at kernel/locking/lockdep.c:3572 lock_downgrade+0x35d/0xbe0
  [ 2580.382206][ T3645] Kernel panic - not syncing: panic_on_warn set ...

  https://syzkaller.appspot.com/text?tag=CrashLog&x=1542da4f400000

  [  386.342585][T16698] mmap_sem: hlock->read=1 count=-4294967295 current=ffff8880512ae180, owner=ffff8880512ae180
  [  386.348586][T16698] ------------[ cut here ]------------
  [  386.357203][T16698] downgrading a read lock
  [  386.357294][T16698] WARNING: CPU: 1 PID: 16698 at kernel/locking/lockdep.c:3572 lock_downgrade+0x35d/0xbe0
  [  386.372148][T16698] Kernel panic - not syncing: panic_on_warn set ...

indicates that lockdep is saying that "current->mm->mmap_sem is held for read"
while "struct rw_semaphore" is saying that "current->mm->mmap_sem is held for write".
Something made lockdep confused. Possibly a lockdep bug.


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: WARNING: locking bug in lock_downgrade
  2019-01-09 14:18     ` Tetsuo Handa
@ 2019-01-09 22:46       ` Waiman Long
  2019-01-10  2:45         ` Tetsuo Handa
  0 siblings, 1 reply; 6+ messages in thread
From: Waiman Long @ 2019-01-09 22:46 UTC (permalink / raw)
  To: Tetsuo Handa, peterz, mingo
  Cc: Yang Shi, syzbot, akpm, kirill.shutemov, linux-kernel, linux-mm,
	linux, mhocko, rientjes, syzkaller-bugs, vbabka, boqun.feng

On 01/09/2019 09:18 AM, Tetsuo Handa wrote:
> On 2018/12/14 4:46, Waiman Long wrote:
>> On 12/12/2018 08:14 PM, Yang Shi wrote:
>>> By looking into lockdep code, I'm not sure if lockdep may get confused
>>> by such sequence or not?
>>>
>>>
>>> Any hint is appreciated.
>>>
>>>
>>> Regards,
>>>
>>> Yang 
>> The warning was printed because hlock->read was set when doing the
>> downgrade_write(). So it is either downgrade_write() was called a second
>> time or a read lock was held originally. It is hard to tell what is the
>> root cause without a reproducer.
>>
>> Cheers,
>> Longman
>>
> Comparing with output from
>
>         struct rw_semaphore *sem = &current->mm->mmap_sem;
>
>         down_write(sem);
>         pr_warn("mmap_sem: count=%ld current=%px, owner=%px\n", atomic_long_read(&sem->count), current, READ_ONCE(sem->owner));
>         /* mmap_sem: count=-4294967295 current=ffff88813095ca80, owner=ffff88813095ca80 */
>         downgrade_write(sem);
>         pr_warn("mmap_sem: count=%ld current=%px, owner=%px\n", atomic_long_read(&sem->count), current, READ_ONCE(sem->owner));
>         /* mmap_sem: count=1 current=ffff88813095ca80, owner=ffff88813095ca83 */
>         up_read(sem);
>         pr_warn("mmap_sem: count=%ld current=%px, owner=%px\n", atomic_long_read(&sem->count), current, READ_ONCE(sem->owner));
>         /* mmap_sem: count=0 current=ffff88813095ca80, owner=0000000000000003 */

The behavior is correct. The current code will leave the reader task
structure pointer in owner even if it is a read lock. You have to look
at bit 0 to know if the owner is a reader or writer.

> what we got with debug printk() patch
>
>   https://syzkaller.appspot.com/text?tag=CrashLog&x=169dbb9b400000
>
>   [ 2580.337550][ T3645] mmap_sem: hlock->read=1 count=-4294967295 current=ffff888050e04140, owner=ffff888050e04140
>   [ 2580.353526][ T3645] ------------[ cut here ]------------
>   [ 2580.367859][ T3645] downgrading a read lock
>   [ 2580.367935][ T3645] WARNING: CPU: 1 PID: 3645 at kernel/locking/lockdep.c:3572 lock_downgrade+0x35d/0xbe0
>   [ 2580.382206][ T3645] Kernel panic - not syncing: panic_on_warn set ...
>
>   https://syzkaller.appspot.com/text?tag=CrashLog&x=1542da4f400000
>
>   [  386.342585][T16698] mmap_sem: hlock->read=1 count=-4294967295 current=ffff8880512ae180, owner=ffff8880512ae180
>   [  386.348586][T16698] ------------[ cut here ]------------
>   [  386.357203][T16698] downgrading a read lock
>   [  386.357294][T16698] WARNING: CPU: 1 PID: 16698 at kernel/locking/lockdep.c:3572 lock_downgrade+0x35d/0xbe0
>   [  386.372148][T16698] Kernel panic - not syncing: panic_on_warn set ...
>
> indicates that lockdep is saying that "current->mm->mmap_sem is held for read"
> while "struct rw_semaphore" is saying that "current->mm->mmap_sem is held for write".
> Something made lockdep confused. Possibly a lockdep bug.
>
It could be a bug in lockdep regarding downgrade. Someone else has
reported similar problem before.

-Longman


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: WARNING: locking bug in lock_downgrade
  2019-01-09 22:46       ` Waiman Long
@ 2019-01-10  2:45         ` Tetsuo Handa
  0 siblings, 0 replies; 6+ messages in thread
From: Tetsuo Handa @ 2019-01-10  2:45 UTC (permalink / raw)
  To: Waiman Long
  Cc: Tetsuo Handa, peterz, mingo, Yang Shi, syzbot, akpm,
	kirill.shutemov, linux-kernel, linux-mm, linux, mhocko, rientjes,
	syzkaller-bugs, vbabka, boqun.feng

OK. I noticed that when downgrade_write() warning is printed, there is always
another lockdep warning in progress at the same moment. And I confirmed that
downgrade_write() is confused by __debug_locks_off() called by concurrently
triggered lockdep warning. In other words, since another lockdep warning calls
__debug_locks_off(), downgrade_write() is making decisions based on "no longer
maintained dependency".

----------------------------------------
#include <linux/module.h>
#include <linux/sched.h>
#include <linux/rwsem.h>

static int __init test_init(void)
{
        struct rw_semaphore *sem = &current->mm->mmap_sem;
        static DECLARE_RWSEM(test_lock);

        down_read(sem);
        down_read(&test_lock);
        down_read(&test_lock);
        up_read(&test_lock);
        up_read(&test_lock);
        up_read(sem);
        return -EINVAL;
}

module_init(test_init);
MODULE_LICENSE("GPL");
----------------------------------------

----------------------------------------
[   40.822909][ T9174] test: loading out-of-tree module taints kernel.
[   40.827870][ T9174] 
[   40.828504][ T9174] ============================================
[   40.830253][ T9174] WARNING: possible recursive locking detected
[   40.831922][ T9174] 5.0.0-rc1-next-20190109 #269 Tainted: G           O     
[   40.833832][ T9174] --------------------------------------------
[   40.835461][ T9174] insmod/9174 is trying to acquire lock:
[   40.836871][ T9174] 000000007dbea5ef (test_lock){.+.+}, at: test_init+0x3c/0x1000 [test]
[   40.838936][ T9174] 
[   40.838936][ T9174] but task is already holding lock:
[   40.840827][ T9174] 000000007dbea5ef (test_lock){.+.+}, at: test_init+0x30/0x1000 [test]
[   40.842962][ T9174] 
[   40.842962][ T9174] other info that might help us debug this:
[   40.844979][ T9174]  Possible unsafe locking scenario:
[   40.844979][ T9174] 
[   40.847031][ T9174]        CPU0
[   40.847923][ T9174]        ----
[   40.848741][ T9174]   lock(test_lock);
[   40.849786][ T9174]   lock(test_lock);
[   40.850860][ T9174] 
[   40.850860][ T9174]  *** DEADLOCK ***
[   40.850860][ T9174] 
[   40.852892][ T9174]  May be due to missing lock nesting notation
[   40.852892][ T9174] 
[   40.855000][ T9174] 2 locks held by insmod/9174:
[   40.856187][ T9174]  #0: 000000008f7a8ad1 (&mm->mmap_sem){++++}, at: test_init+0x24/0x1000 [test]
[   40.858441][ T9174]  #1: 000000007dbea5ef (test_lock){.+.+}, at: test_init+0x30/0x1000 [test]
[   40.860784][ T9174] 
[   40.860784][ T9174] stack backtrace:
[   40.862253][ T9174] CPU: 1 PID: 9174 Comm: insmod Kdump: loaded Tainted: G           O      5.0.0-rc1-next-20190109 #269
[   40.865225][ T9174] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 04/13/2018
[   40.868304][ T9174] Call Trace:
[   40.869127][ T9174]  dump_stack+0x67/0x95
[   40.870274][ T9174]  __lock_acquire+0x1073/0x1260
[   40.871555][ T9174]  lock_acquire+0xd3/0x210
[   40.872657][ T9174]  ? test_init+0x3c/0x1000 [test]
[   40.873912][ T9174]  down_read+0x47/0xa0
[   40.874927][ T9174]  ? test_init+0x3c/0x1000 [test]
[   40.876222][ T9174]  test_init+0x3c/0x1000 [test]
[   40.877432][ T9174]  ? 0xffffffffc03ff000
[   40.878854][ T9174]  do_one_initcall+0x72/0x395
[   40.880625][ T9174]  ? rcu_read_lock_sched_held+0x8c/0xa0
[   40.882965][ T9174]  ? kmem_cache_alloc_trace+0x2b5/0x360
[   40.884734][ T9174]  do_init_module+0x5b/0x1ef
[   40.886265][ T9174]  load_module+0x17ed/0x1d10
[   40.887800][ T9174]  ? __symbol_put+0x90/0x90
[   40.889315][ T9174]  ? kernel_read+0x2c/0x40
[   40.890976][ T9174]  __do_sys_finit_module+0xa9/0x100
[   40.892648][ T9174]  __x64_sys_finit_module+0x15/0x20
[   40.894315][ T9174]  do_syscall_64+0x4a/0x180
[   40.895848][ T9174]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
[   40.897756][ T9174] RIP: 0033:0x7f2ccd0c11c9
[   40.899229][ T9174] Code: 01 00 48 81 c4 80 00 00 00 e9 f1 fe ff ff 0f 1f 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 97 dc 2c 00 f7 d8 64 89 01 48
[   40.906285][ T9174] RSP: 002b:00007fff3f992348 EFLAGS: 00000206 ORIG_RAX: 0000000000000139
[   40.909468][ T9174] RAX: ffffffffffffffda RBX: 0000000000bcb1f0 RCX: 00007f2ccd0c11c9
[   40.912444][ T9174] RDX: 0000000000000000 RSI: 000000000041a94e RDI: 0000000000000003
[   40.915459][ T9174] RBP: 000000000041a94e R08: 0000000000000000 R09: 00007fff3f9924e8
[   40.917978][ T9174] R10: 0000000000000003 R11: 0000000000000206 R12: 0000000000000000
[   40.920434][ T9174] R13: 0000000000bca130 R14: 0000000000000000 R15: 0000000000000000
[   41.305025][ T9174] mmap_sem: hlock->read=1 count=-4294967295 current=ffffa3f06d423840, owner=ffffa3f06d423840
[   41.308093][ T9174] ------------[ cut here ]------------
[   41.309878][ T9174] downgrading a read lock
[   41.309884][ T9174] WARNING: CPU: 1 PID: 9174 at kernel/locking/lockdep.c:3572 lock_downgrade+0xe5/0x240
[   41.314382][ T9174] Modules linked in: vmw_balloon pcspkr sg vmw_vmci i2c_piix4 ip6t_rpfilter ipt_REJECT nf_reject_ipv4 ip6t_REJECT nf_reject_ipv6 xt_conntrack ip_set nfnetlink ebtable_nat ebtable_broute bridge stp llc ip6table_nat nf_nat_ipv6 ip6table_mangle ip6table_raw iptable_nat nf_nat_ipv4 nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 iptable_mangle iptable_raw ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter ip_tables xfs libcrc32c sd_mod sr_mod cdrom serio_raw ata_generic pata_acpi vmwgfx drm_kms_helper ahci syscopyarea libahci sysfillrect sysimgblt e1000 fb_sys_fops mptspi ttm scsi_transport_spi mptscsih drm mptbase i2c_core ata_piix libata
[   41.332426][ T9174] CPU: 1 PID: 9174 Comm: insmod Kdump: loaded Tainted: G           O      5.0.0-rc1-next-20190109 #269
[   41.335600][ T9174] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 04/13/2018
[   41.338867][ T9174] RIP: 0010:lock_downgrade+0xe5/0x240
[   41.340989][ T9174] Code: 48 85 c0 74 13 49 8d 56 90 48 05 b8 00 00 00 48 39 c2 0f 84 35 01 00 00 48 c7 c7 36 70 df 9e 31 c0 4c 89 4d c0 e8 cb c7 f9 ff <0f> 0b 8b 55 cc 4c 8b 4d c0 41 0f b6 41 32 4d 89 79 08 44 89 e6 48
[   41.346977][ T9174] RSP: 0018:ffffbbbf007d7e00 EFLAGS: 00010086
[   41.349340][ T9174] RAX: 0000000000000000 RBX: ffffa3f06d423840 RCX: ffffffffa0049040
[   41.351999][ T9174] RDX: 0000000000000001 RSI: 0000000000000001 RDI: ffffffff9e0ddd22
[   41.354449][ T9174] RBP: ffffbbbf007d7e40 R08: 0000000000000000 R09: 6572206120676e69
[   41.357085][ T9174] R10: 72206120676e6964 R11: 0000000000000001 R12: 0000000000000002
[   41.359517][ T9174] R13: 0000000000000246 R14: ffffa3f0737f0968 R15: ffffffff9e1e9008
[   41.362154][ T9174] FS:  00007f2ccdbee740(0000) GS:ffffa3f07a600000(0000) knlGS:0000000000000000
[   41.365009][ T9174] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   41.367247][ T9174] CR2: 00007f2ccd122250 CR3: 0000000131f6e004 CR4: 00000000003606e0
[   41.369726][ T9174] Call Trace:
[   41.371284][ T9174]  downgrade_write+0x15/0x90
[   41.372939][ T9174]  __do_munmap+0x378/0x3c0
[   41.374557][ T9174]  __vm_munmap+0x76/0xd0
[   41.376401][ T9174]  __x64_sys_munmap+0x27/0x30
[   41.378079][ T9174]  do_syscall_64+0x4a/0x180
[   41.379711][ T9174]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
[   41.381940][ T9174] RIP: 0033:0x7f2ccd0c13f7
[   41.383564][ T9174] Code: 64 89 02 48 83 c8 ff eb 9c 48 8b 15 93 da 2c 00 f7 d8 64 89 02 e9 6a ff ff ff 66 0f 1f 84 00 00 00 00 00 b8 0b 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 69 da 2c 00 f7 d8 64 89 01 48
[   41.389435][ T9174] RSP: 002b:00007fff3f992378 EFLAGS: 00000206 ORIG_RAX: 000000000000000b
[   41.392384][ T9174] RAX: ffffffffffffffda RBX: 0000000000bcb300 RCX: 00007f2ccd0c13f7
[   41.394929][ T9174] RDX: 0000000000000000 RSI: 00000000000011c8 RDI: 00007f2ccdc06000
[   41.397432][ T9174] RBP: 0000000000000000 R08: 000000000040acd0 R09: 0000000000000000
[   41.400105][ T9174] R10: 00007fff3f991da0 R11: 0000000000000206 R12: 0000000000bca010
[   41.402693][ T9174] R13: 00007fff3f9924e0 R14: 0000000000000000 R15: 0000000000000000
[   41.405218][ T9174] irq event stamp: 4607
[   41.409414][ T9174] hardirqs last  enabled at (4607): [<ffffffff9e20ec47>] __slab_alloc.isra.84.constprop.90+0x5e/0x77
[   41.412696][ T9174] hardirqs last disabled at (4606): [<ffffffff9e20ec08>] __slab_alloc.isra.84.constprop.90+0x1f/0x77
[   41.416069][ T9174] softirqs last  enabled at (4528): [<ffffffff9e5fee8f>] peernet2id+0x4f/0x80
[   41.418737][ T9174] softirqs last disabled at (4526): [<ffffffff9e5fee71>] peernet2id+0x31/0x80
[   41.421542][ T9174] ---[ end trace 6db2652247cd6a11 ]---
----------------------------------------

If I do

 static inline int __debug_locks_off(void)
 {
-	return xchg(&debug_locks, 0);
+	return 1;
 }

then downgrade_write() warning is no longer printed by the reproducer (though
other lockdep warnings are printed because dependency is no longer correct).

----------------------------------------
[   34.554580][ T9173] test: loading out-of-tree module taints kernel.
[   34.559029][ T9173] 
[   34.559656][ T9173] ============================================
[   34.561416][ T9173] WARNING: possible recursive locking detected
[   34.563008][ T9173] 5.0.0-rc1-next-20190109+ #270 Tainted: G           O     
[   34.565045][ T9173] --------------------------------------------
[   34.566619][ T9173] insmod/9173 is trying to acquire lock:
[   34.568127][ T9173] 00000000cf012899 (test_lock){.+.+}, at: test_init+0x3c/0x1000 [test]
[   34.570594][ T9173] 
[   34.570594][ T9173] but task is already holding lock:
[   34.573474][ T9173] 00000000cf012899 (test_lock){.+.+}, at: test_init+0x30/0x1000 [test]
[   34.575650][ T9173] 
[   34.575650][ T9173] other info that might help us debug this:
[   34.577843][ T9173]  Possible unsafe locking scenario:
[   34.577843][ T9173] 
[   34.579813][ T9173]        CPU0
[   34.580707][ T9173]        ----
[   34.581576][ T9173]   lock(test_lock);
[   34.582604][ T9173]   lock(test_lock);
[   34.583630][ T9173] 
[   34.583630][ T9173]  *** DEADLOCK ***
[   34.583630][ T9173] 
[   34.585840][ T9173]  May be due to missing lock nesting notation
[   34.585840][ T9173] 
[   34.588127][ T9173] 2 locks held by insmod/9173:
[   34.589406][ T9173]  #0: 00000000d68054a9 (&mm->mmap_sem){++++}, at: test_init+0x24/0x1000 [test]
[   34.591822][ T9173]  #1: 00000000cf012899 (test_lock){.+.+}, at: test_init+0x30/0x1000 [test]
[   34.594205][ T9173] 
[   34.594205][ T9173] stack backtrace:
[   34.595789][ T9173] CPU: 1 PID: 9173 Comm: insmod Kdump: loaded Tainted: G           O      5.0.0-rc1-next-20190109+ #270
[   34.598773][ T9173] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 04/13/2018
[   34.602109][ T9173] Call Trace:
[   34.603046][ T9173]  dump_stack+0x67/0x95
[   34.604783][ T9173]  __lock_acquire+0x1073/0x1260
[   34.606529][ T9173]  lock_acquire+0xd3/0x210
[   34.608203][ T9173]  ? test_init+0x3c/0x1000 [test]
[   34.609972][ T9173]  down_read+0x47/0xa0
[   34.611451][ T9173]  ? test_init+0x3c/0x1000 [test]
[   34.613228][ T9173]  test_init+0x3c/0x1000 [test]
[   34.614849][ T9173]  ? 0xffffffffc01d0000
[   34.616458][ T9173]  do_one_initcall+0x72/0x395
[   34.618000][ T9173]  ? rcu_read_lock_sched_held+0x8c/0xa0
[   34.619922][ T9173]  ? kmem_cache_alloc_trace+0x2b5/0x360
[   34.621843][ T9173]  do_init_module+0x5b/0x1ef
[   34.623527][ T9173]  load_module+0x17ed/0x1d10
[   34.625295][ T9173]  ? __symbol_put+0x90/0x90
[   34.627035][ T9173]  ? kernel_read+0x2c/0x40
[   34.628717][ T9173]  __do_sys_finit_module+0xa9/0x100
[   34.630384][ T9173]  __x64_sys_finit_module+0x15/0x20
[   34.632125][ T9173]  do_syscall_64+0x4a/0x180
[   34.633612][ T9173]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
[   34.635548][ T9173] RIP: 0033:0x7f5b50e1f1c9
[   34.637082][ T9173] Code: 01 00 48 81 c4 80 00 00 00 e9 f1 fe ff ff 0f 1f 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 97 dc 2c 00 f7 d8 64 89 01 48
[   34.643337][ T9173] RSP: 002b:00007ffcee219f68 EFLAGS: 00000202 ORIG_RAX: 0000000000000139
[   34.645825][ T9173] RAX: ffffffffffffffda RBX: 00000000011631f0 RCX: 00007f5b50e1f1c9
[   34.648340][ T9173] RDX: 0000000000000000 RSI: 000000000041a94e RDI: 0000000000000003
[   34.650841][ T9173] RBP: 000000000041a94e R08: 0000000000000000 R09: 00007ffcee21a108
[   34.653287][ T9173] R10: 0000000000000003 R11: 0000000000000202 R12: 0000000000000000
[   34.655978][ T9173] R13: 0000000001162130 R14: 0000000000000000 R15: 0000000000000000
[   34.659756][ T9173] 
[   34.661520][ T9173] =====================================
[   34.663790][ T9173] WARNING: bad unlock balance detected!
[   34.665636][ T9173] 5.0.0-rc1-next-20190109+ #270 Tainted: G           O     
[   34.667902][ T9173] -------------------------------------
[   34.669595][ T9173] insmod/9173 is trying to release lock (test_lock) at:
[   34.672164][ T9173] [<ffffffffc01d0054>] test_init+0x54/0x1000 [test]
[   34.674384][ T9173] but there are no more locks to release!
[   34.676195][ T9173] 
[   34.676195][ T9173] other info that might help us debug this:
[   34.679053][ T9173] 1 lock held by insmod/9173:
[   34.680569][ T9173]  #0: 00000000d68054a9 (&mm->mmap_sem){++++}, at: test_init+0x24/0x1000 [test]
[   34.683362][ T9173] 
[   34.683362][ T9173] stack backtrace:
[   34.685567][ T9173] CPU: 1 PID: 9173 Comm: insmod Kdump: loaded Tainted: G           O      5.0.0-rc1-next-20190109+ #270
[   34.689164][ T9173] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 04/13/2018
[   34.692636][ T9173] Call Trace:
[   34.694044][ T9173]  dump_stack+0x67/0x95
[   34.695444][ T9173]  ? test_init+0x54/0x1000 [test]
[   34.697031][ T9173]  print_unlock_imbalance_bug+0xec/0x100
[   34.698899][ T9173]  ? test_init+0x54/0x1000 [test]
[   34.700489][ T9173]  lock_release+0x230/0x4c0
[   34.702119][ T9173]  up_read+0x1a/0xa0
[   34.703521][ T9173]  test_init+0x54/0x1000 [test]
[   34.705069][ T9173]  ? 0xffffffffc01d0000
[   34.706727][ T9173]  do_one_initcall+0x72/0x395
[   34.708233][ T9173]  ? rcu_read_lock_sched_held+0x8c/0xa0
[   34.710167][ T9173]  ? kmem_cache_alloc_trace+0x2b5/0x360
[   34.711885][ T9173]  do_init_module+0x5b/0x1ef
[   34.713538][ T9173]  load_module+0x17ed/0x1d10
[   34.715027][ T9173]  ? __symbol_put+0x90/0x90
[   34.716494][ T9173]  ? kernel_read+0x2c/0x40
[   34.718226][ T9173]  __do_sys_finit_module+0xa9/0x100
[   34.719924][ T9173]  __x64_sys_finit_module+0x15/0x20
[   34.721734][ T9173]  do_syscall_64+0x4a/0x180
[   34.723199][ T9173]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
[   34.725136][ T9173] RIP: 0033:0x7f5b50e1f1c9
[   34.726770][ T9173] Code: 01 00 48 81 c4 80 00 00 00 e9 f1 fe ff ff 0f 1f 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 97 dc 2c 00 f7 d8 64 89 01 48
[   34.732837][ T9173] RSP: 002b:00007ffcee219f68 EFLAGS: 00000202 ORIG_RAX: 0000000000000139
[   34.738479][ T9173] RAX: ffffffffffffffda RBX: 00000000011631f0 RCX: 00007f5b50e1f1c9
[   34.741078][ T9173] RDX: 0000000000000000 RSI: 000000000041a94e RDI: 0000000000000003
[   34.743495][ T9173] RBP: 000000000041a94e R08: 0000000000000000 R09: 00007ffcee21a108
[   34.746043][ T9173] R10: 0000000000000003 R11: 0000000000000202 R12: 0000000000000000
[   34.748450][ T9173] R13: 0000000001162130 R14: 0000000000000000 R15: 0000000000000000
----------------------------------------

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2019-01-10  2:45 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-11-26 20:38 WARNING: locking bug in lock_downgrade syzbot
2018-12-13  1:14 ` Yang Shi
2018-12-13 19:46   ` Waiman Long
2019-01-09 14:18     ` Tetsuo Handa
2019-01-09 22:46       ` Waiman Long
2019-01-10  2:45         ` Tetsuo Handa

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.