linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* possible deadlock in __wake_up_common_lock
@ 2019-01-02  8:51 syzbot
  2019-01-02 12:51 ` Vlastimil Babka
  0 siblings, 1 reply; 15+ messages in thread
From: syzbot @ 2019-01-02  8:51 UTC (permalink / raw)
  To: aarcange, akpm, kirill.shutemov, linux-kernel, linux-mm, linux,
	mhocko, rientjes, syzkaller-bugs, vbabka, xieyisheng1,
	zhongjiang

Hello,

syzbot found the following crash on:

HEAD commit:    f346b0becb1b Merge branch 'akpm' (patches from Andrew)
git tree:       upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=1510cefd400000
kernel config:  https://syzkaller.appspot.com/x/.config?x=c255c77ba370fe7c
dashboard link: https://syzkaller.appspot.com/bug?extid=93d94a001cfbce9e60e1
compiler:       gcc (GCC) 8.0.1 20180413 (experimental)
userspace arch: i386

Unfortunately, I don't have any reproducer for this crash yet.

IMPORTANT: if you fix the bug, please add the following tag to the commit:
Reported-by: syzbot+93d94a001cfbce9e60e1@syzkaller.appspotmail.com


======================================================
WARNING: possible circular locking dependency detected
4.20.0+ #297 Not tainted
------------------------------------------------------
syz-executor0/8529 is trying to acquire lock:
000000005e7fb829 (&pgdat->kswapd_wait){....}, at:  
__wake_up_common_lock+0x19e/0x330 kernel/sched/wait.c:120

but task is already holding lock:
000000009bb7bae0 (&(&zone->lock)->rlock){-.-.}, at: spin_lock  
include/linux/spinlock.h:329 [inline]
000000009bb7bae0 (&(&zone->lock)->rlock){-.-.}, at: rmqueue_bulk  
mm/page_alloc.c:2548 [inline]
000000009bb7bae0 (&(&zone->lock)->rlock){-.-.}, at: __rmqueue_pcplist  
mm/page_alloc.c:3021 [inline]
000000009bb7bae0 (&(&zone->lock)->rlock){-.-.}, at: rmqueue_pcplist  
mm/page_alloc.c:3050 [inline]
000000009bb7bae0 (&(&zone->lock)->rlock){-.-.}, at: rmqueue  
mm/page_alloc.c:3072 [inline]
000000009bb7bae0 (&(&zone->lock)->rlock){-.-.}, at:  
get_page_from_freelist+0x1bae/0x52a0 mm/page_alloc.c:3491

which lock already depends on the new lock.


the existing dependency chain (in reverse order) is:

-> #4 (&(&zone->lock)->rlock){-.-.}:
        __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline]
        _raw_spin_lock_irqsave+0x99/0xd0 kernel/locking/spinlock.c:152
        rmqueue mm/page_alloc.c:3082 [inline]
        get_page_from_freelist+0x9eb/0x52a0 mm/page_alloc.c:3491
        __alloc_pages_nodemask+0x4f3/0xde0 mm/page_alloc.c:4529
        __alloc_pages include/linux/gfp.h:473 [inline]
        alloc_page_interleave+0x25/0x1c0 mm/mempolicy.c:1988
        alloc_pages_current+0x1bf/0x210 mm/mempolicy.c:2104
        alloc_pages include/linux/gfp.h:509 [inline]
        depot_save_stack+0x3f1/0x470 lib/stackdepot.c:260
        save_stack+0xa9/0xd0 mm/kasan/common.c:79
        set_track mm/kasan/common.c:85 [inline]
        kasan_kmalloc+0xcb/0xd0 mm/kasan/common.c:482
        kasan_slab_alloc+0x12/0x20 mm/kasan/common.c:397
        kmem_cache_alloc+0x130/0x730 mm/slab.c:3541
        kmem_cache_zalloc include/linux/slab.h:731 [inline]
        fill_pool lib/debugobjects.c:134 [inline]
        __debug_object_init+0xbb8/0x1290 lib/debugobjects.c:379
        debug_object_init lib/debugobjects.c:431 [inline]
        debug_object_activate+0x323/0x600 lib/debugobjects.c:512
        debug_timer_activate kernel/time/timer.c:708 [inline]
        debug_activate kernel/time/timer.c:763 [inline]
        __mod_timer kernel/time/timer.c:1040 [inline]
        mod_timer kernel/time/timer.c:1101 [inline]
        add_timer+0x50e/0x1490 kernel/time/timer.c:1137
        __queue_delayed_work+0x249/0x380 kernel/workqueue.c:1533
        queue_delayed_work_on+0x1a2/0x1f0 kernel/workqueue.c:1558
        queue_delayed_work include/linux/workqueue.h:527 [inline]
        schedule_delayed_work include/linux/workqueue.h:628 [inline]
        start_dirtytime_writeback+0x4e/0x53 fs/fs-writeback.c:2043
        do_one_initcall+0x145/0x957 init/main.c:889
        do_initcall_level init/main.c:957 [inline]
        do_initcalls init/main.c:965 [inline]
        do_basic_setup init/main.c:983 [inline]
        kernel_init_freeable+0x4c1/0x5af init/main.c:1136
        kernel_init+0x11/0x1ae init/main.c:1056
        ret_from_fork+0x3a/0x50 arch/x86/entry/entry_64.S:352

-> #3 (&base->lock){-.-.}:
        __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline]
        _raw_spin_lock_irqsave+0x99/0xd0 kernel/locking/spinlock.c:152
        lock_timer_base+0xbb/0x2b0 kernel/time/timer.c:937
        __mod_timer kernel/time/timer.c:1009 [inline]
        mod_timer kernel/time/timer.c:1101 [inline]
        add_timer+0x895/0x1490 kernel/time/timer.c:1137
        __queue_delayed_work+0x249/0x380 kernel/workqueue.c:1533
        queue_delayed_work_on+0x1a2/0x1f0 kernel/workqueue.c:1558
        queue_delayed_work include/linux/workqueue.h:527 [inline]
        schedule_delayed_work include/linux/workqueue.h:628 [inline]
        psi_group_change kernel/sched/psi.c:485 [inline]
        psi_task_change+0x3f1/0x5f0 kernel/sched/psi.c:534
        psi_enqueue kernel/sched/stats.h:82 [inline]
        enqueue_task kernel/sched/core.c:727 [inline]
        activate_task+0x21a/0x430 kernel/sched/core.c:751
        wake_up_new_task+0x527/0xd20 kernel/sched/core.c:2423
        _do_fork+0x33b/0x11d0 kernel/fork.c:2247
        kernel_thread+0x34/0x40 kernel/fork.c:2281
        rest_init+0x28/0x372 init/main.c:409
        arch_call_rest_init+0xe/0x1b
        start_kernel+0x873/0x8ae init/main.c:741
        x86_64_start_reservations+0x29/0x2b arch/x86/kernel/head64.c:470
        x86_64_start_kernel+0x76/0x79 arch/x86/kernel/head64.c:451
        secondary_startup_64+0xa4/0xb0 arch/x86/kernel/head_64.S:243

-> #2 (&rq->lock){-.-.}:
        __raw_spin_lock include/linux/spinlock_api_smp.h:142 [inline]
        _raw_spin_lock+0x2d/0x40 kernel/locking/spinlock.c:144
        rq_lock kernel/sched/sched.h:1149 [inline]
        task_fork_fair+0xb0/0x6d0 kernel/sched/fair.c:10083
        sched_fork+0x443/0xba0 kernel/sched/core.c:2359
        copy_process+0x25b9/0x8790 kernel/fork.c:1893
        _do_fork+0x1cb/0x11d0 kernel/fork.c:2222
        kernel_thread+0x34/0x40 kernel/fork.c:2281
        rest_init+0x28/0x372 init/main.c:409
        arch_call_rest_init+0xe/0x1b
        start_kernel+0x873/0x8ae init/main.c:741
        x86_64_start_reservations+0x29/0x2b arch/x86/kernel/head64.c:470
        x86_64_start_kernel+0x76/0x79 arch/x86/kernel/head64.c:451
        secondary_startup_64+0xa4/0xb0 arch/x86/kernel/head_64.S:243

-> #1 (&p->pi_lock){-.-.}:
        __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline]
        _raw_spin_lock_irqsave+0x99/0xd0 kernel/locking/spinlock.c:152
        try_to_wake_up+0xdc/0x1460 kernel/sched/core.c:1965
        default_wake_function+0x30/0x50 kernel/sched/core.c:3710
        autoremove_wake_function+0x80/0x370 kernel/sched/wait.c:375
        __wake_up_common+0x1d7/0x7d0 kernel/sched/wait.c:92
        __wake_up_common_lock+0x1c2/0x330 kernel/sched/wait.c:121
        __wake_up+0xe/0x10 kernel/sched/wait.c:145
        wakeup_kswapd+0x5f0/0x930 mm/vmscan.c:3982
        wake_all_kswapds+0x150/0x300 mm/page_alloc.c:3975
        __alloc_pages_slowpath+0x1ff1/0x2db0 mm/page_alloc.c:4246
        __alloc_pages_nodemask+0xa89/0xde0 mm/page_alloc.c:4549
        alloc_pages_current+0x10c/0x210 mm/mempolicy.c:2106
        alloc_pages include/linux/gfp.h:509 [inline]
        __get_free_pages+0xc/0x40 mm/page_alloc.c:4573
        pte_alloc_one_kernel+0x15/0x20 arch/x86/mm/pgtable.c:28
        __pte_alloc_kernel+0x23/0x220 mm/memory.c:439
        vmap_pte_range mm/vmalloc.c:144 [inline]
        vmap_pmd_range mm/vmalloc.c:171 [inline]
        vmap_pud_range mm/vmalloc.c:188 [inline]
        vmap_p4d_range mm/vmalloc.c:205 [inline]
        vmap_page_range_noflush+0x878/0xa80 mm/vmalloc.c:230
        vmap_page_range mm/vmalloc.c:243 [inline]
        vm_map_ram+0x46c/0xf60 mm/vmalloc.c:1181
        ion_heap_clear_pages+0x2a/0x70  
drivers/staging/android/ion/ion_heap.c:100
        ion_heap_sglist_zero+0x24f/0x2d0  
drivers/staging/android/ion/ion_heap.c:121
        ion_heap_buffer_zero+0xf8/0x150  
drivers/staging/android/ion/ion_heap.c:143
        ion_system_heap_free+0x227/0x290  
drivers/staging/android/ion/ion_system_heap.c:163
        ion_buffer_destroy+0x15c/0x1c0 drivers/staging/android/ion/ion.c:119
        _ion_heap_freelist_drain+0x43e/0x6a0  
drivers/staging/android/ion/ion_heap.c:199
        ion_heap_freelist_drain+0x1f/0x30  
drivers/staging/android/ion/ion_heap.c:209
        ion_buffer_create drivers/staging/android/ion/ion.c:86 [inline]
        ion_alloc+0x487/0xa60 drivers/staging/android/ion/ion.c:409
        ion_ioctl+0x216/0x41e drivers/staging/android/ion/ion-ioctl.c:76
        __do_compat_sys_ioctl fs/compat_ioctl.c:1052 [inline]
        __se_compat_sys_ioctl fs/compat_ioctl.c:998 [inline]
        __ia32_compat_sys_ioctl+0x20e/0x630 fs/compat_ioctl.c:998
        do_syscall_32_irqs_on arch/x86/entry/common.c:326 [inline]
        do_fast_syscall_32+0x34d/0xfb2 arch/x86/entry/common.c:397
        entry_SYSENTER_compat+0x70/0x7f arch/x86/entry/entry_64_compat.S:139

-> #0 (&pgdat->kswapd_wait){....}:
        lock_acquire+0x1ed/0x520 kernel/locking/lockdep.c:3841
        __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline]
        _raw_spin_lock_irqsave+0x99/0xd0 kernel/locking/spinlock.c:152
        __wake_up_common_lock+0x19e/0x330 kernel/sched/wait.c:120
        __wake_up+0xe/0x10 kernel/sched/wait.c:145
        wakeup_kswapd+0x5f0/0x930 mm/vmscan.c:3982
        steal_suitable_fallback+0x538/0x830 mm/page_alloc.c:2217
        __rmqueue_fallback mm/page_alloc.c:2502 [inline]
        __rmqueue mm/page_alloc.c:2528 [inline]
        rmqueue_bulk mm/page_alloc.c:2550 [inline]
        __rmqueue_pcplist mm/page_alloc.c:3021 [inline]
        rmqueue_pcplist mm/page_alloc.c:3050 [inline]
        rmqueue mm/page_alloc.c:3072 [inline]
        get_page_from_freelist+0x318c/0x52a0 mm/page_alloc.c:3491
        __alloc_pages_nodemask+0x4f3/0xde0 mm/page_alloc.c:4529
        alloc_pages_current+0x10c/0x210 mm/mempolicy.c:2106
        alloc_pages include/linux/gfp.h:509 [inline]
        __get_free_pages+0xc/0x40 mm/page_alloc.c:4573
        tlb_next_batch mm/mmu_gather.c:29 [inline]
        __tlb_remove_page_size+0x2e5/0x500 mm/mmu_gather.c:133
        __tlb_remove_page include/asm-generic/tlb.h:187 [inline]
        zap_pte_range mm/memory.c:1093 [inline]
        zap_pmd_range mm/memory.c:1192 [inline]
        zap_pud_range mm/memory.c:1221 [inline]
        zap_p4d_range mm/memory.c:1242 [inline]
        unmap_page_range+0xf88/0x25b0 mm/memory.c:1263
        unmap_single_vma+0x19b/0x310 mm/memory.c:1308
        unmap_vmas+0x221/0x390 mm/memory.c:1339
        exit_mmap+0x2be/0x590 mm/mmap.c:3140
        __mmput kernel/fork.c:1051 [inline]
        mmput+0x247/0x610 kernel/fork.c:1072
        exit_mm kernel/exit.c:545 [inline]
        do_exit+0xdeb/0x2620 kernel/exit.c:854
        do_group_exit+0x177/0x440 kernel/exit.c:970
        get_signal+0x8b0/0x1980 kernel/signal.c:2517
        do_signal+0x9c/0x21c0 arch/x86/kernel/signal.c:816
        exit_to_usermode_loop+0x2e5/0x380 arch/x86/entry/common.c:162
        prepare_exit_to_usermode arch/x86/entry/common.c:197 [inline]
        syscall_return_slowpath arch/x86/entry/common.c:268 [inline]
        do_syscall_32_irqs_on arch/x86/entry/common.c:341 [inline]
        do_fast_syscall_32+0xcd5/0xfb2 arch/x86/entry/common.c:397
        entry_SYSENTER_compat+0x70/0x7f arch/x86/entry/entry_64_compat.S:139

other info that might help us debug this:

Chain exists of:
   &pgdat->kswapd_wait --> &base->lock --> &(&zone->lock)->rlock

  Possible unsafe locking scenario:

        CPU0                    CPU1
        ----                    ----
   lock(&(&zone->lock)->rlock);
                                lock(&base->lock);
                                lock(&(&zone->lock)->rlock);
   lock(&pgdat->kswapd_wait);

  *** DEADLOCK ***

2 locks held by syz-executor0/8529:
  #0: 000000001be7b4ca (&(ptlock_ptr(page))->rlock#2){+.+.}, at: spin_lock  
include/linux/spinlock.h:329 [inline]
  #0: 000000001be7b4ca (&(ptlock_ptr(page))->rlock#2){+.+.}, at:  
zap_pte_range mm/memory.c:1051 [inline]
  #0: 000000001be7b4ca (&(ptlock_ptr(page))->rlock#2){+.+.}, at:  
zap_pmd_range mm/memory.c:1192 [inline]
  #0: 000000001be7b4ca (&(ptlock_ptr(page))->rlock#2){+.+.}, at:  
zap_pud_range mm/memory.c:1221 [inline]
  #0: 000000001be7b4ca (&(ptlock_ptr(page))->rlock#2){+.+.}, at:  
zap_p4d_range mm/memory.c:1242 [inline]
  #0: 000000001be7b4ca (&(ptlock_ptr(page))->rlock#2){+.+.}, at:  
unmap_page_range+0x98e/0x25b0 mm/memory.c:1263
  #1: 000000009bb7bae0 (&(&zone->lock)->rlock){-.-.}, at: spin_lock  
include/linux/spinlock.h:329 [inline]
  #1: 000000009bb7bae0 (&(&zone->lock)->rlock){-.-.}, at: rmqueue_bulk  
mm/page_alloc.c:2548 [inline]
  #1: 000000009bb7bae0 (&(&zone->lock)->rlock){-.-.}, at: __rmqueue_pcplist  
mm/page_alloc.c:3021 [inline]
  #1: 000000009bb7bae0 (&(&zone->lock)->rlock){-.-.}, at: rmqueue_pcplist  
mm/page_alloc.c:3050 [inline]
  #1: 000000009bb7bae0 (&(&zone->lock)->rlock){-.-.}, at: rmqueue  
mm/page_alloc.c:3072 [inline]
  #1: 000000009bb7bae0 (&(&zone->lock)->rlock){-.-.}, at:  
get_page_from_freelist+0x1bae/0x52a0 mm/page_alloc.c:3491

stack backtrace:
CPU: 0 PID: 8529 Comm: syz-executor0 Not tainted 4.20.0+ #297
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS  
Google 01/01/2011
Call Trace:
  __dump_stack lib/dump_stack.c:77 [inline]
  dump_stack+0x1d3/0x2c6 lib/dump_stack.c:113
  print_circular_bug.isra.34.cold.56+0x1bd/0x27d  
kernel/locking/lockdep.c:1224
  check_prev_add kernel/locking/lockdep.c:1866 [inline]
  check_prevs_add kernel/locking/lockdep.c:1979 [inline]
  validate_chain kernel/locking/lockdep.c:2350 [inline]
  __lock_acquire+0x3360/0x4c20 kernel/locking/lockdep.c:3338
  lock_acquire+0x1ed/0x520 kernel/locking/lockdep.c:3841
  __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline]
  _raw_spin_lock_irqsave+0x99/0xd0 kernel/locking/spinlock.c:152
  __wake_up_common_lock+0x19e/0x330 kernel/sched/wait.c:120
  __wake_up+0xe/0x10 kernel/sched/wait.c:145
  wakeup_kswapd+0x5f0/0x930 mm/vmscan.c:3982
  steal_suitable_fallback+0x538/0x830 mm/page_alloc.c:2217
  __rmqueue_fallback mm/page_alloc.c:2502 [inline]
  __rmqueue mm/page_alloc.c:2528 [inline]
  rmqueue_bulk mm/page_alloc.c:2550 [inline]
  __rmqueue_pcplist mm/page_alloc.c:3021 [inline]
  rmqueue_pcplist mm/page_alloc.c:3050 [inline]
  rmqueue mm/page_alloc.c:3072 [inline]
  get_page_from_freelist+0x318c/0x52a0 mm/page_alloc.c:3491
  __alloc_pages_nodemask+0x4f3/0xde0 mm/page_alloc.c:4529
  alloc_pages_current+0x10c/0x210 mm/mempolicy.c:2106
  alloc_pages include/linux/gfp.h:509 [inline]
  __get_free_pages+0xc/0x40 mm/page_alloc.c:4573
  tlb_next_batch mm/mmu_gather.c:29 [inline]
  __tlb_remove_page_size+0x2e5/0x500 mm/mmu_gather.c:133
  __tlb_remove_page include/asm-generic/tlb.h:187 [inline]
  zap_pte_range mm/memory.c:1093 [inline]
  zap_pmd_range mm/memory.c:1192 [inline]
  zap_pud_range mm/memory.c:1221 [inline]
  zap_p4d_range mm/memory.c:1242 [inline]
  unmap_page_range+0xf88/0x25b0 mm/memory.c:1263
  unmap_single_vma+0x19b/0x310 mm/memory.c:1308
  unmap_vmas+0x221/0x390 mm/memory.c:1339
  exit_mmap+0x2be/0x590 mm/mmap.c:3140
  __mmput kernel/fork.c:1051 [inline]
  mmput+0x247/0x610 kernel/fork.c:1072
  exit_mm kernel/exit.c:545 [inline]
  do_exit+0xdeb/0x2620 kernel/exit.c:854
  do_group_exit+0x177/0x440 kernel/exit.c:970
  get_signal+0x8b0/0x1980 kernel/signal.c:2517
  do_signal+0x9c/0x21c0 arch/x86/kernel/signal.c:816
  exit_to_usermode_loop+0x2e5/0x380 arch/x86/entry/common.c:162
  prepare_exit_to_usermode arch/x86/entry/common.c:197 [inline]
  syscall_return_slowpath arch/x86/entry/common.c:268 [inline]
  do_syscall_32_irqs_on arch/x86/entry/common.c:341 [inline]
  do_fast_syscall_32+0xcd5/0xfb2 arch/x86/entry/common.c:397
  entry_SYSENTER_compat+0x70/0x7f arch/x86/entry/entry_64_compat.S:139
RIP: 0023:0xf7fe3849
Code: Bad RIP value.
RSP: 002b:00000000f5f9d0cc EFLAGS: 00000296 ORIG_RAX: 0000000000000036
RAX: 0000000000000000 RBX: 0000000000000005 RCX: 00000000c0184900
RDX: 0000000020000080 RSI: 0000000000000000 RDI: 0000000000000000
RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
kobject: 'loop0' (000000002925f66c): kobject_uevent_env
syz-executor0 (8529) used greatest stack depth: 10424 bytes left
kobject: 'loop0' (000000002925f66c): fill_kobj_path: path  
= '/devices/virtual/block/loop0'
kobject: 'loop3' (0000000061a5b8df): kobject_uevent_env
kobject: 'loop3' (0000000061a5b8df): fill_kobj_path: path  
= '/devices/virtual/block/loop3'
kobject: 'loop1' (0000000003dfbc9f): kobject_uevent_env
kobject: 'loop1' (0000000003dfbc9f): fill_kobj_path: path  
= '/devices/virtual/block/loop1'
kobject: 'loop0' (000000002925f66c): kobject_uevent_env
kobject: 'loop0' (000000002925f66c): fill_kobj_path: path  
= '/devices/virtual/block/loop0'
kobject: 'loop2' (00000000c253515f): kobject_uevent_env
kobject: 'loop2' (00000000c253515f): fill_kobj_path: path  
= '/devices/virtual/block/loop2'
kobject: 'loop4' (00000000ebe25695): kobject_uevent_env
kobject: 'loop4' (00000000ebe25695): fill_kobj_path: path  
= '/devices/virtual/block/loop4'
kobject: 'loop5' (00000000c7588ca8): kobject_uevent_env
kobject: 'loop5' (00000000c7588ca8): fill_kobj_path: path  
= '/devices/virtual/block/loop5'
kobject: 'loop1' (0000000003dfbc9f): kobject_uevent_env
kobject: 'loop1' (0000000003dfbc9f): fill_kobj_path: path  
= '/devices/virtual/block/loop1'
kobject: 'loop3' (0000000061a5b8df): kobject_uevent_env
kobject: 'loop3' (0000000061a5b8df): fill_kobj_path: path  
= '/devices/virtual/block/loop3'
kobject: 'loop2' (00000000c253515f): kobject_uevent_env
kobject: 'loop2' (00000000c253515f): fill_kobj_path: path  
= '/devices/virtual/block/loop2'
kobject: 'loop1' (0000000003dfbc9f): kobject_uevent_env
kobject: 'loop1' (0000000003dfbc9f): fill_kobj_path: path  
= '/devices/virtual/block/loop1'
kobject: 'loop3' (0000000061a5b8df): kobject_uevent_env
kobject: 'loop3' (0000000061a5b8df): fill_kobj_path: path  
= '/devices/virtual/block/loop3'
kobject: 'loop5' (00000000c7588ca8): kobject_uevent_env
kobject: 'loop5' (00000000c7588ca8): fill_kobj_path: path  
= '/devices/virtual/block/loop5'
kobject: 'loop4' (00000000ebe25695): kobject_uevent_env
kobject: 'loop4' (00000000ebe25695): fill_kobj_path: path  
= '/devices/virtual/block/loop4'
kobject: 'loop3' (0000000061a5b8df): kobject_uevent_env
kobject: 'kvm' (00000000eddbbf94): kobject_uevent_env
kobject: 'loop3' (0000000061a5b8df): fill_kobj_path: path  
= '/devices/virtual/block/loop3'
kobject: 'kvm' (00000000eddbbf94): fill_kobj_path: path  
= '/devices/virtual/misc/kvm'
kobject: 'loop0' (000000002925f66c): kobject_uevent_env
kobject: 'loop0' (000000002925f66c): fill_kobj_path: path  
= '/devices/virtual/block/loop0'
kobject: 'kvm' (00000000eddbbf94): kobject_uevent_env
kobject: 'loop4' (00000000ebe25695): kobject_uevent_env
kobject: 'loop4' (00000000ebe25695): fill_kobj_path: path  
= '/devices/virtual/block/loop4'
audit: type=1326 audit(1546069676.863:33): auid=4294967295 uid=0 gid=0  
ses=4294967295 subj==unconfined pid=8664 comm="syz-executor1"  
exe="/root/syz-executor1" sig=31 arch=40000003 syscall=265 compat=1  
ip=0xf7f82849 code=0x0
kobject: 'kvm' (00000000eddbbf94): fill_kobj_path: path  
= '/devices/virtual/misc/kvm'
kobject: 'loop5' (00000000c7588ca8): kobject_uevent_env
kobject: 'loop5' (00000000c7588ca8): fill_kobj_path: path  
= '/devices/virtual/block/loop5'
kobject: 'loop1' (0000000003dfbc9f): kobject_uevent_env
kobject: 'loop1' (0000000003dfbc9f): fill_kobj_path: path  
= '/devices/virtual/block/loop1'
kobject: 'loop3' (0000000061a5b8df): kobject_uevent_env
kobject: 'loop3' (0000000061a5b8df): fill_kobj_path: path  
= '/devices/virtual/block/loop3'
kobject: 'kvm' (00000000eddbbf94): kobject_uevent_env
kobject: 'loop0' (000000002925f66c): kobject_uevent_env
kobject: 'kvm' (00000000eddbbf94): fill_kobj_path: path  
= '/devices/virtual/misc/kvm'
kobject: 'kvm' (00000000eddbbf94): kobject_uevent_env
kobject: 'loop0' (000000002925f66c): fill_kobj_path: path  
= '/devices/virtual/block/loop0'
kobject: 'kvm' (00000000eddbbf94): fill_kobj_path: path  
= '/devices/virtual/misc/kvm'
kobject: 'loop1' (0000000003dfbc9f): kobject_uevent_env
kobject: 'kvm' (00000000eddbbf94): kobject_uevent_env
kobject: 'kvm' (00000000eddbbf94): fill_kobj_path: path  
= '/devices/virtual/misc/kvm'
kobject: 'loop1' (0000000003dfbc9f): fill_kobj_path: path  
= '/devices/virtual/block/loop1'
kobject: 'loop3' (0000000061a5b8df): kobject_uevent_env
kobject: 'loop3' (0000000061a5b8df): fill_kobj_path: path  
= '/devices/virtual/block/loop3'
kobject: 'loop5' (00000000c7588ca8): kobject_uevent_env
kobject: 'loop5' (00000000c7588ca8): fill_kobj_path: path  
= '/devices/virtual/block/loop5'
kobject: 'loop4' (00000000ebe25695): kobject_uevent_env
kobject: 'loop4' (00000000ebe25695): fill_kobj_path: path  
= '/devices/virtual/block/loop4'
kobject: 'loop0' (000000002925f66c): kobject_uevent_env
kobject: 'loop0' (000000002925f66c): fill_kobj_path: path  
= '/devices/virtual/block/loop0'
kobject: 'loop1' (0000000003dfbc9f): kobject_uevent_env
kobject: 'loop1' (0000000003dfbc9f): fill_kobj_path: path  
= '/devices/virtual/block/loop1'
kobject: 'loop4' (00000000ebe25695): kobject_uevent_env
kobject: 'loop4' (00000000ebe25695): fill_kobj_path: path  
= '/devices/virtual/block/loop4'
kobject: 'kvm' (00000000eddbbf94): kobject_uevent_env
kobject: 'loop3' (0000000061a5b8df): kobject_uevent_env
kobject: 'loop3' (0000000061a5b8df): fill_kobj_path: path  
= '/devices/virtual/block/loop3'
kobject: 'kvm' (00000000eddbbf94): fill_kobj_path: path  
= '/devices/virtual/misc/kvm'
kobject: 'loop5' (00000000c7588ca8): kobject_uevent_env
kobject: 'loop5' (00000000c7588ca8): fill_kobj_path: path  
= '/devices/virtual/block/loop5'
kobject: 'loop0' (000000002925f66c): kobject_uevent_env
kobject: 'loop0' (000000002925f66c): fill_kobj_path: path  
= '/devices/virtual/block/loop0'
kobject: 'loop1' (0000000003dfbc9f): kobject_uevent_env
kobject: 'loop1' (0000000003dfbc9f): fill_kobj_path: path  
= '/devices/virtual/block/loop1'
kobject: 'loop2' (00000000c253515f): kobject_uevent_env
kobject: 'loop2' (00000000c253515f): fill_kobj_path: path  
= '/devices/virtual/block/loop2'
kobject: 'loop3' (0000000061a5b8df): kobject_uevent_env
kobject: 'loop3' (0000000061a5b8df): fill_kobj_path: path  
= '/devices/virtual/block/loop3'
kobject: 'loop0' (000000002925f66c): kobject_uevent_env
kobject: 'loop0' (000000002925f66c): fill_kobj_path: path  
= '/devices/virtual/block/loop0'
kobject: 'loop5' (00000000c7588ca8): kobject_uevent_env
kobject: 'loop5' (00000000c7588ca8): fill_kobj_path: path  
= '/devices/virtual/block/loop5'
kobject: 'loop1' (0000000003dfbc9f): kobject_uevent_env
kobject: 'loop1' (0000000003dfbc9f): fill_kobj_path: path  
= '/devices/virtual/block/loop1'
kobject: 'loop3' (0000000061a5b8df): kobject_uevent_env
kobject: 'loop3' (0000000061a5b8df): fill_kobj_path: path  
= '/devices/virtual/block/loop3'
kobject: 'loop1' (0000000003dfbc9f): kobject_uevent_env
kobject: 'loop1' (0000000003dfbc9f): fill_kobj_path: path  
= '/devices/virtual/block/loop1'
kobject: 'loop5' (00000000c7588ca8): kobject_uevent_env
kobject: 'loop5' (00000000c7588ca8): fill_kobj_path: path  
= '/devices/virtual/block/loop5'
kobject: 'loop0' (000000002925f66c): kobject_uevent_env
kobject: 'loop0' (000000002925f66c): fill_kobj_path: path  
= '/devices/virtual/block/loop0'
kobject: 'loop5' (00000000c7588ca8): kobject_uevent_env
kobject: 'loop5' (00000000c7588ca8): fill_kobj_path: path  
= '/devices/virtual/block/loop5'
kobject: 'loop3' (0000000061a5b8df): kobject_uevent_env
kobject: 'loop3' (0000000061a5b8df): fill_kobj_path: path  
= '/devices/virtual/block/loop3'
kobject: 'loop0' (000000002925f66c): kobject_uevent_env
kobject: 'loop0' (000000002925f66c): fill_kobj_path: path  
= '/devices/virtual/block/loop0'
kobject: 'loop2' (00000000c253515f): kobject_uevent_env
kobject: 'loop2' (00000000c253515f): fill_kobj_path: path  
= '/devices/virtual/block/loop2'
kobject: 'loop5' (00000000c7588ca8): kobject_uevent_env
kobject: 'loop5' (00000000c7588ca8): fill_kobj_path: path  
= '/devices/virtual/block/loop5'
kobject: 'loop0' (000000002925f66c): kobject_uevent_env
kobject: 'loop0' (000000002925f66c): fill_kobj_path: path  
= '/devices/virtual/block/loop0'
kobject: 'loop5' (00000000c7588ca8): kobject_uevent_env
kobject: 'loop5' (00000000c7588ca8): fill_kobj_path: path  
= '/devices/virtual/block/loop5'
kobject: 'loop3' (0000000061a5b8df): kobject_uevent_env
kobject: 'loop3' (0000000061a5b8df): fill_kobj_path: path  
= '/devices/virtual/block/loop3'
kobject: 'loop0' (000000002925f66c): kobject_uevent_env
kobject: 'loop0' (000000002925f66c): fill_kobj_path: path  
= '/devices/virtual/block/loop0'
kobject: 'loop2' (00000000c253515f): kobject_uevent_env
kobject: 'loop2' (00000000c253515f): fill_kobj_path: path  
= '/devices/virtual/block/loop2'
kobject: 'loop4' (00000000ebe25695): kobject_uevent_env
kobject: 'loop4' (00000000ebe25695): fill_kobj_path: path  
= '/devices/virtual/block/loop4'
kobject: 'kvm' (00000000eddbbf94): kobject_uevent_env
kobject: 'loop1' (0000000003dfbc9f): kobject_uevent_env
kobject: 'kvm' (00000000eddbbf94): fill_kobj_path: path  
= '/devices/virtual/misc/kvm'
kobject: 'loop1' (0000000003dfbc9f): fill_kobj_path: path  
= '/devices/virtual/block/loop1'
kobject: 'kvm' (00000000eddbbf94): kobject_uevent_env
kobject: 'loop4' (00000000ebe25695): kobject_uevent_env
kobject: 'kvm' (00000000eddbbf94): fill_kobj_path: path  
= '/devices/virtual/misc/kvm'
kobject: 'loop4' (00000000ebe25695): fill_kobj_path: path  
= '/devices/virtual/block/loop4'
kobject: 'kvm' (00000000eddbbf94): kobject_uevent_env
kobject: 'loop0' (000000002925f66c): kobject_uevent_env
kobject: 'kvm' (00000000eddbbf94): fill_kobj_path: path  
= '/devices/virtual/misc/kvm'
kobject: 'kvm' (00000000eddbbf94): kobject_uevent_env
kobject: 'loop0' (000000002925f66c): fill_kobj_path: path  
= '/devices/virtual/block/loop0'
kobject: 'kvm' (00000000eddbbf94): fill_kobj_path: path  
= '/devices/virtual/misc/kvm'
kobject: 'loop3' (0000000061a5b8df): kobject_uevent_env
kobject: 'loop3' (0000000061a5b8df): fill_kobj_path: path  
= '/devices/virtual/block/loop3'
kobject: 'loop0' (000000002925f66c): kobject_uevent_env
kobject: 'loop0' (000000002925f66c): fill_kobj_path: path  
= '/devices/virtual/block/loop0'
kobject: 'loop5' (00000000c7588ca8): kobject_uevent_env
kobject: 'loop5' (00000000c7588ca8): fill_kobj_path: path  
= '/devices/virtual/block/loop5'
kobject: 'loop5' (00000000c7588ca8): kobject_uevent_env
kobject: 'loop5' (00000000c7588ca8): fill_kobj_path: path  
= '/devices/virtual/block/loop5'
kobject: 'loop0' (000000002925f66c): kobject_uevent_env
kobject: 'loop0' (000000002925f66c): fill_kobj_path: path  
= '/devices/virtual/block/loop0'
kobject: 'loop1' (0000000003dfbc9f): kobject_uevent_env
kobject: 'loop1' (0000000003dfbc9f): fill_kobj_path: path  
= '/devices/virtual/block/loop1'
kobject: 'loop0' (000000002925f66c): kobject_uevent_env
kobject: 'loop0' (000000002925f66c): fill_kobj_path: path  
= '/devices/virtual/block/loop0'
kobject: 'loop5' (00000000c7588ca8): kobject_uevent_env
kobject: 'loop5' (00000000c7588ca8): fill_kobj_path: path  
= '/devices/virtual/block/loop5'
kobject: 'loop2' (00000000c253515f): kobject_uevent_env
kobject: 'loop2' (00000000c253515f): fill_kobj_path: path  
= '/devices/virtual/block/loop2'
kobject: 'loop1' (0000000003dfbc9f): kobject_uevent_env
kobject: 'loop1' (0000000003dfbc9f): fill_kobj_path: path  
= '/devices/virtual/block/loop1'
kobject: 'loop4' (00000000ebe25695): kobject_uevent_env
kobject: 'loop4' (00000000ebe25695): fill_kobj_path: path  
= '/devices/virtual/block/loop4'
kobject: 'kvm' (00000000eddbbf94): kobject_uevent_env
kobject: 'kvm' (00000000eddbbf94): kobject_uevent_env
kobject: 'loop0' (000000002925f66c): kobject_uevent_env
kobject: 'kvm' (00000000eddbbf94): kobject_uevent_env
kobject: 'loop0' (000000002925f66c): fill_kobj_path: path  
= '/devices/virtual/block/loop0'
kobject: 'kvm' (00000000eddbbf94): fill_kobj_path: path  
= '/devices/virtual/misc/kvm'
kobject: 'kvm' (00000000eddbbf94): fill_kobj_path: path  
= '/devices/virtual/misc/kvm'
kobject: 'loop5' (00000000c7588ca8): kobject_uevent_env
kobject: 'loop5' (00000000c7588ca8): fill_kobj_path: path  
= '/devices/virtual/block/loop5'
kobject: 'kvm' (00000000eddbbf94): fill_kobj_path: path  
= '/devices/virtual/misc/kvm'
kobject: 'loop0' (000000002925f66c): kobject_uevent_env
kobject: 'loop0' (000000002925f66c): fill_kobj_path: path  
= '/devices/virtual/block/loop0'
kobject: 'loop1' (0000000003dfbc9f): kobject_uevent_env
kobject: 'loop1' (0000000003dfbc9f): fill_kobj_path: path  
= '/devices/virtual/block/loop1'
kobject: 'kvm' (00000000eddbbf94): kobject_uevent_env
kobject: 'loop3' (0000000061a5b8df): kobject_uevent_env
kobject: 'loop3' (0000000061a5b8df): fill_kobj_path: path  
= '/devices/virtual/block/loop3'
kobject: 'kvm' (00000000eddbbf94): kobject_uevent_env
kobject: 'loop1' (0000000003dfbc9f): kobject_uevent_env
kobject: 'kvm' (00000000eddbbf94): kobject_uevent_env
kobject: 'loop1' (0000000003dfbc9f): fill_kobj_path: path  
= '/devices/virtual/block/loop1'
kobject: 'kvm' (00000000eddbbf94): fill_kobj_path: path  
= '/devices/virtual/misc/kvm'
kobject: 'loop0' (000000002925f66c): kobject_uevent_env
kobject: 'kvm' (00000000eddbbf94): fill_kobj_path: path  
= '/devices/virtual/misc/kvm'
kobject: 'loop0' (000000002925f66c): fill_kobj_path: path  
= '/devices/virtual/block/loop0'
kobject: 'kvm' (00000000eddbbf94): kobject_uevent_env
kobject: 'kvm' (00000000eddbbf94): kobject_uevent_env
kobject: 'kvm' (00000000eddbbf94): fill_kobj_path: path  
= '/devices/virtual/misc/kvm'
kobject: 'kvm' (00000000eddbbf94): fill_kobj_path: path  
= '/devices/virtual/misc/kvm'
kobject: 'kvm' (00000000eddbbf94): fill_kobj_path: path  
= '/devices/virtual/misc/kvm'
kobject: 'loop5' (00000000c7588ca8): kobject_uevent_env
kobject: 'loop5' (00000000c7588ca8): fill_kobj_path: path  
= '/devices/virtual/block/loop5'
kobject: 'kvm' (00000000eddbbf94): kobject_uevent_env
kobject: 'kvm' (00000000eddbbf94): fill_kobj_path: path  
= '/devices/virtual/misc/kvm'
kobject: 'kvm' (00000000eddbbf94): kobject_uevent_env
kobject: 'kvm' (00000000eddbbf94): kobject_uevent_env
kobject: 'kvm' (00000000eddbbf94): fill_kobj_path: path  
= '/devices/virtual/misc/kvm'
kobject: 'kvm' (00000000eddbbf94): fill_kobj_path: path  
= '/devices/virtual/misc/kvm'
kobject: 'kvm' (00000000eddbbf94): kobject_uevent_env
kobject: 'kvm' (00000000eddbbf94): fill_kobj_path: path  
= '/devices/virtual/misc/kvm'
kobject: 'kvm' (00000000eddbbf94): kobject_uevent_env
kobject: 'kvm' (00000000eddbbf94): kobject_uevent_env
kobject: 'kvm' (00000000eddbbf94): fill_kobj_path: path  
= '/devices/virtual/misc/kvm'
kobject: 'kvm' (00000000eddbbf94): fill_kobj_path: path  
= '/devices/virtual/misc/kvm'
kobject: 'loop2' (00000000c253515f): kobject_uevent_env
kobject: 'loop2' (00000000c253515f): fill_kobj_path: path  
= '/devices/virtual/block/loop2'
kobject: 'kvm' (00000000eddbbf94): kobject_uevent_env
kobject: 'loop4' (00000000ebe25695): kobject_uevent_env
kobject: 'kvm' (00000000eddbbf94): fill_kobj_path: path  
= '/devices/virtual/misc/kvm'
kobject: 'loop4' (00000000ebe25695): fill_kobj_path: path  
= '/devices/virtual/block/loop4'
kobject: 'kvm' (00000000eddbbf94): kobject_uevent_env
kobject: 'kvm' (00000000eddbbf94): kobject_uevent_env
kobject: 'loop1' (0000000003dfbc9f): kobject_uevent_env
kobject: 'kvm' (00000000eddbbf94): kobject_uevent_env
kobject: 'loop1' (0000000003dfbc9f): fill_kobj_path: path  
= '/devices/virtual/block/loop1'
kobject: 'kvm' (00000000eddbbf94): fill_kobj_path: path  
= '/devices/virtual/misc/kvm'
kobject: 'kvm' (00000000eddbbf94): fill_kobj_path: path  
= '/devices/virtual/misc/kvm'
kobject: 'kvm' (00000000eddbbf94): fill_kobj_path: path  
= '/devices/virtual/misc/kvm'
kobject: 'loop3' (0000000061a5b8df): kobject_uevent_env
kobject: 'kvm' (00000000eddbbf94): kobject_uevent_env
kobject: 'loop3' (0000000061a5b8df): fill_kobj_path: path  
= '/devices/virtual/block/loop3'
kobject: 'loop0' (000000002925f66c): kobject_uevent_env
kobject: 'kvm' (00000000eddbbf94): kobject_uevent_env
kobject: 'kvm' (00000000eddbbf94): fill_kobj_path: path  
= '/devices/virtual/misc/kvm'
kobject: 'kvm' (00000000eddbbf94): fill_kobj_path: path  
= '/devices/virtual/misc/kvm'
kobject: 'loop0' (000000002925f66c): fill_kobj_path: path  
= '/devices/virtual/block/loop0'
kobject: 'kvm' (00000000eddbbf94): kobject_uevent_env
kobject: 'loop3' (0000000061a5b8df): kobject_uevent_env
kobject: 'loop3' (0000000061a5b8df): fill_kobj_path: path  
= '/devices/virtual/block/loop3'
kobject: 'kvm' (00000000eddbbf94): fill_kobj_path: path  
= '/devices/virtual/misc/kvm'
kobject: 'kvm' (00000000eddbbf94): kobject_uevent_env
kobject: 'kvm' (00000000eddbbf94): fill_kobj_path: path  
= '/devices/virtual/misc/kvm'
kobject: 'loop1' (0000000003dfbc9f): kobject_uevent_env
kobject: 'loop1' (0000000003dfbc9f): fill_kobj_path: path  
= '/devices/virtual/block/loop1'
kobject: 'loop2' (00000000c253515f): kobject_uevent_env
kobject: 'loop2' (00000000c253515f): fill_kobj_path: path  
= '/devices/virtual/block/loop2'
WARNING: CPU: 0 PID: 8908 at net/bridge/netfilter/ebtables.c:2086  
ebt_size_mwt net/bridge/netfilter/ebtables.c:2086 [inline]
WARNING: CPU: 0 PID: 8908 at net/bridge/netfilter/ebtables.c:2086  
size_entry_mwt net/bridge/netfilter/ebtables.c:2167 [inline]
WARNING: CPU: 0 PID: 8908 at net/bridge/netfilter/ebtables.c:2086  
compat_copy_entries+0x1088/0x1500 net/bridge/netfilter/ebtables.c:2206


---
This bug is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzkaller@googlegroups.com.

syzbot will keep track of this bug report. See:
https://goo.gl/tpsmEJ#bug-status-tracking for how to communicate with  
syzbot.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: possible deadlock in __wake_up_common_lock
  2019-01-02  8:51 possible deadlock in __wake_up_common_lock syzbot
@ 2019-01-02 12:51 ` Vlastimil Babka
  2019-01-02 18:06   ` Mel Gorman
                     ` (2 more replies)
  0 siblings, 3 replies; 15+ messages in thread
From: Vlastimil Babka @ 2019-01-02 12:51 UTC (permalink / raw)
  To: syzbot, aarcange, akpm, kirill.shutemov, linux-kernel, linux-mm,
	linux, mhocko, rientjes, syzkaller-bugs, xieyisheng1, zhongjiang,
	Mel Gorman, Peter Zijlstra, Ingo Molnar

On 1/2/19 9:51 AM, syzbot wrote:
> Hello,
> 
> syzbot found the following crash on:
> 
> HEAD commit:    f346b0becb1b Merge branch 'akpm' (patches from Andrew)
> git tree:       upstream
> console output: https://syzkaller.appspot.com/x/log.txt?x=1510cefd400000
> kernel config:  https://syzkaller.appspot.com/x/.config?x=c255c77ba370fe7c
> dashboard link: https://syzkaller.appspot.com/bug?extid=93d94a001cfbce9e60e1
> compiler:       gcc (GCC) 8.0.1 20180413 (experimental)
> userspace arch: i386
> 
> Unfortunately, I don't have any reproducer for this crash yet.
> 
> IMPORTANT: if you fix the bug, please add the following tag to the commit:
> Reported-by: syzbot+93d94a001cfbce9e60e1@syzkaller.appspotmail.com
> 
> 
> ======================================================
> WARNING: possible circular locking dependency detected
> 4.20.0+ #297 Not tainted
> ------------------------------------------------------
> syz-executor0/8529 is trying to acquire lock:
> 000000005e7fb829 (&pgdat->kswapd_wait){....}, at:  
> __wake_up_common_lock+0x19e/0x330 kernel/sched/wait.c:120

From the backtrace at the end of report I see it's coming from

>   wakeup_kswapd+0x5f0/0x930 mm/vmscan.c:3982
>   steal_suitable_fallback+0x538/0x830 mm/page_alloc.c:2217

This wakeup_kswapd is new due to Mel's 1c30844d2dfe ("mm: reclaim small
amounts of memory when an external fragmentation event occurs") so CC Mel.

> but task is already holding lock:
> 000000009bb7bae0 (&(&zone->lock)->rlock){-.-.}, at: spin_lock  
> include/linux/spinlock.h:329 [inline]
> 000000009bb7bae0 (&(&zone->lock)->rlock){-.-.}, at: rmqueue_bulk  
> mm/page_alloc.c:2548 [inline]
> 000000009bb7bae0 (&(&zone->lock)->rlock){-.-.}, at: __rmqueue_pcplist  
> mm/page_alloc.c:3021 [inline]
> 000000009bb7bae0 (&(&zone->lock)->rlock){-.-.}, at: rmqueue_pcplist  
> mm/page_alloc.c:3050 [inline]
> 000000009bb7bae0 (&(&zone->lock)->rlock){-.-.}, at: rmqueue  
> mm/page_alloc.c:3072 [inline]
> 000000009bb7bae0 (&(&zone->lock)->rlock){-.-.}, at:  
> get_page_from_freelist+0x1bae/0x52a0 mm/page_alloc.c:3491
> 
> which lock already depends on the new lock.

However, I don't understand why lockdep thinks it's a problem. IIRC it
doesn't like that we are locking pgdat->kswapd_wait.lock while holding
zone->lock. That means it has learned that the opposite order also
exists, e.g. somebody would take zone->lock while manipulating the wait
queue? I don't see where but I admit I'm not good at reading lockdep
splats, so CCing Peterz and Ingo as well. Keeping rest of mail for
reference.

> the existing dependency chain (in reverse order) is:
> 
> -> #4 (&(&zone->lock)->rlock){-.-.}:
>         __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline]
>         _raw_spin_lock_irqsave+0x99/0xd0 kernel/locking/spinlock.c:152
>         rmqueue mm/page_alloc.c:3082 [inline]
>         get_page_from_freelist+0x9eb/0x52a0 mm/page_alloc.c:3491
>         __alloc_pages_nodemask+0x4f3/0xde0 mm/page_alloc.c:4529
>         __alloc_pages include/linux/gfp.h:473 [inline]
>         alloc_page_interleave+0x25/0x1c0 mm/mempolicy.c:1988
>         alloc_pages_current+0x1bf/0x210 mm/mempolicy.c:2104
>         alloc_pages include/linux/gfp.h:509 [inline]
>         depot_save_stack+0x3f1/0x470 lib/stackdepot.c:260
>         save_stack+0xa9/0xd0 mm/kasan/common.c:79
>         set_track mm/kasan/common.c:85 [inline]
>         kasan_kmalloc+0xcb/0xd0 mm/kasan/common.c:482
>         kasan_slab_alloc+0x12/0x20 mm/kasan/common.c:397
>         kmem_cache_alloc+0x130/0x730 mm/slab.c:3541
>         kmem_cache_zalloc include/linux/slab.h:731 [inline]
>         fill_pool lib/debugobjects.c:134 [inline]
>         __debug_object_init+0xbb8/0x1290 lib/debugobjects.c:379
>         debug_object_init lib/debugobjects.c:431 [inline]
>         debug_object_activate+0x323/0x600 lib/debugobjects.c:512
>         debug_timer_activate kernel/time/timer.c:708 [inline]
>         debug_activate kernel/time/timer.c:763 [inline]
>         __mod_timer kernel/time/timer.c:1040 [inline]
>         mod_timer kernel/time/timer.c:1101 [inline]
>         add_timer+0x50e/0x1490 kernel/time/timer.c:1137
>         __queue_delayed_work+0x249/0x380 kernel/workqueue.c:1533
>         queue_delayed_work_on+0x1a2/0x1f0 kernel/workqueue.c:1558
>         queue_delayed_work include/linux/workqueue.h:527 [inline]
>         schedule_delayed_work include/linux/workqueue.h:628 [inline]
>         start_dirtytime_writeback+0x4e/0x53 fs/fs-writeback.c:2043
>         do_one_initcall+0x145/0x957 init/main.c:889
>         do_initcall_level init/main.c:957 [inline]
>         do_initcalls init/main.c:965 [inline]
>         do_basic_setup init/main.c:983 [inline]
>         kernel_init_freeable+0x4c1/0x5af init/main.c:1136
>         kernel_init+0x11/0x1ae init/main.c:1056
>         ret_from_fork+0x3a/0x50 arch/x86/entry/entry_64.S:352
> 
> -> #3 (&base->lock){-.-.}:
>         __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline]
>         _raw_spin_lock_irqsave+0x99/0xd0 kernel/locking/spinlock.c:152
>         lock_timer_base+0xbb/0x2b0 kernel/time/timer.c:937
>         __mod_timer kernel/time/timer.c:1009 [inline]
>         mod_timer kernel/time/timer.c:1101 [inline]
>         add_timer+0x895/0x1490 kernel/time/timer.c:1137
>         __queue_delayed_work+0x249/0x380 kernel/workqueue.c:1533
>         queue_delayed_work_on+0x1a2/0x1f0 kernel/workqueue.c:1558
>         queue_delayed_work include/linux/workqueue.h:527 [inline]
>         schedule_delayed_work include/linux/workqueue.h:628 [inline]
>         psi_group_change kernel/sched/psi.c:485 [inline]
>         psi_task_change+0x3f1/0x5f0 kernel/sched/psi.c:534
>         psi_enqueue kernel/sched/stats.h:82 [inline]
>         enqueue_task kernel/sched/core.c:727 [inline]
>         activate_task+0x21a/0x430 kernel/sched/core.c:751
>         wake_up_new_task+0x527/0xd20 kernel/sched/core.c:2423
>         _do_fork+0x33b/0x11d0 kernel/fork.c:2247
>         kernel_thread+0x34/0x40 kernel/fork.c:2281
>         rest_init+0x28/0x372 init/main.c:409
>         arch_call_rest_init+0xe/0x1b
>         start_kernel+0x873/0x8ae init/main.c:741
>         x86_64_start_reservations+0x29/0x2b arch/x86/kernel/head64.c:470
>         x86_64_start_kernel+0x76/0x79 arch/x86/kernel/head64.c:451
>         secondary_startup_64+0xa4/0xb0 arch/x86/kernel/head_64.S:243
> 
> -> #2 (&rq->lock){-.-.}:
>         __raw_spin_lock include/linux/spinlock_api_smp.h:142 [inline]
>         _raw_spin_lock+0x2d/0x40 kernel/locking/spinlock.c:144
>         rq_lock kernel/sched/sched.h:1149 [inline]
>         task_fork_fair+0xb0/0x6d0 kernel/sched/fair.c:10083
>         sched_fork+0x443/0xba0 kernel/sched/core.c:2359
>         copy_process+0x25b9/0x8790 kernel/fork.c:1893
>         _do_fork+0x1cb/0x11d0 kernel/fork.c:2222
>         kernel_thread+0x34/0x40 kernel/fork.c:2281
>         rest_init+0x28/0x372 init/main.c:409
>         arch_call_rest_init+0xe/0x1b
>         start_kernel+0x873/0x8ae init/main.c:741
>         x86_64_start_reservations+0x29/0x2b arch/x86/kernel/head64.c:470
>         x86_64_start_kernel+0x76/0x79 arch/x86/kernel/head64.c:451
>         secondary_startup_64+0xa4/0xb0 arch/x86/kernel/head_64.S:243
> 
> -> #1 (&p->pi_lock){-.-.}:
>         __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline]
>         _raw_spin_lock_irqsave+0x99/0xd0 kernel/locking/spinlock.c:152
>         try_to_wake_up+0xdc/0x1460 kernel/sched/core.c:1965
>         default_wake_function+0x30/0x50 kernel/sched/core.c:3710
>         autoremove_wake_function+0x80/0x370 kernel/sched/wait.c:375
>         __wake_up_common+0x1d7/0x7d0 kernel/sched/wait.c:92
>         __wake_up_common_lock+0x1c2/0x330 kernel/sched/wait.c:121
>         __wake_up+0xe/0x10 kernel/sched/wait.c:145
>         wakeup_kswapd+0x5f0/0x930 mm/vmscan.c:3982
>         wake_all_kswapds+0x150/0x300 mm/page_alloc.c:3975
>         __alloc_pages_slowpath+0x1ff1/0x2db0 mm/page_alloc.c:4246
>         __alloc_pages_nodemask+0xa89/0xde0 mm/page_alloc.c:4549
>         alloc_pages_current+0x10c/0x210 mm/mempolicy.c:2106
>         alloc_pages include/linux/gfp.h:509 [inline]
>         __get_free_pages+0xc/0x40 mm/page_alloc.c:4573
>         pte_alloc_one_kernel+0x15/0x20 arch/x86/mm/pgtable.c:28
>         __pte_alloc_kernel+0x23/0x220 mm/memory.c:439
>         vmap_pte_range mm/vmalloc.c:144 [inline]
>         vmap_pmd_range mm/vmalloc.c:171 [inline]
>         vmap_pud_range mm/vmalloc.c:188 [inline]
>         vmap_p4d_range mm/vmalloc.c:205 [inline]
>         vmap_page_range_noflush+0x878/0xa80 mm/vmalloc.c:230
>         vmap_page_range mm/vmalloc.c:243 [inline]
>         vm_map_ram+0x46c/0xf60 mm/vmalloc.c:1181
>         ion_heap_clear_pages+0x2a/0x70  
> drivers/staging/android/ion/ion_heap.c:100
>         ion_heap_sglist_zero+0x24f/0x2d0  
> drivers/staging/android/ion/ion_heap.c:121
>         ion_heap_buffer_zero+0xf8/0x150  
> drivers/staging/android/ion/ion_heap.c:143
>         ion_system_heap_free+0x227/0x290  
> drivers/staging/android/ion/ion_system_heap.c:163
>         ion_buffer_destroy+0x15c/0x1c0 drivers/staging/android/ion/ion.c:119
>         _ion_heap_freelist_drain+0x43e/0x6a0  
> drivers/staging/android/ion/ion_heap.c:199
>         ion_heap_freelist_drain+0x1f/0x30  
> drivers/staging/android/ion/ion_heap.c:209
>         ion_buffer_create drivers/staging/android/ion/ion.c:86 [inline]
>         ion_alloc+0x487/0xa60 drivers/staging/android/ion/ion.c:409
>         ion_ioctl+0x216/0x41e drivers/staging/android/ion/ion-ioctl.c:76
>         __do_compat_sys_ioctl fs/compat_ioctl.c:1052 [inline]
>         __se_compat_sys_ioctl fs/compat_ioctl.c:998 [inline]
>         __ia32_compat_sys_ioctl+0x20e/0x630 fs/compat_ioctl.c:998
>         do_syscall_32_irqs_on arch/x86/entry/common.c:326 [inline]
>         do_fast_syscall_32+0x34d/0xfb2 arch/x86/entry/common.c:397
>         entry_SYSENTER_compat+0x70/0x7f arch/x86/entry/entry_64_compat.S:139
> 
> -> #0 (&pgdat->kswapd_wait){....}:
>         lock_acquire+0x1ed/0x520 kernel/locking/lockdep.c:3841
>         __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline]
>         _raw_spin_lock_irqsave+0x99/0xd0 kernel/locking/spinlock.c:152
>         __wake_up_common_lock+0x19e/0x330 kernel/sched/wait.c:120
>         __wake_up+0xe/0x10 kernel/sched/wait.c:145
>         wakeup_kswapd+0x5f0/0x930 mm/vmscan.c:3982
>         steal_suitable_fallback+0x538/0x830 mm/page_alloc.c:2217
>         __rmqueue_fallback mm/page_alloc.c:2502 [inline]
>         __rmqueue mm/page_alloc.c:2528 [inline]
>         rmqueue_bulk mm/page_alloc.c:2550 [inline]
>         __rmqueue_pcplist mm/page_alloc.c:3021 [inline]
>         rmqueue_pcplist mm/page_alloc.c:3050 [inline]
>         rmqueue mm/page_alloc.c:3072 [inline]
>         get_page_from_freelist+0x318c/0x52a0 mm/page_alloc.c:3491
>         __alloc_pages_nodemask+0x4f3/0xde0 mm/page_alloc.c:4529
>         alloc_pages_current+0x10c/0x210 mm/mempolicy.c:2106
>         alloc_pages include/linux/gfp.h:509 [inline]
>         __get_free_pages+0xc/0x40 mm/page_alloc.c:4573
>         tlb_next_batch mm/mmu_gather.c:29 [inline]
>         __tlb_remove_page_size+0x2e5/0x500 mm/mmu_gather.c:133
>         __tlb_remove_page include/asm-generic/tlb.h:187 [inline]
>         zap_pte_range mm/memory.c:1093 [inline]
>         zap_pmd_range mm/memory.c:1192 [inline]
>         zap_pud_range mm/memory.c:1221 [inline]
>         zap_p4d_range mm/memory.c:1242 [inline]
>         unmap_page_range+0xf88/0x25b0 mm/memory.c:1263
>         unmap_single_vma+0x19b/0x310 mm/memory.c:1308
>         unmap_vmas+0x221/0x390 mm/memory.c:1339
>         exit_mmap+0x2be/0x590 mm/mmap.c:3140
>         __mmput kernel/fork.c:1051 [inline]
>         mmput+0x247/0x610 kernel/fork.c:1072
>         exit_mm kernel/exit.c:545 [inline]
>         do_exit+0xdeb/0x2620 kernel/exit.c:854
>         do_group_exit+0x177/0x440 kernel/exit.c:970
>         get_signal+0x8b0/0x1980 kernel/signal.c:2517
>         do_signal+0x9c/0x21c0 arch/x86/kernel/signal.c:816
>         exit_to_usermode_loop+0x2e5/0x380 arch/x86/entry/common.c:162
>         prepare_exit_to_usermode arch/x86/entry/common.c:197 [inline]
>         syscall_return_slowpath arch/x86/entry/common.c:268 [inline]
>         do_syscall_32_irqs_on arch/x86/entry/common.c:341 [inline]
>         do_fast_syscall_32+0xcd5/0xfb2 arch/x86/entry/common.c:397
>         entry_SYSENTER_compat+0x70/0x7f arch/x86/entry/entry_64_compat.S:139
> 
> other info that might help us debug this:
> 
> Chain exists of:
>    &pgdat->kswapd_wait --> &base->lock --> &(&zone->lock)->rlock
> 
>   Possible unsafe locking scenario:
> 
>         CPU0                    CPU1
>         ----                    ----
>    lock(&(&zone->lock)->rlock);
>                                 lock(&base->lock);
>                                 lock(&(&zone->lock)->rlock);
>    lock(&pgdat->kswapd_wait);
> 
>   *** DEADLOCK ***
> 
> 2 locks held by syz-executor0/8529:
>   #0: 000000001be7b4ca (&(ptlock_ptr(page))->rlock#2){+.+.}, at: spin_lock  
> include/linux/spinlock.h:329 [inline]
>   #0: 000000001be7b4ca (&(ptlock_ptr(page))->rlock#2){+.+.}, at:  
> zap_pte_range mm/memory.c:1051 [inline]
>   #0: 000000001be7b4ca (&(ptlock_ptr(page))->rlock#2){+.+.}, at:  
> zap_pmd_range mm/memory.c:1192 [inline]
>   #0: 000000001be7b4ca (&(ptlock_ptr(page))->rlock#2){+.+.}, at:  
> zap_pud_range mm/memory.c:1221 [inline]
>   #0: 000000001be7b4ca (&(ptlock_ptr(page))->rlock#2){+.+.}, at:  
> zap_p4d_range mm/memory.c:1242 [inline]
>   #0: 000000001be7b4ca (&(ptlock_ptr(page))->rlock#2){+.+.}, at:  
> unmap_page_range+0x98e/0x25b0 mm/memory.c:1263
>   #1: 000000009bb7bae0 (&(&zone->lock)->rlock){-.-.}, at: spin_lock  
> include/linux/spinlock.h:329 [inline]
>   #1: 000000009bb7bae0 (&(&zone->lock)->rlock){-.-.}, at: rmqueue_bulk  
> mm/page_alloc.c:2548 [inline]
>   #1: 000000009bb7bae0 (&(&zone->lock)->rlock){-.-.}, at: __rmqueue_pcplist  
> mm/page_alloc.c:3021 [inline]
>   #1: 000000009bb7bae0 (&(&zone->lock)->rlock){-.-.}, at: rmqueue_pcplist  
> mm/page_alloc.c:3050 [inline]
>   #1: 000000009bb7bae0 (&(&zone->lock)->rlock){-.-.}, at: rmqueue  
> mm/page_alloc.c:3072 [inline]
>   #1: 000000009bb7bae0 (&(&zone->lock)->rlock){-.-.}, at:  
> get_page_from_freelist+0x1bae/0x52a0 mm/page_alloc.c:3491
> 
> stack backtrace:
> CPU: 0 PID: 8529 Comm: syz-executor0 Not tainted 4.20.0+ #297
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS  
> Google 01/01/2011
> Call Trace:
>   __dump_stack lib/dump_stack.c:77 [inline]
>   dump_stack+0x1d3/0x2c6 lib/dump_stack.c:113
>   print_circular_bug.isra.34.cold.56+0x1bd/0x27d  
> kernel/locking/lockdep.c:1224
>   check_prev_add kernel/locking/lockdep.c:1866 [inline]
>   check_prevs_add kernel/locking/lockdep.c:1979 [inline]
>   validate_chain kernel/locking/lockdep.c:2350 [inline]
>   __lock_acquire+0x3360/0x4c20 kernel/locking/lockdep.c:3338
>   lock_acquire+0x1ed/0x520 kernel/locking/lockdep.c:3841
>   __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline]
>   _raw_spin_lock_irqsave+0x99/0xd0 kernel/locking/spinlock.c:152
>   __wake_up_common_lock+0x19e/0x330 kernel/sched/wait.c:120
>   __wake_up+0xe/0x10 kernel/sched/wait.c:145
>   wakeup_kswapd+0x5f0/0x930 mm/vmscan.c:3982
>   steal_suitable_fallback+0x538/0x830 mm/page_alloc.c:2217
>   __rmqueue_fallback mm/page_alloc.c:2502 [inline]
>   __rmqueue mm/page_alloc.c:2528 [inline]
>   rmqueue_bulk mm/page_alloc.c:2550 [inline]
>   __rmqueue_pcplist mm/page_alloc.c:3021 [inline]
>   rmqueue_pcplist mm/page_alloc.c:3050 [inline]
>   rmqueue mm/page_alloc.c:3072 [inline]
>   get_page_from_freelist+0x318c/0x52a0 mm/page_alloc.c:3491
>   __alloc_pages_nodemask+0x4f3/0xde0 mm/page_alloc.c:4529
>   alloc_pages_current+0x10c/0x210 mm/mempolicy.c:2106
>   alloc_pages include/linux/gfp.h:509 [inline]
>   __get_free_pages+0xc/0x40 mm/page_alloc.c:4573
>   tlb_next_batch mm/mmu_gather.c:29 [inline]
>   __tlb_remove_page_size+0x2e5/0x500 mm/mmu_gather.c:133
>   __tlb_remove_page include/asm-generic/tlb.h:187 [inline]
>   zap_pte_range mm/memory.c:1093 [inline]
>   zap_pmd_range mm/memory.c:1192 [inline]
>   zap_pud_range mm/memory.c:1221 [inline]
>   zap_p4d_range mm/memory.c:1242 [inline]
>   unmap_page_range+0xf88/0x25b0 mm/memory.c:1263
>   unmap_single_vma+0x19b/0x310 mm/memory.c:1308
>   unmap_vmas+0x221/0x390 mm/memory.c:1339
>   exit_mmap+0x2be/0x590 mm/mmap.c:3140
>   __mmput kernel/fork.c:1051 [inline]
>   mmput+0x247/0x610 kernel/fork.c:1072
>   exit_mm kernel/exit.c:545 [inline]
>   do_exit+0xdeb/0x2620 kernel/exit.c:854
>   do_group_exit+0x177/0x440 kernel/exit.c:970
>   get_signal+0x8b0/0x1980 kernel/signal.c:2517
>   do_signal+0x9c/0x21c0 arch/x86/kernel/signal.c:816
>   exit_to_usermode_loop+0x2e5/0x380 arch/x86/entry/common.c:162
>   prepare_exit_to_usermode arch/x86/entry/common.c:197 [inline]
>   syscall_return_slowpath arch/x86/entry/common.c:268 [inline]
>   do_syscall_32_irqs_on arch/x86/entry/common.c:341 [inline]
>   do_fast_syscall_32+0xcd5/0xfb2 arch/x86/entry/common.c:397
>   entry_SYSENTER_compat+0x70/0x7f arch/x86/entry/entry_64_compat.S:139
> RIP: 0023:0xf7fe3849
> Code: Bad RIP value.
> RSP: 002b:00000000f5f9d0cc EFLAGS: 00000296 ORIG_RAX: 0000000000000036
> RAX: 0000000000000000 RBX: 0000000000000005 RCX: 00000000c0184900
> RDX: 0000000020000080 RSI: 0000000000000000 RDI: 0000000000000000
> RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
> R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
> R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
> kobject: 'loop0' (000000002925f66c): kobject_uevent_env
> syz-executor0 (8529) used greatest stack depth: 10424 bytes left
> kobject: 'loop0' (000000002925f66c): fill_kobj_path: path  
> = '/devices/virtual/block/loop0'
> kobject: 'loop3' (0000000061a5b8df): kobject_uevent_env
> kobject: 'loop3' (0000000061a5b8df): fill_kobj_path: path  
> = '/devices/virtual/block/loop3'
> kobject: 'loop1' (0000000003dfbc9f): kobject_uevent_env
> kobject: 'loop1' (0000000003dfbc9f): fill_kobj_path: path  
> = '/devices/virtual/block/loop1'
> kobject: 'loop0' (000000002925f66c): kobject_uevent_env
> kobject: 'loop0' (000000002925f66c): fill_kobj_path: path  
> = '/devices/virtual/block/loop0'
> kobject: 'loop2' (00000000c253515f): kobject_uevent_env
> kobject: 'loop2' (00000000c253515f): fill_kobj_path: path  
> = '/devices/virtual/block/loop2'
> kobject: 'loop4' (00000000ebe25695): kobject_uevent_env
> kobject: 'loop4' (00000000ebe25695): fill_kobj_path: path  
> = '/devices/virtual/block/loop4'
> kobject: 'loop5' (00000000c7588ca8): kobject_uevent_env
> kobject: 'loop5' (00000000c7588ca8): fill_kobj_path: path  
> = '/devices/virtual/block/loop5'
> kobject: 'loop1' (0000000003dfbc9f): kobject_uevent_env
> kobject: 'loop1' (0000000003dfbc9f): fill_kobj_path: path  
> = '/devices/virtual/block/loop1'
> kobject: 'loop3' (0000000061a5b8df): kobject_uevent_env
> kobject: 'loop3' (0000000061a5b8df): fill_kobj_path: path  
> = '/devices/virtual/block/loop3'
> kobject: 'loop2' (00000000c253515f): kobject_uevent_env
> kobject: 'loop2' (00000000c253515f): fill_kobj_path: path  
> = '/devices/virtual/block/loop2'
> kobject: 'loop1' (0000000003dfbc9f): kobject_uevent_env
> kobject: 'loop1' (0000000003dfbc9f): fill_kobj_path: path  
> = '/devices/virtual/block/loop1'
> kobject: 'loop3' (0000000061a5b8df): kobject_uevent_env
> kobject: 'loop3' (0000000061a5b8df): fill_kobj_path: path  
> = '/devices/virtual/block/loop3'
> kobject: 'loop5' (00000000c7588ca8): kobject_uevent_env
> kobject: 'loop5' (00000000c7588ca8): fill_kobj_path: path  
> = '/devices/virtual/block/loop5'
> kobject: 'loop4' (00000000ebe25695): kobject_uevent_env
> kobject: 'loop4' (00000000ebe25695): fill_kobj_path: path  
> = '/devices/virtual/block/loop4'
> kobject: 'loop3' (0000000061a5b8df): kobject_uevent_env
> kobject: 'kvm' (00000000eddbbf94): kobject_uevent_env
> kobject: 'loop3' (0000000061a5b8df): fill_kobj_path: path  
> = '/devices/virtual/block/loop3'
> kobject: 'kvm' (00000000eddbbf94): fill_kobj_path: path  
> = '/devices/virtual/misc/kvm'
> kobject: 'loop0' (000000002925f66c): kobject_uevent_env
> kobject: 'loop0' (000000002925f66c): fill_kobj_path: path  
> = '/devices/virtual/block/loop0'
> kobject: 'kvm' (00000000eddbbf94): kobject_uevent_env
> kobject: 'loop4' (00000000ebe25695): kobject_uevent_env
> kobject: 'loop4' (00000000ebe25695): fill_kobj_path: path  
> = '/devices/virtual/block/loop4'
> audit: type=1326 audit(1546069676.863:33): auid=4294967295 uid=0 gid=0  
> ses=4294967295 subj==unconfined pid=8664 comm="syz-executor1"  
> exe="/root/syz-executor1" sig=31 arch=40000003 syscall=265 compat=1  
> ip=0xf7f82849 code=0x0
> kobject: 'kvm' (00000000eddbbf94): fill_kobj_path: path  
> = '/devices/virtual/misc/kvm'
> kobject: 'loop5' (00000000c7588ca8): kobject_uevent_env
> kobject: 'loop5' (00000000c7588ca8): fill_kobj_path: path  
> = '/devices/virtual/block/loop5'
> kobject: 'loop1' (0000000003dfbc9f): kobject_uevent_env
> kobject: 'loop1' (0000000003dfbc9f): fill_kobj_path: path  
> = '/devices/virtual/block/loop1'
> kobject: 'loop3' (0000000061a5b8df): kobject_uevent_env
> kobject: 'loop3' (0000000061a5b8df): fill_kobj_path: path  
> = '/devices/virtual/block/loop3'
> kobject: 'kvm' (00000000eddbbf94): kobject_uevent_env
> kobject: 'loop0' (000000002925f66c): kobject_uevent_env
> kobject: 'kvm' (00000000eddbbf94): fill_kobj_path: path  
> = '/devices/virtual/misc/kvm'
> kobject: 'kvm' (00000000eddbbf94): kobject_uevent_env
> kobject: 'loop0' (000000002925f66c): fill_kobj_path: path  
> = '/devices/virtual/block/loop0'
> kobject: 'kvm' (00000000eddbbf94): fill_kobj_path: path  
> = '/devices/virtual/misc/kvm'
> kobject: 'loop1' (0000000003dfbc9f): kobject_uevent_env
> kobject: 'kvm' (00000000eddbbf94): kobject_uevent_env
> kobject: 'kvm' (00000000eddbbf94): fill_kobj_path: path  
> = '/devices/virtual/misc/kvm'
> kobject: 'loop1' (0000000003dfbc9f): fill_kobj_path: path  
> = '/devices/virtual/block/loop1'
> kobject: 'loop3' (0000000061a5b8df): kobject_uevent_env
> kobject: 'loop3' (0000000061a5b8df): fill_kobj_path: path  
> = '/devices/virtual/block/loop3'
> kobject: 'loop5' (00000000c7588ca8): kobject_uevent_env
> kobject: 'loop5' (00000000c7588ca8): fill_kobj_path: path  
> = '/devices/virtual/block/loop5'
> kobject: 'loop4' (00000000ebe25695): kobject_uevent_env
> kobject: 'loop4' (00000000ebe25695): fill_kobj_path: path  
> = '/devices/virtual/block/loop4'
> kobject: 'loop0' (000000002925f66c): kobject_uevent_env
> kobject: 'loop0' (000000002925f66c): fill_kobj_path: path  
> = '/devices/virtual/block/loop0'
> kobject: 'loop1' (0000000003dfbc9f): kobject_uevent_env
> kobject: 'loop1' (0000000003dfbc9f): fill_kobj_path: path  
> = '/devices/virtual/block/loop1'
> kobject: 'loop4' (00000000ebe25695): kobject_uevent_env
> kobject: 'loop4' (00000000ebe25695): fill_kobj_path: path  
> = '/devices/virtual/block/loop4'
> kobject: 'kvm' (00000000eddbbf94): kobject_uevent_env
> kobject: 'loop3' (0000000061a5b8df): kobject_uevent_env
> kobject: 'loop3' (0000000061a5b8df): fill_kobj_path: path  
> = '/devices/virtual/block/loop3'
> kobject: 'kvm' (00000000eddbbf94): fill_kobj_path: path  
> = '/devices/virtual/misc/kvm'
> kobject: 'loop5' (00000000c7588ca8): kobject_uevent_env
> kobject: 'loop5' (00000000c7588ca8): fill_kobj_path: path  
> = '/devices/virtual/block/loop5'
> kobject: 'loop0' (000000002925f66c): kobject_uevent_env
> kobject: 'loop0' (000000002925f66c): fill_kobj_path: path  
> = '/devices/virtual/block/loop0'
> kobject: 'loop1' (0000000003dfbc9f): kobject_uevent_env
> kobject: 'loop1' (0000000003dfbc9f): fill_kobj_path: path  
> = '/devices/virtual/block/loop1'
> kobject: 'loop2' (00000000c253515f): kobject_uevent_env
> kobject: 'loop2' (00000000c253515f): fill_kobj_path: path  
> = '/devices/virtual/block/loop2'
> kobject: 'loop3' (0000000061a5b8df): kobject_uevent_env
> kobject: 'loop3' (0000000061a5b8df): fill_kobj_path: path  
> = '/devices/virtual/block/loop3'
> kobject: 'loop0' (000000002925f66c): kobject_uevent_env
> kobject: 'loop0' (000000002925f66c): fill_kobj_path: path  
> = '/devices/virtual/block/loop0'
> kobject: 'loop5' (00000000c7588ca8): kobject_uevent_env
> kobject: 'loop5' (00000000c7588ca8): fill_kobj_path: path  
> = '/devices/virtual/block/loop5'
> kobject: 'loop1' (0000000003dfbc9f): kobject_uevent_env
> kobject: 'loop1' (0000000003dfbc9f): fill_kobj_path: path  
> = '/devices/virtual/block/loop1'
> kobject: 'loop3' (0000000061a5b8df): kobject_uevent_env
> kobject: 'loop3' (0000000061a5b8df): fill_kobj_path: path  
> = '/devices/virtual/block/loop3'
> kobject: 'loop1' (0000000003dfbc9f): kobject_uevent_env
> kobject: 'loop1' (0000000003dfbc9f): fill_kobj_path: path  
> = '/devices/virtual/block/loop1'
> kobject: 'loop5' (00000000c7588ca8): kobject_uevent_env
> kobject: 'loop5' (00000000c7588ca8): fill_kobj_path: path  
> = '/devices/virtual/block/loop5'
> kobject: 'loop0' (000000002925f66c): kobject_uevent_env
> kobject: 'loop0' (000000002925f66c): fill_kobj_path: path  
> = '/devices/virtual/block/loop0'
> kobject: 'loop5' (00000000c7588ca8): kobject_uevent_env
> kobject: 'loop5' (00000000c7588ca8): fill_kobj_path: path  
> = '/devices/virtual/block/loop5'
> kobject: 'loop3' (0000000061a5b8df): kobject_uevent_env
> kobject: 'loop3' (0000000061a5b8df): fill_kobj_path: path  
> = '/devices/virtual/block/loop3'
> kobject: 'loop0' (000000002925f66c): kobject_uevent_env
> kobject: 'loop0' (000000002925f66c): fill_kobj_path: path  
> = '/devices/virtual/block/loop0'
> kobject: 'loop2' (00000000c253515f): kobject_uevent_env
> kobject: 'loop2' (00000000c253515f): fill_kobj_path: path  
> = '/devices/virtual/block/loop2'
> kobject: 'loop5' (00000000c7588ca8): kobject_uevent_env
> kobject: 'loop5' (00000000c7588ca8): fill_kobj_path: path  
> = '/devices/virtual/block/loop5'
> kobject: 'loop0' (000000002925f66c): kobject_uevent_env
> kobject: 'loop0' (000000002925f66c): fill_kobj_path: path  
> = '/devices/virtual/block/loop0'
> kobject: 'loop5' (00000000c7588ca8): kobject_uevent_env
> kobject: 'loop5' (00000000c7588ca8): fill_kobj_path: path  
> = '/devices/virtual/block/loop5'
> kobject: 'loop3' (0000000061a5b8df): kobject_uevent_env
> kobject: 'loop3' (0000000061a5b8df): fill_kobj_path: path  
> = '/devices/virtual/block/loop3'
> kobject: 'loop0' (000000002925f66c): kobject_uevent_env
> kobject: 'loop0' (000000002925f66c): fill_kobj_path: path  
> = '/devices/virtual/block/loop0'
> kobject: 'loop2' (00000000c253515f): kobject_uevent_env
> kobject: 'loop2' (00000000c253515f): fill_kobj_path: path  
> = '/devices/virtual/block/loop2'
> kobject: 'loop4' (00000000ebe25695): kobject_uevent_env
> kobject: 'loop4' (00000000ebe25695): fill_kobj_path: path  
> = '/devices/virtual/block/loop4'
> kobject: 'kvm' (00000000eddbbf94): kobject_uevent_env
> kobject: 'loop1' (0000000003dfbc9f): kobject_uevent_env
> kobject: 'kvm' (00000000eddbbf94): fill_kobj_path: path  
> = '/devices/virtual/misc/kvm'
> kobject: 'loop1' (0000000003dfbc9f): fill_kobj_path: path  
> = '/devices/virtual/block/loop1'
> kobject: 'kvm' (00000000eddbbf94): kobject_uevent_env
> kobject: 'loop4' (00000000ebe25695): kobject_uevent_env
> kobject: 'kvm' (00000000eddbbf94): fill_kobj_path: path  
> = '/devices/virtual/misc/kvm'
> kobject: 'loop4' (00000000ebe25695): fill_kobj_path: path  
> = '/devices/virtual/block/loop4'
> kobject: 'kvm' (00000000eddbbf94): kobject_uevent_env
> kobject: 'loop0' (000000002925f66c): kobject_uevent_env
> kobject: 'kvm' (00000000eddbbf94): fill_kobj_path: path  
> = '/devices/virtual/misc/kvm'
> kobject: 'kvm' (00000000eddbbf94): kobject_uevent_env
> kobject: 'loop0' (000000002925f66c): fill_kobj_path: path  
> = '/devices/virtual/block/loop0'
> kobject: 'kvm' (00000000eddbbf94): fill_kobj_path: path  
> = '/devices/virtual/misc/kvm'
> kobject: 'loop3' (0000000061a5b8df): kobject_uevent_env
> kobject: 'loop3' (0000000061a5b8df): fill_kobj_path: path  
> = '/devices/virtual/block/loop3'
> kobject: 'loop0' (000000002925f66c): kobject_uevent_env
> kobject: 'loop0' (000000002925f66c): fill_kobj_path: path  
> = '/devices/virtual/block/loop0'
> kobject: 'loop5' (00000000c7588ca8): kobject_uevent_env
> kobject: 'loop5' (00000000c7588ca8): fill_kobj_path: path  
> = '/devices/virtual/block/loop5'
> kobject: 'loop5' (00000000c7588ca8): kobject_uevent_env
> kobject: 'loop5' (00000000c7588ca8): fill_kobj_path: path  
> = '/devices/virtual/block/loop5'
> kobject: 'loop0' (000000002925f66c): kobject_uevent_env
> kobject: 'loop0' (000000002925f66c): fill_kobj_path: path  
> = '/devices/virtual/block/loop0'
> kobject: 'loop1' (0000000003dfbc9f): kobject_uevent_env
> kobject: 'loop1' (0000000003dfbc9f): fill_kobj_path: path  
> = '/devices/virtual/block/loop1'
> kobject: 'loop0' (000000002925f66c): kobject_uevent_env
> kobject: 'loop0' (000000002925f66c): fill_kobj_path: path  
> = '/devices/virtual/block/loop0'
> kobject: 'loop5' (00000000c7588ca8): kobject_uevent_env
> kobject: 'loop5' (00000000c7588ca8): fill_kobj_path: path  
> = '/devices/virtual/block/loop5'
> kobject: 'loop2' (00000000c253515f): kobject_uevent_env
> kobject: 'loop2' (00000000c253515f): fill_kobj_path: path  
> = '/devices/virtual/block/loop2'
> kobject: 'loop1' (0000000003dfbc9f): kobject_uevent_env
> kobject: 'loop1' (0000000003dfbc9f): fill_kobj_path: path  
> = '/devices/virtual/block/loop1'
> kobject: 'loop4' (00000000ebe25695): kobject_uevent_env
> kobject: 'loop4' (00000000ebe25695): fill_kobj_path: path  
> = '/devices/virtual/block/loop4'
> kobject: 'kvm' (00000000eddbbf94): kobject_uevent_env
> kobject: 'kvm' (00000000eddbbf94): kobject_uevent_env
> kobject: 'loop0' (000000002925f66c): kobject_uevent_env
> kobject: 'kvm' (00000000eddbbf94): kobject_uevent_env
> kobject: 'loop0' (000000002925f66c): fill_kobj_path: path  
> = '/devices/virtual/block/loop0'
> kobject: 'kvm' (00000000eddbbf94): fill_kobj_path: path  
> = '/devices/virtual/misc/kvm'
> kobject: 'kvm' (00000000eddbbf94): fill_kobj_path: path  
> = '/devices/virtual/misc/kvm'
> kobject: 'loop5' (00000000c7588ca8): kobject_uevent_env
> kobject: 'loop5' (00000000c7588ca8): fill_kobj_path: path  
> = '/devices/virtual/block/loop5'
> kobject: 'kvm' (00000000eddbbf94): fill_kobj_path: path  
> = '/devices/virtual/misc/kvm'
> kobject: 'loop0' (000000002925f66c): kobject_uevent_env
> kobject: 'loop0' (000000002925f66c): fill_kobj_path: path  
> = '/devices/virtual/block/loop0'
> kobject: 'loop1' (0000000003dfbc9f): kobject_uevent_env
> kobject: 'loop1' (0000000003dfbc9f): fill_kobj_path: path  
> = '/devices/virtual/block/loop1'
> kobject: 'kvm' (00000000eddbbf94): kobject_uevent_env
> kobject: 'loop3' (0000000061a5b8df): kobject_uevent_env
> kobject: 'loop3' (0000000061a5b8df): fill_kobj_path: path  
> = '/devices/virtual/block/loop3'
> kobject: 'kvm' (00000000eddbbf94): kobject_uevent_env
> kobject: 'loop1' (0000000003dfbc9f): kobject_uevent_env
> kobject: 'kvm' (00000000eddbbf94): kobject_uevent_env
> kobject: 'loop1' (0000000003dfbc9f): fill_kobj_path: path  
> = '/devices/virtual/block/loop1'
> kobject: 'kvm' (00000000eddbbf94): fill_kobj_path: path  
> = '/devices/virtual/misc/kvm'
> kobject: 'loop0' (000000002925f66c): kobject_uevent_env
> kobject: 'kvm' (00000000eddbbf94): fill_kobj_path: path  
> = '/devices/virtual/misc/kvm'
> kobject: 'loop0' (000000002925f66c): fill_kobj_path: path  
> = '/devices/virtual/block/loop0'
> kobject: 'kvm' (00000000eddbbf94): kobject_uevent_env
> kobject: 'kvm' (00000000eddbbf94): kobject_uevent_env
> kobject: 'kvm' (00000000eddbbf94): fill_kobj_path: path  
> = '/devices/virtual/misc/kvm'
> kobject: 'kvm' (00000000eddbbf94): fill_kobj_path: path  
> = '/devices/virtual/misc/kvm'
> kobject: 'kvm' (00000000eddbbf94): fill_kobj_path: path  
> = '/devices/virtual/misc/kvm'
> kobject: 'loop5' (00000000c7588ca8): kobject_uevent_env
> kobject: 'loop5' (00000000c7588ca8): fill_kobj_path: path  
> = '/devices/virtual/block/loop5'
> kobject: 'kvm' (00000000eddbbf94): kobject_uevent_env
> kobject: 'kvm' (00000000eddbbf94): fill_kobj_path: path  
> = '/devices/virtual/misc/kvm'
> kobject: 'kvm' (00000000eddbbf94): kobject_uevent_env
> kobject: 'kvm' (00000000eddbbf94): kobject_uevent_env
> kobject: 'kvm' (00000000eddbbf94): fill_kobj_path: path  
> = '/devices/virtual/misc/kvm'
> kobject: 'kvm' (00000000eddbbf94): fill_kobj_path: path  
> = '/devices/virtual/misc/kvm'
> kobject: 'kvm' (00000000eddbbf94): kobject_uevent_env
> kobject: 'kvm' (00000000eddbbf94): fill_kobj_path: path  
> = '/devices/virtual/misc/kvm'
> kobject: 'kvm' (00000000eddbbf94): kobject_uevent_env
> kobject: 'kvm' (00000000eddbbf94): kobject_uevent_env
> kobject: 'kvm' (00000000eddbbf94): fill_kobj_path: path  
> = '/devices/virtual/misc/kvm'
> kobject: 'kvm' (00000000eddbbf94): fill_kobj_path: path  
> = '/devices/virtual/misc/kvm'
> kobject: 'loop2' (00000000c253515f): kobject_uevent_env
> kobject: 'loop2' (00000000c253515f): fill_kobj_path: path  
> = '/devices/virtual/block/loop2'
> kobject: 'kvm' (00000000eddbbf94): kobject_uevent_env
> kobject: 'loop4' (00000000ebe25695): kobject_uevent_env
> kobject: 'kvm' (00000000eddbbf94): fill_kobj_path: path  
> = '/devices/virtual/misc/kvm'
> kobject: 'loop4' (00000000ebe25695): fill_kobj_path: path  
> = '/devices/virtual/block/loop4'
> kobject: 'kvm' (00000000eddbbf94): kobject_uevent_env
> kobject: 'kvm' (00000000eddbbf94): kobject_uevent_env
> kobject: 'loop1' (0000000003dfbc9f): kobject_uevent_env
> kobject: 'kvm' (00000000eddbbf94): kobject_uevent_env
> kobject: 'loop1' (0000000003dfbc9f): fill_kobj_path: path  
> = '/devices/virtual/block/loop1'
> kobject: 'kvm' (00000000eddbbf94): fill_kobj_path: path  
> = '/devices/virtual/misc/kvm'
> kobject: 'kvm' (00000000eddbbf94): fill_kobj_path: path  
> = '/devices/virtual/misc/kvm'
> kobject: 'kvm' (00000000eddbbf94): fill_kobj_path: path  
> = '/devices/virtual/misc/kvm'
> kobject: 'loop3' (0000000061a5b8df): kobject_uevent_env
> kobject: 'kvm' (00000000eddbbf94): kobject_uevent_env
> kobject: 'loop3' (0000000061a5b8df): fill_kobj_path: path  
> = '/devices/virtual/block/loop3'
> kobject: 'loop0' (000000002925f66c): kobject_uevent_env
> kobject: 'kvm' (00000000eddbbf94): kobject_uevent_env
> kobject: 'kvm' (00000000eddbbf94): fill_kobj_path: path  
> = '/devices/virtual/misc/kvm'
> kobject: 'kvm' (00000000eddbbf94): fill_kobj_path: path  
> = '/devices/virtual/misc/kvm'
> kobject: 'loop0' (000000002925f66c): fill_kobj_path: path  
> = '/devices/virtual/block/loop0'
> kobject: 'kvm' (00000000eddbbf94): kobject_uevent_env
> kobject: 'loop3' (0000000061a5b8df): kobject_uevent_env
> kobject: 'loop3' (0000000061a5b8df): fill_kobj_path: path  
> = '/devices/virtual/block/loop3'
> kobject: 'kvm' (00000000eddbbf94): fill_kobj_path: path  
> = '/devices/virtual/misc/kvm'
> kobject: 'kvm' (00000000eddbbf94): kobject_uevent_env
> kobject: 'kvm' (00000000eddbbf94): fill_kobj_path: path  
> = '/devices/virtual/misc/kvm'
> kobject: 'loop1' (0000000003dfbc9f): kobject_uevent_env
> kobject: 'loop1' (0000000003dfbc9f): fill_kobj_path: path  
> = '/devices/virtual/block/loop1'
> kobject: 'loop2' (00000000c253515f): kobject_uevent_env
> kobject: 'loop2' (00000000c253515f): fill_kobj_path: path  
> = '/devices/virtual/block/loop2'
> WARNING: CPU: 0 PID: 8908 at net/bridge/netfilter/ebtables.c:2086  
> ebt_size_mwt net/bridge/netfilter/ebtables.c:2086 [inline]
> WARNING: CPU: 0 PID: 8908 at net/bridge/netfilter/ebtables.c:2086  
> size_entry_mwt net/bridge/netfilter/ebtables.c:2167 [inline]
> WARNING: CPU: 0 PID: 8908 at net/bridge/netfilter/ebtables.c:2086  
> compat_copy_entries+0x1088/0x1500 net/bridge/netfilter/ebtables.c:2206
> 
> 
> ---
> This bug is generated by a bot. It may contain errors.
> See https://goo.gl/tpsmEJ for more information about syzbot.
> syzbot engineers can be reached at syzkaller@googlegroups.com.
> 
> syzbot will keep track of this bug report. See:
> https://goo.gl/tpsmEJ#bug-status-tracking for how to communicate with  
> syzbot.
> 


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: possible deadlock in __wake_up_common_lock
  2019-01-02 12:51 ` Vlastimil Babka
@ 2019-01-02 18:06   ` Mel Gorman
  2019-01-02 18:19     ` Qian Cai
  2019-01-02 18:29     ` Dmitry Vyukov
  2019-01-07  9:52   ` Peter Zijlstra
  2019-01-08 13:08   ` Peter Zijlstra
  2 siblings, 2 replies; 15+ messages in thread
From: Mel Gorman @ 2019-01-02 18:06 UTC (permalink / raw)
  To: Vlastimil Babka
  Cc: syzbot, aarcange, akpm, kirill.shutemov, linux-kernel, linux-mm,
	linux, mhocko, rientjes, syzkaller-bugs, xieyisheng1, zhongjiang,
	Peter Zijlstra, Ingo Molnar

On Wed, Jan 02, 2019 at 01:51:01PM +0100, Vlastimil Babka wrote:
> On 1/2/19 9:51 AM, syzbot wrote:
> > Hello,
> > 
> > syzbot found the following crash on:
> > 
> > HEAD commit:    f346b0becb1b Merge branch 'akpm' (patches from Andrew)
> > git tree:       upstream
> > console output: https://syzkaller.appspot.com/x/log.txt?x=1510cefd400000
> > kernel config:  https://syzkaller.appspot.com/x/.config?x=c255c77ba370fe7c
> > dashboard link: https://syzkaller.appspot.com/bug?extid=93d94a001cfbce9e60e1
> > compiler:       gcc (GCC) 8.0.1 20180413 (experimental)
> > userspace arch: i386
> > 
> > Unfortunately, I don't have any reproducer for this crash yet.
> > 
> > IMPORTANT: if you fix the bug, please add the following tag to the commit:
> > Reported-by: syzbot+93d94a001cfbce9e60e1@syzkaller.appspotmail.com
> > 
> > 
> > ======================================================
> > WARNING: possible circular locking dependency detected
> > 4.20.0+ #297 Not tainted
> > ------------------------------------------------------
> > syz-executor0/8529 is trying to acquire lock:
> > 000000005e7fb829 (&pgdat->kswapd_wait){....}, at:  
> > __wake_up_common_lock+0x19e/0x330 kernel/sched/wait.c:120
> 
> From the backtrace at the end of report I see it's coming from
> 
> >   wakeup_kswapd+0x5f0/0x930 mm/vmscan.c:3982
> >   steal_suitable_fallback+0x538/0x830 mm/page_alloc.c:2217
> 
> This wakeup_kswapd is new due to Mel's 1c30844d2dfe ("mm: reclaim small
> amounts of memory when an external fragmentation event occurs") so CC Mel.
> 

New year new bugs :(

> > but task is already holding lock:
> > 000000009bb7bae0 (&(&zone->lock)->rlock){-.-.}, at: spin_lock  
> > include/linux/spinlock.h:329 [inline]
> > 000000009bb7bae0 (&(&zone->lock)->rlock){-.-.}, at: rmqueue_bulk  
> > mm/page_alloc.c:2548 [inline]
> > 000000009bb7bae0 (&(&zone->lock)->rlock){-.-.}, at: __rmqueue_pcplist  
> > mm/page_alloc.c:3021 [inline]
> > 000000009bb7bae0 (&(&zone->lock)->rlock){-.-.}, at: rmqueue_pcplist  
> > mm/page_alloc.c:3050 [inline]
> > 000000009bb7bae0 (&(&zone->lock)->rlock){-.-.}, at: rmqueue  
> > mm/page_alloc.c:3072 [inline]
> > 000000009bb7bae0 (&(&zone->lock)->rlock){-.-.}, at:  
> > get_page_from_freelist+0x1bae/0x52a0 mm/page_alloc.c:3491
> > 
> > which lock already depends on the new lock.
> 
> However, I don't understand why lockdep thinks it's a problem. IIRC it
> doesn't like that we are locking pgdat->kswapd_wait.lock while holding
> zone->lock. That means it has learned that the opposite order also
> exists, e.g. somebody would take zone->lock while manipulating the wait
> queue? I don't see where but I admit I'm not good at reading lockdep
> splats, so CCing Peterz and Ingo as well. Keeping rest of mail for
> reference.
> 

I'm not sure I'm reading the output correctly because I'm having trouble
seeing the exact pattern that allows lockdep to conclude the lock ordering
is problematic.

I think it's hungup on the fact that mod_timer can allocate debug
objects for KASAN and somehow concludes that the waking of kswapd is
problematic because potentially a lock ordering exists that would trip.
I don't see how it's actually possible though due to either a lack of
imagination or maybe lockdep is being cautious as something could change
in the future that allows the lockup.

There are a few options I guess in order of preference.

1. Drop zone->lock for the call. It's not necessarily to keep track of
   the IRQ flags as callers into that path already do things like treat
   IRQ disabling and the spin lock separately.

2. Use another alloc_flag in steal_suitable_fallback that is set when a
   wakeup is required but do the actual wakeup in rmqueue() after the
   zone locks are dropped and the allocation request is completed

3. Always wakeup kswapd if watermarks are boosted. I like this the least
   because it means doing wakeups that are unrelated to fragmentation
   that occurred in the current context.

Any particular preference?

While I recognise there is no test case available, how often does this
trigger in syzbot as it would be nice to have some confirmation any
patch is really fixing the problem.

-- 
Mel Gorman
SUSE Labs

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: possible deadlock in __wake_up_common_lock
  2019-01-02 18:06   ` Mel Gorman
@ 2019-01-02 18:19     ` Qian Cai
  2019-01-03  1:28       ` Tetsuo Handa
  2019-01-02 18:29     ` Dmitry Vyukov
  1 sibling, 1 reply; 15+ messages in thread
From: Qian Cai @ 2019-01-02 18:19 UTC (permalink / raw)
  To: Mel Gorman, Vlastimil Babka
  Cc: syzbot, aarcange, akpm, kirill.shutemov, linux-kernel, linux-mm,
	linux, mhocko, rientjes, syzkaller-bugs, xieyisheng1, zhongjiang,
	Peter Zijlstra, Ingo Molnar

On 1/2/19 1:06 PM, Mel Gorman wrote:

> While I recognise there is no test case available, how often does this
> trigger in syzbot as it would be nice to have some confirmation any
> patch is really fixing the problem.

I think I did manage to trigger this every time running a mmap() workload
causing swapping and a low-memory situation [1].

[1]
https://github.com/linux-test-project/ltp/blob/master/testcases/kernel/mem/oom/oom01.c

[  507.192079] ======================================================
[  507.198294] WARNING: possible circular locking dependency detected
[  507.204510] 4.20.0+ #27 Not tainted
[  507.208018] ------------------------------------------------------
[  507.214233] oom01/7666 is trying to acquire lock:
[  507.218965] 00000000bc163d02 (&p->pi_lock){-.-.}, at: try_to_wake_up+0x10a/0xe80
[  507.226415]
[  507.226415] but task is already holding lock:
[  507.232280] 0000000064eb4795 (&pgdat->kswapd_wait){....}, at:
__wake_up_common_lock+0x112/0x1c0
[  507.241036]
[  507.241036] which lock already depends on the new lock.
[  507.241036]
[  507.249260]
[  507.249260] the existing dependency chain (in reverse order) is:
[  507.256787]
[  507.256787] -> #3 (&pgdat->kswapd_wait){....}:
[  507.262748]        lock_acquire+0x1b3/0x3c0
[  507.266960]        _raw_spin_lock_irqsave+0x35/0x50
[  507.271867]        __wake_up_common_lock+0x112/0x1c0
[  507.276863]        wakeup_kswapd+0x3d0/0x560
[  507.281159]        steal_suitable_fallback+0x40b/0x4e0
[  507.286330]        rmqueue_bulk.constprop.26+0xa36/0x1090
[  507.291760]        get_page_from_freelist+0xb79/0x28f0
[  507.296930]        __alloc_pages_nodemask+0x453/0x21f0
[  507.302099]        alloc_pages_vma+0x87/0x280
[  507.306482]        do_anonymous_page+0x443/0xb80
[  507.311128]        __handle_mm_fault+0xbb8/0xc80
[  507.315773]        handle_mm_fault+0x3ae/0x68b
[  507.320243]        __do_page_fault+0x329/0x6d0
[  507.324712]        do_page_fault+0x119/0x53c
[  507.329008]        page_fault+0x1b/0x20
[  507.332863]
[  507.332863] -> #2 (&(&zone->lock)->rlock){-.-.}:
[  507.338997]        lock_acquire+0x1b3/0x3c0
[  507.343205]        _raw_spin_lock_irqsave+0x35/0x50
[  507.348111]        get_page_from_freelist+0x108f/0x28f0
[  507.353368]        __alloc_pages_nodemask+0x453/0x21f0
[  507.358538]        alloc_page_interleave+0x6a/0x1b0
[  507.363446]        allocate_slab+0x319/0xa20
[  507.367742]        new_slab+0x41/0x60
[  507.371427]        ___slab_alloc+0x509/0x8a0
[  507.375721]        __slab_alloc+0x3a/0x70
[  507.379754]        kmem_cache_alloc+0x29c/0x310
[  507.384312]        __debug_object_init+0x984/0x9b0
[  507.389130]        hrtimer_init+0x9b/0x310
[  507.393250]        init_dl_task_timer+0x1c/0x40
[  507.397808]        __sched_fork+0x187/0x290
[  507.402015]        init_idle+0xa1/0x3a0
[  507.405875]        fork_idle+0x122/0x150
[  507.409823]        idle_threads_init+0xea/0x17a
[  507.414379]        smp_init+0x16/0xf2
[  507.418064]        kernel_init_freeable+0x31f/0x7ae
[  507.422971]        kernel_init+0xc/0x127
[  507.426916]        ret_from_fork+0x3a/0x50
[  507.431034]
[  507.431034] -> #1 (&rq->lock){-.-.}:
[  507.436119]        lock_acquire+0x1b3/0x3c0
[  507.440326]        _raw_spin_lock+0x2c/0x40
[  507.444535]        task_fork_fair+0x93/0x310
[  507.448830]        sched_fork+0x194/0x380
[  507.452863]        copy_process+0x1446/0x41f0
[  507.457247]        _do_fork+0x16a/0xac0
[  507.461107]        kernel_thread+0x25/0x30
[  507.465226]        rest_init+0x28/0x319
[  507.469085]        start_kernel+0x634/0x674
[  507.473296]        secondary_startup_64+0xb6/0xc0
[  507.478026]
[  507.478026] -> #0 (&p->pi_lock){-.-.}:
[  507.483286]        __lock_acquire+0x46d/0x860
[  507.487670]        lock_acquire+0x1b3/0x3c0
[  507.491879]        _raw_spin_lock_irqsave+0x35/0x50
[  507.496787]        try_to_wake_up+0x10a/0xe80
[  507.501170]        autoremove_wake_function+0x7e/0x1a0
[  507.506338]        __wake_up_common+0x12d/0x380
[  507.510895]        __wake_up_common_lock+0x149/0x1c0
[  507.515889]        wakeup_kswapd+0x3d0/0x560
[  507.520184]        steal_suitable_fallback+0x40b/0x4e0
[  507.525354]        rmqueue_bulk.constprop.26+0xa36/0x1090
[  507.530786]        get_page_from_freelist+0xb79/0x28f0
[  507.535955]        __alloc_pages_nodemask+0x453/0x21f0
[  507.541124]        alloc_pages_vma+0x87/0x280
[  507.545506]        do_anonymous_page+0x443/0xb80
[  507.550152]        __handle_mm_fault+0xbb8/0xc80
[  507.554797]        handle_mm_fault+0x3ae/0x68b
[  507.559267]        __do_page_fault+0x329/0x6d0
[  507.563738]        do_page_fault+0x119/0x53c
[  507.568034]        page_fault+0x1b/0x20
[  507.571890]
[  507.571890] other info that might help us debug this:
[  507.571890]
[  507.579938] Chain exists of:
[  507.579938]   &p->pi_lock --> &(&zone->lock)->rlock --> &pgdat->kswapd_wait
[  507.579938]
[  507.591311]  Possible unsafe locking scenario:
[  507.591311]
[  507.597265]        CPU0                    CPU1
[  507.601821]        ----                    ----
[  507.606375]   lock(&pgdat->kswapd_wait);
[  507.610321]                                lock(&(&zone->lock)->rlock);
[  507.616973]                                lock(&pgdat->kswapd_wait);
[  507.623452]   lock(&p->pi_lock);
[  507.626698]
[  507.626698]  *** DEADLOCK ***
[  507.626698]
[  507.632652] 3 locks held by oom01/7666:
[  507.636509]  #0: 000000000ed9e0f8 (&mm->mmap_sem){++++}, at:
__do_page_fault+0x236/0x6d0
[  507.644653]  #1: 00000000592a7e32 (&(&zone->lock)->rlock){-.-.}, at:
rmqueue_bulk.constprop.26+0x16f/0x1090
[  507.654453]  #2: 0000000064eb4795 (&pgdat->kswapd_wait){....}, at:
__wake_up_common_lock+0x112/0x1c0
[  507.663644]
[  507.663644] stack backtrace:
[  507.668027] CPU: 75 PID: 7666 Comm: oom01 Kdump: loaded Not tainted 4.20.0+ #27
[  507.675378] Hardware name: HPE ProLiant DL380 Gen10/ProLiant DL380 Gen10,
BIOS U30 06/20/2018
[  507.683953] Call Trace:
[  507.686416]  dump_stack+0xd1/0x160
[  507.689840]  ? dump_stack_print_info.cold.0+0x1b/0x1b
[  507.694923]  ? print_stack_trace+0x8f/0xa0
[  507.699044]  print_circular_bug.isra.10.cold.34+0x20f/0x297
[  507.704651]  ? print_circular_bug_header+0x50/0x50
[  507.709473]  check_prev_add.constprop.19+0x7ad/0xad0
[  507.714468]  ? check_usage+0x3e0/0x3e0
[  507.718241]  ? graph_lock+0xef/0x190
[  507.721838]  ? usage_match+0x27/0x40
[  507.725435]  validate_chain.isra.14+0xbd5/0x16c0
[  507.730082]  ? check_prev_add.constprop.19+0xad0/0xad0
[  507.735252]  ? stack_access_ok+0x35/0x80
[  507.739200]  ? deref_stack_reg+0xa2/0xf0
[  507.743148]  ? __read_once_size_nocheck.constprop.4+0x10/0x10
[  507.748929]  ? debug_lockdep_rcu_enabled.part.0+0x16/0x30
[  507.754362]  ? ftrace_ops_trampoline+0x131/0le_mm_fault+0xbb8/0xc80
[  508.142595]  ? handle_mm_fault+0x3ae/0x68b
[  508.146716]  ? __do_page_fault+0x329/0x6d0
[  508.150836]  ? trace_hardirqs_off+0x9d/0x230
[  508.155132]  ? trace_hardirqs_on_caller+0x230/0x230
[  508.160038]  ? pageset_set_high_and_batch+0x180/0x180
[  508.165122]  get_page_from_freelist+0xb79/0x28f0
[  508.169772]  ? __isolate_free_page+0x430/0x430
[  508.174242]  ? print_irqtrace_events+0x110/0x110
[  508.178885]  ? __isolate_free_page+0x430/0x430
[  508.183355]  ? free_unref_page_list+0x3e6/0x570
[  508.187914]  ? mark_held_locks+0x8b/0xb0
[  508.191861]  ? free_unref_page_list+0x3e6/0x570
[  508.196418]  ? free_unref_page_list+0x3e6/0x570
[  508.200976]  ? lockdep_hardirqs_on+0x1a4/0x290
[  508.205445]  ? trace_hardirqs_on+0x9d/0x230
[  508.209654]  ? ftrace_destroy_function_files+0x50/0x50
[  508.214823]  ? validate_chain.isra.14+0x16c/0x16c0
[  508.219642]  ? check_chain_key+0x13b/0x200
[  508.223766]  ? page_mapping+0x2be/0x460
[  508.227627]  ? page_evictable+0x1de/0x320
[  508.231660]  ? __page_frag_cache_drain+0x180ad0
[  508.619426]  ? lock_downgrade+0x360/0x360
[  508.623458]  ? rwlock_bug.part.0+0x60/0x60
[  508.627580]  ? do_raw_spin_unlock+0x157/0x220
[  508.631963]  ? do_raw_spin_trylock+0x180/0x180
[  508.636434]  ? do_raw_spin_lock+0x137/0x1f0
[  508.640641]  ? mark_lock+0x11c/0xd80
[  508.644238]  alloc_pages_vma+0x87/0x280
[  508.648097]  do_anonymous_page+0x443/0xb80
[  508.652219]  ? mark_lock+0x11c/0xd80
[  508.655815]  ? mark_lock+0x11c/0xd80
[  508.659412]  ? finish_fault+0xf0/0xf0
[  508.663096]  ? print_irqtrace_events+0x110/0x110
[  508.667741]  ? check_flags.part.18+0x220/0x220
[  508.672213]  ? do_raw_spin_unlock+0x157/0x220
[  508.676598]  ? do_raw_spin_trylock+0x180/0x180
[  508.681070]  ? rwlock_bug.part.0+0x60/0x60
[  508.685191]  ? check_chain_key+0x13b/0x200
[  508.689313]  ? __lock_acquire+0x4c0/0x860
[  508.693347]  ? check_chain_key+0x13b/0x200
[  508.697469]  ? handle_mm_fault+0x315/0x68b
[  508.701590]  __handle_mm_fault+0xbb8/0xc80
[  508.705711]  ? handle_mm_fault+0x4c3/0x68b

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: possible deadlock in __wake_up_common_lock
  2019-01-02 18:06   ` Mel Gorman
  2019-01-02 18:19     ` Qian Cai
@ 2019-01-02 18:29     ` Dmitry Vyukov
  2019-01-03 16:37       ` Mel Gorman
  1 sibling, 1 reply; 15+ messages in thread
From: Dmitry Vyukov @ 2019-01-02 18:29 UTC (permalink / raw)
  To: Mel Gorman
  Cc: Vlastimil Babka, syzbot, Andrea Arcangeli, Andrew Morton,
	Kirill A. Shutemov, LKML, Linux-MM, linux, Michal Hocko,
	David Rientjes, syzkaller-bugs, xieyisheng1, zhong jiang,
	Peter Zijlstra, Ingo Molnar

On Wed, Jan 2, 2019 at 7:06 PM Mel Gorman <mgorman@techsingularity.net> wrote:
>
> On Wed, Jan 02, 2019 at 01:51:01PM +0100, Vlastimil Babka wrote:
> > On 1/2/19 9:51 AM, syzbot wrote:
> > > Hello,
> > >
> > > syzbot found the following crash on:
> > >
> > > HEAD commit:    f346b0becb1b Merge branch 'akpm' (patches from Andrew)
> > > git tree:       upstream
> > > console output: https://syzkaller.appspot.com/x/log.txt?x=1510cefd400000
> > > kernel config:  https://syzkaller.appspot.com/x/.config?x=c255c77ba370fe7c
> > > dashboard link: https://syzkaller.appspot.com/bug?extid=93d94a001cfbce9e60e1
> > > compiler:       gcc (GCC) 8.0.1 20180413 (experimental)
> > > userspace arch: i386
> > >
> > > Unfortunately, I don't have any reproducer for this crash yet.
> > >
> > > IMPORTANT: if you fix the bug, please add the following tag to the commit:
> > > Reported-by: syzbot+93d94a001cfbce9e60e1@syzkaller.appspotmail.com
> > >
> > >
> > > ======================================================
> > > WARNING: possible circular locking dependency detected
> > > 4.20.0+ #297 Not tainted
> > > ------------------------------------------------------
> > > syz-executor0/8529 is trying to acquire lock:
> > > 000000005e7fb829 (&pgdat->kswapd_wait){....}, at:
> > > __wake_up_common_lock+0x19e/0x330 kernel/sched/wait.c:120
> >
> > From the backtrace at the end of report I see it's coming from
> >
> > >   wakeup_kswapd+0x5f0/0x930 mm/vmscan.c:3982
> > >   steal_suitable_fallback+0x538/0x830 mm/page_alloc.c:2217
> >
> > This wakeup_kswapd is new due to Mel's 1c30844d2dfe ("mm: reclaim small
> > amounts of memory when an external fragmentation event occurs") so CC Mel.
> >
>
> New year new bugs :(

Old too :(
https://syzkaller.appspot.com/#upstream-open

> > > but task is already holding lock:
> > > 000000009bb7bae0 (&(&zone->lock)->rlock){-.-.}, at: spin_lock
> > > include/linux/spinlock.h:329 [inline]
> > > 000000009bb7bae0 (&(&zone->lock)->rlock){-.-.}, at: rmqueue_bulk
> > > mm/page_alloc.c:2548 [inline]
> > > 000000009bb7bae0 (&(&zone->lock)->rlock){-.-.}, at: __rmqueue_pcplist
> > > mm/page_alloc.c:3021 [inline]
> > > 000000009bb7bae0 (&(&zone->lock)->rlock){-.-.}, at: rmqueue_pcplist
> > > mm/page_alloc.c:3050 [inline]
> > > 000000009bb7bae0 (&(&zone->lock)->rlock){-.-.}, at: rmqueue
> > > mm/page_alloc.c:3072 [inline]
> > > 000000009bb7bae0 (&(&zone->lock)->rlock){-.-.}, at:
> > > get_page_from_freelist+0x1bae/0x52a0 mm/page_alloc.c:3491
> > >
> > > which lock already depends on the new lock.
> >
> > However, I don't understand why lockdep thinks it's a problem. IIRC it
> > doesn't like that we are locking pgdat->kswapd_wait.lock while holding
> > zone->lock. That means it has learned that the opposite order also
> > exists, e.g. somebody would take zone->lock while manipulating the wait
> > queue? I don't see where but I admit I'm not good at reading lockdep
> > splats, so CCing Peterz and Ingo as well. Keeping rest of mail for
> > reference.
> >
>
> I'm not sure I'm reading the output correctly because I'm having trouble
> seeing the exact pattern that allows lockdep to conclude the lock ordering
> is problematic.
>
> I think it's hungup on the fact that mod_timer can allocate debug
> objects for KASAN and somehow concludes that the waking of kswapd is
> problematic because potentially a lock ordering exists that would trip.
> I don't see how it's actually possible though due to either a lack of
> imagination or maybe lockdep is being cautious as something could change
> in the future that allows the lockup.
>
> There are a few options I guess in order of preference.
>
> 1. Drop zone->lock for the call. It's not necessarily to keep track of
>    the IRQ flags as callers into that path already do things like treat
>    IRQ disabling and the spin lock separately.
>
> 2. Use another alloc_flag in steal_suitable_fallback that is set when a
>    wakeup is required but do the actual wakeup in rmqueue() after the
>    zone locks are dropped and the allocation request is completed
>
> 3. Always wakeup kswapd if watermarks are boosted. I like this the least
>    because it means doing wakeups that are unrelated to fragmentation
>    that occurred in the current context.
>
> Any particular preference?
>
> While I recognise there is no test case available, how often does this
> trigger in syzbot as it would be nice to have some confirmation any
> patch is really fixing the problem.

This info is always available over the "dashboard link" in the report:
https://syzkaller.appspot.com/bug?extid=93d94a001cfbce9e60e1

In this case it's 1. I don't know why. Lock inversions are easier to
trigger in some sense as information accumulates globally. Maybe one
of these stacks is hard to trigger, or maybe all these stacks are
rarely triggered on one machine. While the info accumulates globally,
non of the machines are actually run for any prolonged time: they all
crash right away on hundreds of known bugs.

So good that Qian can reproduce this.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: possible deadlock in __wake_up_common_lock
  2019-01-02 18:19     ` Qian Cai
@ 2019-01-03  1:28       ` Tetsuo Handa
  2019-01-03  3:27         ` Qian Cai
  0 siblings, 1 reply; 15+ messages in thread
From: Tetsuo Handa @ 2019-01-03  1:28 UTC (permalink / raw)
  To: Qian Cai, Mel Gorman, Vlastimil Babka
  Cc: syzbot, aarcange, akpm, kirill.shutemov, linux-kernel, linux-mm,
	linux, mhocko, rientjes, syzkaller-bugs, xieyisheng1, zhongjiang,
	Peter Zijlstra, Ingo Molnar

On 2019/01/03 3:19, Qian Cai wrote:
> On 1/2/19 1:06 PM, Mel Gorman wrote:
> 
>> While I recognise there is no test case available, how often does this
>> trigger in syzbot as it would be nice to have some confirmation any
>> patch is really fixing the problem.
> 
> I think I did manage to trigger this every time running a mmap() workload
> causing swapping and a low-memory situation [1].
> 
> [1]
> https://github.com/linux-test-project/ltp/blob/master/testcases/kernel/mem/oom/oom01.c

wakeup_kswapd() is called because tlb_next_batch() is doing GFP_NOWAIT
allocation. But since tlb_next_batch() can tolerate allocation failure,
does below change in tlb_next_batch() help?

#define GFP_NOWAIT      (__GFP_KSWAPD_RECLAIM)

-	batch = (void *)__get_free_pages(GFP_NOWAIT | __GFP_NOWARN, 0);
+	batch = (void *)__get_free_pages(__GFP_NOWARN, 0);

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: possible deadlock in __wake_up_common_lock
  2019-01-03  1:28       ` Tetsuo Handa
@ 2019-01-03  3:27         ` Qian Cai
  0 siblings, 0 replies; 15+ messages in thread
From: Qian Cai @ 2019-01-03  3:27 UTC (permalink / raw)
  To: Tetsuo Handa, Mel Gorman, Vlastimil Babka
  Cc: syzbot, aarcange, akpm, kirill.shutemov, linux-kernel, linux-mm,
	linux, mhocko, rientjes, syzkaller-bugs, xieyisheng1, zhongjiang,
	Peter Zijlstra, Ingo Molnar

On 1/2/19 8:28 PM, Tetsuo Handa wrote:
> On 2019/01/03 3:19, Qian Cai wrote:
>> On 1/2/19 1:06 PM, Mel Gorman wrote:
>>
>>> While I recognise there is no test case available, how often does this
>>> trigger in syzbot as it would be nice to have some confirmation any
>>> patch is really fixing the problem.
>>
>> I think I did manage to trigger this every time running a mmap() workload
>> causing swapping and a low-memory situation [1].
>>
>> [1]
>> https://github.com/linux-test-project/ltp/blob/master/testcases/kernel/mem/oom/oom01.c
> 
> wakeup_kswapd() is called because tlb_next_batch() is doing GFP_NOWAIT
> allocation. But since tlb_next_batch() can tolerate allocation failure,
> does below change in tlb_next_batch() help?
> 
> #define GFP_NOWAIT      (__GFP_KSWAPD_RECLAIM)
> 
> -	batch = (void *)__get_free_pages(GFP_NOWAIT | __GFP_NOWARN, 0);
> +	batch = (void *)__get_free_pages(__GFP_NOWARN, 0);

No. In oom01 case, it is from,

do_anonymous_page
  __alloc_zeroed_user_highpage
    alloc_page_vma(GFP_HIGHUSER ...

GFP_HIGHUSER -> GFP_USER -> __GFP_RECLAIM -> ___GFP_KSWAPD_RECLAIM


Then, it has this new code in steal_suitable_fallback() via 1c30844d2df (mm:
reclaim small amounts of memory when an external fragmentation event occurs)

 /*
  * Boost watermarks to increase reclaim pressure to reduce
  * the likelihood of future fallbacks. Wake kswapd now as
  * the node may be balanced overall and kswapd will not
  * wake naturally.
  */
  boost_watermark(zone);
  if (alloc_flags & ALLOC_KSWAPD)
  	wakeup_kswapd(zone, 0, 0, zone_idx(zone));

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: possible deadlock in __wake_up_common_lock
  2019-01-02 18:29     ` Dmitry Vyukov
@ 2019-01-03 16:37       ` Mel Gorman
  2019-01-03 19:40         ` Qian Cai
  0 siblings, 1 reply; 15+ messages in thread
From: Mel Gorman @ 2019-01-03 16:37 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: Vlastimil Babka, syzbot, Andrea Arcangeli, Andrew Morton,
	Kirill A. Shutemov, LKML, Linux-MM, linux, Michal Hocko,
	David Rientjes, syzkaller-bugs, xieyisheng1, zhong jiang,
	Peter Zijlstra, Ingo Molnar, Qian Cai

On Wed, Jan 02, 2019 at 07:29:43PM +0100, Dmitry Vyukov wrote:
> > > This wakeup_kswapd is new due to Mel's 1c30844d2dfe ("mm: reclaim small
> > > amounts of memory when an external fragmentation event occurs") so CC Mel.
> > >
> >
> > New year new bugs :(
> 
> Old too :(
> https://syzkaller.appspot.com/#upstream-open
> 

Well, that can ruin a day! Lets see can we knock one off the list.

> > While I recognise there is no test case available, how often does this
> > trigger in syzbot as it would be nice to have some confirmation any
> > patch is really fixing the problem.
> 
> This info is always available over the "dashboard link" in the report:
> https://syzkaller.appspot.com/bug?extid=93d94a001cfbce9e60e1
> 

Noted for future reference.

> In this case it's 1. I don't know why. Lock inversions are easier to
> trigger in some sense as information accumulates globally. Maybe one
> of these stacks is hard to trigger, or maybe all these stacks are
> rarely triggered on one machine. While the info accumulates globally,
> non of the machines are actually run for any prolonged time: they all
> crash right away on hundreds of known bugs.
> 
> So good that Qian can reproduce this.

I think this might simply be hard to reproduce. I tried for hours on two
separate machines and failed. Nevertheless this should still fix it and
hopefully syzbot picks this up automaticlly when cc'd. If I hear
nothing, I'll send the patch unconditionally (and cc syzbot). Hopefully
Qian can give it a whirl too.

Thanks

--8<--
mm, page_alloc: Do not wake kswapd with zone lock held

syzbot reported the following and it was confirmed by Qian Cai that a
similar bug was visible from a different context.

======================================================
WARNING: possible circular locking dependency detected
4.20.0+ #297 Not tainted
------------------------------------------------------
syz-executor0/8529 is trying to acquire lock:
000000005e7fb829 (&pgdat->kswapd_wait){....}, at:
__wake_up_common_lock+0x19e/0x330 kernel/sched/wait.c:120

but task is already holding lock:
000000009bb7bae0 (&(&zone->lock)->rlock){-.-.}, at: spin_lock
include/linux/spinlock.h:329 [inline]
000000009bb7bae0 (&(&zone->lock)->rlock){-.-.}, at: rmqueue_bulk
mm/page_alloc.c:2548 [inline]
000000009bb7bae0 (&(&zone->lock)->rlock){-.-.}, at: __rmqueue_pcplist
mm/page_alloc.c:3021 [inline]
000000009bb7bae0 (&(&zone->lock)->rlock){-.-.}, at: rmqueue_pcplist
mm/page_alloc.c:3050 [inline]
000000009bb7bae0 (&(&zone->lock)->rlock){-.-.}, at: rmqueue
mm/page_alloc.c:3072 [inline]
000000009bb7bae0 (&(&zone->lock)->rlock){-.-.}, at:
get_page_from_freelist+0x1bae/0x52a0 mm/page_alloc.c:3491

It appears to be a false positive in that the only way the lock
ordering should be inverted is if kswapd is waking itself and the
wakeup allocates debugging objects which should already be allocated
if it's kswapd doing the waking. Nevertheless, the possibility exists
and so it's best to avoid the problem.

This patch flags a zone as needing a kswapd using the, surprisingly,
unused zone flag field. The flag is read without the lock held to
do the wakeup. It's possible that the flag setting context is not
the same as the flag clearing context or for small races to occur.
However, each race possibility is harmless and there is no visible
degredation in fragmentation treatment.

While zone->flag could have continued to be unused, there is potential
for moving some existing fields into the flags field instead. Particularly
read-mostly ones like zone->initialized and zone->contiguous.

Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
---
 include/linux/mmzone.h | 6 ++++++
 mm/page_alloc.c        | 8 +++++++-
 2 files changed, 13 insertions(+), 1 deletion(-)

diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
index cc4a507d7ca4..842f9189537b 100644
--- a/include/linux/mmzone.h
+++ b/include/linux/mmzone.h
@@ -520,6 +520,12 @@ enum pgdat_flags {
 	PGDAT_RECLAIM_LOCKED,		/* prevents concurrent reclaim */
 };
 
+enum zone_flags {
+	ZONE_BOOSTED_WATERMARK,		/* zone recently boosted watermarks.
+					 * Cleared when kswapd is woken.
+					 */
+};
+
 static inline unsigned long zone_managed_pages(struct zone *zone)
 {
 	return (unsigned long)atomic_long_read(&zone->managed_pages);
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index cde5dac6229a..d295c9bc01a8 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -2214,7 +2214,7 @@ static void steal_suitable_fallback(struct zone *zone, struct page *page,
 	 */
 	boost_watermark(zone);
 	if (alloc_flags & ALLOC_KSWAPD)
-		wakeup_kswapd(zone, 0, 0, zone_idx(zone));
+		set_bit(ZONE_BOOSTED_WATERMARK, &zone->flags);
 
 	/* We are not allowed to try stealing from the whole block */
 	if (!whole_block)
@@ -3102,6 +3102,12 @@ struct page *rmqueue(struct zone *preferred_zone,
 	local_irq_restore(flags);
 
 out:
+	/* Separate test+clear to avoid unnecessary atomics */
+	if (test_bit(ZONE_BOOSTED_WATERMARK, &zone->flags)) {
+		clear_bit(ZONE_BOOSTED_WATERMARK, &zone->flags);
+		wakeup_kswapd(zone, 0, 0, zone_idx(zone));
+	}
+
 	VM_BUG_ON_PAGE(page && bad_range(zone, page), page);
 	return page;

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: possible deadlock in __wake_up_common_lock
  2019-01-03 16:37       ` Mel Gorman
@ 2019-01-03 19:40         ` Qian Cai
  2019-01-03 22:54           ` Mel Gorman
  0 siblings, 1 reply; 15+ messages in thread
From: Qian Cai @ 2019-01-03 19:40 UTC (permalink / raw)
  To: Mel Gorman, Dmitry Vyukov
  Cc: Vlastimil Babka, syzbot, Andrea Arcangeli, Andrew Morton,
	Kirill A. Shutemov, LKML, Linux-MM, linux, Michal Hocko,
	David Rientjes, syzkaller-bugs, xieyisheng1, zhong jiang,
	Peter Zijlstra, Ingo Molnar

On 1/3/19 11:37 AM, Mel Gorman wrote:
> On Wed, Jan 02, 2019 at 07:29:43PM +0100, Dmitry Vyukov wrote:
>>>> This wakeup_kswapd is new due to Mel's 1c30844d2dfe ("mm: reclaim small
>>>> amounts of memory when an external fragmentation event occurs") so CC Mel.
>>>>
>>>
>>> New year new bugs :(
>>
>> Old too :(
>> https://syzkaller.appspot.com/#upstream-open
>>
> 
> Well, that can ruin a day! Lets see can we knock one off the list.
> 
>>> While I recognise there is no test case available, how often does this
>>> trigger in syzbot as it would be nice to have some confirmation any
>>> patch is really fixing the problem.
>>
>> This info is always available over the "dashboard link" in the report:
>> https://syzkaller.appspot.com/bug?extid=93d94a001cfbce9e60e1
>>
> 
> Noted for future reference.
> 
>> In this case it's 1. I don't know why. Lock inversions are easier to
>> trigger in some sense as information accumulates globally. Maybe one
>> of these stacks is hard to trigger, or maybe all these stacks are
>> rarely triggered on one machine. While the info accumulates globally,
>> non of the machines are actually run for any prolonged time: they all
>> crash right away on hundreds of known bugs.
>>
>> So good that Qian can reproduce this.
> 
> I think this might simply be hard to reproduce. I tried for hours on two
> separate machines and failed. Nevertheless this should still fix it and
> hopefully syzbot picks this up automaticlly when cc'd. If I hear
> nothing, I'll send the patch unconditionally (and cc syzbot). Hopefully
> Qian can give it a whirl too.
> 
> Thanks
> 
> --8<--
> mm, page_alloc: Do not wake kswapd with zone lock held
> 
> syzbot reported the following and it was confirmed by Qian Cai that a
> similar bug was visible from a different context.
> 
> ======================================================
> WARNING: possible circular locking dependency detected
> 4.20.0+ #297 Not tainted
> ------------------------------------------------------
> syz-executor0/8529 is trying to acquire lock:
> 000000005e7fb829 (&pgdat->kswapd_wait){....}, at:
> __wake_up_common_lock+0x19e/0x330 kernel/sched/wait.c:120
> 
> but task is already holding lock:
> 000000009bb7bae0 (&(&zone->lock)->rlock){-.-.}, at: spin_lock
> include/linux/spinlock.h:329 [inline]
> 000000009bb7bae0 (&(&zone->lock)->rlock){-.-.}, at: rmqueue_bulk
> mm/page_alloc.c:2548 [inline]
> 000000009bb7bae0 (&(&zone->lock)->rlock){-.-.}, at: __rmqueue_pcplist
> mm/page_alloc.c:3021 [inline]
> 000000009bb7bae0 (&(&zone->lock)->rlock){-.-.}, at: rmqueue_pcplist
> mm/page_alloc.c:3050 [inline]
> 000000009bb7bae0 (&(&zone->lock)->rlock){-.-.}, at: rmqueue
> mm/page_alloc.c:3072 [inline]
> 000000009bb7bae0 (&(&zone->lock)->rlock){-.-.}, at:
> get_page_from_freelist+0x1bae/0x52a0 mm/page_alloc.c:3491
> 
> It appears to be a false positive in that the only way the lock
> ordering should be inverted is if kswapd is waking itself and the
> wakeup allocates debugging objects which should already be allocated
> if it's kswapd doing the waking. Nevertheless, the possibility exists
> and so it's best to avoid the problem.
> 
> This patch flags a zone as needing a kswapd using the, surprisingly,
> unused zone flag field. The flag is read without the lock held to
> do the wakeup. It's possible that the flag setting context is not
> the same as the flag clearing context or for small races to occur.
> However, each race possibility is harmless and there is no visible
> degredation in fragmentation treatment.
> 
> While zone->flag could have continued to be unused, there is potential
> for moving some existing fields into the flags field instead. Particularly
> read-mostly ones like zone->initialized and zone->contiguous.
> 
> Signed-off-by: Mel Gorman <mgorman@techsingularity.net>

Tested-by: Qian Cai <cai@lca.pw>

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: possible deadlock in __wake_up_common_lock
  2019-01-03 19:40         ` Qian Cai
@ 2019-01-03 22:54           ` Mel Gorman
  0 siblings, 0 replies; 15+ messages in thread
From: Mel Gorman @ 2019-01-03 22:54 UTC (permalink / raw)
  To: Qian Cai
  Cc: Dmitry Vyukov, Vlastimil Babka, syzbot, Andrea Arcangeli,
	Andrew Morton, Kirill A. Shutemov, LKML, Linux-MM, linux,
	Michal Hocko, David Rientjes, syzkaller-bugs, xieyisheng1,
	zhong jiang, Peter Zijlstra, Ingo Molnar

On Thu, Jan 03, 2019 at 02:40:35PM -0500, Qian Cai wrote:
> > Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
> 
> Tested-by: Qian Cai <cai@lca.pw>

Thanks!

-- 
Mel Gorman
SUSE Labs

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: possible deadlock in __wake_up_common_lock
  2019-01-02 12:51 ` Vlastimil Babka
  2019-01-02 18:06   ` Mel Gorman
@ 2019-01-07  9:52   ` Peter Zijlstra
  2019-01-07 20:46     ` Johannes Weiner
  2019-01-08 13:08   ` Peter Zijlstra
  2 siblings, 1 reply; 15+ messages in thread
From: Peter Zijlstra @ 2019-01-07  9:52 UTC (permalink / raw)
  To: Vlastimil Babka
  Cc: syzbot, aarcange, akpm, kirill.shutemov, linux-kernel, linux-mm,
	linux, mhocko, rientjes, syzkaller-bugs, xieyisheng1, zhongjiang,
	Mel Gorman, Ingo Molnar, hannes

On Wed, Jan 02, 2019 at 01:51:01PM +0100, Vlastimil Babka wrote:
> > -> #3 (&base->lock){-.-.}:
> >         __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline]
> >         _raw_spin_lock_irqsave+0x99/0xd0 kernel/locking/spinlock.c:152
> >         lock_timer_base+0xbb/0x2b0 kernel/time/timer.c:937
> >         __mod_timer kernel/time/timer.c:1009 [inline]
> >         mod_timer kernel/time/timer.c:1101 [inline]
> >         add_timer+0x895/0x1490 kernel/time/timer.c:1137
> >         __queue_delayed_work+0x249/0x380 kernel/workqueue.c:1533
> >         queue_delayed_work_on+0x1a2/0x1f0 kernel/workqueue.c:1558
> >         queue_delayed_work include/linux/workqueue.h:527 [inline]
> >         schedule_delayed_work include/linux/workqueue.h:628 [inline]
> >         psi_group_change kernel/sched/psi.c:485 [inline]
> >         psi_task_change+0x3f1/0x5f0 kernel/sched/psi.c:534
> >         psi_enqueue kernel/sched/stats.h:82 [inline]
> >         enqueue_task kernel/sched/core.c:727 [inline]
> >         activate_task+0x21a/0x430 kernel/sched/core.c:751
> >         wake_up_new_task+0x527/0xd20 kernel/sched/core.c:2423
> >         _do_fork+0x33b/0x11d0 kernel/fork.c:2247
> >         kernel_thread+0x34/0x40 kernel/fork.c:2281
> >         rest_init+0x28/0x372 init/main.c:409
> >         arch_call_rest_init+0xe/0x1b
> >         start_kernel+0x873/0x8ae init/main.c:741
> >         x86_64_start_reservations+0x29/0x2b arch/x86/kernel/head64.c:470
> >         x86_64_start_kernel+0x76/0x79 arch/x86/kernel/head64.c:451
> >         secondary_startup_64+0xa4/0xb0 arch/x86/kernel/head_64.S:243

That thing is fairly new; I don't think we used to have this dependency
prior to PSI.

Johannes, can we move that mod_timer out from under rq->lock? At worst
we can use an irq_work to self-ipi.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: possible deadlock in __wake_up_common_lock
  2019-01-07  9:52   ` Peter Zijlstra
@ 2019-01-07 20:46     ` Johannes Weiner
  2019-01-07 21:29       ` Peter Zijlstra
  0 siblings, 1 reply; 15+ messages in thread
From: Johannes Weiner @ 2019-01-07 20:46 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Vlastimil Babka, syzbot, aarcange, akpm, kirill.shutemov,
	linux-kernel, linux-mm, linux, mhocko, rientjes, syzkaller-bugs,
	xieyisheng1, zhongjiang, Mel Gorman, Ingo Molnar

On Mon, Jan 07, 2019 at 10:52:17AM +0100, Peter Zijlstra wrote:
> On Wed, Jan 02, 2019 at 01:51:01PM +0100, Vlastimil Babka wrote:
> > > -> #3 (&base->lock){-.-.}:
> > >         __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline]
> > >         _raw_spin_lock_irqsave+0x99/0xd0 kernel/locking/spinlock.c:152
> > >         lock_timer_base+0xbb/0x2b0 kernel/time/timer.c:937
> > >         __mod_timer kernel/time/timer.c:1009 [inline]
> > >         mod_timer kernel/time/timer.c:1101 [inline]
> > >         add_timer+0x895/0x1490 kernel/time/timer.c:1137
> > >         __queue_delayed_work+0x249/0x380 kernel/workqueue.c:1533
> > >         queue_delayed_work_on+0x1a2/0x1f0 kernel/workqueue.c:1558
> > >         queue_delayed_work include/linux/workqueue.h:527 [inline]
> > >         schedule_delayed_work include/linux/workqueue.h:628 [inline]
> > >         psi_group_change kernel/sched/psi.c:485 [inline]
> > >         psi_task_change+0x3f1/0x5f0 kernel/sched/psi.c:534
> > >         psi_enqueue kernel/sched/stats.h:82 [inline]
> > >         enqueue_task kernel/sched/core.c:727 [inline]
> > >         activate_task+0x21a/0x430 kernel/sched/core.c:751
> > >         wake_up_new_task+0x527/0xd20 kernel/sched/core.c:2423
> > >         _do_fork+0x33b/0x11d0 kernel/fork.c:2247
> > >         kernel_thread+0x34/0x40 kernel/fork.c:2281
> > >         rest_init+0x28/0x372 init/main.c:409
> > >         arch_call_rest_init+0xe/0x1b
> > >         start_kernel+0x873/0x8ae init/main.c:741
> > >         x86_64_start_reservations+0x29/0x2b arch/x86/kernel/head64.c:470
> > >         x86_64_start_kernel+0x76/0x79 arch/x86/kernel/head64.c:451
> > >         secondary_startup_64+0xa4/0xb0 arch/x86/kernel/head_64.S:243
> 
> That thing is fairly new; I don't think we used to have this dependency
> prior to PSI.
> 
> Johannes, can we move that mod_timer out from under rq->lock? At worst
> we can use an irq_work to self-ipi.

Hm, so the splat says this:

wakeups take the pi lock
pi lock holders take the rq lock
rq lock holders take the timer base lock (thanks psi)
timer base lock holders take the zone lock (thanks kasan)
problem: now a zone lock holder wakes up kswapd

right? And we can break the chain from the VM or from psi.

I cannot say one is clearly cleaner than the other, though. With kasan
allocating from inside the basic timer code, those locks leak out from
kernel/* and contaminate the VM locking anyway.

Do you think the rq->lock -> base->lock ordering is likely to cause
issues elsewhere?

Something like this below seems to pass the smoke test. If we want to
go ahead with that, I'd test it properly and send it with a sign-off.

diff --git a/include/linux/psi_types.h b/include/linux/psi_types.h
index 2cf422db5d18..42e287139c31 100644
--- a/include/linux/psi_types.h
+++ b/include/linux/psi_types.h
@@ -1,6 +1,7 @@
 #ifndef _LINUX_PSI_TYPES_H
 #define _LINUX_PSI_TYPES_H
 
+#include <linux/irq_work.h>
 #include <linux/seqlock.h>
 #include <linux/types.h>
 
@@ -77,6 +78,7 @@ struct psi_group {
 	u64 last_update;
 	u64 next_update;
 	struct delayed_work clock_work;
+	struct irq_work clock_reviver;
 
 	/* Total stall times and sampled pressure averages */
 	u64 total[NR_PSI_STATES - 1];
diff --git a/kernel/sched/psi.c b/kernel/sched/psi.c
index f39958321293..9654de009250 100644
--- a/kernel/sched/psi.c
+++ b/kernel/sched/psi.c
@@ -165,6 +165,7 @@ static struct psi_group psi_system = {
 };
 
 static void psi_update_work(struct work_struct *work);
+static void psi_revive_clock(struct irq_work *work);
 
 static void group_init(struct psi_group *group)
 {
@@ -177,6 +178,7 @@ static void group_init(struct psi_group *group)
 	group->last_update = now;
 	group->next_update = now + psi_period;
 	INIT_DELAYED_WORK(&group->clock_work, psi_update_work);
+	init_irq_work(&group->clock_reviver, psi_revive_clock);
 	mutex_init(&group->stat_lock);
 }
 
@@ -399,6 +401,14 @@ static void psi_update_work(struct work_struct *work)
 	}
 }
 
+static void psi_revive_clock(struct irq_work *work)
+{
+	struct psi_group *group;
+
+	group = container_of(work, struct psi_group, clock_reviver);
+	schedule_delayed_work(&group->clock_work, PSI_FREQ);
+}
+
 static void record_times(struct psi_group_cpu *groupc, int cpu,
 			 bool memstall_tick)
 {
@@ -484,8 +494,14 @@ static void psi_group_change(struct psi_group *group, int cpu,
 
 	write_seqcount_end(&groupc->seq);
 
+	/*
+	 * We cannot modify workqueues or timers with the rq lock held
+	 * here. If the clock has stopped due to a lack of activity in
+	 * the past and needs reviving, go through an IPI to wake it
+	 * back up. In most cases, the work should already be pending.
+	 */
 	if (!delayed_work_pending(&group->clock_work))
-		schedule_delayed_work(&group->clock_work, PSI_FREQ);
+		irq_work_queue(&group->clock_reviver);
 }
 
 static struct psi_group *iterate_groups(struct task_struct *task, void **iter)

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: possible deadlock in __wake_up_common_lock
  2019-01-07 20:46     ` Johannes Weiner
@ 2019-01-07 21:29       ` Peter Zijlstra
  2019-01-07 21:33         ` Peter Zijlstra
  0 siblings, 1 reply; 15+ messages in thread
From: Peter Zijlstra @ 2019-01-07 21:29 UTC (permalink / raw)
  To: Johannes Weiner
  Cc: Vlastimil Babka, syzbot, aarcange, akpm, kirill.shutemov,
	linux-kernel, linux-mm, linux, mhocko, rientjes, syzkaller-bugs,
	xieyisheng1, zhongjiang, Mel Gorman, Ingo Molnar

On Mon, Jan 07, 2019 at 03:46:27PM -0500, Johannes Weiner wrote:
> Hm, so the splat says this:
> 
> wakeups take the pi lock
> pi lock holders take the rq lock
> rq lock holders take the timer base lock (thanks psi)
> timer base lock holders take the zone lock (thanks kasan)
> problem: now a zone lock holder wakes up kswapd
> 
> right? And we can break the chain from the VM or from psi.

Yep. And since PSI it the latest addition to that chain, I figured we
ought maybe not do that. But I've not looked at a computer in 2 weeks,
so what do I know ;-)

> I cannot say one is clearly cleaner than the other, though. With kasan
> allocating from inside the basic timer code, those locks leak out from
> kernel/* and contaminate the VM locking anyway.
> 
> Do you think the rq->lock -> base->lock ordering is likely to cause
> issues elsewhere?

Not sure; we nest the hrtimer base lock under rq->lock (at the time I
fixed hrtimers to not hold it's base lock over the timer function
callback, just like regular timers already did) and that has worked
fine.

So maybe we should look at the kasan thing.. dunno.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: possible deadlock in __wake_up_common_lock
  2019-01-07 21:29       ` Peter Zijlstra
@ 2019-01-07 21:33         ` Peter Zijlstra
  0 siblings, 0 replies; 15+ messages in thread
From: Peter Zijlstra @ 2019-01-07 21:33 UTC (permalink / raw)
  To: Johannes Weiner
  Cc: Vlastimil Babka, syzbot, aarcange, akpm, kirill.shutemov,
	linux-kernel, linux-mm, linux, mhocko, rientjes, syzkaller-bugs,
	xieyisheng1, zhongjiang, Mel Gorman, Ingo Molnar

On Mon, Jan 07, 2019 at 10:29:21PM +0100, Peter Zijlstra wrote:
> On Mon, Jan 07, 2019 at 03:46:27PM -0500, Johannes Weiner wrote:
> > Hm, so the splat says this:
> > 
> > wakeups take the pi lock
> > pi lock holders take the rq lock
> > rq lock holders take the timer base lock (thanks psi)
> > timer base lock holders take the zone lock (thanks kasan)

That's not kasan, that's debugobjects, and that would be equally true
for the hrtimer usage we already have in the scheduler.

With that, I'm not entirely sure we're responsible for this splat.. I'll
try and have another look tomorrow.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: possible deadlock in __wake_up_common_lock
  2019-01-02 12:51 ` Vlastimil Babka
  2019-01-02 18:06   ` Mel Gorman
  2019-01-07  9:52   ` Peter Zijlstra
@ 2019-01-08 13:08   ` Peter Zijlstra
  2 siblings, 0 replies; 15+ messages in thread
From: Peter Zijlstra @ 2019-01-08 13:08 UTC (permalink / raw)
  To: Vlastimil Babka
  Cc: syzbot, aarcange, akpm, kirill.shutemov, linux-kernel, linux-mm,
	linux, mhocko, rientjes, syzkaller-bugs, xieyisheng1, zhongjiang,
	Mel Gorman, Ingo Molnar, Thomas Gleixner, hannes

On Wed, Jan 02, 2019 at 01:51:01PM +0100, Vlastimil Babka wrote:

> > syz-executor0/8529 is trying to acquire lock:
> > 000000005e7fb829 (&pgdat->kswapd_wait){....}, at:  
> > __wake_up_common_lock+0x19e/0x330 kernel/sched/wait.c:120
> 
> From the backtrace at the end of report I see it's coming from
> 
> >   wakeup_kswapd+0x5f0/0x930 mm/vmscan.c:3982
> >   steal_suitable_fallback+0x538/0x830 mm/page_alloc.c:2217
> 
> This wakeup_kswapd is new due to Mel's 1c30844d2dfe ("mm: reclaim small
> amounts of memory when an external fragmentation event occurs") so CC Mel.

Right; and I see Mel already has a fix for that.

> > the existing dependency chain (in reverse order) is:
> > 
> > -> #4 (&(&zone->lock)->rlock){-.-.}:
> >         __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline]
> >         _raw_spin_lock_irqsave+0x99/0xd0 kernel/locking/spinlock.c:152
> >         rmqueue mm/page_alloc.c:3082 [inline]
> >         get_page_from_freelist+0x9eb/0x52a0 mm/page_alloc.c:3491
> >         __alloc_pages_nodemask+0x4f3/0xde0 mm/page_alloc.c:4529
> >         __alloc_pages include/linux/gfp.h:473 [inline]
> >         alloc_page_interleave+0x25/0x1c0 mm/mempolicy.c:1988
> >         alloc_pages_current+0x1bf/0x210 mm/mempolicy.c:2104
> >         alloc_pages include/linux/gfp.h:509 [inline]
> >         depot_save_stack+0x3f1/0x470 lib/stackdepot.c:260
> >         save_stack+0xa9/0xd0 mm/kasan/common.c:79
> >         set_track mm/kasan/common.c:85 [inline]
> >         kasan_kmalloc+0xcb/0xd0 mm/kasan/common.c:482
> >         kasan_slab_alloc+0x12/0x20 mm/kasan/common.c:397
> >         kmem_cache_alloc+0x130/0x730 mm/slab.c:3541
> >         kmem_cache_zalloc include/linux/slab.h:731 [inline]
> >         fill_pool lib/debugobjects.c:134 [inline]
> >         __debug_object_init+0xbb8/0x1290 lib/debugobjects.c:379
> >         debug_object_init lib/debugobjects.c:431 [inline]
> >         debug_object_activate+0x323/0x600 lib/debugobjects.c:512
> >         debug_timer_activate kernel/time/timer.c:708 [inline]
> >         debug_activate kernel/time/timer.c:763 [inline]
> >         __mod_timer kernel/time/timer.c:1040 [inline]
> >         mod_timer kernel/time/timer.c:1101 [inline]
> >         add_timer+0x50e/0x1490 kernel/time/timer.c:1137
> >         __queue_delayed_work+0x249/0x380 kernel/workqueue.c:1533
> >         queue_delayed_work_on+0x1a2/0x1f0 kernel/workqueue.c:1558
> >         queue_delayed_work include/linux/workqueue.h:527 [inline]
> >         schedule_delayed_work include/linux/workqueue.h:628 [inline]
> >         start_dirtytime_writeback+0x4e/0x53 fs/fs-writeback.c:2043
> >         do_one_initcall+0x145/0x957 init/main.c:889
> >         do_initcall_level init/main.c:957 [inline]
> >         do_initcalls init/main.c:965 [inline]
> >         do_basic_setup init/main.c:983 [inline]
> >         kernel_init_freeable+0x4c1/0x5af init/main.c:1136
> >         kernel_init+0x11/0x1ae init/main.c:1056
> >         ret_from_fork+0x3a/0x50 arch/x86/entry/entry_64.S:352
> > 
> > -> #3 (&base->lock){-.-.}:

However I really, _really_ hate that dependency. We really should not
get memory allocations under rq->lock.

We seem to avoid this for the existing hrtimer usage, because of
hrtimer_init() doing: debug_init() -> debug_hrtimer_init() ->
debug_object_init().

But that isn't done for the (PSI) schedule_delayed_work() thing for some
raisin; even though: group_init() does INIT_DELAYED_WORK() ->
__INIT_DELAYED_WORK() -> __init_timer() -> init_timer_key() ->
debug_init() -> debug_timer_init() -> debug_object_init().

But _somehow_ that isn't doing it.

Now debug_object_activate() has this case:

	if (descr->is_static_object && descr->is_static_object(addr)) {
		debug_object_init()

which does an debug_object_init() for static allocations, which brings
us to:

  static DEFINE_PER_CPU(struct psi_group_cpu, system_group_pcpu);
  static struct psi_group psi_system = {

But that _should_ get initialized by psi_init(), which is called from
sched_init() which _should_ be waaay before do_basic_setup().

Something goes wobbly.. but I'm not seeing it.

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2019-01-08 13:09 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-01-02  8:51 possible deadlock in __wake_up_common_lock syzbot
2019-01-02 12:51 ` Vlastimil Babka
2019-01-02 18:06   ` Mel Gorman
2019-01-02 18:19     ` Qian Cai
2019-01-03  1:28       ` Tetsuo Handa
2019-01-03  3:27         ` Qian Cai
2019-01-02 18:29     ` Dmitry Vyukov
2019-01-03 16:37       ` Mel Gorman
2019-01-03 19:40         ` Qian Cai
2019-01-03 22:54           ` Mel Gorman
2019-01-07  9:52   ` Peter Zijlstra
2019-01-07 20:46     ` Johannes Weiner
2019-01-07 21:29       ` Peter Zijlstra
2019-01-07 21:33         ` Peter Zijlstra
2019-01-08 13:08   ` Peter Zijlstra

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).