linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* possible deadlock in shmem_fallocate (4)
@ 2019-12-26 21:25 syzbot
  2020-03-07 21:43 ` [ashmem] " Eric Biggers
                   ` (3 more replies)
  0 siblings, 4 replies; 17+ messages in thread
From: syzbot @ 2019-12-26 21:25 UTC (permalink / raw)
  To: akpm, hughd, linux-kernel, linux-mm, syzkaller-bugs

Hello,

syzbot found the following crash on:

HEAD commit:    46cf053e Linux 5.5-rc3
git tree:       upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=162124aee00000
kernel config:  https://syzkaller.appspot.com/x/.config?x=ed9d672709340e35
dashboard link: https://syzkaller.appspot.com/bug?extid=7a0d9d0b26efefe61780
compiler:       gcc (GCC) 9.0.0 20181231 (experimental)

Unfortunately, I don't have any reproducer for this crash yet.

IMPORTANT: if you fix the bug, please add the following tag to the commit:
Reported-by: syzbot+7a0d9d0b26efefe61780@syzkaller.appspotmail.com

======================================================
WARNING: possible circular locking dependency detected
5.5.0-rc3-syzkaller #0 Not tainted
------------------------------------------------------
kswapd0/1852 is trying to acquire lock:
ffff888098919cd8 (&sb->s_type->i_mutex_key#13){+.+.}, at: inode_lock  
include/linux/fs.h:791 [inline]
ffff888098919cd8 (&sb->s_type->i_mutex_key#13){+.+.}, at:  
shmem_fallocate+0x15a/0xd40 mm/shmem.c:2735

but task is already holding lock:
ffffffff89a41e00 (fs_reclaim){+.+.}, at: __fs_reclaim_acquire+0x0/0x30  
mm/page_alloc.c:4922

which lock already depends on the new lock.


the existing dependency chain (in reverse order) is:

-> #1 (fs_reclaim){+.+.}:
        __fs_reclaim_acquire mm/page_alloc.c:4084 [inline]
        fs_reclaim_acquire.part.0+0x24/0x30 mm/page_alloc.c:4095
        fs_reclaim_acquire mm/page_alloc.c:4695 [inline]
        prepare_alloc_pages mm/page_alloc.c:4692 [inline]
        __alloc_pages_nodemask+0x52d/0x910 mm/page_alloc.c:4744
        alloc_pages_vma+0xdd/0x620 mm/mempolicy.c:2170
        shmem_alloc_page+0xc0/0x180 mm/shmem.c:1499
        shmem_alloc_and_acct_page+0x165/0x990 mm/shmem.c:1524
        shmem_getpage_gfp+0x56d/0x29a0 mm/shmem.c:1838
        shmem_getpage mm/shmem.c:154 [inline]
        shmem_write_begin+0x105/0x1e0 mm/shmem.c:2487
        generic_perform_write+0x23b/0x540 mm/filemap.c:3309
        __generic_file_write_iter+0x25e/0x630 mm/filemap.c:3438
        generic_file_write_iter+0x420/0x68e mm/filemap.c:3470
        call_write_iter include/linux/fs.h:1902 [inline]
        new_sync_write+0x4d3/0x770 fs/read_write.c:483
        __vfs_write+0xe1/0x110 fs/read_write.c:496
        vfs_write+0x268/0x5d0 fs/read_write.c:558
        ksys_write+0x14f/0x290 fs/read_write.c:611
        __do_sys_write fs/read_write.c:623 [inline]
        __se_sys_write fs/read_write.c:620 [inline]
        __x64_sys_write+0x73/0xb0 fs/read_write.c:620
        do_syscall_64+0xfa/0x790 arch/x86/entry/common.c:294
        entry_SYSCALL_64_after_hwframe+0x49/0xbe

-> #0 (&sb->s_type->i_mutex_key#13){+.+.}:
        check_prev_add kernel/locking/lockdep.c:2476 [inline]
        check_prevs_add kernel/locking/lockdep.c:2581 [inline]
        validate_chain kernel/locking/lockdep.c:2971 [inline]
        __lock_acquire+0x2596/0x4a00 kernel/locking/lockdep.c:3955
        lock_acquire+0x190/0x410 kernel/locking/lockdep.c:4485
        down_write+0x93/0x150 kernel/locking/rwsem.c:1534
        inode_lock include/linux/fs.h:791 [inline]
        shmem_fallocate+0x15a/0xd40 mm/shmem.c:2735
        ashmem_shrink_scan drivers/staging/android/ashmem.c:462 [inline]
        ashmem_shrink_scan+0x370/0x510 drivers/staging/android/ashmem.c:437
        do_shrink_slab+0x40f/0xad0 mm/vmscan.c:526
        shrink_slab mm/vmscan.c:687 [inline]
        shrink_slab+0x19a/0x680 mm/vmscan.c:660
        shrink_node_memcgs mm/vmscan.c:2687 [inline]
        shrink_node+0x46a/0x1ad0 mm/vmscan.c:2791
        kswapd_shrink_node mm/vmscan.c:3539 [inline]
        balance_pgdat+0x7c8/0x11f0 mm/vmscan.c:3697
        kswapd+0x5c3/0xf30 mm/vmscan.c:3948
        kthread+0x361/0x430 kernel/kthread.c:255
        ret_from_fork+0x24/0x30 arch/x86/entry/entry_64.S:352

other info that might help us debug this:

  Possible unsafe locking scenario:

        CPU0                    CPU1
        ----                    ----
   lock(fs_reclaim);
                                lock(&sb->s_type->i_mutex_key#13);
                                lock(fs_reclaim);
   lock(&sb->s_type->i_mutex_key#13);

  *** DEADLOCK ***

2 locks held by kswapd0/1852:
  #0: ffffffff89a41e00 (fs_reclaim){+.+.}, at: __fs_reclaim_acquire+0x0/0x30  
mm/page_alloc.c:4922
  #1: ffffffff89a1f948 (shrinker_rwsem){++++}, at: shrink_slab  
mm/vmscan.c:677 [inline]
  #1: ffffffff89a1f948 (shrinker_rwsem){++++}, at: shrink_slab+0xe6/0x680  
mm/vmscan.c:660

stack backtrace:
CPU: 0 PID: 1852 Comm: kswapd0 Not tainted 5.5.0-rc3-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS  
Google 01/01/2011
Call Trace:
  __dump_stack lib/dump_stack.c:77 [inline]
  dump_stack+0x197/0x210 lib/dump_stack.c:118
  print_circular_bug.isra.0.cold+0x163/0x172 kernel/locking/lockdep.c:1685
  check_noncircular+0x32e/0x3e0 kernel/locking/lockdep.c:1809
  check_prev_add kernel/locking/lockdep.c:2476 [inline]
  check_prevs_add kernel/locking/lockdep.c:2581 [inline]
  validate_chain kernel/locking/lockdep.c:2971 [inline]
  __lock_acquire+0x2596/0x4a00 kernel/locking/lockdep.c:3955
  lock_acquire+0x190/0x410 kernel/locking/lockdep.c:4485
  down_write+0x93/0x150 kernel/locking/rwsem.c:1534
  inode_lock include/linux/fs.h:791 [inline]
  shmem_fallocate+0x15a/0xd40 mm/shmem.c:2735
  ashmem_shrink_scan drivers/staging/android/ashmem.c:462 [inline]
  ashmem_shrink_scan+0x370/0x510 drivers/staging/android/ashmem.c:437
  do_shrink_slab+0x40f/0xad0 mm/vmscan.c:526
  shrink_slab mm/vmscan.c:687 [inline]
  shrink_slab+0x19a/0x680 mm/vmscan.c:660
  shrink_node_memcgs mm/vmscan.c:2687 [inline]
  shrink_node+0x46a/0x1ad0 mm/vmscan.c:2791
  kswapd_shrink_node mm/vmscan.c:3539 [inline]
  balance_pgdat+0x7c8/0x11f0 mm/vmscan.c:3697
  kswapd+0x5c3/0xf30 mm/vmscan.c:3948
  kthread+0x361/0x430 kernel/kthread.c:255
  ret_from_fork+0x24/0x30 arch/x86/entry/entry_64.S:352


---
This bug is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzkaller@googlegroups.com.

syzbot will keep track of this bug report. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [ashmem] possible deadlock in shmem_fallocate (4)
  2019-12-26 21:25 possible deadlock in shmem_fallocate (4) syzbot
@ 2020-03-07 21:43 ` Eric Biggers
  2020-03-08 15:05 ` Hillf Danton
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 17+ messages in thread
From: Eric Biggers @ 2020-03-07 21:43 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Arve Hjønnevåg, Todd Kjos,
	Martijn Coenen, Joel Fernandes, Christian Brauner, devel
  Cc: syzbot, akpm, hughd, linux-kernel, linux-mm, syzkaller-bugs

ashmem is calling shmem_fallocate() during memory reclaim, which can deadlock on
inode_lock().  +Cc maintainers of drivers/staging/android/ashmem.c.

On Thu, Dec 26, 2019 at 01:25:09PM -0800, syzbot wrote:
> Hello,
> 
> syzbot found the following crash on:
> 
> HEAD commit:    46cf053e Linux 5.5-rc3
> git tree:       upstream
> console output: https://syzkaller.appspot.com/x/log.txt?x=162124aee00000
> kernel config:  https://syzkaller.appspot.com/x/.config?x=ed9d672709340e35
> dashboard link: https://syzkaller.appspot.com/bug?extid=7a0d9d0b26efefe61780
> compiler:       gcc (GCC) 9.0.0 20181231 (experimental)
> 
> Unfortunately, I don't have any reproducer for this crash yet.
> 
> IMPORTANT: if you fix the bug, please add the following tag to the commit:
> Reported-by: syzbot+7a0d9d0b26efefe61780@syzkaller.appspotmail.com
> 
> ======================================================
> WARNING: possible circular locking dependency detected
> 5.5.0-rc3-syzkaller #0 Not tainted
> ------------------------------------------------------
> kswapd0/1852 is trying to acquire lock:
> ffff888098919cd8 (&sb->s_type->i_mutex_key#13){+.+.}, at: inode_lock
> include/linux/fs.h:791 [inline]
> ffff888098919cd8 (&sb->s_type->i_mutex_key#13){+.+.}, at:
> shmem_fallocate+0x15a/0xd40 mm/shmem.c:2735
> 
> but task is already holding lock:
> ffffffff89a41e00 (fs_reclaim){+.+.}, at: __fs_reclaim_acquire+0x0/0x30
> mm/page_alloc.c:4922
> 
> which lock already depends on the new lock.
> 
> 
> the existing dependency chain (in reverse order) is:
> 
> -> #1 (fs_reclaim){+.+.}:
>        __fs_reclaim_acquire mm/page_alloc.c:4084 [inline]
>        fs_reclaim_acquire.part.0+0x24/0x30 mm/page_alloc.c:4095
>        fs_reclaim_acquire mm/page_alloc.c:4695 [inline]
>        prepare_alloc_pages mm/page_alloc.c:4692 [inline]
>        __alloc_pages_nodemask+0x52d/0x910 mm/page_alloc.c:4744
>        alloc_pages_vma+0xdd/0x620 mm/mempolicy.c:2170
>        shmem_alloc_page+0xc0/0x180 mm/shmem.c:1499
>        shmem_alloc_and_acct_page+0x165/0x990 mm/shmem.c:1524
>        shmem_getpage_gfp+0x56d/0x29a0 mm/shmem.c:1838
>        shmem_getpage mm/shmem.c:154 [inline]
>        shmem_write_begin+0x105/0x1e0 mm/shmem.c:2487
>        generic_perform_write+0x23b/0x540 mm/filemap.c:3309
>        __generic_file_write_iter+0x25e/0x630 mm/filemap.c:3438
>        generic_file_write_iter+0x420/0x68e mm/filemap.c:3470
>        call_write_iter include/linux/fs.h:1902 [inline]
>        new_sync_write+0x4d3/0x770 fs/read_write.c:483
>        __vfs_write+0xe1/0x110 fs/read_write.c:496
>        vfs_write+0x268/0x5d0 fs/read_write.c:558
>        ksys_write+0x14f/0x290 fs/read_write.c:611
>        __do_sys_write fs/read_write.c:623 [inline]
>        __se_sys_write fs/read_write.c:620 [inline]
>        __x64_sys_write+0x73/0xb0 fs/read_write.c:620
>        do_syscall_64+0xfa/0x790 arch/x86/entry/common.c:294
>        entry_SYSCALL_64_after_hwframe+0x49/0xbe
> 
> -> #0 (&sb->s_type->i_mutex_key#13){+.+.}:
>        check_prev_add kernel/locking/lockdep.c:2476 [inline]
>        check_prevs_add kernel/locking/lockdep.c:2581 [inline]
>        validate_chain kernel/locking/lockdep.c:2971 [inline]
>        __lock_acquire+0x2596/0x4a00 kernel/locking/lockdep.c:3955
>        lock_acquire+0x190/0x410 kernel/locking/lockdep.c:4485
>        down_write+0x93/0x150 kernel/locking/rwsem.c:1534
>        inode_lock include/linux/fs.h:791 [inline]
>        shmem_fallocate+0x15a/0xd40 mm/shmem.c:2735
>        ashmem_shrink_scan drivers/staging/android/ashmem.c:462 [inline]
>        ashmem_shrink_scan+0x370/0x510 drivers/staging/android/ashmem.c:437
>        do_shrink_slab+0x40f/0xad0 mm/vmscan.c:526
>        shrink_slab mm/vmscan.c:687 [inline]
>        shrink_slab+0x19a/0x680 mm/vmscan.c:660
>        shrink_node_memcgs mm/vmscan.c:2687 [inline]
>        shrink_node+0x46a/0x1ad0 mm/vmscan.c:2791
>        kswapd_shrink_node mm/vmscan.c:3539 [inline]
>        balance_pgdat+0x7c8/0x11f0 mm/vmscan.c:3697
>        kswapd+0x5c3/0xf30 mm/vmscan.c:3948
>        kthread+0x361/0x430 kernel/kthread.c:255
>        ret_from_fork+0x24/0x30 arch/x86/entry/entry_64.S:352
> 
> other info that might help us debug this:
> 
>  Possible unsafe locking scenario:
> 
>        CPU0                    CPU1
>        ----                    ----
>   lock(fs_reclaim);
>                                lock(&sb->s_type->i_mutex_key#13);
>                                lock(fs_reclaim);
>   lock(&sb->s_type->i_mutex_key#13);
> 
>  *** DEADLOCK ***
> 
> 2 locks held by kswapd0/1852:
>  #0: ffffffff89a41e00 (fs_reclaim){+.+.}, at: __fs_reclaim_acquire+0x0/0x30
> mm/page_alloc.c:4922
>  #1: ffffffff89a1f948 (shrinker_rwsem){++++}, at: shrink_slab
> mm/vmscan.c:677 [inline]
>  #1: ffffffff89a1f948 (shrinker_rwsem){++++}, at: shrink_slab+0xe6/0x680
> mm/vmscan.c:660
> 
> stack backtrace:
> CPU: 0 PID: 1852 Comm: kswapd0 Not tainted 5.5.0-rc3-syzkaller #0
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
> Google 01/01/2011
> Call Trace:
>  __dump_stack lib/dump_stack.c:77 [inline]
>  dump_stack+0x197/0x210 lib/dump_stack.c:118
>  print_circular_bug.isra.0.cold+0x163/0x172 kernel/locking/lockdep.c:1685
>  check_noncircular+0x32e/0x3e0 kernel/locking/lockdep.c:1809
>  check_prev_add kernel/locking/lockdep.c:2476 [inline]
>  check_prevs_add kernel/locking/lockdep.c:2581 [inline]
>  validate_chain kernel/locking/lockdep.c:2971 [inline]
>  __lock_acquire+0x2596/0x4a00 kernel/locking/lockdep.c:3955
>  lock_acquire+0x190/0x410 kernel/locking/lockdep.c:4485
>  down_write+0x93/0x150 kernel/locking/rwsem.c:1534
>  inode_lock include/linux/fs.h:791 [inline]
>  shmem_fallocate+0x15a/0xd40 mm/shmem.c:2735
>  ashmem_shrink_scan drivers/staging/android/ashmem.c:462 [inline]
>  ashmem_shrink_scan+0x370/0x510 drivers/staging/android/ashmem.c:437
>  do_shrink_slab+0x40f/0xad0 mm/vmscan.c:526
>  shrink_slab mm/vmscan.c:687 [inline]
>  shrink_slab+0x19a/0x680 mm/vmscan.c:660
>  shrink_node_memcgs mm/vmscan.c:2687 [inline]
>  shrink_node+0x46a/0x1ad0 mm/vmscan.c:2791
>  kswapd_shrink_node mm/vmscan.c:3539 [inline]
>  balance_pgdat+0x7c8/0x11f0 mm/vmscan.c:3697
>  kswapd+0x5c3/0xf30 mm/vmscan.c:3948
>  kthread+0x361/0x430 kernel/kthread.c:255
>  ret_from_fork+0x24/0x30 arch/x86/entry/entry_64.S:352
> 
> 
> ---
> This bug is generated by a bot. It may contain errors.
> See https://goo.gl/tpsmEJ for more information about syzbot.
> syzbot engineers can be reached at syzkaller@googlegroups.com.
> 
> syzbot will keep track of this bug report. See:
> https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
> 
> -- 


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: possible deadlock in shmem_fallocate (4)
  2019-12-26 21:25 possible deadlock in shmem_fallocate (4) syzbot
  2020-03-07 21:43 ` [ashmem] " Eric Biggers
@ 2020-03-08 15:05 ` Hillf Danton
  2020-07-14  0:32 ` syzbot
  2020-07-14  3:07 ` syzbot
  3 siblings, 0 replies; 17+ messages in thread
From: Hillf Danton @ 2020-03-08 15:05 UTC (permalink / raw)
  To: syzbot; +Cc: akpm, hughd, linux-kernel, linux-mm, syzkaller-bugs


Thu, 26 Dec 2019 13:25:09 -0800
> syzbot found the following crash on:
> 
> HEAD commit:    46cf053e Linux 5.5-rc3
> git tree:       upstream
> console output: https://syzkaller.appspot.com/x/log.txt?x=162124aee00000
> kernel config:  https://syzkaller.appspot.com/x/.config?x=ed9d672709340e35
> dashboard link: https://syzkaller.appspot.com/bug?extid=7a0d9d0b26efefe61780
> compiler:       gcc (GCC) 9.0.0 20181231 (experimental)
> 
> Unfortunately, I don't have any reproducer for this crash yet.
> 
> IMPORTANT: if you fix the bug, please add the following tag to the commit:
> Reported-by: syzbot+7a0d9d0b26efefe61780@syzkaller.appspotmail.com
> 
> ======================================================
> WARNING: possible circular locking dependency detected
> 5.5.0-rc3-syzkaller #0 Not tainted
> ------------------------------------------------------
> kswapd0/1852 is trying to acquire lock:
> ffff888098919cd8 (&sb->s_type->i_mutex_key#13){+.+.}, at: inode_lock  
> include/linux/fs.h:791 [inline]
> ffff888098919cd8 (&sb->s_type->i_mutex_key#13){+.+.}, at:  
> shmem_fallocate+0x15a/0xd40 mm/shmem.c:2735
> 
> but task is already holding lock:
> ffffffff89a41e00 (fs_reclaim){+.+.}, at: __fs_reclaim_acquire+0x0/0x30  
> mm/page_alloc.c:4922
> 
> which lock already depends on the new lock.
> 
> 
> the existing dependency chain (in reverse order) is:
> 
> -> #1 (fs_reclaim){+.+.}:
>         __fs_reclaim_acquire mm/page_alloc.c:4084 [inline]
>         fs_reclaim_acquire.part.0+0x24/0x30 mm/page_alloc.c:4095
>         fs_reclaim_acquire mm/page_alloc.c:4695 [inline]
>         prepare_alloc_pages mm/page_alloc.c:4692 [inline]
>         __alloc_pages_nodemask+0x52d/0x910 mm/page_alloc.c:4744
>         alloc_pages_vma+0xdd/0x620 mm/mempolicy.c:2170
>         shmem_alloc_page+0xc0/0x180 mm/shmem.c:1499
>         shmem_alloc_and_acct_page+0x165/0x990 mm/shmem.c:1524
>         shmem_getpage_gfp+0x56d/0x29a0 mm/shmem.c:1838
>         shmem_getpage mm/shmem.c:154 [inline]
>         shmem_write_begin+0x105/0x1e0 mm/shmem.c:2487
>         generic_perform_write+0x23b/0x540 mm/filemap.c:3309
>         __generic_file_write_iter+0x25e/0x630 mm/filemap.c:3438
>         generic_file_write_iter+0x420/0x68e mm/filemap.c:3470
>         call_write_iter include/linux/fs.h:1902 [inline]
>         new_sync_write+0x4d3/0x770 fs/read_write.c:483
>         __vfs_write+0xe1/0x110 fs/read_write.c:496
>         vfs_write+0x268/0x5d0 fs/read_write.c:558
>         ksys_write+0x14f/0x290 fs/read_write.c:611
>         __do_sys_write fs/read_write.c:623 [inline]
>         __se_sys_write fs/read_write.c:620 [inline]
>         __x64_sys_write+0x73/0xb0 fs/read_write.c:620
>         do_syscall_64+0xfa/0x790 arch/x86/entry/common.c:294
>         entry_SYSCALL_64_after_hwframe+0x49/0xbe
> 
> -> #0 (&sb->s_type->i_mutex_key#13){+.+.}:
>         check_prev_add kernel/locking/lockdep.c:2476 [inline]
>         check_prevs_add kernel/locking/lockdep.c:2581 [inline]
>         validate_chain kernel/locking/lockdep.c:2971 [inline]
>         __lock_acquire+0x2596/0x4a00 kernel/locking/lockdep.c:3955
>         lock_acquire+0x190/0x410 kernel/locking/lockdep.c:4485
>         down_write+0x93/0x150 kernel/locking/rwsem.c:1534
>         inode_lock include/linux/fs.h:791 [inline]
>         shmem_fallocate+0x15a/0xd40 mm/shmem.c:2735
>         ashmem_shrink_scan drivers/staging/android/ashmem.c:462 [inline]
>         ashmem_shrink_scan+0x370/0x510 drivers/staging/android/ashmem.c:437
>         do_shrink_slab+0x40f/0xad0 mm/vmscan.c:526
>         shrink_slab mm/vmscan.c:687 [inline]
>         shrink_slab+0x19a/0x680 mm/vmscan.c:660
>         shrink_node_memcgs mm/vmscan.c:2687 [inline]
>         shrink_node+0x46a/0x1ad0 mm/vmscan.c:2791
>         kswapd_shrink_node mm/vmscan.c:3539 [inline]
>         balance_pgdat+0x7c8/0x11f0 mm/vmscan.c:3697
>         kswapd+0x5c3/0xf30 mm/vmscan.c:3948
>         kthread+0x361/0x430 kernel/kthread.c:255
>         ret_from_fork+0x24/0x30 arch/x86/entry/entry_64.S:352
> 
> other info that might help us debug this:
> 
>   Possible unsafe locking scenario:
> 
>         CPU0                    CPU1
>         ----                    ----
>    lock(fs_reclaim);
>                                 lock(&sb->s_type->i_mutex_key#13);
>                                 lock(fs_reclaim);
>    lock(&sb->s_type->i_mutex_key#13);
> 
>   *** DEADLOCK ***
> 
> 2 locks held by kswapd0/1852:
>   #0: ffffffff89a41e00 (fs_reclaim){+.+.}, at: __fs_reclaim_acquire+0x0/0x30  mm/page_alloc.c:4922
>   #1: ffffffff89a1f948 (shrinker_rwsem){++++}, at: shrink_slab   mm/vmscan.c:677 [inline]
>   #1: ffffffff89a1f948 (shrinker_rwsem){++++}, at: shrink_slab+0xe6/0x680   mm/vmscan.c:660
> 
> stack backtrace:
> CPU: 0 PID: 1852 Comm: kswapd0 Not tainted 5.5.0-rc3-syzkaller #0
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS  
> Google 01/01/2011
> Call Trace:
>   __dump_stack lib/dump_stack.c:77 [inline]
>   dump_stack+0x197/0x210 lib/dump_stack.c:118
>   print_circular_bug.isra.0.cold+0x163/0x172 kernel/locking/lockdep.c:1685
>   check_noncircular+0x32e/0x3e0 kernel/locking/lockdep.c:1809
>   check_prev_add kernel/locking/lockdep.c:2476 [inline]
>   check_prevs_add kernel/locking/lockdep.c:2581 [inline]
>   validate_chain kernel/locking/lockdep.c:2971 [inline]
>   __lock_acquire+0x2596/0x4a00 kernel/locking/lockdep.c:3955
>   lock_acquire+0x190/0x410 kernel/locking/lockdep.c:4485
>   down_write+0x93/0x150 kernel/locking/rwsem.c:1534
>   inode_lock include/linux/fs.h:791 [inline]
>   shmem_fallocate+0x15a/0xd40 mm/shmem.c:2735
>   ashmem_shrink_scan drivers/staging/android/ashmem.c:462 [inline]
>   ashmem_shrink_scan+0x370/0x510 drivers/staging/android/ashmem.c:437
>   do_shrink_slab+0x40f/0xad0 mm/vmscan.c:526
>   shrink_slab mm/vmscan.c:687 [inline]
>   shrink_slab+0x19a/0x680 mm/vmscan.c:660
>   shrink_node_memcgs mm/vmscan.c:2687 [inline]
>   shrink_node+0x46a/0x1ad0 mm/vmscan.c:2791
>   kswapd_shrink_node mm/vmscan.c:3539 [inline]
>   balance_pgdat+0x7c8/0x11f0 mm/vmscan.c:3697
>   kswapd+0x5c3/0xf30 mm/vmscan.c:3948
>   kthread+0x361/0x430 kernel/kthread.c:255
>   ret_from_fork+0x24/0x30 arch/x86/entry/entry_64.S:352

Simply move punch-hole out of page reclaiming context to avoid deadlock.

--- a/drivers/staging/android/ashmem.c
+++ b/drivers/staging/android/ashmem.c
@@ -24,6 +24,7 @@
 #include <linux/bitops.h>
 #include <linux/mutex.h>
 #include <linux/shmem_fs.h>
+#include <linux/workqueue.h>
 #include "ashmem.h"
 
 #define ASHMEM_NAME_PREFIX "dev/ashmem/"
@@ -70,6 +71,7 @@ struct ashmem_range {
 	size_t pgstart;
 	size_t pgend;
 	unsigned int purged;
+	struct work_struct work;
 };
 
 /* LRU list of unpinned pages, protected by ashmem_mutex */
@@ -201,6 +203,7 @@ static void range_del(struct ashmem_rang
 	list_del(&range->unpinned);
 	if (range_on_lru(range))
 		lru_del(range);
+	flush_work(&range->work);
 	kmem_cache_free(ashmem_range_cachep, range);
 }
 
@@ -419,6 +422,21 @@ out:
 	return ret;
 }
 
+static void ashmem_shrink_workfn(struct work_struct *__work)
+{
+	struct ashmem_range *range = container_of(__work, 
+					struct ashmem_range, work);
+	loff_t start = range->pgstart * PAGE_SIZE;
+	loff_t end = (range->pgend + 1) * PAGE_SIZE;
+	struct file *f = range->asma->file;
+
+	f->f_op->fallocate(f, FALLOC_FL_PUNCH_HOLE | FALLOC_FL_KEEP_SIZE,
+				start, end - start);
+	fput(f);
+	if (atomic_dec_and_test(&ashmem_shrink_inflight))
+		wake_up_all(&ashmem_shrink_wait);
+}
+
 /*
  * ashmem_shrink - our cache shrinker, called from mm/vmscan.c
  *
@@ -448,8 +466,6 @@ ashmem_shrink_scan(struct shrinker *shri
 	while (!list_empty(&ashmem_lru_list)) {
 		struct ashmem_range *range =
 			list_first_entry(&ashmem_lru_list, typeof(*range), lru);
-		loff_t start = range->pgstart * PAGE_SIZE;
-		loff_t end = (range->pgend + 1) * PAGE_SIZE;
 		struct file *f = range->asma->file;
 
 		get_file(f);
@@ -459,12 +475,9 @@ ashmem_shrink_scan(struct shrinker *shri
 
 		freed += range_size(range);
 		mutex_unlock(&ashmem_mutex);
-		f->f_op->fallocate(f,
-				   FALLOC_FL_PUNCH_HOLE | FALLOC_FL_KEEP_SIZE,
-				   start, end - start);
-		fput(f);
-		if (atomic_dec_and_test(&ashmem_shrink_inflight))
-			wake_up_all(&ashmem_shrink_wait);
+
+		queue_work(system_unbound_wq, &range->work);
+
 		if (!mutex_trylock(&ashmem_mutex))
 			goto out;
 		if (--sc->nr_to_scan <= 0)
@@ -729,6 +742,7 @@ static int ashmem_pin_unpin(struct ashme
 		range = kmem_cache_zalloc(ashmem_range_cachep, GFP_KERNEL);
 		if (!range)
 			return -ENOMEM;
+		INIT_WORK(&range->work, ashmem_shrink_workfn);
 	}
 
 	mutex_lock(&ashmem_mutex);



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: possible deadlock in shmem_fallocate (4)
  2019-12-26 21:25 possible deadlock in shmem_fallocate (4) syzbot
  2020-03-07 21:43 ` [ashmem] " Eric Biggers
  2020-03-08 15:05 ` Hillf Danton
@ 2020-07-14  0:32 ` syzbot
  2020-07-14  3:32   ` Hillf Danton
  2020-07-14  3:07 ` syzbot
  3 siblings, 1 reply; 17+ messages in thread
From: syzbot @ 2020-07-14  0:32 UTC (permalink / raw)
  To: akpm, arve, christian, devel, ebiggers, gregkh, hdanton, hughd,
	joel, linux-kernel, linux-mm, maco, syzkaller-bugs, tkjos

syzbot has found a reproducer for the following crash on:

HEAD commit:    11ba4688 Linux 5.8-rc5
git tree:       upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=13f1bf47100000
kernel config:  https://syzkaller.appspot.com/x/.config?x=a160d1053fc89af5
dashboard link: https://syzkaller.appspot.com/bug?extid=7a0d9d0b26efefe61780
compiler:       gcc (GCC) 10.1.0-syz 20200507
syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=1181004f100000

IMPORTANT: if you fix the bug, please add the following tag to the commit:
Reported-by: syzbot+7a0d9d0b26efefe61780@syzkaller.appspotmail.com

======================================================
WARNING: possible circular locking dependency detected
5.8.0-rc5-syzkaller #0 Not tainted
------------------------------------------------------
khugepaged/1157 is trying to acquire lock:
ffff88809272e910 (&sb->s_type->i_mutex_key#15){+.+.}-{3:3}, at: inode_lock include/linux/fs.h:800 [inline]
ffff88809272e910 (&sb->s_type->i_mutex_key#15){+.+.}-{3:3}, at: shmem_fallocate+0x153/0xd90 mm/shmem.c:2707

but task is already holding lock:
ffffffff89c6c260 (fs_reclaim){+.+.}-{0:0}, at: fs_reclaim_release mm/page_alloc.c:4202 [inline]
ffffffff89c6c260 (fs_reclaim){+.+.}-{0:0}, at: fs_reclaim_release mm/page_alloc.c:4198 [inline]
ffffffff89c6c260 (fs_reclaim){+.+.}-{0:0}, at: __perform_reclaim mm/page_alloc.c:4227 [inline]
ffffffff89c6c260 (fs_reclaim){+.+.}-{0:0}, at: __alloc_pages_direct_reclaim mm/page_alloc.c:4244 [inline]
ffffffff89c6c260 (fs_reclaim){+.+.}-{0:0}, at: __alloc_pages_slowpath.constprop.0+0x1554/0x2780 mm/page_alloc.c:4650

which lock already depends on the new lock.


the existing dependency chain (in reverse order) is:

-> #1 (fs_reclaim){+.+.}-{0:0}:
       __fs_reclaim_acquire mm/page_alloc.c:4183 [inline]
       fs_reclaim_acquire mm/page_alloc.c:4194 [inline]
       prepare_alloc_pages mm/page_alloc.c:4780 [inline]
       __alloc_pages_nodemask+0x3d1/0x930 mm/page_alloc.c:4832
       alloc_pages_vma+0xdd/0x720 mm/mempolicy.c:2255
       shmem_alloc_page+0x11f/0x1f0 mm/shmem.c:1502
       shmem_alloc_and_acct_page+0x161/0x8a0 mm/shmem.c:1527
       shmem_getpage_gfp+0x511/0x2450 mm/shmem.c:1823
       shmem_getpage mm/shmem.c:153 [inline]
       shmem_write_begin+0xf9/0x1d0 mm/shmem.c:2459
       generic_perform_write+0x20a/0x4f0 mm/filemap.c:3318
       __generic_file_write_iter+0x24b/0x610 mm/filemap.c:3447
       generic_file_write_iter+0x3a6/0x5c0 mm/filemap.c:3479
       call_write_iter include/linux/fs.h:1908 [inline]
       new_sync_write+0x422/0x650 fs/read_write.c:503
       vfs_write+0x59d/0x6b0 fs/read_write.c:578
       ksys_write+0x12d/0x250 fs/read_write.c:631
       do_syscall_64+0x60/0xe0 arch/x86/entry/common.c:384
       entry_SYSCALL_64_after_hwframe+0x44/0xa9

-> #0 (&sb->s_type->i_mutex_key#15){+.+.}-{3:3}:
       check_prev_add kernel/locking/lockdep.c:2496 [inline]
       check_prevs_add kernel/locking/lockdep.c:2601 [inline]
       validate_chain kernel/locking/lockdep.c:3218 [inline]
       __lock_acquire+0x2acb/0x56e0 kernel/locking/lockdep.c:4380
       lock_acquire+0x1f1/0xad0 kernel/locking/lockdep.c:4959
       down_write+0x8d/0x150 kernel/locking/rwsem.c:1531
       inode_lock include/linux/fs.h:800 [inline]
       shmem_fallocate+0x153/0xd90 mm/shmem.c:2707
       ashmem_shrink_scan.part.0+0x2e9/0x490 drivers/staging/android/ashmem.c:490
       ashmem_shrink_scan+0x6c/0xa0 drivers/staging/android/ashmem.c:473
       do_shrink_slab+0x3c6/0xab0 mm/vmscan.c:518
       shrink_slab+0x16f/0x5c0 mm/vmscan.c:679
       shrink_node_memcgs mm/vmscan.c:2658 [inline]
       shrink_node+0x519/0x1b60 mm/vmscan.c:2770
       shrink_zones mm/vmscan.c:2973 [inline]
       do_try_to_free_pages+0x38b/0x1340 mm/vmscan.c:3026
       try_to_free_pages+0x29a/0x8b0 mm/vmscan.c:3265
       __perform_reclaim mm/page_alloc.c:4223 [inline]
       __alloc_pages_direct_reclaim mm/page_alloc.c:4244 [inline]
       __alloc_pages_slowpath.constprop.0+0x949/0x2780 mm/page_alloc.c:4650
       __alloc_pages_nodemask+0x68f/0x930 mm/page_alloc.c:4863
       __alloc_pages include/linux/gfp.h:509 [inline]
       __alloc_pages_node include/linux/gfp.h:522 [inline]
       khugepaged_alloc_page+0xa0/0x170 mm/khugepaged.c:867
       collapse_huge_page mm/khugepaged.c:1056 [inline]
       khugepaged_scan_pmd mm/khugepaged.c:1349 [inline]
       khugepaged_scan_mm_slot mm/khugepaged.c:2110 [inline]
       khugepaged_do_scan mm/khugepaged.c:2193 [inline]
       khugepaged+0x3093/0x5a10 mm/khugepaged.c:2238
       kthread+0x3b5/0x4a0 kernel/kthread.c:291
       ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:293

other info that might help us debug this:

 Possible unsafe locking scenario:

       CPU0                    CPU1
       ----                    ----
  lock(fs_reclaim);
                               lock(&sb->s_type->i_mutex_key#15);
                               lock(fs_reclaim);
  lock(&sb->s_type->i_mutex_key#15);

 *** DEADLOCK ***

2 locks held by khugepaged/1157:
 #0: ffffffff89c6c260 (fs_reclaim){+.+.}-{0:0}, at: fs_reclaim_release mm/page_alloc.c:4202 [inline]
 #0: ffffffff89c6c260 (fs_reclaim){+.+.}-{0:0}, at: fs_reclaim_release mm/page_alloc.c:4198 [inline]
 #0: ffffffff89c6c260 (fs_reclaim){+.+.}-{0:0}, at: __perform_reclaim mm/page_alloc.c:4227 [inline]
 #0: ffffffff89c6c260 (fs_reclaim){+.+.}-{0:0}, at: __alloc_pages_direct_reclaim mm/page_alloc.c:4244 [inline]
 #0: ffffffff89c6c260 (fs_reclaim){+.+.}-{0:0}, at: __alloc_pages_slowpath.constprop.0+0x1554/0x2780 mm/page_alloc.c:4650
 #1: ffffffff89c46a90 (shrinker_rwsem){++++}-{3:3}, at: shrink_slab+0xc7/0x5c0 mm/vmscan.c:669

stack backtrace:
CPU: 0 PID: 1157 Comm: khugepaged Not tainted 5.8.0-rc5-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
 __dump_stack lib/dump_stack.c:77 [inline]
 dump_stack+0x18f/0x20d lib/dump_stack.c:118
 check_noncircular+0x324/0x3e0 kernel/locking/lockdep.c:1827
 check_prev_add kernel/locking/lockdep.c:2496 [inline]
 check_prevs_add kernel/locking/lockdep.c:2601 [inline]
 validate_chain kernel/locking/lockdep.c:3218 [inline]
 __lock_acquire+0x2acb/0x56e0 kernel/locking/lockdep.c:4380
 lock_acquire+0x1f1/0xad0 kernel/locking/lockdep.c:4959
 down_write+0x8d/0x150 kernel/locking/rwsem.c:1531
 inode_lock include/linux/fs.h:800 [inline]
 shmem_fallocate+0x153/0xd90 mm/shmem.c:2707
 ashmem_shrink_scan.part.0+0x2e9/0x490 drivers/staging/android/ashmem.c:490
 ashmem_shrink_scan+0x6c/0xa0 drivers/staging/android/ashmem.c:473
 do_shrink_slab+0x3c6/0xab0 mm/vmscan.c:518
 shrink_slab+0x16f/0x5c0 mm/vmscan.c:679
 shrink_node_memcgs mm/vmscan.c:2658 [inline]
 shrink_node+0x519/0x1b60 mm/vmscan.c:2770
 shrink_zones mm/vmscan.c:2973 [inline]
 do_try_to_free_pages+0x38b/0x1340 mm/vmscan.c:3026
 try_to_free_pages+0x29a/0x8b0 mm/vmscan.c:3265
 __perform_reclaim mm/page_alloc.c:4223 [inline]
 __alloc_pages_direct_reclaim mm/page_alloc.c:4244 [inline]
 __alloc_pages_slowpath.constprop.0+0x949/0x2780 mm/page_alloc.c:4650
 __alloc_pages_nodemask+0x68f/0x930 mm/page_alloc.c:4863
 __alloc_pages include/linux/gfp.h:509 [inline]
 __alloc_pages_node include/linux/gfp.h:522 [inline]
 khugepaged_alloc_page+0xa0/0x170 mm/khugepaged.c:867
 collapse_huge_page mm/khugepaged.c:1056 [inline]
 khugepaged_scan_pmd mm/khugepaged.c:1349 [inline]
 khugepaged_scan_mm_slot mm/khugepaged.c:2110 [inline]
 khugepaged_do_scan mm/khugepaged.c:2193 [inline]
 khugepaged+0x3093/0x5a10 mm/khugepaged.c:2238
 kthread+0x3b5/0x4a0 kernel/kthread.c:291
 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:293



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: possible deadlock in shmem_fallocate (4)
  2019-12-26 21:25 possible deadlock in shmem_fallocate (4) syzbot
                   ` (2 preceding siblings ...)
  2020-07-14  0:32 ` syzbot
@ 2020-07-14  3:07 ` syzbot
  3 siblings, 0 replies; 17+ messages in thread
From: syzbot @ 2020-07-14  3:07 UTC (permalink / raw)
  To: akpm, arve, christian, devel, ebiggers, gregkh, hdanton, hughd,
	joel, linux-kernel, linux-mm, maco, syzkaller-bugs, tkjos

syzbot has found a reproducer for the following crash on:

HEAD commit:    11ba4688 Linux 5.8-rc5
git tree:       upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=175391fb100000
kernel config:  https://syzkaller.appspot.com/x/.config?x=a160d1053fc89af5
dashboard link: https://syzkaller.appspot.com/bug?extid=7a0d9d0b26efefe61780
compiler:       gcc (GCC) 10.1.0-syz 20200507
syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=104d9c77100000
C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=1343e95d100000

IMPORTANT: if you fix the bug, please add the following tag to the commit:
Reported-by: syzbot+7a0d9d0b26efefe61780@syzkaller.appspotmail.com

======================================================
WARNING: possible circular locking dependency detected
5.8.0-rc5-syzkaller #0 Not tainted
------------------------------------------------------
khugepaged/1158 is trying to acquire lock:
ffff8882071865b0 (&sb->s_type->i_mutex_key#15){+.+.}-{3:3}, at: inode_lock include/linux/fs.h:800 [inline]
ffff8882071865b0 (&sb->s_type->i_mutex_key#15){+.+.}-{3:3}, at: shmem_fallocate+0x153/0xd90 mm/shmem.c:2707

but task is already holding lock:
ffffffff89c6c260 (fs_reclaim){+.+.}-{0:0}, at: fs_reclaim_release mm/page_alloc.c:4202 [inline]
ffffffff89c6c260 (fs_reclaim){+.+.}-{0:0}, at: fs_reclaim_release mm/page_alloc.c:4198 [inline]
ffffffff89c6c260 (fs_reclaim){+.+.}-{0:0}, at: __perform_reclaim mm/page_alloc.c:4227 [inline]
ffffffff89c6c260 (fs_reclaim){+.+.}-{0:0}, at: __alloc_pages_direct_reclaim mm/page_alloc.c:4244 [inline]
ffffffff89c6c260 (fs_reclaim){+.+.}-{0:0}, at: __alloc_pages_slowpath.constprop.0+0x1554/0x2780 mm/page_alloc.c:4650

which lock already depends on the new lock.


the existing dependency chain (in reverse order) is:

-> #1 (fs_reclaim){+.+.}-{0:0}:
       __fs_reclaim_acquire mm/page_alloc.c:4183 [inline]
       fs_reclaim_acquire mm/page_alloc.c:4194 [inline]
       prepare_alloc_pages mm/page_alloc.c:4780 [inline]
       __alloc_pages_nodemask+0x3d1/0x930 mm/page_alloc.c:4832
       alloc_pages_vma+0xdd/0x720 mm/mempolicy.c:2255
       shmem_alloc_page+0x11f/0x1f0 mm/shmem.c:1502
       shmem_alloc_and_acct_page+0x161/0x8a0 mm/shmem.c:1527
       shmem_getpage_gfp+0x511/0x2450 mm/shmem.c:1823
       shmem_getpage mm/shmem.c:153 [inline]
       shmem_write_begin+0xf9/0x1d0 mm/shmem.c:2459
       generic_perform_write+0x20a/0x4f0 mm/filemap.c:3318
       __generic_file_write_iter+0x24b/0x610 mm/filemap.c:3447
       generic_file_write_iter+0x3a6/0x5c0 mm/filemap.c:3479
       call_write_iter include/linux/fs.h:1908 [inline]
       new_sync_write+0x422/0x650 fs/read_write.c:503
       vfs_write+0x59d/0x6b0 fs/read_write.c:578
       ksys_write+0x12d/0x250 fs/read_write.c:631
       do_syscall_64+0x60/0xe0 arch/x86/entry/common.c:384
       entry_SYSCALL_64_after_hwframe+0x44/0xa9

-> #0 (&sb->s_type->i_mutex_key#15){+.+.}-{3:3}:
       check_prev_add kernel/locking/lockdep.c:2496 [inline]
       check_prevs_add kernel/locking/lockdep.c:2601 [inline]
       validate_chain kernel/locking/lockdep.c:3218 [inline]
       __lock_acquire+0x2acb/0x56e0 kernel/locking/lockdep.c:4380
       lock_acquire+0x1f1/0xad0 kernel/locking/lockdep.c:4959
       down_write+0x8d/0x150 kernel/locking/rwsem.c:1531
       inode_lock include/linux/fs.h:800 [inline]
       shmem_fallocate+0x153/0xd90 mm/shmem.c:2707
       ashmem_shrink_scan.part.0+0x2e9/0x490 drivers/staging/android/ashmem.c:490
       ashmem_shrink_scan+0x6c/0xa0 drivers/staging/android/ashmem.c:473
       do_shrink_slab+0x3c6/0xab0 mm/vmscan.c:518
       shrink_slab+0x16f/0x5c0 mm/vmscan.c:679
       shrink_node_memcgs mm/vmscan.c:2658 [inline]
       shrink_node+0x519/0x1b60 mm/vmscan.c:2770
       shrink_zones mm/vmscan.c:2973 [inline]
       do_try_to_free_pages+0x38b/0x1340 mm/vmscan.c:3026
       try_to_free_pages+0x29a/0x8b0 mm/vmscan.c:3265
       __perform_reclaim mm/page_alloc.c:4223 [inline]
       __alloc_pages_direct_reclaim mm/page_alloc.c:4244 [inline]
       __alloc_pages_slowpath.constprop.0+0x949/0x2780 mm/page_alloc.c:4650
       __alloc_pages_nodemask+0x68f/0x930 mm/page_alloc.c:4863
       __alloc_pages include/linux/gfp.h:509 [inline]
       __alloc_pages_node include/linux/gfp.h:522 [inline]
       khugepaged_alloc_page+0xa0/0x170 mm/khugepaged.c:867
       collapse_huge_page mm/khugepaged.c:1056 [inline]
       khugepaged_scan_pmd mm/khugepaged.c:1349 [inline]
       khugepaged_scan_mm_slot mm/khugepaged.c:2110 [inline]
       khugepaged_do_scan mm/khugepaged.c:2193 [inline]
       khugepaged+0x3093/0x5a10 mm/khugepaged.c:2238
       kthread+0x3b5/0x4a0 kernel/kthread.c:291
       ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:293

other info that might help us debug this:

 Possible unsafe locking scenario:

       CPU0                    CPU1
       ----                    ----
  lock(fs_reclaim);
                               lock(&sb->s_type->i_mutex_key#15);
                               lock(fs_reclaim);
  lock(&sb->s_type->i_mutex_key#15);

 *** DEADLOCK ***

2 locks held by khugepaged/1158:
 #0: ffffffff89c6c260 (fs_reclaim){+.+.}-{0:0}, at: fs_reclaim_release mm/page_alloc.c:4202 [inline]
 #0: ffffffff89c6c260 (fs_reclaim){+.+.}-{0:0}, at: fs_reclaim_release mm/page_alloc.c:4198 [inline]
 #0: ffffffff89c6c260 (fs_reclaim){+.+.}-{0:0}, at: __perform_reclaim mm/page_alloc.c:4227 [inline]
 #0: ffffffff89c6c260 (fs_reclaim){+.+.}-{0:0}, at: __alloc_pages_direct_reclaim mm/page_alloc.c:4244 [inline]
 #0: ffffffff89c6c260 (fs_reclaim){+.+.}-{0:0}, at: __alloc_pages_slowpath.constprop.0+0x1554/0x2780 mm/page_alloc.c:4650
 #1: ffffffff89c46a90 (shrinker_rwsem){++++}-{3:3}, at: shrink_slab+0xc7/0x5c0 mm/vmscan.c:669

stack backtrace:
CPU: 1 PID: 1158 Comm: khugepaged Not tainted 5.8.0-rc5-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
 __dump_stack lib/dump_stack.c:77 [inline]
 dump_stack+0x18f/0x20d lib/dump_stack.c:118
 check_noncircular+0x324/0x3e0 kernel/locking/lockdep.c:1827
 check_prev_add kernel/locking/lockdep.c:2496 [inline]
 check_prevs_add kernel/locking/lockdep.c:2601 [inline]
 validate_chain kernel/locking/lockdep.c:3218 [inline]
 __lock_acquire+0x2acb/0x56e0 kernel/locking/lockdep.c:4380
 lock_acquire+0x1f1/0xad0 kernel/locking/lockdep.c:4959
 down_write+0x8d/0x150 kernel/locking/rwsem.c:1531
 inode_lock include/linux/fs.h:800 [inline]
 shmem_fallocate+0x153/0xd90 mm/shmem.c:2707
 ashmem_shrink_scan.part.0+0x2e9/0x490 drivers/staging/android/ashmem.c:490
 ashmem_shrink_scan+0x6c/0xa0 drivers/staging/android/ashmem.c:473
 do_shrink_slab+0x3c6/0xab0 mm/vmscan.c:518
 shrink_slab+0x16f/0x5c0 mm/vmscan.c:679
 shrink_node_memcgs mm/vmscan.c:2658 [inline]
 shrink_node+0x519/0x1b60 mm/vmscan.c:2770
 shrink_zones mm/vmscan.c:2973 [inline]
 do_try_to_free_pages+0x38b/0x1340 mm/vmscan.c:3026
 try_to_free_pages+0x29a/0x8b0 mm/vmscan.c:3265
 __perform_reclaim mm/page_alloc.c:4223 [inline]
 __alloc_pages_direct_reclaim mm/page_alloc.c:4244 [inline]
 __alloc_pages_slowpath.constprop.0+0x949/0x2780 mm/page_alloc.c:4650
 __alloc_pages_nodemask+0x68f/0x930 mm/page_alloc.c:4863
 __alloc_pages include/linux/gfp.h:509 [inline]
 __alloc_pages_node include/linux/gfp.h:522 [inline]
 khugepaged_alloc_page+0xa0/0x170 mm/khugepaged.c:867
 collapse_huge_page mm/khugepaged.c:1056 [inline]
 khugepaged_scan_pmd mm/khugepaged.c:1349 [inline]
 khugepaged_scan_mm_slot mm/khugepaged.c:2110 [inline]
 khugepaged_do_scan mm/khugepaged.c:2193 [inline]
 khugepaged+0x3093/0x5a10 mm/khugepaged.c:2238
 kthread+0x3b5/0x4a0 kernel/kthread.c:291
 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:293



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: possible deadlock in shmem_fallocate (4)
  2020-07-14  0:32 ` syzbot
@ 2020-07-14  3:32   ` Hillf Danton
  2020-07-14  3:41     ` Eric Biggers
  0 siblings, 1 reply; 17+ messages in thread
From: Hillf Danton @ 2020-07-14  3:32 UTC (permalink / raw)
  To: syzbot
  Cc: akpm, arve, christian, devel, ebiggers, gregkh, hughd, joel,
	linux-kernel, linux-mm, maco, syzkaller-bugs, tkjos,
	Markus Elfring, Hillf Danton


Mon, 13 Jul 2020 17:32:19 -0700
> syzbot has found a reproducer for the following crash on:
> 
> HEAD commit:    11ba4688 Linux 5.8-rc5
> git tree:       upstream
> console output: https://syzkaller.appspot.com/x/log.txt?x=13f1bf47100000
> kernel config:  https://syzkaller.appspot.com/x/.config?x=a160d1053fc89af5
> dashboard link: https://syzkaller.appspot.com/bug?extid=7a0d9d0b26efefe61780
> compiler:       gcc (GCC) 10.1.0-syz 20200507
> syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=1181004f100000
> 
> IMPORTANT: if you fix the bug, please add the following tag to the commit:
> Reported-by: syzbot+7a0d9d0b26efefe61780@syzkaller.appspotmail.com
> 
> ======================================================
> WARNING: possible circular locking dependency detected
> 5.8.0-rc5-syzkaller #0 Not tainted
> ------------------------------------------------------
> khugepaged/1157 is trying to acquire lock:
> ffff88809272e910 (&sb->s_type->i_mutex_key#15){+.+.}-{3:3}, at: inode_lock include/linux/fs.h:800 [inline]
> ffff88809272e910 (&sb->s_type->i_mutex_key#15){+.+.}-{3:3}, at: shmem_fallocate+0x153/0xd90 mm/shmem.c:2707
> 
> but task is already holding lock:
> ffffffff89c6c260 (fs_reclaim){+.+.}-{0:0}, at: fs_reclaim_release mm/page_alloc.c:4202 [inline]
> ffffffff89c6c260 (fs_reclaim){+.+.}-{0:0}, at: fs_reclaim_release mm/page_alloc.c:4198 [inline]
> ffffffff89c6c260 (fs_reclaim){+.+.}-{0:0}, at: __perform_reclaim mm/page_alloc.c:4227 [inline]
> ffffffff89c6c260 (fs_reclaim){+.+.}-{0:0}, at: __alloc_pages_direct_reclaim mm/page_alloc.c:4244 [inline]
> ffffffff89c6c260 (fs_reclaim){+.+.}-{0:0}, at: __alloc_pages_slowpath.constprop.0+0x1554/0x2780 mm/page_alloc.c:4650
> 
> which lock already depends on the new lock.
> 
> 
> the existing dependency chain (in reverse order) is:
> 
> -> #1 (fs_reclaim){+.+.}-{0:0}:
>        __fs_reclaim_acquire mm/page_alloc.c:4183 [inline]
>        fs_reclaim_acquire mm/page_alloc.c:4194 [inline]
>        prepare_alloc_pages mm/page_alloc.c:4780 [inline]
>        __alloc_pages_nodemask+0x3d1/0x930 mm/page_alloc.c:4832
>        alloc_pages_vma+0xdd/0x720 mm/mempolicy.c:2255
>        shmem_alloc_page+0x11f/0x1f0 mm/shmem.c:1502
>        shmem_alloc_and_acct_page+0x161/0x8a0 mm/shmem.c:1527
>        shmem_getpage_gfp+0x511/0x2450 mm/shmem.c:1823
>        shmem_getpage mm/shmem.c:153 [inline]
>        shmem_write_begin+0xf9/0x1d0 mm/shmem.c:2459
>        generic_perform_write+0x20a/0x4f0 mm/filemap.c:3318
>        __generic_file_write_iter+0x24b/0x610 mm/filemap.c:3447
>        generic_file_write_iter+0x3a6/0x5c0 mm/filemap.c:3479
>        call_write_iter include/linux/fs.h:1908 [inline]
>        new_sync_write+0x422/0x650 fs/read_write.c:503
>        vfs_write+0x59d/0x6b0 fs/read_write.c:578
>        ksys_write+0x12d/0x250 fs/read_write.c:631
>        do_syscall_64+0x60/0xe0 arch/x86/entry/common.c:384
>        entry_SYSCALL_64_after_hwframe+0x44/0xa9
> 
> -> #0 (&sb->s_type->i_mutex_key#15){+.+.}-{3:3}:
>        check_prev_add kernel/locking/lockdep.c:2496 [inline]
>        check_prevs_add kernel/locking/lockdep.c:2601 [inline]
>        validate_chain kernel/locking/lockdep.c:3218 [inline]
>        __lock_acquire+0x2acb/0x56e0 kernel/locking/lockdep.c:4380
>        lock_acquire+0x1f1/0xad0 kernel/locking/lockdep.c:4959
>        down_write+0x8d/0x150 kernel/locking/rwsem.c:1531
>        inode_lock include/linux/fs.h:800 [inline]
>        shmem_fallocate+0x153/0xd90 mm/shmem.c:2707
>        ashmem_shrink_scan.part.0+0x2e9/0x490 drivers/staging/android/ashmem.c:490
>        ashmem_shrink_scan+0x6c/0xa0 drivers/staging/android/ashmem.c:473
>        do_shrink_slab+0x3c6/0xab0 mm/vmscan.c:518
>        shrink_slab+0x16f/0x5c0 mm/vmscan.c:679
>        shrink_node_memcgs mm/vmscan.c:2658 [inline]
>        shrink_node+0x519/0x1b60 mm/vmscan.c:2770
>        shrink_zones mm/vmscan.c:2973 [inline]
>        do_try_to_free_pages+0x38b/0x1340 mm/vmscan.c:3026
>        try_to_free_pages+0x29a/0x8b0 mm/vmscan.c:3265
>        __perform_reclaim mm/page_alloc.c:4223 [inline]
>        __alloc_pages_direct_reclaim mm/page_alloc.c:4244 [inline]
>        __alloc_pages_slowpath.constprop.0+0x949/0x2780 mm/page_alloc.c:4650
>        __alloc_pages_nodemask+0x68f/0x930 mm/page_alloc.c:4863
>        __alloc_pages include/linux/gfp.h:509 [inline]
>        __alloc_pages_node include/linux/gfp.h:522 [inline]
>        khugepaged_alloc_page+0xa0/0x170 mm/khugepaged.c:867
>        collapse_huge_page mm/khugepaged.c:1056 [inline]
>        khugepaged_scan_pmd mm/khugepaged.c:1349 [inline]
>        khugepaged_scan_mm_slot mm/khugepaged.c:2110 [inline]
>        khugepaged_do_scan mm/khugepaged.c:2193 [inline]
>        khugepaged+0x3093/0x5a10 mm/khugepaged.c:2238
>        kthread+0x3b5/0x4a0 kernel/kthread.c:291
>        ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:293
> 
> other info that might help us debug this:
> 
>  Possible unsafe locking scenario:
> 
>        CPU0                    CPU1
>        ----                    ----
>   lock(fs_reclaim);
>                                lock(&sb->s_type->i_mutex_key#15);
>                                lock(fs_reclaim);
>   lock(&sb->s_type->i_mutex_key#15);
> 
>  *** DEADLOCK ***
> 
> 2 locks held by khugepaged/1157:
>  #0: ffffffff89c6c260 (fs_reclaim){+.+.}-{0:0}, at: fs_reclaim_release mm/page_alloc.c:4202 [inline]
>  #0: ffffffff89c6c260 (fs_reclaim){+.+.}-{0:0}, at: fs_reclaim_release mm/page_alloc.c:4198 [inline]
>  #0: ffffffff89c6c260 (fs_reclaim){+.+.}-{0:0}, at: __perform_reclaim mm/page_alloc.c:4227 [inline]
>  #0: ffffffff89c6c260 (fs_reclaim){+.+.}-{0:0}, at: __alloc_pages_direct_reclaim mm/page_alloc.c:4244 [inline]
>  #0: ffffffff89c6c260 (fs_reclaim){+.+.}-{0:0}, at: __alloc_pages_slowpath.constprop.0+0x1554/0x2780 mm/page_alloc.c:4650
>  #1: ffffffff89c46a90 (shrinker_rwsem){++++}-{3:3}, at: shrink_slab+0xc7/0x5c0 mm/vmscan.c:669
> 
> stack backtrace:
> CPU: 0 PID: 1157 Comm: khugepaged Not tainted 5.8.0-rc5-syzkaller #0
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
> Call Trace:
>  __dump_stack lib/dump_stack.c:77 [inline]
>  dump_stack+0x18f/0x20d lib/dump_stack.c:118
>  check_noncircular+0x324/0x3e0 kernel/locking/lockdep.c:1827
>  check_prev_add kernel/locking/lockdep.c:2496 [inline]
>  check_prevs_add kernel/locking/lockdep.c:2601 [inline]
>  validate_chain kernel/locking/lockdep.c:3218 [inline]
>  __lock_acquire+0x2acb/0x56e0 kernel/locking/lockdep.c:4380
>  lock_acquire+0x1f1/0xad0 kernel/locking/lockdep.c:4959
>  down_write+0x8d/0x150 kernel/locking/rwsem.c:1531
>  inode_lock include/linux/fs.h:800 [inline]
>  shmem_fallocate+0x153/0xd90 mm/shmem.c:2707
>  ashmem_shrink_scan.part.0+0x2e9/0x490 drivers/staging/android/ashmem.c:490
>  ashmem_shrink_scan+0x6c/0xa0 drivers/staging/android/ashmem.c:473
>  do_shrink_slab+0x3c6/0xab0 mm/vmscan.c:518
>  shrink_slab+0x16f/0x5c0 mm/vmscan.c:679
>  shrink_node_memcgs mm/vmscan.c:2658 [inline]
>  shrink_node+0x519/0x1b60 mm/vmscan.c:2770
>  shrink_zones mm/vmscan.c:2973 [inline]
>  do_try_to_free_pages+0x38b/0x1340 mm/vmscan.c:3026
>  try_to_free_pages+0x29a/0x8b0 mm/vmscan.c:3265
>  __perform_reclaim mm/page_alloc.c:4223 [inline]
>  __alloc_pages_direct_reclaim mm/page_alloc.c:4244 [inline]
>  __alloc_pages_slowpath.constprop.0+0x949/0x2780 mm/page_alloc.c:4650
>  __alloc_pages_nodemask+0x68f/0x930 mm/page_alloc.c:4863
>  __alloc_pages include/linux/gfp.h:509 [inline]
>  __alloc_pages_node include/linux/gfp.h:522 [inline]
>  khugepaged_alloc_page+0xa0/0x170 mm/khugepaged.c:867
>  collapse_huge_page mm/khugepaged.c:1056 [inline]
>  khugepaged_scan_pmd mm/khugepaged.c:1349 [inline]
>  khugepaged_scan_mm_slot mm/khugepaged.c:2110 [inline]
>  khugepaged_do_scan mm/khugepaged.c:2193 [inline]
>  khugepaged+0x3093/0x5a10 mm/khugepaged.c:2238
>  kthread+0x3b5/0x4a0 kernel/kthread.c:291
>  ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:293

Add FALLOC_FL_NOBLOCK and on the shmem side try to lock inode upon the
new flag. And the overall upside is to keep the current gfp either in
the khugepaged context or not.

--- a/include/uapi/linux/falloc.h
+++ b/include/uapi/linux/falloc.h
@@ -77,4 +77,6 @@
  */
 #define FALLOC_FL_UNSHARE_RANGE		0x40
 
+#define FALLOC_FL_NOBLOCK		0x80
+
 #endif /* _UAPI_FALLOC_H_ */
--- a/mm/shmem.c
+++ b/mm/shmem.c
@@ -2725,10 +2725,15 @@ static long shmem_fallocate(struct file
 	pgoff_t start, index, end;
 	int error;
 
-	if (mode & ~(FALLOC_FL_KEEP_SIZE | FALLOC_FL_PUNCH_HOLE))
+	if (mode & ~(FALLOC_FL_KEEP_SIZE | FALLOC_FL_PUNCH_HOLE |
+						FALLOC_FL_NOBLOCK))
 		return -EOPNOTSUPP;
 
-	inode_lock(inode);
+	if (mode & FALLOC_FL_NOBLOCK) {
+		if (!inode_trylock(inode))
+			return -EBUSY;
+	} else
+		inode_lock(inode);
 
 	if (mode & FALLOC_FL_PUNCH_HOLE) {
 		struct address_space *mapping = file->f_mapping;
--- a/drivers/staging/android/ashmem.c
+++ b/drivers/staging/android/ashmem.c
@@ -459,8 +459,9 @@ ashmem_shrink_scan(struct shrinker *shri
 
 		freed += range_size(range);
 		mutex_unlock(&ashmem_mutex);
-		f->f_op->fallocate(f,
-				   FALLOC_FL_PUNCH_HOLE | FALLOC_FL_KEEP_SIZE,
+		f->f_op->fallocate(f, FALLOC_FL_PUNCH_HOLE |
+					FALLOC_FL_KEEP_SIZE |
+					FALLOC_FL_NOBLOCK,
 				   start, end - start);
 		fput(f);
 		if (atomic_dec_and_test(&ashmem_shrink_inflight))



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: possible deadlock in shmem_fallocate (4)
  2020-07-14  3:32   ` Hillf Danton
@ 2020-07-14  3:41     ` Eric Biggers
  2020-07-14  5:32       ` Hillf Danton
  0 siblings, 1 reply; 17+ messages in thread
From: Eric Biggers @ 2020-07-14  3:41 UTC (permalink / raw)
  To: Hillf Danton
  Cc: syzbot, akpm, arve, christian, devel, gregkh, hughd, joel,
	linux-kernel, linux-mm, maco, syzkaller-bugs, tkjos,
	Markus Elfring

On Tue, Jul 14, 2020 at 11:32:52AM +0800, Hillf Danton wrote:
> 
> Add FALLOC_FL_NOBLOCK and on the shmem side try to lock inode upon the
> new flag. And the overall upside is to keep the current gfp either in
> the khugepaged context or not.
> 
> --- a/include/uapi/linux/falloc.h
> +++ b/include/uapi/linux/falloc.h
> @@ -77,4 +77,6 @@
>   */
>  #define FALLOC_FL_UNSHARE_RANGE		0x40
>  
> +#define FALLOC_FL_NOBLOCK		0x80
> +

You can't add a new UAPI flag to fix a kernel-internal problem like this.

- Eric


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: possible deadlock in shmem_fallocate (4)
  2020-07-14  3:41     ` Eric Biggers
@ 2020-07-14  5:32       ` Hillf Danton
  2020-07-14  8:26         ` Michal Hocko
  0 siblings, 1 reply; 17+ messages in thread
From: Hillf Danton @ 2020-07-14  5:32 UTC (permalink / raw)
  To: Eric Biggers
  Cc: Hillf Danton, syzbot, akpm, arve, christian, devel, gregkh,
	hughd, joel, linux-kernel, linux-mm, maco, syzkaller-bugs, tkjos,
	Markus Elfring


On Mon, 13 Jul 2020 20:41:11 -0700 Eric Biggers wrote:
> On Tue, Jul 14, 2020 at 11:32:52AM +0800, Hillf Danton wrote:
> > 
> > Add FALLOC_FL_NOBLOCK and on the shmem side try to lock inode upon the
> > new flag. And the overall upside is to keep the current gfp either in
> > the khugepaged context or not.
> > 
> > --- a/include/uapi/linux/falloc.h
> > +++ b/include/uapi/linux/falloc.h
> > @@ -77,4 +77,6 @@
> >   */
> >  #define FALLOC_FL_UNSHARE_RANGE		0x40
> >  
> > +#define FALLOC_FL_NOBLOCK		0x80
> > +
> 
> You can't add a new UAPI flag to fix a kernel-internal problem like this.

Sounds fair, see below.

What the report indicates is a missing PF_MEMALLOC_NOFS and it's
checked on the ashmem side and added as an exception before going
to filesystem. On shmem side, no more than a best effort is paid
on the inteded exception.

--- a/drivers/staging/android/ashmem.c
+++ b/drivers/staging/android/ashmem.c
@@ -437,6 +437,7 @@ static unsigned long
 ashmem_shrink_scan(struct shrinker *shrink, struct shrink_control *sc)
 {
 	unsigned long freed = 0;
+	bool nofs;
 
 	/* We might recurse into filesystem code, so bail out if necessary */
 	if (!(sc->gfp_mask & __GFP_FS))
@@ -445,6 +446,11 @@ ashmem_shrink_scan(struct shrinker *shri
 	if (!mutex_trylock(&ashmem_mutex))
 		return -1;
 
+	/* enter filesystem with caution: nonblock on locking */
+	nofs = current->flags & PF_MEMALLOC_NOFS;
+	if (!nofs)
+		current->flags |= PF_MEMALLOC_NOFS;
+
 	while (!list_empty(&ashmem_lru_list)) {
 		struct ashmem_range *range =
 			list_first_entry(&ashmem_lru_list, typeof(*range), lru);
@@ -472,6 +478,8 @@ ashmem_shrink_scan(struct shrinker *shri
 	}
 	mutex_unlock(&ashmem_mutex);
 out:
+	if (!nofs)
+		current->flags &= ~PF_MEMALLOC_NOFS;
 	return freed;
 }
 
--- a/mm/shmem.c
+++ b/mm/shmem.c
@@ -2728,7 +2728,12 @@ static long shmem_fallocate(struct file
 	if (mode & ~(FALLOC_FL_KEEP_SIZE | FALLOC_FL_PUNCH_HOLE))
 		return -EOPNOTSUPP;
 
-	inode_lock(inode);
+	if (current->flags & PF_MEMALLOC_NOFS) {
+		/* this exception needs a best effort and no more */
+		if (!inode_trylock(inode))
+			return -EBUSY;
+	} else
+		inode_lock(inode);
 
 	if (mode & FALLOC_FL_PUNCH_HOLE) {
 		struct address_space *mapping = file->f_mapping;



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: possible deadlock in shmem_fallocate (4)
  2020-07-14  5:32       ` Hillf Danton
@ 2020-07-14  8:26         ` Michal Hocko
  2020-07-14 14:08           ` Hillf Danton
  0 siblings, 1 reply; 17+ messages in thread
From: Michal Hocko @ 2020-07-14  8:26 UTC (permalink / raw)
  To: Hillf Danton
  Cc: Eric Biggers, syzbot, akpm, arve, christian, devel, gregkh,
	hughd, joel, linux-kernel, linux-mm, maco, syzkaller-bugs, tkjos,
	Markus Elfring

On Tue 14-07-20 13:32:05, Hillf Danton wrote:
> 
> On Mon, 13 Jul 2020 20:41:11 -0700 Eric Biggers wrote:
> > On Tue, Jul 14, 2020 at 11:32:52AM +0800, Hillf Danton wrote:
> > > 
> > > Add FALLOC_FL_NOBLOCK and on the shmem side try to lock inode upon the
> > > new flag. And the overall upside is to keep the current gfp either in
> > > the khugepaged context or not.
> > > 
> > > --- a/include/uapi/linux/falloc.h
> > > +++ b/include/uapi/linux/falloc.h
> > > @@ -77,4 +77,6 @@
> > >   */
> > >  #define FALLOC_FL_UNSHARE_RANGE		0x40
> > >  
> > > +#define FALLOC_FL_NOBLOCK		0x80
> > > +
> > 
> > You can't add a new UAPI flag to fix a kernel-internal problem like this.
> 
> Sounds fair, see below.
> 
> What the report indicates is a missing PF_MEMALLOC_NOFS and it's
> checked on the ashmem side and added as an exception before going
> to filesystem. On shmem side, no more than a best effort is paid
> on the inteded exception.
> 
> --- a/drivers/staging/android/ashmem.c
> +++ b/drivers/staging/android/ashmem.c
> @@ -437,6 +437,7 @@ static unsigned long
>  ashmem_shrink_scan(struct shrinker *shrink, struct shrink_control *sc)
>  {
>  	unsigned long freed = 0;
> +	bool nofs;
>  
>  	/* We might recurse into filesystem code, so bail out if necessary */
>  	if (!(sc->gfp_mask & __GFP_FS))
> @@ -445,6 +446,11 @@ ashmem_shrink_scan(struct shrinker *shri
>  	if (!mutex_trylock(&ashmem_mutex))
>  		return -1;
>  
> +	/* enter filesystem with caution: nonblock on locking */
> +	nofs = current->flags & PF_MEMALLOC_NOFS;
> +	if (!nofs)
> +		current->flags |= PF_MEMALLOC_NOFS;
> +
>  	while (!list_empty(&ashmem_lru_list)) {
>  		struct ashmem_range *range =
>  			list_first_entry(&ashmem_lru_list, typeof(*range), lru);

I do not think this is an appropriate fix. First of all is this a real
deadlock or a lockdep false positive? Is it possible that ashmem just
needs to properly annotate its shmem inodes? Or is it possible that
the internal backing shmem file is visible to the userspace so the write
path would be possible?

If this a real problem then the proper fix would be to set internal
shmem mapping's gfp_mask to drop __GFP_FS.
-- 
Michal Hocko
SUSE Labs


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: possible deadlock in shmem_fallocate (4)
  2020-07-14  8:26         ` Michal Hocko
@ 2020-07-14 14:08           ` Hillf Danton
  2020-07-14 14:18             ` Michal Hocko
  0 siblings, 1 reply; 17+ messages in thread
From: Hillf Danton @ 2020-07-14 14:08 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Hillf Danton, Eric Biggers, syzbot, akpm, arve, christian, devel,
	gregkh, hughd, joel, linux-kernel, linux-mm, maco,
	syzkaller-bugs, tkjos, Markus Elfring


On Tue, 14 Jul 2020 10:26:29 +0200 Michal Hocko wrote:
> On Tue 14-07-20 13:32:05, Hillf Danton wrote:
> > 
> > On Mon, 13 Jul 2020 20:41:11 -0700 Eric Biggers wrote:
> > > On Tue, Jul 14, 2020 at 11:32:52AM +0800, Hillf Danton wrote:
> > > > 
> > > > Add FALLOC_FL_NOBLOCK and on the shmem side try to lock inode upon the
> > > > new flag. And the overall upside is to keep the current gfp either in
> > > > the khugepaged context or not.
> > > > 
> > > > --- a/include/uapi/linux/falloc.h
> > > > +++ b/include/uapi/linux/falloc.h
> > > > @@ -77,4 +77,6 @@
> > > >   */
> > > >  #define FALLOC_FL_UNSHARE_RANGE		0x40
> > > >  
> > > > +#define FALLOC_FL_NOBLOCK		0x80
> > > > +
> > > 
> > > You can't add a new UAPI flag to fix a kernel-internal problem like this.
> > 
> > Sounds fair, see below.
> > 
> > What the report indicates is a missing PF_MEMALLOC_NOFS and it's
> > checked on the ashmem side and added as an exception before going
> > to filesystem. On shmem side, no more than a best effort is paid
> > on the inteded exception.
> > 
> > --- a/drivers/staging/android/ashmem.c
> > +++ b/drivers/staging/android/ashmem.c
> > @@ -437,6 +437,7 @@ static unsigned long
> >  ashmem_shrink_scan(struct shrinker *shrink, struct shrink_control *sc)
> >  {
> >  	unsigned long freed = 0;
> > +	bool nofs;
> >  
> >  	/* We might recurse into filesystem code, so bail out if necessary */
> >  	if (!(sc->gfp_mask & __GFP_FS))
> > @@ -445,6 +446,11 @@ ashmem_shrink_scan(struct shrinker *shri
> >  	if (!mutex_trylock(&ashmem_mutex))
> >  		return -1;
> >  
> > +	/* enter filesystem with caution: nonblock on locking */
> > +	nofs = current->flags & PF_MEMALLOC_NOFS;
> > +	if (!nofs)
> > +		current->flags |= PF_MEMALLOC_NOFS;
> > +
> >  	while (!list_empty(&ashmem_lru_list)) {
> >  		struct ashmem_range *range =
> >  			list_first_entry(&ashmem_lru_list, typeof(*range), lru);
> 
> I do not think this is an appropriate fix. First of all is this a real
> deadlock or a lockdep false positive? Is it possible that ashmem just

The warning matters and we can do something to quiesce it.

> needs to properly annotate its shmem inodes? Or is it possible that
> the internal backing shmem file is visible to the userspace so the write
> path would be possible?
> 
> If this a real problem then the proper fix would be to set internal
> shmem mapping's gfp_mask to drop __GFP_FS.

Thanks for the tip, see below.

Can you expand a bit on how it helps direct reclaimers like khugepaged
in the syzbot report wrt deadlock? TBH I have difficult time following
up after staring at the chart below for quite a while.

Possible unsafe locking scenario:

       CPU0                    CPU1
       ----                    ----
  lock(fs_reclaim);
                               lock(&sb->s_type->i_mutex_key#15);
                               lock(fs_reclaim);

  lock(&sb->s_type->i_mutex_key#15);


--- a/drivers/staging/android/ashmem.c
+++ b/drivers/staging/android/ashmem.c
@@ -381,6 +381,7 @@ static int ashmem_mmap(struct file *file
 	if (!asma->file) {
 		char *name = ASHMEM_NAME_DEF;
 		struct file *vmfile;
+		gfp_t gfp;
 
 		if (asma->name[ASHMEM_NAME_PREFIX_LEN] != '\0')
 			name = asma->name;
@@ -392,6 +393,10 @@ static int ashmem_mmap(struct file *file
 			goto out;
 		}
 		vmfile->f_mode |= FMODE_LSEEK;
+		gfp = mapping_gfp_mask(vmfile->f_mapping);
+		if (gfp & __GFP_FS)
+			mapping_set_gfp_mask(vmfile->f_mapping,
+						gfp & ~__GFP_FS);
 		asma->file = vmfile;
 	}
 	get_file(asma->file);



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: possible deadlock in shmem_fallocate (4)
  2020-07-14 14:08           ` Hillf Danton
@ 2020-07-14 14:18             ` Michal Hocko
  2020-07-14 15:46               ` Todd Kjos
  0 siblings, 1 reply; 17+ messages in thread
From: Michal Hocko @ 2020-07-14 14:18 UTC (permalink / raw)
  To: Hillf Danton
  Cc: Eric Biggers, syzbot, akpm, arve, christian, devel, gregkh,
	hughd, joel, linux-kernel, linux-mm, maco, syzkaller-bugs, tkjos,
	Markus Elfring

On Tue 14-07-20 22:08:59, Hillf Danton wrote:
> 
> On Tue, 14 Jul 2020 10:26:29 +0200 Michal Hocko wrote:
> > On Tue 14-07-20 13:32:05, Hillf Danton wrote:
> > > 
> > > On Mon, 13 Jul 2020 20:41:11 -0700 Eric Biggers wrote:
> > > > On Tue, Jul 14, 2020 at 11:32:52AM +0800, Hillf Danton wrote:
> > > > > 
> > > > > Add FALLOC_FL_NOBLOCK and on the shmem side try to lock inode upon the
> > > > > new flag. And the overall upside is to keep the current gfp either in
> > > > > the khugepaged context or not.
> > > > > 
> > > > > --- a/include/uapi/linux/falloc.h
> > > > > +++ b/include/uapi/linux/falloc.h
> > > > > @@ -77,4 +77,6 @@
> > > > >   */
> > > > >  #define FALLOC_FL_UNSHARE_RANGE		0x40
> > > > >  
> > > > > +#define FALLOC_FL_NOBLOCK		0x80
> > > > > +
> > > > 
> > > > You can't add a new UAPI flag to fix a kernel-internal problem like this.
> > > 
> > > Sounds fair, see below.
> > > 
> > > What the report indicates is a missing PF_MEMALLOC_NOFS and it's
> > > checked on the ashmem side and added as an exception before going
> > > to filesystem. On shmem side, no more than a best effort is paid
> > > on the inteded exception.
> > > 
> > > --- a/drivers/staging/android/ashmem.c
> > > +++ b/drivers/staging/android/ashmem.c
> > > @@ -437,6 +437,7 @@ static unsigned long
> > >  ashmem_shrink_scan(struct shrinker *shrink, struct shrink_control *sc)
> > >  {
> > >  	unsigned long freed = 0;
> > > +	bool nofs;
> > >  
> > >  	/* We might recurse into filesystem code, so bail out if necessary */
> > >  	if (!(sc->gfp_mask & __GFP_FS))
> > > @@ -445,6 +446,11 @@ ashmem_shrink_scan(struct shrinker *shri
> > >  	if (!mutex_trylock(&ashmem_mutex))
> > >  		return -1;
> > >  
> > > +	/* enter filesystem with caution: nonblock on locking */
> > > +	nofs = current->flags & PF_MEMALLOC_NOFS;
> > > +	if (!nofs)
> > > +		current->flags |= PF_MEMALLOC_NOFS;
> > > +
> > >  	while (!list_empty(&ashmem_lru_list)) {
> > >  		struct ashmem_range *range =
> > >  			list_first_entry(&ashmem_lru_list, typeof(*range), lru);
> > 
> > I do not think this is an appropriate fix. First of all is this a real
> > deadlock or a lockdep false positive? Is it possible that ashmem just
> 
> The warning matters and we can do something to quiesce it.

The underlying issue should be fixed rather than _something_ done to
silence it.
 
> > needs to properly annotate its shmem inodes? Or is it possible that
> > the internal backing shmem file is visible to the userspace so the write
> > path would be possible?
> > 
> > If this a real problem then the proper fix would be to set internal
> > shmem mapping's gfp_mask to drop __GFP_FS.
> 
> Thanks for the tip, see below.
> 
> Can you expand a bit on how it helps direct reclaimers like khugepaged
> in the syzbot report wrt deadlock?

I do not understand your question.

> TBH I have difficult time following
> up after staring at the chart below for quite a while.

Yes, lockdep reports are quite hard to follow and they tend to confuse
one hell out of me. But this one says that there is a reclaim dependency
between the shmem inode lock and the reclaim context.

> Possible unsafe locking scenario:
> 
>        CPU0                    CPU1
>        ----                    ----
>   lock(fs_reclaim);
>                                lock(&sb->s_type->i_mutex_key#15);
>                                lock(fs_reclaim);
> 
>   lock(&sb->s_type->i_mutex_key#15);

Please refrain from proposing fixes until the actual problem is
understood. I suspect that this might be just false positive because the
lockdep cannot tell the backing shmem which is internal to ashmem(?)
with any general shmem. But somebody really familiar with ashmem code
should have a look I believe.

-- 
Michal Hocko
SUSE Labs


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: possible deadlock in shmem_fallocate (4)
  2020-07-14 14:18             ` Michal Hocko
@ 2020-07-14 15:46               ` Todd Kjos
  2020-07-14 16:41                 ` Suren Baghdasaryan
  0 siblings, 1 reply; 17+ messages in thread
From: Todd Kjos @ 2020-07-14 15:46 UTC (permalink / raw)
  To: Michal Hocko, Suren Baghdasaryan, Hridya Valsaraju
  Cc: Hillf Danton, Eric Biggers, syzbot, akpm,
	Arve Hjønnevåg, Christian Brauner,
	open list:ANDROID DRIVERS, Greg Kroah-Hartman, Hugh Dickins,
	Joel Fernandes (Google),
	LKML, Linux-MM, Martijn Coenen, syzkaller-bugs, Todd Kjos,
	Markus Elfring

+Suren Baghdasaryan +Hridya Valsaraju who support the ashmem driver.


On Tue, Jul 14, 2020 at 7:18 AM Michal Hocko <mhocko@kernel.org> wrote:
>
> On Tue 14-07-20 22:08:59, Hillf Danton wrote:
> >
> > On Tue, 14 Jul 2020 10:26:29 +0200 Michal Hocko wrote:
> > > On Tue 14-07-20 13:32:05, Hillf Danton wrote:
> > > >
> > > > On Mon, 13 Jul 2020 20:41:11 -0700 Eric Biggers wrote:
> > > > > On Tue, Jul 14, 2020 at 11:32:52AM +0800, Hillf Danton wrote:
> > > > > >
> > > > > > Add FALLOC_FL_NOBLOCK and on the shmem side try to lock inode upon the
> > > > > > new flag. And the overall upside is to keep the current gfp either in
> > > > > > the khugepaged context or not.
> > > > > >
> > > > > > --- a/include/uapi/linux/falloc.h
> > > > > > +++ b/include/uapi/linux/falloc.h
> > > > > > @@ -77,4 +77,6 @@
> > > > > >   */
> > > > > >  #define FALLOC_FL_UNSHARE_RANGE              0x40
> > > > > >
> > > > > > +#define FALLOC_FL_NOBLOCK            0x80
> > > > > > +
> > > > >
> > > > > You can't add a new UAPI flag to fix a kernel-internal problem like this.
> > > >
> > > > Sounds fair, see below.
> > > >
> > > > What the report indicates is a missing PF_MEMALLOC_NOFS and it's
> > > > checked on the ashmem side and added as an exception before going
> > > > to filesystem. On shmem side, no more than a best effort is paid
> > > > on the inteded exception.
> > > >
> > > > --- a/drivers/staging/android/ashmem.c
> > > > +++ b/drivers/staging/android/ashmem.c
> > > > @@ -437,6 +437,7 @@ static unsigned long
> > > >  ashmem_shrink_scan(struct shrinker *shrink, struct shrink_control *sc)
> > > >  {
> > > >   unsigned long freed = 0;
> > > > + bool nofs;
> > > >
> > > >   /* We might recurse into filesystem code, so bail out if necessary */
> > > >   if (!(sc->gfp_mask & __GFP_FS))
> > > > @@ -445,6 +446,11 @@ ashmem_shrink_scan(struct shrinker *shri
> > > >   if (!mutex_trylock(&ashmem_mutex))
> > > >           return -1;
> > > >
> > > > + /* enter filesystem with caution: nonblock on locking */
> > > > + nofs = current->flags & PF_MEMALLOC_NOFS;
> > > > + if (!nofs)
> > > > +         current->flags |= PF_MEMALLOC_NOFS;
> > > > +
> > > >   while (!list_empty(&ashmem_lru_list)) {
> > > >           struct ashmem_range *range =
> > > >                   list_first_entry(&ashmem_lru_list, typeof(*range), lru);
> > >
> > > I do not think this is an appropriate fix. First of all is this a real
> > > deadlock or a lockdep false positive? Is it possible that ashmem just
> >
> > The warning matters and we can do something to quiesce it.
>
> The underlying issue should be fixed rather than _something_ done to
> silence it.
>
> > > needs to properly annotate its shmem inodes? Or is it possible that
> > > the internal backing shmem file is visible to the userspace so the write
> > > path would be possible?
> > >
> > > If this a real problem then the proper fix would be to set internal
> > > shmem mapping's gfp_mask to drop __GFP_FS.
> >
> > Thanks for the tip, see below.
> >
> > Can you expand a bit on how it helps direct reclaimers like khugepaged
> > in the syzbot report wrt deadlock?
>
> I do not understand your question.
>
> > TBH I have difficult time following
> > up after staring at the chart below for quite a while.
>
> Yes, lockdep reports are quite hard to follow and they tend to confuse
> one hell out of me. But this one says that there is a reclaim dependency
> between the shmem inode lock and the reclaim context.
>
> > Possible unsafe locking scenario:
> >
> >        CPU0                    CPU1
> >        ----                    ----
> >   lock(fs_reclaim);
> >                                lock(&sb->s_type->i_mutex_key#15);
> >                                lock(fs_reclaim);
> >
> >   lock(&sb->s_type->i_mutex_key#15);
>
> Please refrain from proposing fixes until the actual problem is
> understood. I suspect that this might be just false positive because the
> lockdep cannot tell the backing shmem which is internal to ashmem(?)
> with any general shmem. But somebody really familiar with ashmem code
> should have a look I believe.
>
> --
> Michal Hocko
> SUSE Labs


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: possible deadlock in shmem_fallocate (4)
  2020-07-14 15:46               ` Todd Kjos
@ 2020-07-14 16:41                 ` Suren Baghdasaryan
  2020-07-14 17:32                   ` Suren Baghdasaryan
  0 siblings, 1 reply; 17+ messages in thread
From: Suren Baghdasaryan @ 2020-07-14 16:41 UTC (permalink / raw)
  To: Todd Kjos
  Cc: Michal Hocko, Hridya Valsaraju, Hillf Danton, Eric Biggers,
	syzbot, Andrew Morton, Arve Hjønnevåg,
	Christian Brauner, open list:ANDROID DRIVERS, Greg Kroah-Hartman,
	Hugh Dickins, Joel Fernandes (Google),
	LKML, Linux-MM, Martijn Coenen, syzkaller-bugs, Todd Kjos,
	Markus Elfring

On Tue, Jul 14, 2020 at 8:47 AM Todd Kjos <tkjos@google.com> wrote:
>
> +Suren Baghdasaryan +Hridya Valsaraju who support the ashmem driver.

Thanks for looping me in.

>
>
> On Tue, Jul 14, 2020 at 7:18 AM Michal Hocko <mhocko@kernel.org> wrote:
> >
> > On Tue 14-07-20 22:08:59, Hillf Danton wrote:
> > >
> > > On Tue, 14 Jul 2020 10:26:29 +0200 Michal Hocko wrote:
> > > > On Tue 14-07-20 13:32:05, Hillf Danton wrote:
> > > > >
> > > > > On Mon, 13 Jul 2020 20:41:11 -0700 Eric Biggers wrote:
> > > > > > On Tue, Jul 14, 2020 at 11:32:52AM +0800, Hillf Danton wrote:
> > > > > > >
> > > > > > > Add FALLOC_FL_NOBLOCK and on the shmem side try to lock inode upon the
> > > > > > > new flag. And the overall upside is to keep the current gfp either in
> > > > > > > the khugepaged context or not.
> > > > > > >
> > > > > > > --- a/include/uapi/linux/falloc.h
> > > > > > > +++ b/include/uapi/linux/falloc.h
> > > > > > > @@ -77,4 +77,6 @@
> > > > > > >   */
> > > > > > >  #define FALLOC_FL_UNSHARE_RANGE              0x40
> > > > > > >
> > > > > > > +#define FALLOC_FL_NOBLOCK            0x80
> > > > > > > +
> > > > > >
> > > > > > You can't add a new UAPI flag to fix a kernel-internal problem like this.
> > > > >
> > > > > Sounds fair, see below.
> > > > >
> > > > > What the report indicates is a missing PF_MEMALLOC_NOFS and it's
> > > > > checked on the ashmem side and added as an exception before going
> > > > > to filesystem. On shmem side, no more than a best effort is paid
> > > > > on the inteded exception.
> > > > >
> > > > > --- a/drivers/staging/android/ashmem.c
> > > > > +++ b/drivers/staging/android/ashmem.c
> > > > > @@ -437,6 +437,7 @@ static unsigned long
> > > > >  ashmem_shrink_scan(struct shrinker *shrink, struct shrink_control *sc)
> > > > >  {
> > > > >   unsigned long freed = 0;
> > > > > + bool nofs;
> > > > >
> > > > >   /* We might recurse into filesystem code, so bail out if necessary */
> > > > >   if (!(sc->gfp_mask & __GFP_FS))
> > > > > @@ -445,6 +446,11 @@ ashmem_shrink_scan(struct shrinker *shri
> > > > >   if (!mutex_trylock(&ashmem_mutex))
> > > > >           return -1;
> > > > >
> > > > > + /* enter filesystem with caution: nonblock on locking */
> > > > > + nofs = current->flags & PF_MEMALLOC_NOFS;
> > > > > + if (!nofs)
> > > > > +         current->flags |= PF_MEMALLOC_NOFS;
> > > > > +
> > > > >   while (!list_empty(&ashmem_lru_list)) {
> > > > >           struct ashmem_range *range =
> > > > >                   list_first_entry(&ashmem_lru_list, typeof(*range), lru);
> > > >
> > > > I do not think this is an appropriate fix. First of all is this a real
> > > > deadlock or a lockdep false positive? Is it possible that ashmem just
> > >
> > > The warning matters and we can do something to quiesce it.
> >
> > The underlying issue should be fixed rather than _something_ done to
> > silence it.
> >
> > > > needs to properly annotate its shmem inodes? Or is it possible that
> > > > the internal backing shmem file is visible to the userspace so the write
> > > > path would be possible?
> > > >
> > > > If this a real problem then the proper fix would be to set internal
> > > > shmem mapping's gfp_mask to drop __GFP_FS.
> > >
> > > Thanks for the tip, see below.
> > >
> > > Can you expand a bit on how it helps direct reclaimers like khugepaged
> > > in the syzbot report wrt deadlock?
> >
> > I do not understand your question.
> >
> > > TBH I have difficult time following
> > > up after staring at the chart below for quite a while.
> >
> > Yes, lockdep reports are quite hard to follow and they tend to confuse
> > one hell out of me. But this one says that there is a reclaim dependency
> > between the shmem inode lock and the reclaim context.
> >
> > > Possible unsafe locking scenario:
> > >
> > >        CPU0                    CPU1
> > >        ----                    ----
> > >   lock(fs_reclaim);
> > >                                lock(&sb->s_type->i_mutex_key#15);
> > >                                lock(fs_reclaim);
> > >
> > >   lock(&sb->s_type->i_mutex_key#15);
> >
> > Please refrain from proposing fixes until the actual problem is
> > understood. I suspect that this might be just false positive because the
> > lockdep cannot tell the backing shmem which is internal to ashmem(?)
> > with any general shmem. But somebody really familiar with ashmem code
> > should have a look I believe.

I believe the deadlock is possible if a write to ashmem fd coincides
with shrinking of ashmem caches. I just developed a possible fix here
https://android-review.googlesource.com/c/kernel/common/+/1361205 but
wanted to test it before posting upstream. The idea is to detect such
a race between write and cache shrinking operations and let
ashmem_shrink_scan bail out if the race is detected instead of taking
inode_lock. AFAIK writing ashmem files is not a usual usage for ashmem
(standard usage is to mmap it and use as shared memory), therefore
this bailing out early should not affect ashmem cache maintenance
much. Besides ashmem_shrink_scan already bails out early if a
contention on ashmem_mutex is detected, which is a much more probable
case (see: https://elixir.bootlin.com/linux/v5.8-rc4/source/drivers/staging/android/ashmem.c#L497).

I'll test and post the patch here in a day or so if there are no early
objections to it.
Thanks!

> >
> > --
> > Michal Hocko
> > SUSE Labs


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: possible deadlock in shmem_fallocate (4)
  2020-07-14 16:41                 ` Suren Baghdasaryan
@ 2020-07-14 17:32                   ` Suren Baghdasaryan
  2020-07-15  3:52                     ` Hillf Danton
  2020-07-15  6:36                     ` Michal Hocko
  0 siblings, 2 replies; 17+ messages in thread
From: Suren Baghdasaryan @ 2020-07-14 17:32 UTC (permalink / raw)
  To: Todd Kjos
  Cc: Michal Hocko, Hridya Valsaraju, Hillf Danton, Eric Biggers,
	syzbot, Andrew Morton, Arve Hjønnevåg,
	Christian Brauner, open list:ANDROID DRIVERS, Greg Kroah-Hartman,
	Hugh Dickins, Joel Fernandes (Google),
	LKML, Linux-MM, Martijn Coenen, syzkaller-bugs, Todd Kjos,
	Markus Elfring

On Tue, Jul 14, 2020 at 9:41 AM Suren Baghdasaryan <surenb@google.com> wrote:
>
> On Tue, Jul 14, 2020 at 8:47 AM Todd Kjos <tkjos@google.com> wrote:
> >
> > +Suren Baghdasaryan +Hridya Valsaraju who support the ashmem driver.
>
> Thanks for looping me in.
>
> >
> >
> > On Tue, Jul 14, 2020 at 7:18 AM Michal Hocko <mhocko@kernel.org> wrote:
> > >
> > > On Tue 14-07-20 22:08:59, Hillf Danton wrote:
> > > >
> > > > On Tue, 14 Jul 2020 10:26:29 +0200 Michal Hocko wrote:
> > > > > On Tue 14-07-20 13:32:05, Hillf Danton wrote:
> > > > > >
> > > > > > On Mon, 13 Jul 2020 20:41:11 -0700 Eric Biggers wrote:
> > > > > > > On Tue, Jul 14, 2020 at 11:32:52AM +0800, Hillf Danton wrote:
> > > > > > > >
> > > > > > > > Add FALLOC_FL_NOBLOCK and on the shmem side try to lock inode upon the
> > > > > > > > new flag. And the overall upside is to keep the current gfp either in
> > > > > > > > the khugepaged context or not.
> > > > > > > >
> > > > > > > > --- a/include/uapi/linux/falloc.h
> > > > > > > > +++ b/include/uapi/linux/falloc.h
> > > > > > > > @@ -77,4 +77,6 @@
> > > > > > > >   */
> > > > > > > >  #define FALLOC_FL_UNSHARE_RANGE              0x40
> > > > > > > >
> > > > > > > > +#define FALLOC_FL_NOBLOCK            0x80
> > > > > > > > +
> > > > > > >
> > > > > > > You can't add a new UAPI flag to fix a kernel-internal problem like this.
> > > > > >
> > > > > > Sounds fair, see below.
> > > > > >
> > > > > > What the report indicates is a missing PF_MEMALLOC_NOFS and it's
> > > > > > checked on the ashmem side and added as an exception before going
> > > > > > to filesystem. On shmem side, no more than a best effort is paid
> > > > > > on the inteded exception.
> > > > > >
> > > > > > --- a/drivers/staging/android/ashmem.c
> > > > > > +++ b/drivers/staging/android/ashmem.c
> > > > > > @@ -437,6 +437,7 @@ static unsigned long
> > > > > >  ashmem_shrink_scan(struct shrinker *shrink, struct shrink_control *sc)
> > > > > >  {
> > > > > >   unsigned long freed = 0;
> > > > > > + bool nofs;
> > > > > >
> > > > > >   /* We might recurse into filesystem code, so bail out if necessary */
> > > > > >   if (!(sc->gfp_mask & __GFP_FS))
> > > > > > @@ -445,6 +446,11 @@ ashmem_shrink_scan(struct shrinker *shri
> > > > > >   if (!mutex_trylock(&ashmem_mutex))
> > > > > >           return -1;
> > > > > >
> > > > > > + /* enter filesystem with caution: nonblock on locking */
> > > > > > + nofs = current->flags & PF_MEMALLOC_NOFS;
> > > > > > + if (!nofs)
> > > > > > +         current->flags |= PF_MEMALLOC_NOFS;
> > > > > > +
> > > > > >   while (!list_empty(&ashmem_lru_list)) {
> > > > > >           struct ashmem_range *range =
> > > > > >                   list_first_entry(&ashmem_lru_list, typeof(*range), lru);
> > > > >
> > > > > I do not think this is an appropriate fix. First of all is this a real
> > > > > deadlock or a lockdep false positive? Is it possible that ashmem just
> > > >
> > > > The warning matters and we can do something to quiesce it.
> > >
> > > The underlying issue should be fixed rather than _something_ done to
> > > silence it.
> > >
> > > > > needs to properly annotate its shmem inodes? Or is it possible that
> > > > > the internal backing shmem file is visible to the userspace so the write
> > > > > path would be possible?
> > > > >
> > > > > If this a real problem then the proper fix would be to set internal
> > > > > shmem mapping's gfp_mask to drop __GFP_FS.
> > > >
> > > > Thanks for the tip, see below.
> > > >
> > > > Can you expand a bit on how it helps direct reclaimers like khugepaged
> > > > in the syzbot report wrt deadlock?
> > >
> > > I do not understand your question.
> > >
> > > > TBH I have difficult time following
> > > > up after staring at the chart below for quite a while.
> > >
> > > Yes, lockdep reports are quite hard to follow and they tend to confuse
> > > one hell out of me. But this one says that there is a reclaim dependency
> > > between the shmem inode lock and the reclaim context.
> > >
> > > > Possible unsafe locking scenario:
> > > >
> > > >        CPU0                    CPU1
> > > >        ----                    ----
> > > >   lock(fs_reclaim);
> > > >                                lock(&sb->s_type->i_mutex_key#15);
> > > >                                lock(fs_reclaim);
> > > >
> > > >   lock(&sb->s_type->i_mutex_key#15);
> > >
> > > Please refrain from proposing fixes until the actual problem is
> > > understood. I suspect that this might be just false positive because the
> > > lockdep cannot tell the backing shmem which is internal to ashmem(?)
> > > with any general shmem.

Actually looking some more into this, I think you are right. Ashmem
currently does not redirect writes into the backing shmem and
fallocate call from ashmem_shrink_scan is always performed against
asma->file, which is the backing shmem. IOW writes into the backing
shmem are not supported, therefore this concurrent locking can't
happen.

I'm not sure how we can annotate the fact that the inode_lock in
generic_file_write_iter and in shmem_fallocate always operate on
different inodes. Ideas?

> > >  But somebody really familiar with ashmem code
> > > should have a look I believe.
>
> I believe the deadlock is possible if a write to ashmem fd coincides
> with shrinking of ashmem caches. I just developed a possible fix here
> https://android-review.googlesource.com/c/kernel/common/+/1361205 but
> wanted to test it before posting upstream. The idea is to detect such
> a race between write and cache shrinking operations and let
> ashmem_shrink_scan bail out if the race is detected instead of taking
> inode_lock. AFAIK writing ashmem files is not a usual usage for ashmem
> (standard usage is to mmap it and use as shared memory), therefore
> this bailing out early should not affect ashmem cache maintenance
> much. Besides ashmem_shrink_scan already bails out early if a
> contention on ashmem_mutex is detected, which is a much more probable
> case (see: https://elixir.bootlin.com/linux/v5.8-rc4/source/drivers/staging/android/ashmem.c#L497).
>
> I'll test and post the patch here in a day or so if there are no early
> objections to it.
> Thanks!
>
> > >
> > > --
> > > Michal Hocko
> > > SUSE Labs


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: possible deadlock in shmem_fallocate (4)
  2020-07-14 17:32                   ` Suren Baghdasaryan
@ 2020-07-15  3:52                     ` Hillf Danton
  2020-07-15  6:36                     ` Michal Hocko
  1 sibling, 0 replies; 17+ messages in thread
From: Hillf Danton @ 2020-07-15  3:52 UTC (permalink / raw)
  To: Suren Baghdasaryan
  Cc: Todd Kjos, Michal Hocko, Hridya Valsaraju, Hillf Danton,
	Eric Biggers, syzbot, Andrew Morton, Arve Hjønnevåg,
	Christian Brauner, open list:ANDROID DRIVERS, Greg Kroah-Hartman,
	Hugh Dickins, Joel Fernandes (Google),
	LKML, Linux-MM, Martijn Coenen, syzkaller-bugs, Todd Kjos,
	Markus Elfring


On Tue, 14 Jul 2020 10:32:20 -0700 Suren Baghdasaryan wrote:
> On Tue, Jul 14, 2020 at 9:41 AM Suren Baghdasaryan <surenb@google.com> wrote:
> >
> > On Tue, Jul 14, 2020 at 8:47 AM Todd Kjos <tkjos@google.com> wrote:
> > >
> > > +Suren Baghdasaryan +Hridya Valsaraju who support the ashmem driver.
> >
> > Thanks for looping me in.
> >
> > > On Tue, Jul 14, 2020 at 7:18 AM Michal Hocko <mhocko@kernel.org> wrote:
> > > >
> > > > On Tue 14-07-20 22:08:59, Hillf Danton wrote:
> > > > >
> > > > > On Tue, 14 Jul 2020 10:26:29 +0200 Michal Hocko wrote:
> > > > > > On Tue 14-07-20 13:32:05, Hillf Danton wrote:
> > > > > > >
> > > > > > > On Mon, 13 Jul 2020 20:41:11 -0700 Eric Biggers wrote:
> > > > > > > > On Tue, Jul 14, 2020 at 11:32:52AM +0800, Hillf Danton wrote:
> > > > > > > > >
> > > > > > > > > Add FALLOC_FL_NOBLOCK and on the shmem side try to lock inode upon the
> > > > > > > > > new flag. And the overall upside is to keep the current gfp either in
> > > > > > > > > the khugepaged context or not.
> > > > > > > > >
> > > > > > > > > --- a/include/uapi/linux/falloc.h
> > > > > > > > > +++ b/include/uapi/linux/falloc.h
> > > > > > > > > @@ -77,4 +77,6 @@
> > > > > > > > >   */
> > > > > > > > >  #define FALLOC_FL_UNSHARE_RANGE              0x40
> > > > > > > > >
> > > > > > > > > +#define FALLOC_FL_NOBLOCK            0x80
> > > > > > > > > +
> > > > > > > >
> > > > > > > > You can't add a new UAPI flag to fix a kernel-internal problem like this.
> > > > > > >
> > > > > > > Sounds fair, see below.
> > > > > > >
> > > > > > > What the report indicates is a missing PF_MEMALLOC_NOFS and it's
> > > > > > > checked on the ashmem side and added as an exception before going
> > > > > > > to filesystem. On shmem side, no more than a best effort is paid
> > > > > > > on the inteded exception.
> > > > > > >
> > > > > > > --- a/drivers/staging/android/ashmem.c
> > > > > > > +++ b/drivers/staging/android/ashmem.c
> > > > > > > @@ -437,6 +437,7 @@ static unsigned long
> > > > > > >  ashmem_shrink_scan(struct shrinker *shrink, struct shrink_control *sc)
> > > > > > >  {
> > > > > > >   unsigned long freed = 0;
> > > > > > > + bool nofs;
> > > > > > >
> > > > > > >   /* We might recurse into filesystem code, so bail out if necessary */
> > > > > > >   if (!(sc->gfp_mask & __GFP_FS))
> > > > > > > @@ -445,6 +446,11 @@ ashmem_shrink_scan(struct shrinker *shri
> > > > > > >   if (!mutex_trylock(&ashmem_mutex))
> > > > > > >           return -1;
> > > > > > >
> > > > > > > + /* enter filesystem with caution: nonblock on locking */
> > > > > > > + nofs = current->flags & PF_MEMALLOC_NOFS;
> > > > > > > + if (!nofs)
> > > > > > > +         current->flags |= PF_MEMALLOC_NOFS;
> > > > > > > +
> > > > > > >   while (!list_empty(&ashmem_lru_list)) {
> > > > > > >           struct ashmem_range *range =
> > > > > > >                   list_first_entry(&ashmem_lru_list, typeof(*range), lru);
> > > > > >
> > > > > > I do not think this is an appropriate fix. First of all is this a real
> > > > > > deadlock or a lockdep false positive? Is it possible that ashmem just
> > > > >
> > > > > The warning matters and we can do something to quiesce it.
> > > >
> > > > The underlying issue should be fixed rather than _something_ done to
> > > > silence it.
> > > >
> > > > > > needs to properly annotate its shmem inodes? Or is it possible that
> > > > > > the internal backing shmem file is visible to the userspace so the write
> > > > > > path would be possible?
> > > > > >
> > > > > > If this a real problem then the proper fix would be to set internal
> > > > > > shmem mapping's gfp_mask to drop __GFP_FS.
> > > > >
> > > > > Thanks for the tip, see below.
> > > > >
> > > > > Can you expand a bit on how it helps direct reclaimers like khugepaged
> > > > > in the syzbot report wrt deadlock?
> > > >
> > > > I do not understand your question.
> > > >
> > > > > TBH I have difficult time following
> > > > > up after staring at the chart below for quite a while.
> > > >
> > > > Yes, lockdep reports are quite hard to follow and they tend to confuse
> > > > one hell out of me. But this one says that there is a reclaim dependency
> > > > between the shmem inode lock and the reclaim context.
> > > >
> > > > > Possible unsafe locking scenario:
> > > > >
> > > > >        CPU0                    CPU1
> > > > >        ----                    ----
> > > > >   lock(fs_reclaim);
> > > > >                                lock(&sb->s_type->i_mutex_key#15);
> > > > >                                lock(fs_reclaim);
> > > > >
> > > > >   lock(&sb->s_type->i_mutex_key#15);
> > > >
> > > > Please refrain from proposing fixes until the actual problem is
> > > > understood. I suspect that this might be just false positive because the
> > > > lockdep cannot tell the backing shmem which is internal to ashmem(?)
> > > > with any general shmem.
> 
> Actually looking some more into this, I think you are right. Ashmem
> currently does not redirect writes into the backing shmem and
> fallocate call from ashmem_shrink_scan is always performed against
> asma->file, which is the backing shmem. IOW writes into the backing
> shmem are not supported, therefore this concurrent locking can't
> happen.

The print of generic_file_write_iter in the syzbot report backs that
concurrency because of f_op::fallocate and another is
 Reported-by: syzbot+7a0d9d0b26efefe61780@syzkaller.appspotmail.com

> 
> I'm not sure how we can annotate the fact that the inode_lock in
> generic_file_write_iter and in shmem_fallocate always operate on
> different inodes. Ideas?
> 
> > > >  But somebody really familiar with ashmem code
> > > > should have a look I believe.
> >
> > I believe the deadlock is possible if a write to ashmem fd coincides
> > with shrinking of ashmem caches. I just developed a possible fix here
> > https://android-review.googlesource.com/c/kernel/common/+/1361205 but
> > wanted to test it before posting upstream. The idea is to detect such
> > a race between write and cache shrinking operations and let
> > ashmem_shrink_scan bail out if the race is detected instead of taking
> > inode_lock. AFAIK writing ashmem files is not a usual usage for ashmem
> > (standard usage is to mmap it and use as shared memory), therefore
> > this bailing out early should not affect ashmem cache maintenance
> > much. Besides ashmem_shrink_scan already bails out early if a
> > contention on ashmem_mutex is detected, which is a much more probable
> > case (see: https://elixir.bootlin.com/linux/v5.8-rc4/source/drivers/staging/android/ashmem.c#L497).
> >
> > I'll test and post the patch here in a day or so if there are no early
> > objections to it.
> > Thanks!
> >
> > > >
> > > > --
> > > > Michal Hocko
> > > > SUSE Labs



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: possible deadlock in shmem_fallocate (4)
  2020-07-14 17:32                   ` Suren Baghdasaryan
  2020-07-15  3:52                     ` Hillf Danton
@ 2020-07-15  6:36                     ` Michal Hocko
  2020-07-16  2:49                       ` Suren Baghdasaryan
  1 sibling, 1 reply; 17+ messages in thread
From: Michal Hocko @ 2020-07-15  6:36 UTC (permalink / raw)
  To: Suren Baghdasaryan
  Cc: Todd Kjos, Hridya Valsaraju, Hillf Danton, Eric Biggers, syzbot,
	Andrew Morton, Arve Hjønnevåg, Christian Brauner,
	open list:ANDROID DRIVERS, Greg Kroah-Hartman, Hugh Dickins,
	Joel Fernandes (Google),
	LKML, Linux-MM, Martijn Coenen, syzkaller-bugs, Todd Kjos,
	Markus Elfring

On Tue 14-07-20 10:32:20, Suren Baghdasaryan wrote:
[...]
> I'm not sure how we can annotate the fact that the inode_lock in
> generic_file_write_iter and in shmem_fallocate always operate on
> different inodes. Ideas?

I believe that the standard way is to use lockdep_set_class on your
backing inode. But asking lockdep experts would be better than relying
on my vague recollection

-- 
Michal Hocko
SUSE Labs


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: possible deadlock in shmem_fallocate (4)
  2020-07-15  6:36                     ` Michal Hocko
@ 2020-07-16  2:49                       ` Suren Baghdasaryan
  0 siblings, 0 replies; 17+ messages in thread
From: Suren Baghdasaryan @ 2020-07-16  2:49 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Todd Kjos, Hridya Valsaraju, Hillf Danton, Eric Biggers, syzbot,
	Andrew Morton, Arve Hjønnevåg, Christian Brauner,
	open list:ANDROID DRIVERS, Greg Kroah-Hartman, Hugh Dickins,
	Joel Fernandes (Google),
	LKML, Linux-MM, Martijn Coenen, syzkaller-bugs, Todd Kjos,
	Markus Elfring

On Tue, Jul 14, 2020 at 11:36 PM Michal Hocko <mhocko@kernel.org> wrote:
>
> On Tue 14-07-20 10:32:20, Suren Baghdasaryan wrote:
> [...]
> > I'm not sure how we can annotate the fact that the inode_lock in
> > generic_file_write_iter and in shmem_fallocate always operate on
> > different inodes. Ideas?
>
> I believe that the standard way is to use lockdep_set_class on your
> backing inode. But asking lockdep experts would be better than relying
> on my vague recollection

Thanks Michal! I think https://lkml.org/lkml/2020/7/15/1390 should fix
it, however I was unable to reproduce the lockdep warning to confirm
that this warning gets fixed by the patch.

>
> --
> Michal Hocko
> SUSE Labs


^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2020-07-16  2:50 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-12-26 21:25 possible deadlock in shmem_fallocate (4) syzbot
2020-03-07 21:43 ` [ashmem] " Eric Biggers
2020-03-08 15:05 ` Hillf Danton
2020-07-14  0:32 ` syzbot
2020-07-14  3:32   ` Hillf Danton
2020-07-14  3:41     ` Eric Biggers
2020-07-14  5:32       ` Hillf Danton
2020-07-14  8:26         ` Michal Hocko
2020-07-14 14:08           ` Hillf Danton
2020-07-14 14:18             ` Michal Hocko
2020-07-14 15:46               ` Todd Kjos
2020-07-14 16:41                 ` Suren Baghdasaryan
2020-07-14 17:32                   ` Suren Baghdasaryan
2020-07-15  3:52                     ` Hillf Danton
2020-07-15  6:36                     ` Michal Hocko
2020-07-16  2:49                       ` Suren Baghdasaryan
2020-07-14  3:07 ` syzbot

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).