Linux-Block Archive on lore.kernel.org
 help / color / Atom feed
* INFO: task syz-executor can't die for more than 143 seconds. (2)
@ 2019-10-23  7:56 syzbot
  2019-10-24 10:08 ` Tetsuo Handa
  0 siblings, 1 reply; 14+ messages in thread
From: syzbot @ 2019-10-23  7:56 UTC (permalink / raw)
  To: axboe, linux-block, linux-kernel, syzkaller-bugs

Hello,

syzbot found the following crash on:

HEAD commit:    c4b9850b Add linux-next specific files for 20191018
git tree:       linux-next
console output: https://syzkaller.appspot.com/x/log.txt?x=177b3ab0e00000
kernel config:  https://syzkaller.appspot.com/x/.config?x=c940ef12efcd1ec
dashboard link: https://syzkaller.appspot.com/bug?extid=b48daca8639150bc5e73
compiler:       gcc (GCC) 9.0.0 20181231 (experimental)
syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=1356b8ff600000
C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=14f48687600000

Bisection is inconclusive: the first bad commit could be any of:

9211bfbf netfilter: add missing IS_ENABLED(CONFIG_BRIDGE_NETFILTER) checks  
to header-file.
47e640af netfilter: add missing IS_ENABLED(CONFIG_NF_TABLES) check to  
header-file.
a1b2f04e netfilter: add missing includes to a number of header-files.
0abc8bf4 netfilter: add missing IS_ENABLED(CONFIG_NF_CONNTRACK) checks to  
some header-files.
bd96b4c7 netfilter: inline four headers files into another one.
43dd16ef netfilter: nf_tables: store data in offload context registers
78458e3e netfilter: add missing IS_ENABLED(CONFIG_NETFILTER) checks to some  
header-files.
20a9379d netfilter: remove "#ifdef __KERNEL__" guards from some headers.
bd8699e9 netfilter: nft_bitwise: add offload support
2a475c40 kbuild: remove all netfilter headers from header-test blacklist.
7e59b3fe netfilter: remove unnecessary spaces
1b90af29 ipvs: Improve robustness to the ipvs sysctl
5785cf15 netfilter: nf_tables: add missing prototypes.
0a30ba50 netfilter: nf_nat_proto: make tables static
e84fb4b3 netfilter: conntrack: use shared sysctl constants
10533343 netfilter: connlabels: prefer static lock initialiser
8c0bb787 netfilter: synproxy: rename mss synproxy_options field
c162610c Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next

bisection log:  https://syzkaller.appspot.com/x/bisect.txt?x=143869e8e00000

IMPORTANT: if you fix the bug, please add the following tag to the commit:
Reported-by: syzbot+b48daca8639150bc5e73@syzkaller.appspotmail.com

INFO: task syz-executor350:8656 can't die for more than 143 seconds.
syz-executor350 R  running task    27256  8656   8654 0x00004006
Call Trace:
  context_switch kernel/sched/core.c:3384 [inline]
  __schedule+0x94a/0x1e70 kernel/sched/core.c:4069
  schedule+0xd9/0x260 kernel/sched/core.c:4136
  io_schedule+0x1c/0x70 kernel/sched/core.c:5780
  rq_qos_wait+0x301/0x3f0 block/blk-rq-qos.c:288
  __wbt_wait block/blk-wbt.c:526 [inline]
  wbt_wait+0x20b/0x370 block/blk-wbt.c:591
  __rq_qos_throttle+0x56/0xa0 block/blk-rq-qos.c:72
  rq_qos_throttle block/blk-rq-qos.h:182 [inline]
  blk_mq_make_request+0x3d0/0x2280 block/blk-mq.c:1943
  generic_make_request block/blk-core.c:1093 [inline]
  generic_make_request+0x23c/0xb50 block/blk-core.c:1035
  submit_bio+0x113/0x600 block/blk-core.c:1219
  blk_next_bio+0x4a/0x60 block/blk-lib.c:19
  __blkdev_issue_zero_pages+0x151/0x430 block/blk-lib.c:284
  blkdev_issue_zeroout+0x434/0x4c0 block/blk-lib.c:378
  blkdev_fallocate+0x2fc/0x410 fs/block_dev.c:2089
  vfs_fallocate+0x4aa/0xa50 fs/open.c:309
  ksys_fallocate+0x58/0xa0 fs/open.c:332
  __do_sys_fallocate fs/open.c:340 [inline]
  __se_sys_fallocate fs/open.c:338 [inline]
  __x64_sys_fallocate+0x97/0xf0 fs/open.c:338
  do_syscall_64+0xfa/0x760 arch/x86/entry/common.c:290
  entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x441269
Code: e8 ac e8 ff ff 48 83 c4 18 c3 0f 1f 80 00 00 00 00 48 89 f8 48 89 f7  
48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff  
ff 0f 83 eb 08 fc ff c3 66 2e 0f 1f 84 00 00 00 00
RSP: 002b:00007ffc6d0e55e8 EFLAGS: 00000246 ORIG_RAX: 000000000000011d
RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 0000000000441269
RDX: 0000000000000000 RSI: 0000000000000011 RDI: 0000000000000003
RBP: 00000000006cb018 R08: 00000000004002c8 R09: 00000000004002c8
R10: 0000400000000200 R11: 0000000000000246 R12: 0000000000401fe0
R13: 0000000000402070 R14: 0000000000000000 R15: 0000000000000000
INFO: task syz-executor350:8661 can't die for more than 144 seconds.
syz-executor350 R  running task    27256  8661   8660 0x00004006
Call Trace:
  context_switch kernel/sched/core.c:3384 [inline]
  __schedule+0x94a/0x1e70 kernel/sched/core.c:4069
  schedule+0xd9/0x260 kernel/sched/core.c:4136
  io_schedule+0x1c/0x70 kernel/sched/core.c:5780
  rq_qos_wait+0x301/0x3f0 block/blk-rq-qos.c:288
  __wbt_wait block/blk-wbt.c:526 [inline]
  wbt_wait+0x20b/0x370 block/blk-wbt.c:591
  __rq_qos_throttle+0x56/0xa0 block/blk-rq-qos.c:72
  rq_qos_throttle block/blk-rq-qos.h:182 [inline]
  blk_mq_make_request+0x3d0/0x2280 block/blk-mq.c:1943
  generic_make_request block/blk-core.c:1093 [inline]
  generic_make_request+0x23c/0xb50 block/blk-core.c:1035
  submit_bio+0x113/0x600 block/blk-core.c:1219
  blk_next_bio+0x4a/0x60 block/blk-lib.c:19
  __blkdev_issue_zero_pages+0x151/0x430 block/blk-lib.c:284
  blkdev_issue_zeroout+0x434/0x4c0 block/blk-lib.c:378
  blkdev_fallocate+0x2fc/0x410 fs/block_dev.c:2089
  vfs_fallocate+0x4aa/0xa50 fs/open.c:309
  ksys_fallocate+0x58/0xa0 fs/open.c:332
  __do_sys_fallocate fs/open.c:340 [inline]
  __se_sys_fallocate fs/open.c:338 [inline]
  __x64_sys_fallocate+0x97/0xf0 fs/open.c:338
  do_syscall_64+0xfa/0x760 arch/x86/entry/common.c:290
  entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x441269
Code: e8 ac e8 ff ff 48 83 c4 18 c3 0f 1f 80 00 00 00 00 48 89 f8 48 89 f7  
48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff  
ff 0f 83 eb 08 fc ff c3 66 2e 0f 1f 84 00 00 00 00
RSP: 002b:00007ffc6d0e55e8 EFLAGS: 00000246 ORIG_RAX: 000000000000011d
RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 0000000000441269
RDX: 0000000000000000 RSI: 0000000000000011 RDI: 0000000000000003
RBP: 00000000006cb018 R08: 00000000004002c8 R09: 00000000004002c8
R10: 0000400000000200 R11: 0000000000000246 R12: 0000000000401fe0
R13: 0000000000402070 R14: 0000000000000000 R15: 0000000000000000
INFO: task syz-executor350:8662 can't die for more than 144 seconds.
syz-executor350 R  running task    27256  8662   8658 0x00004006
Call Trace:
  context_switch kernel/sched/core.c:3384 [inline]
  __schedule+0x94a/0x1e70 kernel/sched/core.c:4069
INFO: task syz-executor350:8663 can't die for more than 145 seconds.
syz-executor350 D27256  8663   8655 0x00004006
Call Trace:
  context_switch kernel/sched/core.c:3384 [inline]
  __schedule+0x94a/0x1e70 kernel/sched/core.c:4069
  schedule+0xd9/0x260 kernel/sched/core.c:4136
  io_schedule+0x1c/0x70 kernel/sched/core.c:5780
  rq_qos_wait+0x301/0x3f0 block/blk-rq-qos.c:288
  __wbt_wait block/blk-wbt.c:526 [inline]
  wbt_wait+0x20b/0x370 block/blk-wbt.c:591
  __rq_qos_throttle+0x56/0xa0 block/blk-rq-qos.c:72
  rq_qos_throttle block/blk-rq-qos.h:182 [inline]
  blk_mq_make_request+0x3d0/0x2280 block/blk-mq.c:1943
  generic_make_request block/blk-core.c:1093 [inline]
  generic_make_request+0x23c/0xb50 block/blk-core.c:1035
  submit_bio+0x113/0x600 block/blk-core.c:1219
  blk_next_bio+0x4a/0x60 block/blk-lib.c:19
  __blkdev_issue_zero_pages+0x151/0x430 block/blk-lib.c:284
  blkdev_issue_zeroout+0x434/0x4c0 block/blk-lib.c:378
  blkdev_fallocate+0x2fc/0x410 fs/block_dev.c:2089
  vfs_fallocate+0x4aa/0xa50 fs/open.c:309
  ksys_fallocate+0x58/0xa0 fs/open.c:332
  __do_sys_fallocate fs/open.c:340 [inline]
  __se_sys_fallocate fs/open.c:338 [inline]
  __x64_sys_fallocate+0x97/0xf0 fs/open.c:338
  do_syscall_64+0xfa/0x760 arch/x86/entry/common.c:290
  entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x441269
Code: e8 ac e8 ff ff 48 83 c4 18 c3 0f 1f 80 00 00 00 00 48 89 f8 48 89 f7  
48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff  
ff 0f 83 eb 08 fc ff c3 66 2e 0f 1f 84 00 00 00 00
RSP: 002b:00007ffc6d0e55e8 EFLAGS: 00000246 ORIG_RAX: 000000000000011d
RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 0000000000441269
RDX: 0000000000000000 RSI: 0000000000000011 RDI: 0000000000000003
RBP: 00000000006cb018 R08: 00000000004002c8 R09: 00000000004002c8
R10: 0000400000000200 R11: 0000000000000246 R12: 0000000000401fe0
R13: 0000000000402070 R14: 0000000000000000 R15: 0000000000000000
INFO: task syz-executor350:8664 can't die for more than 146 seconds.
syz-executor350 R  running task    27336  8664   8659 0x00004006
Call Trace:
  context_switch kernel/sched/core.c:3384 [inline]
  __schedule+0x94a/0x1e70 kernel/sched/core.c:4069
  schedule+0xd9/0x260 kernel/sched/core.c:4136
  io_schedule+0xc/0x70 kernel/sched/core.c:5779
  rq_qos_wait+0x301/0x3f0 block/blk-rq-qos.c:288
  __wbt_wait block/blk-wbt.c:526 [inline]
  wbt_wait+0x20b/0x370 block/blk-wbt.c:591
  __rq_qos_throttle+0x56/0xa0 block/blk-rq-qos.c:72
  rq_qos_throttle block/blk-rq-qos.h:182 [inline]
  blk_mq_make_request+0x3d0/0x2280 block/blk-mq.c:1943
  generic_make_request block/blk-core.c:1093 [inline]
  generic_make_request+0x23c/0xb50 block/blk-core.c:1035
  submit_bio+0x113/0x600 block/blk-core.c:1219
  blk_next_bio+0x4a/0x60 block/blk-lib.c:19
  __blkdev_issue_zero_pages+0x151/0x430 block/blk-lib.c:284
  blkdev_issue_zeroout+0x434/0x4c0 block/blk-lib.c:378
  blkdev_fallocate+0x2fc/0x410 fs/block_dev.c:2089
  vfs_fallocate+0x4aa/0xa50 fs/open.c:309
  ksys_fallocate+0x58/0xa0 fs/open.c:332
  __do_sys_fallocate fs/open.c:340 [inline]
  __se_sys_fallocate fs/open.c:338 [inline]
  __x64_sys_fallocate+0x97/0xf0 fs/open.c:338
  do_syscall_64+0xfa/0x760 arch/x86/entry/common.c:290
  entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x441269
Code: e8 ac e8 ff ff 48 83 c4 18 c3 0f 1f 80 00 00 00 00 48 89 f8 48 89 f7  
48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff  
ff 0f 83 eb 08 fc ff c3 66 2e 0f 1f 84 00 00 00 00
RSP: 002b:00007ffc6d0e55e8 EFLAGS: 00000246 ORIG_RAX: 000000000000011d
RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 0000000000441269
RDX: 0000000000000000 RSI: 0000000000000011 RDI: 0000000000000003
RBP: 00000000006cb018 R08: 00000000004002c8 R09: 00000000004002c8
R10: 0000400000000200 R11: 0000000000000246 R12: 0000000000401fe0
R13: 0000000000402070 R14: 0000000000000000 R15: 0000000000000000
INFO: task syz-executor350:8665 can't die for more than 147 seconds.
syz-executor350 D27256  8665   8657 0x00004006
Call Trace:
  context_switch kernel/sched/core.c:3384 [inline]
  __schedule+0x94a/0x1e70 kernel/sched/core.c:4069

Showing all locks held in the system:
1 lock held by khungtaskd/1073:
  #0: ffffffff88fab740 (rcu_read_lock){....}, at:  
debug_show_all_locks+0x5f/0x279 kernel/locking/lockdep.c:5336
1 lock held by rsyslogd/8538:
  #0: ffff888098390620 (&f->f_pos_lock){+.+.}, at: __fdget_pos+0xee/0x110  
fs/file.c:801
2 locks held by getty/8628:
  #0: ffff8880a1d05090 (&tty->ldisc_sem){++++}, at:  
ldsem_down_read+0x33/0x40 drivers/tty/tty_ldsem.c:340
  #1: ffffc90005f252e0 (&ldata->atomic_read_lock){+.+.}, at:  
n_tty_read+0x220/0x1bf0 drivers/tty/n_tty.c:2156
2 locks held by getty/8629:
  #0: ffff8880a9790090 (&tty->ldisc_sem){++++}, at:  
ldsem_down_read+0x33/0x40 drivers/tty/tty_ldsem.c:340
  #1: ffffc90005f4b2e0 (&ldata->atomic_read_lock){+.+.}, at:  
n_tty_read+0x220/0x1bf0 drivers/tty/n_tty.c:2156
2 locks held by getty/8630:
  #0: ffff8880a47ae090 (&tty->ldisc_sem){++++}, at:  
ldsem_down_read+0x33/0x40 drivers/tty/tty_ldsem.c:340
  #1: ffffc90005f3f2e0 (&ldata->atomic_read_lock){+.+.}, at:  
n_tty_read+0x220/0x1bf0 drivers/tty/n_tty.c:2156
2 locks held by getty/8631:
  #0: ffff88809fc6a090 (&tty->ldisc_sem){++++}, at:  
ldsem_down_read+0x33/0x40 drivers/tty/tty_ldsem.c:340
  #1: ffffc90005f472e0 (&ldata->atomic_read_lock){+.+.}, at:  
n_tty_read+0x220/0x1bf0 drivers/tty/n_tty.c:2156
2 locks held by getty/8632:
  #0: ffff8880946cf090 (&tty->ldisc_sem){++++}, at:  
ldsem_down_read+0x33/0x40 drivers/tty/tty_ldsem.c:340
  #1: ffffc90005f2d2e0 (&ldata->atomic_read_lock){+.+.}, at:  
n_tty_read+0x220/0x1bf0 drivers/tty/n_tty.c:2156
2 locks held by getty/8633:
  #0: ffff888092b6e090 (&tty->ldisc_sem){++++}, at:  
ldsem_down_read+0x33/0x40 drivers/tty/tty_ldsem.c:340
  #1: ffffc90005f432e0 (&ldata->atomic_read_lock){+.+.}, at:  
n_tty_read+0x220/0x1bf0 drivers/tty/n_tty.c:2156
2 locks held by getty/8634:
  #0: ffff88809a971090 (&tty->ldisc_sem){++++}, at:  
ldsem_down_read+0x33/0x40 drivers/tty/tty_ldsem.c:340
  #1: ffffc90005f192e0 (&ldata->atomic_read_lock){+.+.}, at:  
n_tty_read+0x220/0x1bf0 drivers/tty/n_tty.c:2156

=============================================

NMI backtrace for cpu 1
CPU: 1 PID: 1073 Comm: khungtaskd Not tainted 5.4.0-rc3-next-20191018 #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS  
Google 01/01/2011
Call Trace:
  __dump_stack lib/dump_stack.c:77 [inline]
  dump_stack+0x172/0x1f0 lib/dump_stack.c:113
  nmi_cpu_backtrace.cold+0x70/0xb2 lib/nmi_backtrace.c:101
  nmi_trigger_cpumask_backtrace+0x23b/0x28b lib/nmi_backtrace.c:62
  arch_trigger_cpumask_backtrace+0x14/0x20 arch/x86/kernel/apic/hw_nmi.c:38
  trigger_all_cpu_backtrace include/linux/nmi.h:146 [inline]
  check_hung_uninterruptible_tasks kernel/hung_task.c:269 [inline]
  watchdog+0xc8f/0x1350 kernel/hung_task.c:353
  kthread+0x361/0x430 kernel/kthread.c:255
  ret_from_fork+0x24/0x30 arch/x86/entry/entry_64.S:352
Sending NMI from CPU 1 to CPUs 0:
NMI backtrace for cpu 0
CPU: 0 PID: 9 Comm: ksoftirqd/0 Not tainted 5.4.0-rc3-next-20191018 #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS  
Google 01/01/2011
RIP: 0010:__sanitizer_cov_trace_pc+0x0/0x50 kernel/kcov.c:95
Code: 89 25 c4 94 3d 09 41 bc f4 ff ff ff e8 1d 8c e9 ff 48 c7 05 ae 94 3d  
09 00 00 00 00 e9 77 e9 ff ff 90 90 90 90 90 90 90 90 90 <55> 48 89 e5 65  
48 8b 04 25 80 fe 01 00 65 8b 15 74 b0 8e 7e 81 e2
RSP: 0018:ffff8880a989f9f0 EFLAGS: 00000092
RAX: 0000000000000286 RBX: ffff88809962f0c0 RCX: ffffffff81915655
RDX: 0000000000000100 RSI: ffff88809962f0c0 RDI: ffff88821acc9540
RBP: ffff8880a989fa28 R08: ffff8880a9886240 R09: ffff8880a9886ad8
R10: fffffbfff138fe50 R11: ffffffff89c7f287 R12: ffff88821acc9540
R13: ffff88809962f0c0 R14: 0000000000000286 R15: 0000000000000002
FS:  0000000000000000(0000) GS:ffff8880ae800000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: ffffffffff600400 CR3: 000000008fe26000 CR4: 00000000001406f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
  kmem_cache_free+0x5b/0x320 mm/slab.c:3690
  mempool_free_slab+0x1e/0x30 mm/mempool.c:520
  mempool_free+0xeb/0x370 mm/mempool.c:502
  bio_free+0x268/0x420 block/bio.c:255
  bio_put+0xda/0x110 block/bio.c:549
  __bio_chain_endio block/bio.c:307 [inline]
  bio_endio+0x2c2/0xaf0 block/bio.c:1804
  req_bio_endio block/blk-core.c:271 [inline]
  blk_update_request+0x49e/0x10d0 block/blk-core.c:1491
  blk_mq_end_request+0x5b/0x560 block/blk-mq.c:552
  end_cmd drivers/block/null_blk_main.c:648 [inline]
  end_cmd+0x111/0x310 drivers/block/null_blk_main.c:642
  null_complete_rq+0x19/0x20 drivers/block/null_blk_main.c:675
  blk_done_softirq+0x2fe/0x4d0 block/blk-softirq.c:37
  __do_softirq+0x262/0x98c kernel/softirq.c:292
  run_ksoftirqd kernel/softirq.c:603 [inline]
  run_ksoftirqd+0x8e/0x110 kernel/softirq.c:595
  smpboot_thread_fn+0x6a3/0xa40 kernel/smpboot.c:165
  kthread+0x361/0x430 kernel/kthread.c:255
  ret_from_fork+0x24/0x30 arch/x86/entry/entry_64.S:352


---
This bug is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzkaller@googlegroups.com.

syzbot will keep track of this bug report. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
For information about bisection process see: https://goo.gl/tpsmEJ#bisection
syzbot can test patches for this bug, for details see:
https://goo.gl/tpsmEJ#testing-patches

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: INFO: task syz-executor can't die for more than 143 seconds. (2)
  2019-10-23  7:56 INFO: task syz-executor can't die for more than 143 seconds. (2) syzbot
@ 2019-10-24 10:08 ` Tetsuo Handa
  2019-10-28  8:51   ` Bob Liu
  0 siblings, 1 reply; 14+ messages in thread
From: Tetsuo Handa @ 2019-10-24 10:08 UTC (permalink / raw)
  To: axboe; +Cc: syzbot, linux-block, linux-kernel, syzkaller-bugs

On 2019/10/23 16:56, syzbot wrote:
> Hello,
> 
> syzbot found the following crash on:
> 
> HEAD commit:    c4b9850b Add linux-next specific files for 20191018
> git tree:       linux-next
> console output: https://syzkaller.appspot.com/x/log.txt?x=177b3ab0e00000
> kernel config:  https://syzkaller.appspot.com/x/.config?x=c940ef12efcd1ec
> dashboard link: https://syzkaller.appspot.com/bug?extid=b48daca8639150bc5e73
> compiler:       gcc (GCC) 9.0.0 20181231 (experimental)
> syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=1356b8ff600000
> C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=14f48687600000

The reproducer is trying to allocate 64TB of disk space on /dev/nullb0 using fallocate()
but __blkdev_issue_zero_pages() cannot bail out upon SIGKILL (and therefore cannot
terminate for minutes). Can we make it killable? I don't know what action is needed
for undoing this loop...

        while (nr_sects != 0) {
                bio = blk_next_bio(bio, __blkdev_sectors_to_bio_pages(nr_sects),
                                   gfp_mask);
                bio->bi_iter.bi_sector = sector;
                bio_set_dev(bio, bdev);
                bio_set_op_attrs(bio, REQ_OP_WRITE, 0);

                while (nr_sects != 0) {
                        sz = min((sector_t) PAGE_SIZE, nr_sects << 9);
                        bi_size = bio_add_page(bio, ZERO_PAGE(0), sz, 0);
                        nr_sects -= bi_size >> 9;
                        sector += bi_size >> 9;
                        if (bi_size < sz)
                                break;
                }
                cond_resched();
        }


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: INFO: task syz-executor can't die for more than 143 seconds. (2)
  2019-10-24 10:08 ` Tetsuo Handa
@ 2019-10-28  8:51   ` Bob Liu
  2019-11-08 11:41     ` [PATCH] block: Bail out iteration functions upon SIGKILL Tetsuo Handa
  0 siblings, 1 reply; 14+ messages in thread
From: Bob Liu @ 2019-10-28  8:51 UTC (permalink / raw)
  To: Tetsuo Handa, axboe; +Cc: syzbot, linux-block, linux-kernel, syzkaller-bugs

On 10/24/19 6:08 PM, Tetsuo Handa wrote:
> On 2019/10/23 16:56, syzbot wrote:
>> Hello,
>>
>> syzbot found the following crash on:
>>
>> HEAD commit:    c4b9850b Add linux-next specific files for 20191018
>> git tree:       linux-next
>> console output: https://urldefense.proofpoint.com/v2/url?u=https-3A__syzkaller.appspot.com_x_log.txt-3Fx-3D177b3ab0e00000&d=DwICaQ&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=1ktT0U2YS_I8Zz2o-MS1YcCAzWZ6hFGtyTgvVMGM7gI&m=wOlNeKk9puri9Fvxn8bGDrlWHd-4GPMeJ9rb2CVqXaE&s=PZfBliKlYjm16VnyPzu-3i0SgqlbByIB0iI8jVhcGuk&e= 
>> kernel config:  https://urldefense.proofpoint.com/v2/url?u=https-3A__syzkaller.appspot.com_x_.config-3Fx-3Dc940ef12efcd1ec&d=DwICaQ&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=1ktT0U2YS_I8Zz2o-MS1YcCAzWZ6hFGtyTgvVMGM7gI&m=wOlNeKk9puri9Fvxn8bGDrlWHd-4GPMeJ9rb2CVqXaE&s=z8tV220wKFTQIJH1tSYLUl8ecAnll94C_mFVcHkuTlc&e= 
>> dashboard link: https://urldefense.proofpoint.com/v2/url?u=https-3A__syzkaller.appspot.com_bug-3Fextid-3Db48daca8639150bc5e73&d=DwICaQ&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=1ktT0U2YS_I8Zz2o-MS1YcCAzWZ6hFGtyTgvVMGM7gI&m=wOlNeKk9puri9Fvxn8bGDrlWHd-4GPMeJ9rb2CVqXaE&s=VZ2ZdnAqEd_AkXJujidN3EgwscGpUAdsZjuObKjXN-U&e= 
>> compiler:       gcc (GCC) 9.0.0 20181231 (experimental)
>> syz repro:      https://urldefense.proofpoint.com/v2/url?u=https-3A__syzkaller.appspot.com_x_repro.syz-3Fx-3D1356b8ff600000&d=DwICaQ&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=1ktT0U2YS_I8Zz2o-MS1YcCAzWZ6hFGtyTgvVMGM7gI&m=wOlNeKk9puri9Fvxn8bGDrlWHd-4GPMeJ9rb2CVqXaE&s=Q_svUYj2OBYmIJXnResNzOWVUCyjRpxnpun2Cu15S9M&e= 
>> C reproducer:   https://urldefense.proofpoint.com/v2/url?u=https-3A__syzkaller.appspot.com_x_repro.c-3Fx-3D14f48687600000&d=DwICaQ&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=1ktT0U2YS_I8Zz2o-MS1YcCAzWZ6hFGtyTgvVMGM7gI&m=wOlNeKk9puri9Fvxn8bGDrlWHd-4GPMeJ9rb2CVqXaE&s=FGZNxR7w-rU29MhJxJtno-c_wUXCJHgPC5V1YNp7h58&e= 
> 
> The reproducer is trying to allocate 64TB of disk space on /dev/nullb0 using fallocate()
> but __blkdev_issue_zero_pages() cannot bail out upon SIGKILL (and therefore cannot
> terminate for minutes). Can we make it killable? 

Yeah, I think we can consider add fatal_signal_pending(current) checking in the while() loop..

> I don't know what action is needed
> for undoing this loop...
> 
>         while (nr_sects != 0) {
>                 bio = blk_next_bio(bio, __blkdev_sectors_to_bio_pages(nr_sects),
>                                    gfp_mask);
>                 bio->bi_iter.bi_sector = sector;
>                 bio_set_dev(bio, bdev);
>                 bio_set_op_attrs(bio, REQ_OP_WRITE, 0);
> 
>                 while (nr_sects != 0) {
>                         sz = min((sector_t) PAGE_SIZE, nr_sects << 9);
>                         bi_size = bio_add_page(bio, ZERO_PAGE(0), sz, 0);
>                         nr_sects -= bi_size >> 9;
>                         sector += bi_size >> 9;
>                         if (bi_size < sz)
>                                 break;
>                 }
>                 cond_resched();
>         }
> 


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH] block: Bail out iteration functions upon SIGKILL.
  2019-10-28  8:51   ` Bob Liu
@ 2019-11-08 11:41     ` Tetsuo Handa
  2019-11-08 18:13       ` Chaitanya Kulkarni
  2019-11-12  4:05       ` Damien Le Moal
  0 siblings, 2 replies; 14+ messages in thread
From: Tetsuo Handa @ 2019-11-08 11:41 UTC (permalink / raw)
  To: Bob Liu, axboe; +Cc: linux-block

syzbot found that a thread can stall for minutes inside fallocate()
after that thread was killed by SIGKILL [1]. While trying to allocate
64TB of disk space using fallocate() is legal, delaying termination of
killed thread for minutes is bad. Thus, allow iteration functions in
block/blk-lib.c to be killable.

[1] https://syzkaller.appspot.com/bug?id=9386d051e11e09973d5a4cf79af5e8cedf79386d

Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Reported-by: syzbot <syzbot+b48daca8639150bc5e73@syzkaller.appspotmail.com>
---
 block/blk-lib.c | 44 ++++++++++++++++++++++++++++++++++++++++----
 1 file changed, 40 insertions(+), 4 deletions(-)

diff --git a/block/blk-lib.c b/block/blk-lib.c
index 5f2c429..6ca7cae 100644
--- a/block/blk-lib.c
+++ b/block/blk-lib.c
@@ -7,9 +7,22 @@
 #include <linux/bio.h>
 #include <linux/blkdev.h>
 #include <linux/scatterlist.h>
+#include <linux/sched/signal.h>
 
 #include "blk.h"
 
+static int blk_should_abort(struct bio *bio)
+{
+	int ret;
+
+	cond_resched();
+	if (!fatal_signal_pending(current))
+		return 0;
+	ret = submit_bio_wait(bio);
+	bio_put(bio);
+	return ret ? ret : -EINTR;
+}
+
 struct bio *blk_next_bio(struct bio *bio, unsigned int nr_pages, gfp_t gfp)
 {
 	struct bio *new = bio_alloc(gfp, nr_pages);
@@ -55,6 +68,7 @@ int __blkdev_issue_discard(struct block_device *bdev, sector_t sector,
 		return -EINVAL;
 
 	while (nr_sects) {
+		int ret;
 		sector_t req_sects = min_t(sector_t, nr_sects,
 				bio_allowed_max_sectors(q));
 
@@ -75,7 +89,11 @@ int __blkdev_issue_discard(struct block_device *bdev, sector_t sector,
 		 * us to schedule out to avoid softlocking if preempt
 		 * is disabled.
 		 */
-		cond_resched();
+		ret = blk_should_abort(bio);
+		if (ret) {
+			*biop = NULL;
+			return ret;
+		}
 	}
 
 	*biop = bio;
@@ -154,6 +172,8 @@ static int __blkdev_issue_write_same(struct block_device *bdev, sector_t sector,
 	max_write_same_sectors = bio_allowed_max_sectors(q);
 
 	while (nr_sects) {
+		int ret;
+
 		bio = blk_next_bio(bio, 1, gfp_mask);
 		bio->bi_iter.bi_sector = sector;
 		bio_set_dev(bio, bdev);
@@ -171,7 +191,11 @@ static int __blkdev_issue_write_same(struct block_device *bdev, sector_t sector,
 			bio->bi_iter.bi_size = nr_sects << 9;
 			nr_sects = 0;
 		}
-		cond_resched();
+		ret = blk_should_abort(bio);
+		if (ret) {
+			*biop = NULL;
+			return ret;
+		}
 	}
 
 	*biop = bio;
@@ -230,6 +254,8 @@ static int __blkdev_issue_write_zeroes(struct block_device *bdev,
 		return -EOPNOTSUPP;
 
 	while (nr_sects) {
+		int ret;
+
 		bio = blk_next_bio(bio, 0, gfp_mask);
 		bio->bi_iter.bi_sector = sector;
 		bio_set_dev(bio, bdev);
@@ -245,7 +271,11 @@ static int __blkdev_issue_write_zeroes(struct block_device *bdev,
 			bio->bi_iter.bi_size = nr_sects << 9;
 			nr_sects = 0;
 		}
-		cond_resched();
+		ret = blk_should_abort(bio);
+		if (ret) {
+			*biop = NULL;
+			return ret;
+		}
 	}
 
 	*biop = bio;
@@ -281,6 +311,8 @@ static int __blkdev_issue_zero_pages(struct block_device *bdev,
 		return -EPERM;
 
 	while (nr_sects != 0) {
+		int ret;
+
 		bio = blk_next_bio(bio, __blkdev_sectors_to_bio_pages(nr_sects),
 				   gfp_mask);
 		bio->bi_iter.bi_sector = sector;
@@ -295,7 +327,11 @@ static int __blkdev_issue_zero_pages(struct block_device *bdev,
 			if (bi_size < sz)
 				break;
 		}
-		cond_resched();
+		ret = blk_should_abort(bio);
+		if (ret) {
+			*biop = NULL;
+			return ret;
+		}
 	}
 
 	*biop = bio;
-- 
1.8.3.1



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] block: Bail out iteration functions upon SIGKILL.
  2019-11-08 11:41     ` [PATCH] block: Bail out iteration functions upon SIGKILL Tetsuo Handa
@ 2019-11-08 18:13       ` Chaitanya Kulkarni
  2019-11-08 22:18         ` Chaitanya Kulkarni
  2019-11-12  4:05       ` Damien Le Moal
  1 sibling, 1 reply; 14+ messages in thread
From: Chaitanya Kulkarni @ 2019-11-08 18:13 UTC (permalink / raw)
  To: Tetsuo Handa, Bob Liu, axboe; +Cc: linux-block

Thanks for the patch, this looks good to me, let me test this patch
will send a review then.

On 11/08/2019 03:54 AM, Tetsuo Handa wrote:
> syzbot found that a thread can stall for minutes inside fallocate()
> after that thread was killed by SIGKILL [1]. While trying to allocate
> 64TB of disk space using fallocate() is legal, delaying termination of
> killed thread for minutes is bad. Thus, allow iteration functions in
> block/blk-lib.c to be killable.
>
> [1]https://syzkaller.appspot.com/bug?id=9386d051e11e09973d5a4cf79af5e8cedf79386d
>
> Signed-off-by: Tetsuo Handa<penguin-kernel@I-love.SAKURA.ne.jp>
> Reported-by: syzbot<syzbot+b48daca8639150bc5e73@syzkaller.appspotmail.com>


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] block: Bail out iteration functions upon SIGKILL.
  2019-11-08 18:13       ` Chaitanya Kulkarni
@ 2019-11-08 22:18         ` Chaitanya Kulkarni
  0 siblings, 0 replies; 14+ messages in thread
From: Chaitanya Kulkarni @ 2019-11-08 22:18 UTC (permalink / raw)
  To: Tetsuo Handa, Bob Liu, axboe; +Cc: linux-block

Looks good.

Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
On 11/08/2019 10:13 AM, Chaitanya Kulkarni wrote:
> Thanks for the patch, this looks good to me, let me test this patch
> will send a review then.
>
> On 11/08/2019 03:54 AM, Tetsuo Handa wrote:
>> syzbot found that a thread can stall for minutes inside fallocate()
>> after that thread was killed by SIGKILL [1]. While trying to allocate
>> 64TB of disk space using fallocate() is legal, delaying termination of
>> killed thread for minutes is bad. Thus, allow iteration functions in
>> block/blk-lib.c to be killable.
>>
>> [1]https://syzkaller.appspot.com/bug?id=9386d051e11e09973d5a4cf79af5e8cedf79386d
>>
>> Signed-off-by: Tetsuo Handa<penguin-kernel@I-love.SAKURA.ne.jp>
>> Reported-by: syzbot<syzbot+b48daca8639150bc5e73@syzkaller.appspotmail.com>
>
>


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] block: Bail out iteration functions upon SIGKILL.
  2019-11-08 11:41     ` [PATCH] block: Bail out iteration functions upon SIGKILL Tetsuo Handa
  2019-11-08 18:13       ` Chaitanya Kulkarni
@ 2019-11-12  4:05       ` Damien Le Moal
  2019-11-12 14:47         ` Tetsuo Handa
  1 sibling, 1 reply; 14+ messages in thread
From: Damien Le Moal @ 2019-11-12  4:05 UTC (permalink / raw)
  To: Tetsuo Handa, Bob Liu, axboe; +Cc: linux-block

On 2019/11/08 20:54, Tetsuo Handa wrote:
> syzbot found that a thread can stall for minutes inside fallocate()
> after that thread was killed by SIGKILL [1]. While trying to allocate
> 64TB of disk space using fallocate() is legal, delaying termination of
> killed thread for minutes is bad. Thus, allow iteration functions in
> block/blk-lib.c to be killable.
> 
> [1] https://syzkaller.appspot.com/bug?id=9386d051e11e09973d5a4cf79af5e8cedf79386d
> 
> Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
> Reported-by: syzbot <syzbot+b48daca8639150bc5e73@syzkaller.appspotmail.com>
> ---
>  block/blk-lib.c | 44 ++++++++++++++++++++++++++++++++++++++++----
>  1 file changed, 40 insertions(+), 4 deletions(-)
> 
> diff --git a/block/blk-lib.c b/block/blk-lib.c
> index 5f2c429..6ca7cae 100644
> --- a/block/blk-lib.c
> +++ b/block/blk-lib.c
> @@ -7,9 +7,22 @@
>  #include <linux/bio.h>
>  #include <linux/blkdev.h>
>  #include <linux/scatterlist.h>
> +#include <linux/sched/signal.h>
>  
>  #include "blk.h"
>  
> +static int blk_should_abort(struct bio *bio)
> +{
> +	int ret;
> +
> +	cond_resched();
> +	if (!fatal_signal_pending(current))
> +		return 0;
> +	ret = submit_bio_wait(bio);

This will change the behavior of __blkdev_issue_discard() to a sync IO
execution instead of the current async execution since submit_bio_wait()
call is the responsibility of the caller (e.g. blkdev_issue_discard()).
Have you checked if users of __blkdev_issue_discard() are OK with that ?
f2fs, ext4, xfs, dm and nvme use this function.

Looking at f2fs, this does not look like it is going to work as expected
since the bio setup, including end_io callback, is done after this
function is called and a regular submit_bio() execution is being used.

> +	bio_put(bio);
> +	return ret ? ret : -EINTR;
> +}
> +
>  struct bio *blk_next_bio(struct bio *bio, unsigned int nr_pages, gfp_t gfp)
>  {
>  	struct bio *new = bio_alloc(gfp, nr_pages);
> @@ -55,6 +68,7 @@ int __blkdev_issue_discard(struct block_device *bdev, sector_t sector,
>  		return -EINVAL;
>  
>  	while (nr_sects) {
> +		int ret;

Please add a white line after the declaration similarly to your change
in __blkdev_issue_write_same() and __blkdev_issue_zero_pages().

>  		sector_t req_sects = min_t(sector_t, nr_sects,
>  				bio_allowed_max_sectors(q));
>  
> @@ -75,7 +89,11 @@ int __blkdev_issue_discard(struct block_device *bdev, sector_t sector,
>  		 * us to schedule out to avoid softlocking if preempt
>  		 * is disabled.
>  		 */
> -		cond_resched();
> +		ret = blk_should_abort(bio);
> +		if (ret) {
> +			*biop = NULL;
> +			return ret;
> +		}
>  	}
>  
>  	*biop = bio;
> @@ -154,6 +172,8 @@ static int __blkdev_issue_write_same(struct block_device *bdev, sector_t sector,
>  	max_write_same_sectors = bio_allowed_max_sectors(q);
>  
>  	while (nr_sects) {
> +		int ret;
> +
>  		bio = blk_next_bio(bio, 1, gfp_mask);
>  		bio->bi_iter.bi_sector = sector;
>  		bio_set_dev(bio, bdev);
> @@ -171,7 +191,11 @@ static int __blkdev_issue_write_same(struct block_device *bdev, sector_t sector,
>  			bio->bi_iter.bi_size = nr_sects << 9;
>  			nr_sects = 0;
>  		}
> -		cond_resched();
> +		ret = blk_should_abort(bio);
> +		if (ret) {
> +			*biop = NULL;
> +			return ret;
> +		}
>  	}
>  
>  	*biop = bio;
> @@ -230,6 +254,8 @@ static int __blkdev_issue_write_zeroes(struct block_device *bdev,
>  		return -EOPNOTSUPP;
>  
>  	while (nr_sects) {
> +		int ret;
> +
>  		bio = blk_next_bio(bio, 0, gfp_mask);
>  		bio->bi_iter.bi_sector = sector;
>  		bio_set_dev(bio, bdev);
> @@ -245,7 +271,11 @@ static int __blkdev_issue_write_zeroes(struct block_device *bdev,
>  			bio->bi_iter.bi_size = nr_sects << 9;
>  			nr_sects = 0;
>  		}
> -		cond_resched();
> +		ret = blk_should_abort(bio);
> +		if (ret) {
> +			*biop = NULL;
> +			return ret;
> +		}
>  	}
>  
>  	*biop = bio;
> @@ -281,6 +311,8 @@ static int __blkdev_issue_zero_pages(struct block_device *bdev,
>  		return -EPERM;
>  
>  	while (nr_sects != 0) {
> +		int ret;
> +
>  		bio = blk_next_bio(bio, __blkdev_sectors_to_bio_pages(nr_sects),
>  				   gfp_mask);
>  		bio->bi_iter.bi_sector = sector;
> @@ -295,7 +327,11 @@ static int __blkdev_issue_zero_pages(struct block_device *bdev,
>  			if (bi_size < sz)
>  				break;
>  		}
> -		cond_resched();
> +		ret = blk_should_abort(bio);
> +		if (ret) {
> +			*biop = NULL;
> +			return ret;
> +		}
>  	}
>  
>  	*biop = bio;
> 


-- 
Damien Le Moal
Western Digital Research

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] block: Bail out iteration functions upon SIGKILL.
  2019-11-12  4:05       ` Damien Le Moal
@ 2019-11-12 14:47         ` Tetsuo Handa
  2019-11-13  1:54           ` Damien Le Moal
  0 siblings, 1 reply; 14+ messages in thread
From: Tetsuo Handa @ 2019-11-12 14:47 UTC (permalink / raw)
  To: Damien Le Moal, Bob Liu, axboe; +Cc: linux-block

On 2019/11/12 13:05, Damien Le Moal wrote:
> On 2019/11/08 20:54, Tetsuo Handa wrote:
>> syzbot found that a thread can stall for minutes inside fallocate()
>> after that thread was killed by SIGKILL [1]. While trying to allocate
>> 64TB of disk space using fallocate() is legal, delaying termination of
>> killed thread for minutes is bad. Thus, allow iteration functions in
>> block/blk-lib.c to be killable.
>>
>> [1] https://syzkaller.appspot.com/bug?id=9386d051e11e09973d5a4cf79af5e8cedf79386d
>>
>> Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
>> Reported-by: syzbot <syzbot+b48daca8639150bc5e73@syzkaller.appspotmail.com>
>> ---
>>  block/blk-lib.c | 44 ++++++++++++++++++++++++++++++++++++++++----
>>  1 file changed, 40 insertions(+), 4 deletions(-)
>>
>> diff --git a/block/blk-lib.c b/block/blk-lib.c
>> index 5f2c429..6ca7cae 100644
>> --- a/block/blk-lib.c
>> +++ b/block/blk-lib.c
>> @@ -7,9 +7,22 @@
>>  #include <linux/bio.h>
>>  #include <linux/blkdev.h>
>>  #include <linux/scatterlist.h>
>> +#include <linux/sched/signal.h>
>>  
>>  #include "blk.h"
>>  
>> +static int blk_should_abort(struct bio *bio)
>> +{
>> +	int ret;
>> +
>> +	cond_resched();
>> +	if (!fatal_signal_pending(current))
>> +		return 0;
>> +	ret = submit_bio_wait(bio);
> 
> This will change the behavior of __blkdev_issue_discard() to a sync IO
> execution instead of the current async execution since submit_bio_wait()
> call is the responsibility of the caller (e.g. blkdev_issue_discard()).
> Have you checked if users of __blkdev_issue_discard() are OK with that ?
> f2fs, ext4, xfs, dm and nvme use this function.

I'm not sure...

> 
> Looking at f2fs, this does not look like it is going to work as expected
> since the bio setup, including end_io callback, is done after this
> function is called and a regular submit_bio() execution is being used.

Then, just breaking the iteration like below?
nvmet_bdev_execute_write_zeroes() ignores -EINTR if "*biop = bio;" is done. Is that no problem?

--- a/block/blk-lib.c
+++ b/block/blk-lib.c
@@ -7,6 +7,7 @@
 #include <linux/bio.h>
 #include <linux/blkdev.h>
 #include <linux/scatterlist.h>
+#include <linux/sched/signal.h>
 
 #include "blk.h"
 
@@ -30,6 +31,7 @@ int __blkdev_issue_discard(struct block_device *bdev, sector_t sector,
 	struct bio *bio = *biop;
 	unsigned int op;
 	sector_t bs_mask;
+	int ret = 0;
 
 	if (!q)
 		return -ENXIO;
@@ -76,10 +78,14 @@ int __blkdev_issue_discard(struct block_device *bdev, sector_t sector,
 		 * is disabled.
 		 */
 		cond_resched();
+		if (fatal_signal_pending(current)) {
+			ret = -EINTR;
+			break;
+		}
 	}
 
 	*biop = bio;
-	return 0;
+	return ret;
 }
 EXPORT_SYMBOL(__blkdev_issue_discard);
 
@@ -136,6 +142,7 @@ static int __blkdev_issue_write_same(struct block_device *bdev, sector_t sector,
 	unsigned int max_write_same_sectors;
 	struct bio *bio = *biop;
 	sector_t bs_mask;
+	int ret = 0;
 
 	if (!q)
 		return -ENXIO;
@@ -172,10 +179,14 @@ static int __blkdev_issue_write_same(struct block_device *bdev, sector_t sector,
 			nr_sects = 0;
 		}
 		cond_resched();
+		if (fatal_signal_pending(current)) {
+			ret = -EINTR;
+			break;
+		}
 	}
 
 	*biop = bio;
-	return 0;
+	return ret;
 }
 
 /**
@@ -216,6 +227,7 @@ static int __blkdev_issue_write_zeroes(struct block_device *bdev,
 	struct bio *bio = *biop;
 	unsigned int max_write_zeroes_sectors;
 	struct request_queue *q = bdev_get_queue(bdev);
+	int ret = 0;
 
 	if (!q)
 		return -ENXIO;
@@ -246,10 +258,14 @@ static int __blkdev_issue_write_zeroes(struct block_device *bdev,
 			nr_sects = 0;
 		}
 		cond_resched();
+		if (fatal_signal_pending(current)) {
+			ret = -EINTR;
+			break;
+		}
 	}
 
 	*biop = bio;
-	return 0;
+	return ret;
 }
 
 /*
@@ -273,6 +289,7 @@ static int __blkdev_issue_zero_pages(struct block_device *bdev,
 	struct bio *bio = *biop;
 	int bi_size = 0;
 	unsigned int sz;
+	int ret = 0;
 
 	if (!q)
 		return -ENXIO;
@@ -296,10 +313,14 @@ static int __blkdev_issue_zero_pages(struct block_device *bdev,
 				break;
 		}
 		cond_resched();
+		if (fatal_signal_pending(current)) {
+			ret = -EINTR;
+			break;
+		}
 	}
 
 	*biop = bio;
-	return 0;
+	return ret;
 }
 
 /**

> 
>> +	bio_put(bio);
>> +	return ret ? ret : -EINTR;
>> +}
>> +
>>  struct bio *blk_next_bio(struct bio *bio, unsigned int nr_pages, gfp_t gfp)
>>  {
>>  	struct bio *new = bio_alloc(gfp, nr_pages);

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] block: Bail out iteration functions upon SIGKILL.
  2019-11-12 14:47         ` Tetsuo Handa
@ 2019-11-13  1:54           ` Damien Le Moal
  2019-11-13  6:55             ` Ming Lei
  2019-11-15 10:05             ` Tetsuo Handa
  0 siblings, 2 replies; 14+ messages in thread
From: Damien Le Moal @ 2019-11-13  1:54 UTC (permalink / raw)
  To: Tetsuo Handa, Bob Liu, axboe; +Cc: linux-block

On 2019/11/12 23:48, Tetsuo Handa wrote:
[...]
>>> +static int blk_should_abort(struct bio *bio)
>>> +{
>>> +	int ret;
>>> +
>>> +	cond_resched();
>>> +	if (!fatal_signal_pending(current))
>>> +		return 0;
>>> +	ret = submit_bio_wait(bio);
>>
>> This will change the behavior of __blkdev_issue_discard() to a sync IO
>> execution instead of the current async execution since submit_bio_wait()
>> call is the responsibility of the caller (e.g. blkdev_issue_discard()).
>> Have you checked if users of __blkdev_issue_discard() are OK with that ?
>> f2fs, ext4, xfs, dm and nvme use this function.
> 
> I'm not sure...
> 
>>
>> Looking at f2fs, this does not look like it is going to work as expected
>> since the bio setup, including end_io callback, is done after this
>> function is called and a regular submit_bio() execution is being used.
> 
> Then, just breaking the iteration like below?
> nvmet_bdev_execute_write_zeroes() ignores -EINTR if "*biop = bio;" is done. Is that no problem?
> 
> --- a/block/blk-lib.c
> +++ b/block/blk-lib.c
> @@ -7,6 +7,7 @@
>  #include <linux/bio.h>
>  #include <linux/blkdev.h>
>  #include <linux/scatterlist.h>
> +#include <linux/sched/signal.h>
>  
>  #include "blk.h"
>  
> @@ -30,6 +31,7 @@ int __blkdev_issue_discard(struct block_device *bdev, sector_t sector,
>  	struct bio *bio = *biop;
>  	unsigned int op;
>  	sector_t bs_mask;
> +	int ret = 0;
>  
>  	if (!q)
>  		return -ENXIO;
> @@ -76,10 +78,14 @@ int __blkdev_issue_discard(struct block_device *bdev, sector_t sector,
>  		 * is disabled.
>  		 */
>  		cond_resched();
> +		if (fatal_signal_pending(current)) {
> +			ret = -EINTR;
> +			break;
> +		}
>  	}
>  
>  	*biop = bio;
> -	return 0;
> +	return ret;

This will leak a bio as blkdev_issue_discard() executes the bio only in
the case "if (!ret && bio)". So that does not work as is, unless all
callers of __blkdev_issue_discard() are also changed. Same problem for
the other __blkdev_issue_xxx() functions.

Looking more into this, if an error is returned here, no bio should be
returned and we need to make sure that all started bios are also
completed. So your helper blk_should_abort() did the right thing calling
submit_bio_wait(). However, I Think it would be better to fail
immediately the current loop bio instead of executing it and then
reporting the -EINTR error, unconditionally, regardless of what the
started bios completion status is.

This could be done with the help of a function like this, very similar
to submit_bio_wait().

void bio_chain_end_wait(struct bio *bio)
{
	DECLARE_COMPLETION_ONSTACK_MAP(done, bio->bi_disk->lockdep_map);

	bio->bi_private = &done;
	bio->bi_end_io = submit_bio_wait_endio;
	bio->bi_opf |= REQ_SYNC;
	bio_endio(bio);
	wait_for_completion_io(&done);
}

And then your helper function becomes something like this:

static int blk_should_abort(struct bio *bio)
{
	int ret;

	cond_resched();
	if (!fatal_signal_pending(current))
		return 0;

	if (bio_flagged(bio, BIO_CHAIN))
		bio_chain_end_wait(bio);
	bio_put(bio);

	return -EINTR;
}

Thoughts ?


>  }
>  EXPORT_SYMBOL(__blkdev_issue_discard);
>  
> @@ -136,6 +142,7 @@ static int __blkdev_issue_write_same(struct block_device *bdev, sector_t sector,
>  	unsigned int max_write_same_sectors;
>  	struct bio *bio = *biop;
>  	sector_t bs_mask;
> +	int ret = 0;
>  
>  	if (!q)
>  		return -ENXIO;
> @@ -172,10 +179,14 @@ static int __blkdev_issue_write_same(struct block_device *bdev, sector_t sector,
>  			nr_sects = 0;
>  		}
>  		cond_resched();
> +		if (fatal_signal_pending(current)) {
> +			ret = -EINTR;
> +			break;
> +		}
>  	}
>  
>  	*biop = bio;
> -	return 0;
> +	return ret;
>  }
>  
>  /**
> @@ -216,6 +227,7 @@ static int __blkdev_issue_write_zeroes(struct block_device *bdev,
>  	struct bio *bio = *biop;
>  	unsigned int max_write_zeroes_sectors;
>  	struct request_queue *q = bdev_get_queue(bdev);
> +	int ret = 0;
>  
>  	if (!q)
>  		return -ENXIO;
> @@ -246,10 +258,14 @@ static int __blkdev_issue_write_zeroes(struct block_device *bdev,
>  			nr_sects = 0;
>  		}
>  		cond_resched();
> +		if (fatal_signal_pending(current)) {
> +			ret = -EINTR;
> +			break;
> +		}
>  	}
>  
>  	*biop = bio;
> -	return 0;
> +	return ret;
>  }
>  
>  /*
> @@ -273,6 +289,7 @@ static int __blkdev_issue_zero_pages(struct block_device *bdev,
>  	struct bio *bio = *biop;
>  	int bi_size = 0;
>  	unsigned int sz;
> +	int ret = 0;
>  
>  	if (!q)
>  		return -ENXIO;
> @@ -296,10 +313,14 @@ static int __blkdev_issue_zero_pages(struct block_device *bdev,
>  				break;
>  		}
>  		cond_resched();
> +		if (fatal_signal_pending(current)) {
> +			ret = -EINTR;
> +			break;
> +		}
>  	}
>  
>  	*biop = bio;
> -	return 0;
> +	return ret;
>  }
>  
>  /**
> 
>>
>>> +	bio_put(bio);
>>> +	return ret ? ret : -EINTR;
>>> +}
>>> +
>>>  struct bio *blk_next_bio(struct bio *bio, unsigned int nr_pages, gfp_t gfp)
>>>  {
>>>  	struct bio *new = bio_alloc(gfp, nr_pages);
> 


-- 
Damien Le Moal
Western Digital Research

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] block: Bail out iteration functions upon SIGKILL.
  2019-11-13  1:54           ` Damien Le Moal
@ 2019-11-13  6:55             ` Ming Lei
  2019-11-13  7:11               ` Damien Le Moal
  2019-11-15 10:05             ` Tetsuo Handa
  1 sibling, 1 reply; 14+ messages in thread
From: Ming Lei @ 2019-11-13  6:55 UTC (permalink / raw)
  To: Damien Le Moal; +Cc: Tetsuo Handa, Bob Liu, axboe, linux-block

On Wed, Nov 13, 2019 at 01:54:14AM +0000, Damien Le Moal wrote:
> On 2019/11/12 23:48, Tetsuo Handa wrote:
> [...]
> >>> +static int blk_should_abort(struct bio *bio)
> >>> +{
> >>> +	int ret;
> >>> +
> >>> +	cond_resched();
> >>> +	if (!fatal_signal_pending(current))
> >>> +		return 0;
> >>> +	ret = submit_bio_wait(bio);
> >>
> >> This will change the behavior of __blkdev_issue_discard() to a sync IO
> >> execution instead of the current async execution since submit_bio_wait()
> >> call is the responsibility of the caller (e.g. blkdev_issue_discard()).
> >> Have you checked if users of __blkdev_issue_discard() are OK with that ?
> >> f2fs, ext4, xfs, dm and nvme use this function.
> > 
> > I'm not sure...
> > 
> >>
> >> Looking at f2fs, this does not look like it is going to work as expected
> >> since the bio setup, including end_io callback, is done after this
> >> function is called and a regular submit_bio() execution is being used.
> > 
> > Then, just breaking the iteration like below?
> > nvmet_bdev_execute_write_zeroes() ignores -EINTR if "*biop = bio;" is done. Is that no problem?
> > 
> > --- a/block/blk-lib.c
> > +++ b/block/blk-lib.c
> > @@ -7,6 +7,7 @@
> >  #include <linux/bio.h>
> >  #include <linux/blkdev.h>
> >  #include <linux/scatterlist.h>
> > +#include <linux/sched/signal.h>
> >  
> >  #include "blk.h"
> >  
> > @@ -30,6 +31,7 @@ int __blkdev_issue_discard(struct block_device *bdev, sector_t sector,
> >  	struct bio *bio = *biop;
> >  	unsigned int op;
> >  	sector_t bs_mask;
> > +	int ret = 0;
> >  
> >  	if (!q)
> >  		return -ENXIO;
> > @@ -76,10 +78,14 @@ int __blkdev_issue_discard(struct block_device *bdev, sector_t sector,
> >  		 * is disabled.
> >  		 */
> >  		cond_resched();
> > +		if (fatal_signal_pending(current)) {
> > +			ret = -EINTR;
> > +			break;
> > +		}
> >  	}
> >  
> >  	*biop = bio;
> > -	return 0;
> > +	return ret;
> 
> This will leak a bio as blkdev_issue_discard() executes the bio only in
> the case "if (!ret && bio)". So that does not work as is, unless all
> callers of __blkdev_issue_discard() are also changed. Same problem for
> the other __blkdev_issue_xxx() functions.
> 
> Looking more into this, if an error is returned here, no bio should be
> returned and we need to make sure that all started bios are also
> completed. So your helper blk_should_abort() did the right thing calling
> submit_bio_wait(). However, I Think it would be better to fail
> immediately the current loop bio instead of executing it and then
> reporting the -EINTR error, unconditionally, regardless of what the
> started bios completion status is.
> 
> This could be done with the help of a function like this, very similar
> to submit_bio_wait().
> 
> void bio_chain_end_wait(struct bio *bio)
> {
> 	DECLARE_COMPLETION_ONSTACK_MAP(done, bio->bi_disk->lockdep_map);
> 
> 	bio->bi_private = &done;
> 	bio->bi_end_io = submit_bio_wait_endio;
> 	bio->bi_opf |= REQ_SYNC;
> 	bio_endio(bio);
> 	wait_for_completion_io(&done);
> }
> 
> And then your helper function becomes something like this:
> 
> static int blk_should_abort(struct bio *bio)
> {
> 	int ret;
> 
> 	cond_resched();
> 	if (!fatal_signal_pending(current))
> 		return 0;
> 
> 	if (bio_flagged(bio, BIO_CHAIN))
> 		bio_chain_end_wait(bio);
> 	bio_put(bio);
> 
> 	return -EINTR;
> }
> 
> Thoughts ?

DISCARD request can be quite big, and any sync bio submission may cause
serious performance regression.

Not mention blkdev_issue_discard() may be called in non-block context.

Thanks,
Ming


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] block: Bail out iteration functions upon SIGKILL.
  2019-11-13  6:55             ` Ming Lei
@ 2019-11-13  7:11               ` Damien Le Moal
  2019-11-13  7:49                 ` Ming Lei
  0 siblings, 1 reply; 14+ messages in thread
From: Damien Le Moal @ 2019-11-13  7:11 UTC (permalink / raw)
  To: Ming Lei; +Cc: Tetsuo Handa, Bob Liu, axboe, linux-block

On 2019/11/13 15:55, Ming Lei wrote:
> On Wed, Nov 13, 2019 at 01:54:14AM +0000, Damien Le Moal wrote:
>> On 2019/11/12 23:48, Tetsuo Handa wrote:
>> [...]
>>>>> +static int blk_should_abort(struct bio *bio)
>>>>> +{
>>>>> +	int ret;
>>>>> +
>>>>> +	cond_resched();
>>>>> +	if (!fatal_signal_pending(current))
>>>>> +		return 0;
>>>>> +	ret = submit_bio_wait(bio);
>>>>
>>>> This will change the behavior of __blkdev_issue_discard() to a sync IO
>>>> execution instead of the current async execution since submit_bio_wait()
>>>> call is the responsibility of the caller (e.g. blkdev_issue_discard()).
>>>> Have you checked if users of __blkdev_issue_discard() are OK with that ?
>>>> f2fs, ext4, xfs, dm and nvme use this function.
>>>
>>> I'm not sure...
>>>
>>>>
>>>> Looking at f2fs, this does not look like it is going to work as expected
>>>> since the bio setup, including end_io callback, is done after this
>>>> function is called and a regular submit_bio() execution is being used.
>>>
>>> Then, just breaking the iteration like below?
>>> nvmet_bdev_execute_write_zeroes() ignores -EINTR if "*biop = bio;" is done. Is that no problem?
>>>
>>> --- a/block/blk-lib.c
>>> +++ b/block/blk-lib.c
>>> @@ -7,6 +7,7 @@
>>>  #include <linux/bio.h>
>>>  #include <linux/blkdev.h>
>>>  #include <linux/scatterlist.h>
>>> +#include <linux/sched/signal.h>
>>>  
>>>  #include "blk.h"
>>>  
>>> @@ -30,6 +31,7 @@ int __blkdev_issue_discard(struct block_device *bdev, sector_t sector,
>>>  	struct bio *bio = *biop;
>>>  	unsigned int op;
>>>  	sector_t bs_mask;
>>> +	int ret = 0;
>>>  
>>>  	if (!q)
>>>  		return -ENXIO;
>>> @@ -76,10 +78,14 @@ int __blkdev_issue_discard(struct block_device *bdev, sector_t sector,
>>>  		 * is disabled.
>>>  		 */
>>>  		cond_resched();
>>> +		if (fatal_signal_pending(current)) {
>>> +			ret = -EINTR;
>>> +			break;
>>> +		}
>>>  	}
>>>  
>>>  	*biop = bio;
>>> -	return 0;
>>> +	return ret;
>>
>> This will leak a bio as blkdev_issue_discard() executes the bio only in
>> the case "if (!ret && bio)". So that does not work as is, unless all
>> callers of __blkdev_issue_discard() are also changed. Same problem for
>> the other __blkdev_issue_xxx() functions.
>>
>> Looking more into this, if an error is returned here, no bio should be
>> returned and we need to make sure that all started bios are also
>> completed. So your helper blk_should_abort() did the right thing calling
>> submit_bio_wait(). However, I Think it would be better to fail
>> immediately the current loop bio instead of executing it and then
>> reporting the -EINTR error, unconditionally, regardless of what the
>> started bios completion status is.
>>
>> This could be done with the help of a function like this, very similar
>> to submit_bio_wait().
>>
>> void bio_chain_end_wait(struct bio *bio)
>> {
>> 	DECLARE_COMPLETION_ONSTACK_MAP(done, bio->bi_disk->lockdep_map);
>>
>> 	bio->bi_private = &done;
>> 	bio->bi_end_io = submit_bio_wait_endio;
>> 	bio->bi_opf |= REQ_SYNC;
>> 	bio_endio(bio);
>> 	wait_for_completion_io(&done);
>> }
>>
>> And then your helper function becomes something like this:
>>
>> static int blk_should_abort(struct bio *bio)
>> {
>> 	int ret;
>>
>> 	cond_resched();
>> 	if (!fatal_signal_pending(current))
>> 		return 0;
>>
>> 	if (bio_flagged(bio, BIO_CHAIN))
>> 		bio_chain_end_wait(bio);
>> 	bio_put(bio);
>>
>> 	return -EINTR;
>> }
>>
>> Thoughts ?
> 
> DISCARD request can be quite big, and any sync bio submission may cause
> serious performance regression.

Yes indeed. But if the bio issuing loop is interrupted with discard BIOs
already issued, I do not think there is any other choice but to wait for
their completion before returning.

> Not mention blkdev_issue_discard() may be called in non-block context.

This loop is calling cond_resched(), which checks might_sleep(). So
certainly this function can block, no ?


-- 
Damien Le Moal
Western Digital Research

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] block: Bail out iteration functions upon SIGKILL.
  2019-11-13  7:11               ` Damien Le Moal
@ 2019-11-13  7:49                 ` Ming Lei
  0 siblings, 0 replies; 14+ messages in thread
From: Ming Lei @ 2019-11-13  7:49 UTC (permalink / raw)
  To: Damien Le Moal; +Cc: Tetsuo Handa, Bob Liu, axboe, linux-block

On Wed, Nov 13, 2019 at 07:11:36AM +0000, Damien Le Moal wrote:
> On 2019/11/13 15:55, Ming Lei wrote:
> > On Wed, Nov 13, 2019 at 01:54:14AM +0000, Damien Le Moal wrote:
> >> On 2019/11/12 23:48, Tetsuo Handa wrote:
> >> [...]
> >>>>> +static int blk_should_abort(struct bio *bio)
> >>>>> +{
> >>>>> +	int ret;
> >>>>> +
> >>>>> +	cond_resched();
> >>>>> +	if (!fatal_signal_pending(current))
> >>>>> +		return 0;
> >>>>> +	ret = submit_bio_wait(bio);
> >>>>
> >>>> This will change the behavior of __blkdev_issue_discard() to a sync IO
> >>>> execution instead of the current async execution since submit_bio_wait()
> >>>> call is the responsibility of the caller (e.g. blkdev_issue_discard()).
> >>>> Have you checked if users of __blkdev_issue_discard() are OK with that ?
> >>>> f2fs, ext4, xfs, dm and nvme use this function.
> >>>
> >>> I'm not sure...
> >>>
> >>>>
> >>>> Looking at f2fs, this does not look like it is going to work as expected
> >>>> since the bio setup, including end_io callback, is done after this
> >>>> function is called and a regular submit_bio() execution is being used.
> >>>
> >>> Then, just breaking the iteration like below?
> >>> nvmet_bdev_execute_write_zeroes() ignores -EINTR if "*biop = bio;" is done. Is that no problem?
> >>>
> >>> --- a/block/blk-lib.c
> >>> +++ b/block/blk-lib.c
> >>> @@ -7,6 +7,7 @@
> >>>  #include <linux/bio.h>
> >>>  #include <linux/blkdev.h>
> >>>  #include <linux/scatterlist.h>
> >>> +#include <linux/sched/signal.h>
> >>>  
> >>>  #include "blk.h"
> >>>  
> >>> @@ -30,6 +31,7 @@ int __blkdev_issue_discard(struct block_device *bdev, sector_t sector,
> >>>  	struct bio *bio = *biop;
> >>>  	unsigned int op;
> >>>  	sector_t bs_mask;
> >>> +	int ret = 0;
> >>>  
> >>>  	if (!q)
> >>>  		return -ENXIO;
> >>> @@ -76,10 +78,14 @@ int __blkdev_issue_discard(struct block_device *bdev, sector_t sector,
> >>>  		 * is disabled.
> >>>  		 */
> >>>  		cond_resched();
> >>> +		if (fatal_signal_pending(current)) {
> >>> +			ret = -EINTR;
> >>> +			break;
> >>> +		}
> >>>  	}
> >>>  
> >>>  	*biop = bio;
> >>> -	return 0;
> >>> +	return ret;
> >>
> >> This will leak a bio as blkdev_issue_discard() executes the bio only in
> >> the case "if (!ret && bio)". So that does not work as is, unless all
> >> callers of __blkdev_issue_discard() are also changed. Same problem for
> >> the other __blkdev_issue_xxx() functions.
> >>
> >> Looking more into this, if an error is returned here, no bio should be
> >> returned and we need to make sure that all started bios are also
> >> completed. So your helper blk_should_abort() did the right thing calling
> >> submit_bio_wait(). However, I Think it would be better to fail
> >> immediately the current loop bio instead of executing it and then
> >> reporting the -EINTR error, unconditionally, regardless of what the
> >> started bios completion status is.
> >>
> >> This could be done with the help of a function like this, very similar
> >> to submit_bio_wait().
> >>
> >> void bio_chain_end_wait(struct bio *bio)
> >> {
> >> 	DECLARE_COMPLETION_ONSTACK_MAP(done, bio->bi_disk->lockdep_map);
> >>
> >> 	bio->bi_private = &done;
> >> 	bio->bi_end_io = submit_bio_wait_endio;
> >> 	bio->bi_opf |= REQ_SYNC;
> >> 	bio_endio(bio);
> >> 	wait_for_completion_io(&done);
> >> }
> >>
> >> And then your helper function becomes something like this:
> >>
> >> static int blk_should_abort(struct bio *bio)
> >> {
> >> 	int ret;
> >>
> >> 	cond_resched();
> >> 	if (!fatal_signal_pending(current))
> >> 		return 0;
> >>
> >> 	if (bio_flagged(bio, BIO_CHAIN))
> >> 		bio_chain_end_wait(bio);
> >> 	bio_put(bio);
> >>
> >> 	return -EINTR;
> >> }
> >>
> >> Thoughts ?
> > 
> > DISCARD request can be quite big, and any sync bio submission may cause
> > serious performance regression.
> 
> Yes indeed. But if the bio issuing loop is interrupted with discard BIOs
> already issued, I do not think there is any other choice but to wait for
> their completion before returning.

Looks I miss the check on fatal_signal_pending(), then this approach
seems fine.

> 
> > Not mention blkdev_issue_discard() may be called in non-block context.
> 
> This loop is calling cond_resched(), which checks might_sleep(). So
> certainly this function can block, no ?

Indeed, looks I misunderstood it.

Thanks,
Ming


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] block: Bail out iteration functions upon SIGKILL.
  2019-11-13  1:54           ` Damien Le Moal
  2019-11-13  6:55             ` Ming Lei
@ 2019-11-15 10:05             ` Tetsuo Handa
  2019-11-18  0:02               ` Damien Le Moal
  1 sibling, 1 reply; 14+ messages in thread
From: Tetsuo Handa @ 2019-11-15 10:05 UTC (permalink / raw)
  To: Damien Le Moal; +Cc: Bob Liu, axboe, linux-block

On 2019/11/13 10:54, Damien Le Moal wrote:
> On 2019/11/12 23:48, Tetsuo Handa wrote:
> [...]
>>>> +static int blk_should_abort(struct bio *bio)
>>>> +{
>>>> +	int ret;
>>>> +
>>>> +	cond_resched();
>>>> +	if (!fatal_signal_pending(current))
>>>> +		return 0;
>>>> +	ret = submit_bio_wait(bio);
>>>
>>> This will change the behavior of __blkdev_issue_discard() to a sync IO
>>> execution instead of the current async execution since submit_bio_wait()
>>> call is the responsibility of the caller (e.g. blkdev_issue_discard()).
>>> Have you checked if users of __blkdev_issue_discard() are OK with that ?
>>> f2fs, ext4, xfs, dm and nvme use this function.
>>
>> I'm not sure...
>>
>>>
>>> Looking at f2fs, this does not look like it is going to work as expected
>>> since the bio setup, including end_io callback, is done after this
>>> function is called and a regular submit_bio() execution is being used.
>>
>> Then, just breaking the iteration like below?
>> nvmet_bdev_execute_write_zeroes() ignores -EINTR if "*biop = bio;" is done. Is that no problem?
>>
>> --- a/block/blk-lib.c
>> +++ b/block/blk-lib.c
>> @@ -7,6 +7,7 @@
>>  #include <linux/bio.h>
>>  #include <linux/blkdev.h>
>>  #include <linux/scatterlist.h>
>> +#include <linux/sched/signal.h>
>>  
>>  #include "blk.h"
>>  
>> @@ -30,6 +31,7 @@ int __blkdev_issue_discard(struct block_device *bdev, sector_t sector,
>>  	struct bio *bio = *biop;
>>  	unsigned int op;
>>  	sector_t bs_mask;
>> +	int ret = 0;
>>  
>>  	if (!q)
>>  		return -ENXIO;
>> @@ -76,10 +78,14 @@ int __blkdev_issue_discard(struct block_device *bdev, sector_t sector,
>>  		 * is disabled.
>>  		 */
>>  		cond_resched();
>> +		if (fatal_signal_pending(current)) {
>> +			ret = -EINTR;
>> +			break;
>> +		}
>>  	}
>>  
>>  	*biop = bio;
>> -	return 0;
>> +	return ret;
> 
> This will leak a bio as blkdev_issue_discard() executes the bio only in
> the case "if (!ret && bio)". So that does not work as is, unless all
> callers of __blkdev_issue_discard() are also changed. Same problem for
> the other __blkdev_issue_xxx() functions.
> 
> Looking more into this, if an error is returned here, no bio should be
> returned and we need to make sure that all started bios are also
> completed. So your helper blk_should_abort() did the right thing calling
> submit_bio_wait(). However, I Think it would be better to fail
> immediately the current loop bio instead of executing it and then
> reporting the -EINTR error, unconditionally, regardless of what the
> started bios completion status is.
> 
> This could be done with the help of a function like this, very similar
> to submit_bio_wait().
> 
> void bio_chain_end_wait(struct bio *bio)
> {
> 	DECLARE_COMPLETION_ONSTACK_MAP(done, bio->bi_disk->lockdep_map);
> 
> 	bio->bi_private = &done;
> 	bio->bi_end_io = submit_bio_wait_endio;
> 	bio->bi_opf |= REQ_SYNC;
> 	bio_endio(bio);
> 	wait_for_completion_io(&done);
> }
> 
> And then your helper function becomes something like this:
> 
> static int blk_should_abort(struct bio *bio)
> {
> 	int ret;
> 
> 	cond_resched();
> 	if (!fatal_signal_pending(current))
> 		return 0;
> 
> 	if (bio_flagged(bio, BIO_CHAIN))
> 		bio_chain_end_wait(bio);

I don't know about block layer, but I feel this is bad because bio_put()
will be called without submit_bio_wait() when bio_flagged() == false.
Who calls submit_bio_wait() if bio_flagged() == false ?

> 	bio_put(bio);
> 
> 	return -EINTR;
> }
> 
> Thoughts ?
> 

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] block: Bail out iteration functions upon SIGKILL.
  2019-11-15 10:05             ` Tetsuo Handa
@ 2019-11-18  0:02               ` Damien Le Moal
  0 siblings, 0 replies; 14+ messages in thread
From: Damien Le Moal @ 2019-11-18  0:02 UTC (permalink / raw)
  To: Tetsuo Handa; +Cc: Bob Liu, axboe, linux-block

On 2019/11/15 19:05, Tetsuo Handa wrote:
> On 2019/11/13 10:54, Damien Le Moal wrote:
>> On 2019/11/12 23:48, Tetsuo Handa wrote:
>> [...]
>>>>> +static int blk_should_abort(struct bio *bio)
>>>>> +{
>>>>> +	int ret;
>>>>> +
>>>>> +	cond_resched();
>>>>> +	if (!fatal_signal_pending(current))
>>>>> +		return 0;
>>>>> +	ret = submit_bio_wait(bio);
>>>>
>>>> This will change the behavior of __blkdev_issue_discard() to a sync IO
>>>> execution instead of the current async execution since submit_bio_wait()
>>>> call is the responsibility of the caller (e.g. blkdev_issue_discard()).
>>>> Have you checked if users of __blkdev_issue_discard() are OK with that ?
>>>> f2fs, ext4, xfs, dm and nvme use this function.
>>>
>>> I'm not sure...
>>>
>>>>
>>>> Looking at f2fs, this does not look like it is going to work as expected
>>>> since the bio setup, including end_io callback, is done after this
>>>> function is called and a regular submit_bio() execution is being used.
>>>
>>> Then, just breaking the iteration like below?
>>> nvmet_bdev_execute_write_zeroes() ignores -EINTR if "*biop = bio;" is done. Is that no problem?
>>>
>>> --- a/block/blk-lib.c
>>> +++ b/block/blk-lib.c
>>> @@ -7,6 +7,7 @@
>>>  #include <linux/bio.h>
>>>  #include <linux/blkdev.h>
>>>  #include <linux/scatterlist.h>
>>> +#include <linux/sched/signal.h>
>>>  
>>>  #include "blk.h"
>>>  
>>> @@ -30,6 +31,7 @@ int __blkdev_issue_discard(struct block_device *bdev, sector_t sector,
>>>  	struct bio *bio = *biop;
>>>  	unsigned int op;
>>>  	sector_t bs_mask;
>>> +	int ret = 0;
>>>  
>>>  	if (!q)
>>>  		return -ENXIO;
>>> @@ -76,10 +78,14 @@ int __blkdev_issue_discard(struct block_device *bdev, sector_t sector,
>>>  		 * is disabled.
>>>  		 */
>>>  		cond_resched();
>>> +		if (fatal_signal_pending(current)) {
>>> +			ret = -EINTR;
>>> +			break;
>>> +		}
>>>  	}
>>>  
>>>  	*biop = bio;
>>> -	return 0;
>>> +	return ret;
>>
>> This will leak a bio as blkdev_issue_discard() executes the bio only in
>> the case "if (!ret && bio)". So that does not work as is, unless all
>> callers of __blkdev_issue_discard() are also changed. Same problem for
>> the other __blkdev_issue_xxx() functions.
>>
>> Looking more into this, if an error is returned here, no bio should be
>> returned and we need to make sure that all started bios are also
>> completed. So your helper blk_should_abort() did the right thing calling
>> submit_bio_wait(). However, I Think it would be better to fail
>> immediately the current loop bio instead of executing it and then
>> reporting the -EINTR error, unconditionally, regardless of what the
>> started bios completion status is.
>>
>> This could be done with the help of a function like this, very similar
>> to submit_bio_wait().
>>
>> void bio_chain_end_wait(struct bio *bio)
>> {
>> 	DECLARE_COMPLETION_ONSTACK_MAP(done, bio->bi_disk->lockdep_map);
>>
>> 	bio->bi_private = &done;
>> 	bio->bi_end_io = submit_bio_wait_endio;
>> 	bio->bi_opf |= REQ_SYNC;
>> 	bio_endio(bio);
>> 	wait_for_completion_io(&done);
>> }
>>
>> And then your helper function becomes something like this:
>>
>> static int blk_should_abort(struct bio *bio)
>> {
>> 	int ret;
>>
>> 	cond_resched();
>> 	if (!fatal_signal_pending(current))
>> 		return 0;
>>
>> 	if (bio_flagged(bio, BIO_CHAIN))
>> 		bio_chain_end_wait(bio);
> 
> I don't know about block layer, but I feel this is bad because bio_put()
> will be called without submit_bio_wait() when bio_flagged() == false.
> Who calls submit_bio_wait() if bio_flagged() == false ?

If the BIO is not flagged, then it is not chained and so does not need
to be executed at all and can be dropped (freed) right away with
bio_put(). No need (and in fact we do not want) to execute it at all.

For other cases where bio is flagged, it means that it is chained and so
that previous BIOs where already started by the submit_bio() call in
bio_next(). In this case, the current BIO is still *not* executed at all
and bio_endio() is called for it instead of submit_bio_wait(). But since
bio_endio() is called after setting:

bio->bi_end_io = submit_bio_wait_endio;

the bio_endio() call has the same effect as the completion of the bio if
it were executed: the previous chained BIO completion is waited for.

> 
>> 	bio_put(bio);
>>
>> 	return -EINTR;
>> }
>>
>> Thoughts ?
>>
> 


-- 
Damien Le Moal
Western Digital Research

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, back to index

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-10-23  7:56 INFO: task syz-executor can't die for more than 143 seconds. (2) syzbot
2019-10-24 10:08 ` Tetsuo Handa
2019-10-28  8:51   ` Bob Liu
2019-11-08 11:41     ` [PATCH] block: Bail out iteration functions upon SIGKILL Tetsuo Handa
2019-11-08 18:13       ` Chaitanya Kulkarni
2019-11-08 22:18         ` Chaitanya Kulkarni
2019-11-12  4:05       ` Damien Le Moal
2019-11-12 14:47         ` Tetsuo Handa
2019-11-13  1:54           ` Damien Le Moal
2019-11-13  6:55             ` Ming Lei
2019-11-13  7:11               ` Damien Le Moal
2019-11-13  7:49                 ` Ming Lei
2019-11-15 10:05             ` Tetsuo Handa
2019-11-18  0:02               ` Damien Le Moal

Linux-Block Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-block/0 linux-block/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-block linux-block/ https://lore.kernel.org/linux-block \
		linux-block@vger.kernel.org
	public-inbox-index linux-block

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-block


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git