Greetings, 0day kernel testing robot got the below dmesg and the first bad commit is https://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux-block.git for-4.12/block commit 142bbdfccc8b3e9f7342f2ce8422e76a3b45beae Author: Jan Kara AuthorDate: Tue Apr 4 14:31:30 2017 +0200 Commit: Jens Axboe CommitDate: Wed Apr 5 08:15:08 2017 -0600 cfq: Disable writeback throttling by default Writeback throttling does not play well with CFQ since that also tries to throttle async writes. As a result async writeback can get starved in presence of readers. As an example take a benchmark simulating postgreSQL database running over a standard rotating SATA drive. There are 16 processes doing random reads from a huge file (2*machine memory), 1 process doing random writes to the huge file and calling fsync once per 50000 writes and 1 process doing sequential 8k writes to a relatively small file wrapping around at the end of the file and calling fsync every 5 writes. Under this load read latency easily exceeds the target latency of 75 ms (just because there are so many reads happening against a relatively slow disk) and thus writeback is throttled to a point where only 1 write request is allowed at a time. Blktrace data then looks like: 8,0 1 0 8.347751764 0 m N cfq workload slice:40000000 8,0 1 0 8.347755256 0 m N cfq293A / set_active wl_class: 0 wl_type:0 8,0 1 0 8.347784100 0 m N cfq293A / Not idling. st->count:1 8,0 1 3814 8.347763916 5839 UT N [kworker/u9:2] 1 8,0 0 0 8.347777605 0 m N cfq293A / Not idling. st->count:1 8,0 1 0 8.347784100 0 m N cfq293A / Not idling. st->count:1 8,0 3 1596 8.354364057 0 C R 156109528 + 8 (6906954) [0] 8,0 3 0 8.354383193 0 m N cfq6196SN / complete rqnoidle 0 8,0 3 0 8.354386476 0 m N cfq schedule dispatch 8,0 3 0 8.354399397 0 m N cfq293A / Not idling. st->count:1 8,0 3 0 8.354404705 0 m N cfq293A / dispatch_insert 8,0 3 0 8.354409454 0 m N cfq293A / dispatched a request 8,0 3 0 8.354412527 0 m N cfq293A / activate rq, drv=1 8,0 3 1597 8.354414692 0 D W 145961400 + 24 (6718452) [swapper/0] 8,0 3 0 8.354484184 0 m N cfq293A / Not idling. st->count:1 8,0 3 0 8.354487536 0 m N cfq293A / slice expired t=0 8,0 3 0 8.354498013 0 m N / served: vt=5888102466265088 min_vt=5888074869387264 8,0 3 0 8.354502692 0 m N cfq293A / sl_used=6737519 disp=1 charge=6737519 iops=0 sect=24 8,0 3 0 8.354505695 0 m N cfq293A / del_from_rr ... 8,0 0 1810 8.354728768 0 C W 145961400 + 24 (314076) [0] 8,0 0 0 8.354746927 0 m N cfq293A / complete rqnoidle 0 ... 8,0 1 3829 8.389886102 5839 G W 145962968 + 24 [kworker/u9:2] 8,0 1 3830 8.389888127 5839 P N [kworker/u9:2] 8,0 1 3831 8.389908102 5839 A W 145978336 + 24 <- (8,4) 44000 8,0 1 3832 8.389910477 5839 Q W 145978336 + 24 [kworker/u9:2] 8,0 1 3833 8.389914248 5839 I W 145962968 + 24 (28146) [kworker/u9:2] 8,0 1 0 8.389919137 0 m N cfq293A / insert_request 8,0 1 0 8.389924305 0 m N cfq293A / add_to_rr 8,0 1 3834 8.389933175 5839 UT N [kworker/u9:2] 1 ... 8,0 0 0 9.455290997 0 m N cfq workload slice:40000000 8,0 0 0 9.455294769 0 m N cfq293A / set_active wl_class:0 wl_type:0 8,0 0 0 9.455303499 0 m N cfq293A / fifo=ffff880003166090 8,0 0 0 9.455306851 0 m N cfq293A / dispatch_insert 8,0 0 0 9.455311251 0 m N cfq293A / dispatched a request 8,0 0 0 9.455314324 0 m N cfq293A / activate rq, drv=1 8,0 0 2043 9.455316210 6204 D W 145962968 + 24 (1065401962) [pgioperf] 8,0 0 0 9.455392407 0 m N cfq293A / Not idling. st->count:1 8,0 0 0 9.455395969 0 m N cfq293A / slice expired t=0 8,0 0 0 9.455404210 0 m N / served: vt=5888958194597888 min_vt=5888941810597888 8,0 0 0 9.455410077 0 m N cfq293A / sl_used=4000000 disp=1 charge=4000000 iops=0 sect=24 8,0 0 0 9.455416851 0 m N cfq293A / del_from_rr ... 8,0 0 2045 9.455648515 0 C W 145962968 + 24 (332305) [0] 8,0 0 0 9.455668350 0 m N cfq293A / complete rqnoidle 0 ... 8,0 1 4371 9.455710115 5839 G W 145978336 + 24 [kworker/u9:2] 8,0 1 4372 9.455712350 5839 P N [kworker/u9:2] 8,0 1 4373 9.455730159 5839 A W 145986616 + 24 <- (8,4) 52280 8,0 1 4374 9.455732674 5839 Q W 145986616 + 24 [kworker/u9:2] 8,0 1 4375 9.455737563 5839 I W 145978336 + 24 (27448) [kworker/u9:2] 8,0 1 0 9.455742871 0 m N cfq293A / insert_request 8,0 1 0 9.455747550 0 m N cfq293A / add_to_rr 8,0 1 4376 9.455756629 5839 UT N [kworker/u9:2] 1 So we can see a Q event for a write request, then IO is blocked by writeback throttling and G and I events for the request happen only once other writeback IO is completed. Thus CFQ always sees only one write request. When it sees it, it queues the async queue behind all the read queues and the async queue gets scheduled after about one second. When it is scheduled, that one request gets dispatched and async queue is expired as it has no more requests to submit. Overall we submit about one write request per second. Although this scheduling is beneficial for read latency, writes are heavily starved and this causes large delays all over the system (due to processes blocking on page lock, transaction starts, etc.). When writeback throttling is disabled, write throughput is about one fifth of a read throughput which roughly matches readers/writers ratio and overall the system stalls are much shorter. Mixing writeback throttling logic with CFQ throttling logic is always a recipe for surprises as CFQ assumes it sees the big part of the picture which is not necessarily true when writeback throttling is blocking requests. So disable writeback throttling logic by default when CFQ is used as an IO scheduler. Signed-off-by: Jan Kara Signed-off-by: Jens Axboe 85003a446e block: fix inheriting request priority from bio 142bbdfccc cfq: Disable writeback throttling by default 591c59d18f block: sed-opal: Tone down all the pr_* to debugs 5efce14289 Add linux-next specific files for 20170407 +-----------------------------------------------------------------------+------------+------------+------------+---------------+ | | 85003a446e | 142bbdfccc | 591c59d18f | next-20170407 | +-----------------------------------------------------------------------+------------+------------+------------+---------------+ | boot_successes | 0 | 0 | 0 | 0 | | boot_failures | 72 | 13 | 16 | 17 | | WARNING:at_drivers/gpu/drm/drm_mode_config.c:#drm_mode_config_cleanup | 72 | 13 | 16 | | | BUG:Bad_page_state_in_process | 72 | 13 | 16 | | | kernel_BUG_at_lib/list_debug.c | 0 | 7 | 12 | 17 | | invalid_opcode:#[##] | 0 | 7 | 12 | 17 | | Kernel_panic-not_syncing:Fatal_exception | 0 | 7 | 12 | 17 | +-----------------------------------------------------------------------+------------+------------+------------+---------------+ [ 53.879031] cdrom: Uniform CD-ROM driver Revision: 3.20 [ 53.882634] list_del corruption, ffff88001766c890->prev is LIST_POISON2 (dead000000000200) [ 53.885158] ------------[ cut here ]------------ [ 53.886348] kernel BUG at lib/list_debug.c:50! [ 53.887820] invalid opcode: 0000 [#1] [ 53.888750] CPU: 0 PID: 1 Comm: swapper Tainted: G B W 4.11.0-rc3-00108-g142bbdf #1 [ 53.893266] task: ffff88001e42e000 task.stack: ffff88001e430000 [ 53.897615] RIP: 0010:__list_del_entry_valid+0x49/0x7c [ 53.899003] RSP: 0000:ffff88001e433ca8 EFLAGS: 00010296 [ 53.901688] RAX: 000000000000004e RBX: ffff88001766c890 RCX: ffff88001e42e608 [ 53.905706] RDX: 0000000000000086 RSI: 0000000000000001 RDI: 0000000000000286 [ 53.907504] RBP: ffff88001e433ca8 R08: 0000000000000001 R09: 0000000000000000 [ 53.909377] R10: ffff88001e433c68 R11: ffffffff83b9f5c7 R12: ffff880017646000 [ 53.911264] R13: ffffffff82edc470 R14: 0000000000000000 R15: 00000000ecd70201 [ 53.913178] FS: 0000000000000000(0000) GS:ffffffff82c41000(0000) knlGS:0000000000000000 [ 53.915286] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 53.916831] CR2: 0000000000000000 CR3: 0000000002c17000 CR4: 00000000000006f0 [ 53.918792] Call Trace: [ 53.920003] blk_stat_remove_callback+0x22/0x75 [ 53.921270] wbt_exit+0x1d/0x3c [ 53.922162] blk_unregister_queue+0x97/0xa8 [ 53.924686] del_gendisk+0x12f/0x1c8 [ 53.925653] ide_cd_remove+0x29/0x50 [ 53.926627] generic_ide_remove+0x20/0x27 [ 53.927602] driver_probe_device+0x178/0x322 [ 53.928660] ? driver_probe_device+0x322/0x322 [ 53.930030] __driver_attach+0x61/0x82 [ 53.931041] bus_for_each_dev+0x5a/0x82 [ 53.931905] driver_attach+0x19/0x1b [ 53.933323] bus_add_driver+0xea/0x1ce [ 53.934341] ? ide_gd_init+0x48/0x48 [ 53.935262] driver_register+0x83/0xba [ 53.936105] ide_cdrom_init+0x27/0x48 [ 53.937348] do_one_initcall+0xc3/0x22d [ 53.938388] ? parse_args+0x116/0x2a4 [ 53.940037] kernel_init_freeable+0x185/0x2ab [ 53.941207] ? rest_init+0xba/0xba [ 53.942183] kernel_init+0x9/0xe1 [ 53.943093] ret_from_fork+0x2c/0x40 # HH:MM RESULT GOOD BAD GOOD_BUT_DIRTY DIRTY_NOT_BAD git bisect start d203084e692ac2f656cf8c50449f8f90ca7875de a71c9a1c779f2499fb2afc0553e543f18aff6edf -- git bisect good 0b14e94a0003326bad39639ebafb4edd60003cc2 # 04:03 G 11 0 11 11 Merge remote-tracking branch 'ipsec-next/master' git bisect bad 27e723a4c39f9c94889452cb8911d66a06dfe69e # 05:01 B 0 7 17 0 Merge remote-tracking branch 'mailbox/mailbox-for-next' git bisect good 2cb3c9c8012d8e99d103f2fd62bd4c0a6c018785 # 18:11 G 11 0 0 2 Merge remote-tracking branch 'sound/for-next' git bisect bad 2dc410a12a7cc90d7cf712816474ee13199f305e # 18:37 B 0 1 11 0 Merge remote-tracking branch 'mfd/for-mfd-next' git bisect bad 612ae9f4920a9f4945629d12c213faa5548a9311 # 19:04 B 0 3 13 0 Merge remote-tracking branch 'lightnvm/for-next' git bisect good d59c05080e9d7e0b940ea02ac89ad30fee4dba25 # 19:45 G 10 0 0 0 Merge remote-tracking branch 'modules/modules-next' git bisect good 2f2761a9a104609a84d8306c3a5abec4fc6c3639 # 20:31 G 10 0 0 0 Merge remote-tracking branch 'input/next' git bisect bad 639a55aed5e49e2677e1d7c2cbf4890999543775 # 21:08 B 0 1 11 0 Merge remote-tracking branch 'block/for-next' git bisect good 508f4be842c912c8936ad88909f89d60b8997f8b # 21:52 G 10 0 10 10 Merge branch 'for-4.12/block' into for-next git bisect good 4083aa986f0447d562c494e75d525c1b1df71a2a # 22:15 G 10 0 10 10 nvmet_fc: Sync NVME LS reject reasons with spec git bisect bad 64c7f1d1572cacadfc0a4ca5a937aeffa486de58 # 22:59 B 0 1 11 0 block, scsi: move the retries field to struct scsi_request git bisect good 77f02a7acd7654cd5944f2120831d1eace94b343 # 11:10 G 11 0 11 15 nvme: factor request completion code into a common helper git bisect bad 8e14be53f470b47d044d9d552bdc3bb4c6723276 # 11:25 B 0 11 21 0 remove the obsolete hd driver git bisect bad 142bbdfccc8b3e9f7342f2ce8422e76a3b45beae # 11:38 B 5 1 5 5 cfq: Disable writeback throttling by default git bisect good 85003a446e9d058c9c3e6cfa5d991ace07330a68 # 12:08 G 17 0 17 17 block: fix inheriting request priority from bio # first bad commit: [142bbdfccc8b3e9f7342f2ce8422e76a3b45beae] cfq: Disable writeback throttling by default git bisect good 85003a446e9d058c9c3e6cfa5d991ace07330a68 # 00:09 G 49 0 49 70 block: fix inheriting request priority from bio # extra tests with CONFIG_DEBUG_INFO_REDUCED git bisect bad 142bbdfccc8b3e9f7342f2ce8422e76a3b45beae # 00:27 B 2 5 2 2 cfq: Disable writeback throttling by default # extra tests on HEAD of linux-next/master git bisect bad d203084e692ac2f656cf8c50449f8f90ca7875de # 00:27 B 0 17 30 0 Add linux-next specific files for 20170406 # extra tests on tree/branch block/for-4.12/block git bisect bad 591c59d18f72c98ad7a992809c3a9a1789687845 # 00:42 B 0 2 12 0 block: sed-opal: Tone down all the pr_* to debugs # extra tests with first bad commit reverted git bisect good 154fba34f5994831c761349e10feb59863a3b46f # 01:10 G 17 0 17 17 Revert "cfq: Disable writeback throttling by default" # extra tests on tree/branch linux-next/master git bisect bad 5efce1428912454a377d8336ea47c51700641b38 # 01:24 B 0 2 12 0 Add linux-next specific files for 20170407 --- 0-DAY kernel test infrastructure Open Source Technology Center https://lists.01.org/pipermail/lkp Intel Corporation