All of lore.kernel.org
 help / color / mirror / Atom feed
* [BUG] task jbd2/xvda4-8:174 blocked for more than 120 seconds.
@ 2019-02-08 17:00 Sacha
  2019-02-08 17:13 ` [admin] " Samuel Thibault
  0 siblings, 1 reply; 16+ messages in thread
From: Sacha @ 2019-02-08 17:00 UTC (permalink / raw)
  To: xen-devel; +Cc: admin

Hi,

On  Debian GNU/Linux 9.7 (stretch) amd64, we have a bug on the last Xen
Hypervisor version:

    xen-hypervisor-4.8-amd64 4.8.5+shim4.10.2+xsa282

The rollback on the previous package version corrected the problem:

    xen-hypervisor-4.8-amd64 4.8.4+xsa273+shim4.10.1+xsa273-1+deb9u10

The errors are on the domU a frozen file system until a kernel panic.


------------------------------------------------------------------------

The logs on the dom0 are the following:

dionysos login: [22942.436116] device-mapper: uevent: version 1.0.3
[22942.436402] device-mapper: ioctl: 4.35.0-ioctl (2016-06-23)
initialised: dm-devel@redhat.com <mailto:dm-devel@redhat.com>
[115516.380129] INFO: task jbd2/xvda6-8:289 blocked for more than 120
seconds.
[115516.380149] Not tainted 4.9.0-0.bpo.6-amd64 #1 Debian
4.9.88-1+deb9u1~bpo8+1
[115516.380160] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[115516.380170] jbd2/xvda6-8 D 0 289 2 0x00000000
[115516.380183] ffff8800051e4000 0000000000000000 ffff8800d5bf6f00
ffff8800d17d3140
[115516.380201] ffff8800d6398ec0 ffffc9004099fb20 ffffffff8160e973
ffffc9004099fda0
[115516.380218] ffff880003642d00 000000004099fbe8 ffffffff81303fcf
ffff8800d17d3140
[115516.380235] Call Trace:
[115516.380250] [] ? _/schedule+0x243/0x6f0
[115516.380260] [] ? blk_attempt_plug_merge+0xcf/0xe0
[115516.380269] [] ? schedule+0x32/0x80
[115516.380279] [] ? schedule_timeout+0x1df/0x380
[115516.380293] [] ? xen_clocksource_get_cycles+0x11/0x20
[115516.380302] [] ? bit_wait_timeout+0x90/0x90
[115516.380314] [] ? io_schedule_timeout+0xb4/0x130
[115516.380328] [] ? prepare_to_wait+0x57/0x80
[115516.380340] [] ? bit_wait_io+0x17/0x60
[115516.380355] [] ? _/wait_on_bit+0x5e/0x90
[115516.380372] [] ? bit_wait_timeout+0x90/0x90
[115516.380389] [] ? out_of_line_wait_on_bit+0x7e/0xa0
[115516.380402] [] ? autoremove_wake_function+0x40/0x40
[115516.380424] [] ? jbd2_journal_commit_transaction+0xd4e/0x1800 [jbd2]
[115516.380441] [] ? _/switch_to+0x2c9/0x730
[115516.380457] [] ? try_to_del_timer_sync+0x4d/0x80
[115516.380472] [] ? kjournald2+0xdd/0x280 [jbd2]
[115516.380493] [] ? wake_up_atomic_t+0x30/0x30
[115516.380505] [] ? commit_timeout+0x10/0x10 [jbd2]
[115516.380525] [] ? kthread+0xf2/0x110
[115516.380533] [] ? _/switch_to+0x2c9/0x730
[115516.380549] [] ? kthread_park+0x60/0x60
[115516.380563] [] ? ret_from_fork+0x57/0x70
[115516.380581] INFO: task jbd2/xvda5-8:300 blocked for more than 120
seconds.
[115516.380588] Not tainted 4.9.0-0.bpo.6-amd64 #1 Debian
4.9.88-1+deb9u1~bpo8+1
[115516.380603] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[115516.380623] jbd2/xvda5-8 D 0 300 2 0x00000000
[115516.380637] ffff8800051e4000 0000000000000000 ffff8800d5bf6f00
ffff8800d2820d80
[115516.380658] ffff8800d6398ec0 ffffc90040a1bb20 ffffffff8160e973
ffffc90040a1bda0
[115516.380680] ffff880062dd6600 0000000040a1bbe8 ffff88005ee997d8
ffff8800d2820d80
[115516.380711] Call Trace:
[115516.380717] [] ? _/schedule+0x243/0x6f0
[115516.380729] [] ? schedule+0x32/0x80
[115516.380744] [] ? schedule_timeout+0x1df/0x380
[115516.380752] [] ? xen_clocksource_get/cycles+0x11/0x20
[115516.380763] [] ? bit_wait_timeout+0x90/0x90
[115516.380782] [] ? io_schedule_timeout+0xb4/0x130
[115516.380787] [] ? prepare/to_wait+0x57/0x80
[115516.380800] [] ? bit_wait_io+0x17/0x60
[115516.380810] [] ? _/wait_on_bit+0x5e/0x90
[115516.380821] [] ? bit_wait_timeout+0x90/0x90
[115516.380839] [] ? out_of_line_wait_on_bit+0x7e/0xa0
[115516.380849] [] ? autoremove_wake_function+0x40/0x40
[115516.380878] [] ? jbd2_journal_commit_transaction+0xd4e/0x1800 [jbd2]
[115516.380888] [] ? __switch_to+0x2c9/0x730
[115516.380899] [] ? try_to_del_timer_sync+0x4d/0x80
[115516.380918] [] ? kjournald2+0xdd/0x280 [jbd2]
[115516.380934] [] ? wake/up_atomic_t+0x30/0x30
[115516.380956] [] ? commit_timeout+0x10/0x10 [jbd2]
[115516.380962] [] ? do_syscall_64+0x91/0x1a0
[115516.380973] [] ? SyS_exit_group+0x10/0x10
[115516.380992] [] ? kthread+0xf2/0x110
[115516.380995] [] ? _/switch_to+0x2c9/0x730
[115516.381007] [] ? kthread/park+0x60/0x60
[115516.381024] [] ? ret/from_fork+0x57/0x70
[115516.381033] INFO: task rs:main Q:Reg:529 blocked for more than 120
seconds.
[115516.381049] Not tainted 4.9.0-0.bpo.6-amd64 #1 Debian
4.9.88-1+deb9u1~bpo8+1
[115516.381068] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[115516.381086] rs:main Q:Reg D 0 529 1 0x00000000
[115516.381106] ffff8800d406bc00 ffff8800d406b800 ffff880004cc0140
ffff880004cc0e80
[115516.381125] ffff8800d6218ec0 ffffc90040ac7a00 ffffffff8160e973
ffff8800d5036080
[115516.381149] 0000000000000050 0000000000000498 0000000000000045
ffff880004cc0e80
[115516.381177] Call Trace:
[115516.381186] [] ? _/schedule+0x243/0x6f0
[115516.381201] [] ? schedule+0x32/0x80
[115516.381208] [] ? schedule_timeout+0x1df/0x380
[115516.381221] [] ? xen_clocksource_get_cycles+0x11/0x20
[115516.381240] [] ? bit_wait_timeout+0x90/0x90
[115516.381253] [] ? io_schedule_timeout+0xb4/0x130
[115516.381273] [] ? prepare_to_wait+0x57/0x80
[115516.381282] [] ? bit_wait_io+0x17/0x60
[115516.381295] [] ? _/wait_on_bit+0x5e/0x90
[115516.381310] [] ? _/switch_to_asm+0x34/0x70
[115516.381318] [] ? bit_wait_timeout+0x90/0x90
[115516.381329] [] ? out/of_line_wait_on_bit+0x7e/0xa0
[115516.381347] [] ? autoremove/wake_function+0x40/0x40
[115516.381365] [] ? do_get_write_access+0x208/0x420 [jbd2]
[115516.381404] [] ? ext4_dirty_inode+0x43/0x60 [ext4]
[115516.381421] [] ? jbd2/journal_get_write_access+0x2e/0x60 [jbd2]
[115516.381451] [] ? _/ext4_journal_get_write_access+0x36/0x70 [ext4]
[115516.381487] [] ? ext4_reserve_inode_write+0x5d/0x80 [ext4]
[115516.381514] [] ? ext4_mark_inode_dirty+0x4f/0x210 [ext4]
[115516.381544] [] ? ext4_dirty_inode+0x43/0x60 [ext4]
[115516.381563] [] ? _/mark_inode_dirty+0x17e/0x380
[115516.381578] [] ? generic/update_time+0x79/0xd0
[115516.381592] [] ? current/time+0x36/0x70
[115516.381607] [] ? file_update_time+0xbf/0x110
[115516.381622] [] ? _/generic_file_write_iter+0x99/0x1e0
[115516.381641] [] ? ext4_file_write_iter+0xfb/0x3b0 [ext4]
[115516.381654] [] ? error_exit+0x9/0x20
[115516.381665] [] ? new/sync_write+0xe4/0x140
[115516.381676] [] ? vfs_write+0xb3/0x1a0
[115516.381686] [] ? SyS/write+0x52/0xc0
[115516.381696] [] ? do_syscall_64+0x91/0x1a0
[115516.381708] [] ? entry_SYSCALL_64_after_swapgs+0x58/0xc6
[115516.381730] INFO: task kworker/u8:1:2695 blocked for more than 120
seconds.
[115516.381743] Not tainted 4.9.0-0.bpo.6-amd64 #1 Debian
4.9.88-1+deb9u1~bpo8+1
[115516.381757] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[115516.381770] kworker/u8:1 D 0 2695 2 0x00000000
[115516.381785] Workqueue: writeback wb_workfn (flush-202:5)
[115516.381797] ffff8800051e4000 0000000000000000 ffffffff81c11540
ffff8800d37ae000
[115516.381818] ffff8800d6218ec0 ffffc90040c5b5d0 ffffffff8160e973
0000000000000000
[115516.381839] ffff8800051e4000 0000000081c11540 ffffffff810257c9
ffff8800d37ae000
[115516.381866] Call Trace:
[115516.381878] [] ? _/schedule+0x243/0x6f0
[115516.381893] [] ? _/switch_to+0x2c9/0x730
[115516.381909] [] ? schedule+0x32/0x80
[115516.381919] [] ? schedule_timeout+0x1df/0x380
[115516.381933] [] ? _/radix_tree_lookup+0x76/0xe0
[115516.381944] [] ? bit_wait_timeout+0x90/0x90
[115516.381955] [] ? io_schedule_timeout+0xb4/0x130
[115516.581030] [] ? prepare_to_wait+0x57/0x80
[115516.581054] [] ? bit_wait_io+0x17/0x60
[115516.581071] [] ? _/wait_on_bit+0x5e/0x90
[115516.581086] [] ? bit_wait_timeout+0x90/0x90
[115516.581102] [] ? out/of_line_wait_on_bit+0x7e/0xa0
[115516.581119] [] ? autoremove/wake_function+0x40/0x40
[115516.581148] [] ? do_get_write_access+0x208/0x420 [jbd2]
[115516.581167] [] ? kmem_cache_alloc+0xb9/0x200
[115516.581328] [] ? jbd2_journal_get_write_access+0x2e/0x60 [jbd2]
[115516.581390] [] ? _/ext4_journal_get_write_access+0x36/0x70 [ext4]
[115516.581446] [] ? ext4/mb_mark_diskspace_used+0xeb/0x4a0 [ext4]
[115516.581495] [] ? ext4/mb_use_preallocated.constprop.31+0x22d/0x240
[ext4]
[115516.581547] [] ? ext4_mb_new_blocks+0x337/0xaf0 [ext4]
[115516.581590] [] ? ext4_find_extent+0x136/0x2f0 [ext4]
[115516.581634] [] ? ext4_ext_map_blocks+0x55f/0xdc0 [ext4]
[115516.581678] [] ? ext4_map_blocks+0x117/0x670 [ext4]
[115516.581718] [] ? ext4_writepages+0x742/0xd70 [ext4]
[115516.581747] [] ? notify_remote_via_irq+0x4a/0x70
[115516.581774] [] ? /raw_spin_unlock_irqrestore+0x16/0x20
[115516.581799] [] ? blkif_queue_rq+0x55f/0x6b0 [xen_blkfront]
[115516.581835] [] ? _/writeback_single_inode+0x3d/0x340
[115516.581860] [] ? fprop_reflect/period_percpu.isra.5+0x77/0xb0
[115516.581891] [] ? writeback_sb/inodes+0x23d/0x470
[115516.581916] [] ? _/writeback_inodes_wb+0x87/0xb0
[115516.581940] [] ? wb_writeback+0x288/0x320
[115516.581964] [] ? get/nr_inodes+0x3c/0x60
[115516.581987] [] ? wb_workfn+0x2c6/0x3a0
[115516.582010] [] ? process/one_work+0x151/0x410
[115516.582033] [] ? worker_thread+0x65/0x4a0
[115516.582056] [] ? rescuer_thread+0x340/0x340
[115516.582081] [] ? do_syscall_64+0x91/0x1a0
[115516.582106] [] ? SyS_exit_group+0x10/0x10
[115516.582128] [] ? kthread+0xf2/0x110
[115516.582150] [] ? _/switch_to+0x2c9/0x730
[115516.582174] [] ? kthread_park+0x60/0x60
[115516.582197] [] ? ret_from_fork+0x57/0x70
[115516.582224] INFO: task sort:4008 blocked for more than 120 seconds.
[115516.582249] Not tainted 4.9.0-0.bpo.6-amd64 #1 Debian
4.9.88-1+deb9u1~bpo8+1
[115516.582275] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[115516.582305] sort D 0 4008 3999 0x00000000
[115516.582331] ffff8800d1a19c00 0000000000000000 ffff8800d3afa3c0
ffff8800d321c280
[115516.582377] ffff8800d6318ec0 ffffc900416cb9b0 ffffffff8160e973
ffff8800d51191c0
[115516.582422] 0000000000000000 00000000816134a6 a86eec8ef88d3ca6
ffff8800d321c280
[115516.582506] Call Trace:
[115516.582525] [] ? _/schedule+0x243/0x6f0
[115516.582543] [] ? schedule+0x32/0x80
[115516.582561] [] ? schedule_timeout+0x1df/0x380
[115516.582581] [] ? bh/lru_install+0x160/0x1d0
[115516.582600] [] ? bit/wait_timeout+0x90/0x90
[115516.582619] [] ? io_schedule_timeout+0xb4/0x130
[115516.582639] [] ? prepare_to_wait+0x57/0x80
[115516.582657] [] ? bit_wait_io+0x17/0x60
[115516.582674] [] ? _/wait_on_bit+0x5e/0x90
[115516.582692] [] ? bit_wait_timeout+0x90/0x90
[115516.582710] [] ? out_of_line_wait_on_bit+0x7e/0xa0
[115516.582731] [] ? autoremove_wake_function+0x40/0x40
[115516.582755] [] ? do_get_write_access+0x208/0x420 [jbd2]
[115516.582774] [] ? inode_init_always+0x136/0x1f0
[115516.582796] [] ? jbd2/journal_get_write_access+0x2e/0x60 [jbd2]
[115516.582840] [] ? _/ext4_journal_get_write_access+0x36/0x70 [ext4]
[115516.582877] [] ? __ext4_new_inode+0x570/0x1450 [ext4]
[115516.582911] [] ? ext4/create+0x115/0x1b0 [ext4]
[115516.582931] [] ? path/openat+0x14b8/0x15b0
[115516.582950] [] ? do_filp_open+0x91/0x100
[115516.582972] [] ? _/check_object_size+0x10b/0x1dc
[115516.582997] [] ? do_sys_open+0x127/0x210
[115516.583021] [] ? do_syscall_64+0x91/0x1a0
[115516.583045] [] ? entry_SYSCALL_64_after_swapgs+0x58/0xc6
[115637.212130] INFO: task jbd2/xvda2-8:157 blocked for more than 120
seconds.
[115637.212159] Not tainted 4.9.0-0.bpo.6-amd64 #1 Debian
4.9.88-1+deb9u1~bpo8+1
[115637.212177] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[115637.212196] jbd2/xvda2-8 D 0 157 2 0x00000000
[115637.212216] ffff8800051e4000 0000000000000000 ffff8800cb21b0c0
ffff8800d16a8140
[115637.212245] ffff8800d6298ec0 ffffc9004097fb20 ffffffff8160e973
ffffc9004097fda0
[115637.212271] ffff88002f6dd100 000000004097fbe8 ffffffff81303fcf
ffff8800d16a8140
[115637.212297] Call Trace:
[115637.212321] [] ? _/schedule+0x243/0x6f0
[115637.212337] [] ? blk_attempt_plug_merge+0xcf/0xe0
[115637.212351] [] ? schedule+0x32/0x80
[115637.212365] [] ? schedule_timeout+0x1df/0x380
[115637.212383] [] ? xen_clocksource_get_cycles+0x11/0x20
[115637.212398] [] ? bit_wait_timeout+0x90/0x90
[115637.212431] [] ? io_schedule_timeout+0xb4/0x130
[115637.212451] [] ? prepare_to_wait+0x57/0x80
[115637.212466] [] ? bit_wait_io+0x17/0x60
[115637.212479] [] ? _/wait_on_bit+0x5e/0x90
[115637.212493] [] ? bit_wait_timeout+0x90/0x90
[115637.212507] [] ? out_of_line_wait_on_bit+0x7e/0xa0
[115637.212530] [] ? autoremove_wake_function+0x40/0x40
[115637.212552] [] ? jbd2_journal_commit_transaction+0xd4e/0x1800 [jbd2]
[115637.212578] [] ? _/switch_to+0x2c9/0x730
[115637.212591] [] ? try_to_del_timer_sync+0x4d/0x80
[115637.212611] [] ? kjournald2+0xdd/0x280 [jbd2]
[115637.212628] [] ? wake_up_atomic_t+0x30/0x30
[115637.212647] [] ? commit_timeout+0x10/0x10 [jbd2]
[115637.212663] [] ? do/syscall_64+0x91/0x1a0
[115637.212685] [] ? SyS_exit_group+0x10/0x10
[115637.212698] [] ? kthread+0xf2/0x110
[115637.212714] [] ? _/switch_to+0x2c9/0x730
[115637.212727] [] ? kthread/park+0x60/0x60
[115637.212749] [] ? ret/from_fork+0x57/0x70
[115637.212760] INFO: task jbd2/xvda4-8:174 blocked for more than 120
seconds.
[115637.212773] Not tainted 4.9.0-0.bpo.6-amd64 #1 Debian
4.9.88-1+deb9u1~bpo8+1
[115637.212792] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[115637.212808] jbd2/xvda4-8 D 0 174 2 0x00000000
[115637.212828] ffff8800d16b8800 0000000000000000 ffff8800d5bf61c0
ffff880004eb1000
[115637.212860] ffff8800d6318ec0 ffffc90040a23b20 ffffffff8160e973
0000000000000000
[115637.212882] 0000000000000000 0000000000000001 0000000000000000
ffff880004eb1000
[115637.212910] Call Trace:
[115637.212922] [] ? _/schedule+0x243/0x6f0
[115637.212940] [] ? schedule+0x32/0x80
[115637.212949] [] ? schedule_timeout+0x1df/0x380
[115637.212973] [] ? _/blk_mq_run_hw_queue+0x32d/0x3f0
[115637.212984] [] ? xen_clocksource_get_cycles+0x11/0x20
[115637.212998] [] ? bit/wait_timeout+0x90/0x90
[115637.213018] [] ? io_schedule_timeout+0xb4/0x130
[115637.213028] [] ? prepare_to_wait+0x57/0x80
[115637.213042] [] ? bit_wait_io+0x17/0x60
[115637.213055] [] ? _/wait_on_bit+0x5e/0x90
[115637.213070] [] ? bit_wait_timeout+0x90/0x90
[115637.213087] [] ? out_of_line_wait_on_bit+0x7e/0xa0
[115637.213102] [] ? autoremove_wake_function+0x40/0x40
[115637.213126] [] ? jbd2_journal_commit_transaction+0xd4e/0x1800 [jbd2]
[115637.213146] [] ? _/switch_to+0x2c9/0x730
[115637.213166] [] ? try_to_del_timer_sync+0x4d/0x80
[115637.213180] [] ? kjournald2+0xdd/0x280 [jbd2]
[115637.213200] [] ? wake_up_atomic_t+0x30/0x30
[115637.213215] [] ? commit_timeout+0x10/0x10 [jbd2]
[115637.213237] [] ? do_syscall_64+0x91/0x1a0
[115637.213247] [] ? SyS_exit_group+0x10/0x10
[115637.213262] [] ? kthread+0xf2/0x110
[115637.213279] [] ? _/switch_to+0x2c9/0x730
[115637.213290] [] ? kthread_park+0x60/0x60
[115637.213303] [] ? ret_from_fork+0x57/0x70
[115637.213321] INFO: task jbd2/xvda6-8:289 blocked for more than 120
seconds.
[115637.213335] Not tainted 4.9.0-0.bpo.6-amd64 #1 Debian
4.9.88-1+deb9u1~bpo8+1
[115637.213356] "echo 0 > /proc/sys/kernel/hung/task_timeout_secs"
disables this message.
[115637.213368] jbd2/xvda6-8 D 0 289 2 0x00000000
[115637.213389] ffff8800051e4000 0000000000000000 ffff8800d5bf6f00
ffff8800d17d3140
[115637.213413] ffff8800d6398ec0 ffffc9004099fb20 ffffffff8160e973
ffffc9004099fda0
[115637.213442] ffff880003642d00 000000004099fbe8 ffffffff81303fcf
ffff8800d17d3140
[115637.213471] Call Trace:
[115637.213482] [] ? _/schedule+0x243/0x6f0
[115637.213503] [] ? blk_attempt_plug_merge+0xcf/0xe0
[115637.213511] [] ? schedule+0x32/0x80
[115637.213525] [] ? schedule_timeout+0x1df/0x380
[115637.213539] [] ? xen_clocksource_get_cycles+0x11/0x20
[115637.213554] [] ? bit/wait_timeout+0x90/0x90
[115637.213571] [] ? io_schedule_timeout+0xb4/0x130
[115637.213588] [] ? prepare_to_wait+0x57/0x80
[115637.213602] [] ? bit_wait_io+0x17/0x60
[115637.213615] [] ? _/wait_on_bit+0x5e/0x90
[115637.213629] [] ? bit_wait_timeout+0x90/0x90
[115637.213643] [] ? out_of_line_wait_on_bit+0x7e/0xa0
[115637.213658] [] ? autoremove_wake_function+0x40/0x40
[115637.213677] [] ? jbd2_journal_commit_transaction+0xd4e/0x1800 [jbd2]
[115637.213698] [] ? _/switch_to+0x2c9/0x730
[115637.213713] [] ? try_to_del_timer_sync+0x4d/0x80
[115637.213732] [] ? kjournald2+0xdd/0x280 [jbd2]
[115637.213748] [] ? wake_up_atomic_t+0x30/0x30
[115637.213765] [] ? commit_timeout+0x10/0x10 [jbd2]
[115637.213781] [] ? kthread+0xf2/0x110
[115637.213794] [] ? _/switch_to+0x2c9/0x730
[115637.213810] [] ? kthread_park+0x60/0x60
[115637.213824] [] ? ret_from_fork+0x57/0x70
[115637.213839] INFO: task jbd2/xvda5-8:300 blocked for more than 120
seconds.
[115637.213853] Not tainted 4.9.0-0.bpo.6-amd64 #1 Debian
4.9.88-1+deb9u1~bpo8+1
[115637.213868] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[115637.213887] jbd2/xvda5-8 D 0 300 2 0x00000000
[115637.213903] ffff8800051e4000 0000000000000000 ffff8800d5bf6f00
ffff8800d2820d80
[115637.213930] ffff8800d6398ec0 ffffc90040a1bb20 ffffffff8160e973
ffffc90040a1bda0
[115637.213958] ffff880062dd6600 0000000040a1bbe8 ffff88005ee997d8
ffff8800d2820d80
[115637.413265] Call Trace:
[115637.413301] [] ? _/schedule+0x243/0x6f0
[115637.413329] [] ? schedule+0x32/0x80
[115637.413359] [] ? schedule_timeout+0x1df/0x380
[115637.413396] [] ? xen_clocksource_get_cycles+0x11/0x20
[115637.413426] [] ? bit/wait_timeout+0x90/0x90
[115637.413453] [] ? io_schedule_timeout+0xb4/0x130
[115637.413484] [] ? prepare_to_wait+0x57/0x80
[115637.413511] [] ? bit_wait_io+0x17/0x60
[115637.413536] [] ? _/wait_on_bit+0x5e/0x90
[115637.413562] [] ? bit_wait_timeout+0x90/0x90
[115637.413588] [] ? out_of_line_wait_on_bit+0x7e/0xa0
[115637.413617] [] ? autoremove_wake_function+0x40/0x40
[115637.413657] [] ? jbd2_journal_commit_transaction+0xd4e/0x1800 [jbd2]
[115637.413698] [] ? _/switch_to+0x2c9/0x730
[115637.413728] [] ? try_to_del_timer_sync+0x4d/0x80
[115637.413763] [] ? kjournald2+0xdd/0x280 [jbd2]
[115637.413793] [] ? wake_up_atomic_t+0x30/0x30
[115637.413829] [] ? commit_timeout+0x10/0x10 [jbd2]
[115637.413858] [] ? do_syscall_64+0x91/0x1a0
[115637.413887] [] ? SyS_exit_group+0x10/0x10
[115637.413915] [] ? kthread+0xf2/0x110
[115637.413939] [] ? _/switch_to+0x2c9/0x730
[115637.413967] [] ? kthread_park+0x60/0x60
[115637.413998] [] ? ret_from_fork+0x57/0x70
[115637.414032] INFO: task rs:main Q:Reg:529 blocked for more than 120
seconds.
[115637.414060] Not tainted 4.9.0-0.bpo.6-amd64 #1 Debian
4.9.88-1+deb9u1~bpo8+1
[115637.414092] "echo 0 > /proc/sys/kernel/hung/task_timeout_secs"
disables this message.
[115637.414125] rs:main Q:Reg D 0 529 1 0x00000000
[115637.414153] ffff8800d406bc00 ffff8800d406b800 ffff880004cc0140
ffff880004cc0e80
[115637.414204] ffff8800d6218ec0 ffffc90040ac7a00 ffffffff8160e973
ffff8800d5036080
[115637.414258] 0000000000000050 0000000000000498 0000000000000045
ffff880004cc0e80
[115637.414308] Call Trace:
[115637.414328] [] ? _/schedule+0x243/0x6f0
[115637.414352] [] ? schedule+0x32/0x80
[115637.414376] [] ? schedule_timeout+0x1df/0x380
[115637.414404] [] ? xen_clocksource_get_cycles+0x11/0x20
[115637.414431] [] ? bit_wait_timeout+0x90/0x90
[115637.414460] [] ? io_schedule_timeout+0xb4/0x130
[115637.414488] [] ? prepare_to_wait+0x57/0x80
[115637.414514] [] ? bit_wait_io+0x17/0x60
[115637.414538] [] ? _/wait_on_bit+0x5e/0x90
[115637.414566] [] ? _/switch_to_asm+0x34/0x70
[115637.414593] [] ? bit_wait_timeout+0x90/0x90
[115637.414620] [] ? out_of_line/wait_on_bit+0x7e/0xa0
[115637.414647] [] ? autoremove_wake_function+0x40/0x40
[115637.414680] [] ? do/get_write_access+0x208/0x420 [jbd2]
[115637.414737] [] ? ext4_dirty_inode+0x43/0x60 [ext4]
[115637.414772] [] ? jbd2_journal_get_write_access+0x2e/0x60 [jbd2]
[115637.414831] [] ? _/ext4_journal_get_write_access+0x36/0x70 [ext4]
[115637.414891] [] ? ext4_reserve_inode_write+0x5d/0x80 [ext4]
[115637.414949] [] ? ext4_mark_inode_dirty+0x4f/0x210 [ext4]
[115637.415001] [] ? ext4_dirty_inode+0x43/0x60 [ext4]
[115637.415035] [] ? _/mark_inode_dirty+0x17e/0x380
[115637.415070] [] ? generic_update_time+0x79/0xd0
[115637.415101] [] ? current_time+0x36/0x70
[115637.415129] [] ? file_update_time+0xbf/0x110
[115637.415159] [] ? _/generic_file_write_iter+0x99/0x1e0
[115637.415207] [] ? ext4_file/write_iter+0xfb/0x3b0 [ext4]
[115637.415239] [] ? error/exit+0x9/0x20
[115637.415263] [] ? new_sync_write+0xe4/0x140
[115637.612733] [] ? vfs_write+0xb3/0x1a0
[115637.612749] [] ? SyS_write+0x52/0xc0
[115637.612766] [] ? do_syscall_64+0x91/0x1a0
[115637.612784] [] ? entry_SYSCALL_64_after_swapgs+0x58/0xc6



dionysos login: [69437.780459] device-mapper: uevent: version 1.0.3
[69437.780663] device-mapper: ioctl: 4.35.0-ioctl (2016-06-23) initialised: dm-devel@redhat.com
[75641.820105] INFO: task jbd2/xvda4-8:174 blocked for more than 120 seconds.
[75641.820137]       Not tainted 4.9.0-0.bpo.6-amd64 #1 Debian 4.9.88-1+deb9u1~bpo8+1
[75641.820148] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[75641.820163] jbd2/xvda4-8    D    0   174      2 0x00000000
[75641.820186]  ffff8800cba3d000 0000000000000000 ffffffff81c11540 ffff880005043140
[75641.820207]  ffff8800d6218ec0 ffffc90040a57b20 ffffffff8160e973 ffffc90040a57da0
[75641.820233]  ffff8800cb4fa000 0000000040a57be8 ffffffff81303fcf ffff880005043140
[75641.820260] Call Trace:
[75641.820276]  [<ffffffff8160e973>] ? __schedule+0x243/0x6f0
[75641.820291]  [<ffffffff81303fcf>] ? blk_attempt_plug_merge+0xcf/0xe0
[75641.820316]  [<ffffffff8160ee52>] ? schedule+0x32/0x80
[75641.820330]  [<ffffffff8161233f>] ? schedule_timeout+0x1df/0x380
[75641.820338]  [<ffffffff8101bce1>] ? xen_clocksource_get_cycles+0x11/0x20
[75641.820354]  [<ffffffff8160f6e0>] ? bit_wait_timeout+0x90/0x90
[75641.820373]  [<ffffffff8160e6b4>] ? io_schedule_timeout+0xb4/0x130
[75641.820402]  [<ffffffff810bd0d7>] ? prepare_to_wait+0x57/0x80
[75641.820417]  [<ffffffff8160f6f7>] ? bit_wait_io+0x17/0x60
[75641.820431]  [<ffffffff8160f1de>] ? __wait_on_bit+0x5e/0x90
[75641.820450]  [<ffffffff8160f6e0>] ? bit_wait_timeout+0x90/0x90
[75641.820474]  [<ffffffff8160f34e>] ? out_of_line_wait_on_bit+0x7e/0xa0
[75641.820486]  [<ffffffff810bd400>] ? autoremove_wake_function+0x40/0x40
[75641.820516]  [<ffffffffc007dd5e>] ? jbd2_journal_commit_transaction+0xd4e/0x1800 [jbd2]
[75641.820544]  [<ffffffff810257c9>] ? __switch_to+0x2c9/0x730
[75641.820564]  [<ffffffff810e858d>] ? try_to_del_timer_sync+0x4d/0x80
[75641.820599]  [<ffffffffc00829fd>] ? kjournald2+0xdd/0x280 [jbd2]
[75641.820609]  [<ffffffff810bd3c0>] ? wake_up_atomic_t+0x30/0x30
[75641.820631]  [<ffffffffc0082920>] ? commit_timeout+0x10/0x10 [jbd2]
[75641.820652]  [<ffffffff810994d2>] ? kthread+0xf2/0x110
[75641.820680]  [<ffffffff810257c9>] ? __switch_to+0x2c9/0x730
[75641.820688]  [<ffffffff810993e0>] ? kthread_park+0x60/0x60
[75641.820706]  [<ffffffff81613977>] ? ret_from_fork+0x57/0x70
[75641.820725] INFO: task jbd2/xvda5-8:299 blocked for more than 120 seconds.
[75641.820753]       Not tainted 4.9.0-0.bpo.6-amd64 #1 Debian 4.9.88-1+deb9u1~bpo8+1
[75641.820766] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[75641.820797] jbd2/xvda5-8    D    0   299      2 0x00000000
[75641.820807]  ffff8800cba3d000 0000000000000000 ffffffff81c11540 ffff880004c28e80
[75641.820843]  ffff8800d6218ec0 ffffc900409b7b20 ffffffff8160e973 ffff880002374000
[75641.820877]  0000000602395200 00000000cc3a7200 ffff8800cc3a7200 ffff880004c28e80
[75641.820923] Call Trace:
[75641.820925]  [<ffffffff8160e973>] ? __schedule+0x243/0x6f0
[75641.820954]  [<ffffffff8160ee52>] ? schedule+0x32/0x80
[75641.820959]  [<ffffffff8161233f>] ? schedule_timeout+0x1df/0x380
[75641.820989]  [<ffffffff816138e4>] ? __switch_to_asm+0x34/0x70
[75641.820997]  [<ffffffff810161e4>] ? xen_mc_flush+0x184/0x1c0
[75641.821021]  [<ffffffff8101bce1>] ? xen_clocksource_get_cycles+0x11/0x20
[75641.821036]  [<ffffffff8160f6e0>] ? bit_wait_timeout+0x90/0x90
[75641.821055]  [<ffffffff8160e6b4>] ? io_schedule_timeout+0xb4/0x130
[75641.821084]  [<ffffffff810bd0d7>] ? prepare_to_wait+0x57/0x80
[75641.821092]  [<ffffffff8160f6f7>] ? bit_wait_io+0x17/0x60
[75641.821115]  [<ffffffff8160f1de>] ? __wait_on_bit+0x5e/0x90
[75641.821126]  [<ffffffff8160f6e0>] ? bit_wait_timeout+0x90/0x90
[75641.821146]  [<ffffffff8160f34e>] ? out_of_line_wait_on_bit+0x7e/0xa0
[75641.821164]  [<ffffffff810bd400>] ? autoremove_wake_function+0x40/0x40
[75641.821188]  [<ffffffffc007ded8>] ? jbd2_journal_commit_transaction+0xec8/0x1800 [jbd2]
[75641.821216]  [<ffffffff810257c9>] ? __switch_to+0x2c9/0x730
[75641.821244]  [<ffffffff810e858d>] ? try_to_del_timer_sync+0x4d/0x80
[75641.821257]  [<ffffffffc00829fd>] ? kjournald2+0xdd/0x280 [jbd2]
[75641.821280]  [<ffffffff810bd3c0>] ? wake_up_atomic_t+0x30/0x30
[75641.821312]  [<ffffffffc0082920>] ? commit_timeout+0x10/0x10 [jbd2]
[75641.821320]  [<ffffffff810994d2>] ? kthread+0xf2/0x110
[75641.821336]  [<ffffffff810257c9>] ? __switch_to+0x2c9/0x730
[75641.821354]  [<ffffffff810993e0>] ? kthread_park+0x60/0x60
[75641.821371]  [<ffffffff81613977>] ? ret_from_fork+0x57/0x70
[75641.821400] INFO: task kworker/u8:2:24364 blocked for more than 120 seconds.
[75641.821419]       Not tainted 4.9.0-0.bpo.6-amd64 #1 Debian 4.9.88-1+deb9u1~bpo8+1
[75641.821441] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[75641.821463] kworker/u8:2    D    0 24364      2 0x00000000
[75641.821489] Workqueue: writeback wb_workfn (flush-202:5)
[75641.821509]  ffff8800cba3d000 0000000000000000 ffffffff81c11540 ffff880004ea8200
[75641.821543]  ffff8800d6218ec0 ffffc90040de7660 ffffffff8160e973 ffffc90040de7648
[75641.821576]  0000000000000001 0000000002395200 ffff880002386e80 ffff880004ea8200
[75641.821609] Call Trace:
[75641.821622]  [<ffffffff8160e973>] ? __schedule+0x243/0x6f0
[75641.821639]  [<ffffffff8160ee52>] ? schedule+0x32/0x80
[75641.821656]  [<ffffffff8161233f>] ? schedule_timeout+0x1df/0x380
[75641.821674]  [<ffffffff813047e0>] ? blk_flush_plug_list+0xc0/0x230
[75641.821693]  [<ffffffff8160e6b4>] ? io_schedule_timeout+0xb4/0x130
[75641.821713]  [<ffffffff8135ff14>] ? __sbitmap_queue_get+0x24/0x90
[75641.821732]  [<ffffffff8130fc89>] ? bt_get.isra.6+0x129/0x1c0
[75641.821751]  [<ffffffff810bd3c0>] ? wake_up_atomic_t+0x30/0x30
[75641.821769]  [<ffffffff8130ffd3>] ? blk_mq_get_tag+0x23/0x90
[75641.821788]  [<ffffffff8130b90a>] ? __blk_mq_alloc_request+0x1a/0x220
[75641.821807]  [<ffffffff8130c76d>] ? blk_mq_map_request+0xcd/0x170
[75641.821826]  [<ffffffff8130f029>] ? blk_mq_make_request+0xc9/0x560
[75641.821847]  [<ffffffff811e5cb9>] ? kmem_cache_alloc+0x99/0x200
[75641.821867]  [<ffffffff81302e66>] ? generic_make_request+0x126/0x2d0
[75641.821886]  [<ffffffff81303086>] ? submit_bio+0x76/0x150
[75641.821904]  [<ffffffff812401a7>] ? submit_bh_wbc+0x157/0x1d0
[75642.020761]  [<ffffffff8123ef00>] ? bh_uptodate_or_lock+0x70/0x70
[75642.020782]  [<ffffffff81240341>] ? __block_write_full_page+0x121/0x3f0
[75642.020798]  [<ffffffff81242bc0>] ? I_BDEV+0x10/0x10
[75642.020824]  [<ffffffff8118db05>] ? __writepage+0x15/0x30
[75642.020841]  [<ffffffff8119006a>] ? write_cache_pages+0x20a/0x480
[75642.020860]  [<ffffffff8118daf0>] ? wb_position_ratio+0x1e0/0x1e0
[75642.020880]  [<ffffffff81190331>] ? generic_writepages+0x51/0x80
[75642.020897]  [<ffffffff8118dc8a>] ? __wb_calc_thresh+0x3a/0x150
[75642.020920]  [<ffffffff81236acd>] ? __writeback_single_inode+0x3d/0x340
[75642.020941]  [<ffffffff812372ad>] ? writeback_sb_inodes+0x23d/0x470
[75642.020960]  [<ffffffff81237567>] ? __writeback_inodes_wb+0x87/0xb0
[75642.020974]  [<ffffffff812378e8>] ? wb_writeback+0x288/0x320
[75642.020988]  [<ffffffff8122324c>] ? get_nr_inodes+0x3c/0x60
[75642.021002]  [<ffffffff81238286>] ? wb_workfn+0x2c6/0x3a0
[75642.021020]  [<ffffffff810930f1>] ? process_one_work+0x151/0x410
[75642.021034]  [<ffffffff810941a5>] ? worker_thread+0x65/0x4a0
[75642.021060]  [<ffffffff81094140>] ? rescuer_thread+0x340/0x340
[75642.021073]  [<ffffffff81003bd1>] ? do_syscall_64+0x91/0x1a0
[75642.021090]  [<ffffffff8107e490>] ? SyS_exit_group+0x10/0x10
[75642.021103]  [<ffffffff810994d2>] ? kthread+0xf2/0x110
[75642.021122]  [<ffffffff810257c9>] ? __switch_to+0x2c9/0x730
[75642.021137]  [<ffffffff810993e0>] ? kthread_park+0x60/0x60
[75642.021150]  [<ffffffff81613977>] ? ret_from_fork+0x57/0x70
[75642.021167] INFO: task find:24583 blocked for more than 120 seconds.
[75642.021188]       Not tainted 4.9.0-0.bpo.6-amd64 #1 Debian 4.9.88-1+deb9u1~bpo8+1
[75642.021197] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[75642.021227] find            D    0 24583  24582 0x00000000
[75642.021233]  ffff8800ccfa4c00 0000000000000000 ffffffff81c11540 ffff8800cd37e4c0
[75642.021261]  ffff8800d6218ec0 ffffc9004098fb20 ffffffff8160e973 ffffffff816134a6
[75642.021293]  ffffffff8160e6f4 00000000ffffffff 0000000000000002 ffff8800cd37e4c0
[75642.021324] Call Trace:
[75642.021336]  [<ffffffff8160e973>] ? __schedule+0x243/0x6f0
[75642.021351]  [<ffffffff816134a6>] ? _raw_spin_unlock_irqrestore+0x16/0x20
[75642.021358]  [<ffffffff8160e6f4>] ? io_schedule_timeout+0xf4/0x130
[75642.021383]  [<ffffffff8160ee52>] ? schedule+0x32/0x80
[75642.021391]  [<ffffffff8161233f>] ? schedule_timeout+0x1df/0x380
[75642.021410]  [<ffffffff8160f6e0>] ? bit_wait_timeout+0x90/0x90
[75642.021421]  [<ffffffff8160f34e>] ? out_of_line_wait_on_bit+0x7e/0xa0
[75642.021436]  [<ffffffff8160f6e0>] ? bit_wait_timeout+0x90/0x90
[75642.021459]  [<ffffffff8160e6b4>] ? io_schedule_timeout+0xb4/0x130
[75642.021471]  [<ffffffff810bd0d7>] ? prepare_to_wait+0x57/0x80
[75642.021491]  [<ffffffff8160f6f7>] ? bit_wait_io+0x17/0x60
[75642.021514]  [<ffffffff8160f1de>] ? __wait_on_bit+0x5e/0x90
[75642.021523]  [<ffffffff8160f6e0>] ? bit_wait_timeout+0x90/0x90
[75642.021535]  [<ffffffff8160f34e>] ? out_of_line_wait_on_bit+0x7e/0xa0
[75642.021557]  [<ffffffff810bd400>] ? autoremove_wake_function+0x40/0x40
[75642.021577]  [<ffffffffc007bbf8>] ? do_get_write_access+0x208/0x420 [jbd2]
[75642.021637]  [<ffffffffc00b5fe3>] ? ext4_dirty_inode+0x43/0x60 [ext4]
[75642.021653]  [<ffffffffc007be3e>] ? jbd2_journal_get_write_access+0x2e/0x60 [jbd2]
[75642.021687]  [<ffffffffc00e2d36>] ? __ext4_journal_get_write_access+0x36/0x70 [ext4]
[75642.021718]  [<ffffffffc00b1c8d>] ? ext4_reserve_inode_write+0x5d/0x80 [ext4]
[75642.021750]  [<ffffffffc00b1cff>] ? ext4_mark_inode_dirty+0x4f/0x210 [ext4]
[75642.021773]  [<ffffffffc00b5fe3>] ? ext4_dirty_inode+0x43/0x60 [ext4]
[75642.021791]  [<ffffffff8123688e>] ? __mark_inode_dirty+0x17e/0x380
[75642.021811]  [<ffffffff812241e9>] ? generic_update_time+0x79/0xd0
[75642.021825]  [<ffffffff81223a16>] ? current_time+0x36/0x70
[75642.021835]  [<ffffffff81225dcc>] ? touch_atime+0xac/0xd0
[75642.021853]  [<ffffffff8121c60b>] ? iterate_dir+0x15b/0x190
[75642.021867]  [<ffffffff8121cb09>] ? SyS_getdents+0x99/0x120
[75642.021880]  [<ffffffff8121c640>] ? iterate_dir+0x190/0x190
[75642.021894]  [<ffffffff81003bd1>] ? do_syscall_64+0x91/0x1a0
[75642.021900]  [<ffffffff816137ce>] ? entry_SYSCALL_64_after_swapgs+0x58/0xc6
[75762.652164] INFO: task jbd2/xvda4-8:174 blocked for more than 120 seconds.
[75762.652203]       Not tainted 4.9.0-0.bpo.6-amd64 #1 Debian 4.9.88-1+deb9u1~bpo8+1
[75762.652232] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[75762.652261] jbd2/xvda4-8    D    0   174      2 0x00000000
[75762.652286]  ffff8800cba3d000 0000000000000000 ffffffff81c11540 ffff880005043140
[75762.652333]  ffff8800d6218ec0 ffffc90040a57b20 ffffffff8160e973 ffffc90040a57da0
[75762.652382]  ffff8800cb4fa000 0000000040a57be8 ffffffff81303fcf ffff880005043140
[75762.652417] Call Trace:
[75762.652440]  [<ffffffff8160e973>] ? __schedule+0x243/0x6f0
[75762.652459]  [<ffffffff81303fcf>] ? blk_attempt_plug_merge+0xcf/0xe0
[75762.652477]  [<ffffffff8160ee52>] ? schedule+0x32/0x80
[75762.652495]  [<ffffffff8161233f>] ? schedule_timeout+0x1df/0x380
[75762.652516]  [<ffffffff8101bce1>] ? xen_clocksource_get_cycles+0x11/0x20
[75762.652535]  [<ffffffff8160f6e0>] ? bit_wait_timeout+0x90/0x90
[75762.652553]  [<ffffffff8160e6b4>] ? io_schedule_timeout+0xb4/0x130
[75762.652573]  [<ffffffff810bd0d7>] ? prepare_to_wait+0x57/0x80
[75762.652591]  [<ffffffff8160f6f7>] ? bit_wait_io+0x17/0x60
[75762.652608]  [<ffffffff8160f1de>] ? __wait_on_bit+0x5e/0x90
[75762.652625]  [<ffffffff8160f6e0>] ? bit_wait_timeout+0x90/0x90
[75762.652643]  [<ffffffff8160f34e>] ? out_of_line_wait_on_bit+0x7e/0xa0
[75762.652662]  [<ffffffff810bd400>] ? autoremove_wake_function+0x40/0x40
[75762.652690]  [<ffffffffc007dd5e>] ? jbd2_journal_commit_transaction+0xd4e/0x1800 [jbd2]
[75762.652718]  [<ffffffff810257c9>] ? __switch_to+0x2c9/0x730
[75762.652737]  [<ffffffff810e858d>] ? try_to_del_timer_sync+0x4d/0x80
[75762.652760]  [<ffffffffc00829fd>] ? kjournald2+0xdd/0x280 [jbd2]
[75762.652779]  [<ffffffff810bd3c0>] ? wake_up_atomic_t+0x30/0x30
[75762.652802]  [<ffffffffc0082920>] ? commit_timeout+0x10/0x10 [jbd2]
[75762.652824]  [<ffffffff810994d2>] ? kthread+0xf2/0x110
[75762.652841]  [<ffffffff810257c9>] ? __switch_to+0x2c9/0x730
[75762.652858]  [<ffffffff810993e0>] ? kthread_park+0x60/0x60
[75762.652876]  [<ffffffff81613977>] ? ret_from_fork+0x57/0x70
[75762.652895] INFO: task jbd2/xvda5-8:299 blocked for more than 120 seconds.
[75762.652913]       Not tainted 4.9.0-0.bpo.6-amd64 #1 Debian 4.9.88-1+deb9u1~bpo8+1
[75762.652936] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[75762.652958] jbd2/xvda5-8    D    0   299      2 0x00000000
[75762.652976]  ffff8800cba3d000 0000000000000000 ffffffff81c11540 ffff880004c28e80
[75762.653011]  ffff8800d6218ec0 ffffc900409b7b20 ffffffff8160e973 ffff880002374000
[75762.653044]  0000000602395200 00000000cc3a7200 ffff8800cc3a7200 ffff880004c28e80
[75762.653078] Call Trace:
[75762.653091]  [<ffffffff8160e973>] ? __schedule+0x243/0x6f0
[75762.653108]  [<ffffffff8160ee52>] ? schedule+0x32/0x80
[75762.653125]  [<ffffffff8161233f>] ? schedule_timeout+0x1df/0x380
[75762.653143]  [<ffffffff816138e4>] ? __switch_to_asm+0x34/0x70
[75762.653163]  [<ffffffff810161e4>] ? xen_mc_flush+0x184/0x1c0
[75762.653182]  [<ffffffff8101bce1>] ? xen_clocksource_get_cycles+0x11/0x20
[75762.653201]  [<ffffffff8160f6e0>] ? bit_wait_timeout+0x90/0x90
[75762.653220]  [<ffffffff8160e6b4>] ? io_schedule_timeout+0xb4/0x130
[75762.653239]  [<ffffffff810bd0d7>] ? prepare_to_wait+0x57/0x80
[75762.653257]  [<ffffffff8160f6f7>] ? bit_wait_io+0x17/0x60
[75762.653274]  [<ffffffff8160f1de>] ? __wait_on_bit+0x5e/0x90
[75762.653291]  [<ffffffff8160f6e0>] ? bit_wait_timeout+0x90/0x90
[75762.653309]  [<ffffffff8160f34e>] ? out_of_line_wait_on_bit+0x7e/0xa0
[75762.653328]  [<ffffffff810bd400>] ? autoremove_wake_function+0x40/0x40
[75762.653351]  [<ffffffffc007ded8>] ? jbd2_journal_commit_transaction+0xec8/0x1800 [jbd2]
[75762.653376]  [<ffffffff810257c9>] ? __switch_to+0x2c9/0x730
[75762.653394]  [<ffffffff810e858d>] ? try_to_del_timer_sync+0x4d/0x80
[75762.653416]  [<ffffffffc00829fd>] ? kjournald2+0xdd/0x280 [jbd2]
[75762.653436]  [<ffffffff810bd3c0>] ? wake_up_atomic_t+0x30/0x30
[75762.653458]  [<ffffffffc0082920>] ? commit_timeout+0x10/0x10 [jbd2]
[75762.653477]  [<ffffffff810994d2>] ? kthread+0xf2/0x110
[75762.653495]  [<ffffffff810257c9>] ? __switch_to+0x2c9/0x730
[75762.653512]  [<ffffffff810993e0>] ? kthread_park+0x60/0x60
[75762.653529]  [<ffffffff81613977>] ? ret_from_fork+0x57/0x70
[75762.653558] INFO: task kworker/u8:2:24364 blocked for more than 120 seconds.
[75762.653578]       Not tainted 4.9.0-0.bpo.6-amd64 #1 Debian 4.9.88-1+deb9u1~bpo8+1
[75762.653600] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[75762.653622] kworker/u8:2    D    0 24364      2 0x00000000
[75762.653643] Workqueue: writeback wb_workfn (flush-202:5)
[75762.653663]  ffff8800cba3d000 0000000000000000 ffffffff81c11540 ffff880004ea8200
[75762.653696]  ffff8800d6218ec0 ffffc90040de7660 ffffffff8160e973 ffffc90040de7648
[75762.653730]  0000000000000001 0000000002395200 ffff880002386e80 ffff880004ea8200
[75762.653763] Call Trace:
[75762.653778]  [<ffffffff8160e973>] ? __schedule+0x243/0x6f0
[75762.653794]  [<ffffffff8160ee52>] ? schedule+0x32/0x80
[75762.653811]  [<ffffffff8161233f>] ? schedule_timeout+0x1df/0x380
[75762.653830]  [<ffffffff813047e0>] ? blk_flush_plug_list+0xc0/0x230
[75762.653848]  [<ffffffff8160e6b4>] ? io_schedule_timeout+0xb4/0x130
[75762.653868]  [<ffffffff8135ff14>] ? __sbitmap_queue_get+0x24/0x90
[75762.653888]  [<ffffffff8130fc89>] ? bt_get.isra.6+0x129/0x1c0
[75762.653906]  [<ffffffff810bd3c0>] ? wake_up_atomic_t+0x30/0x30
[75762.653924]  [<ffffffff8130ffd3>] ? blk_mq_get_tag+0x23/0x90
[75762.653944]  [<ffffffff8130b90a>] ? __blk_mq_alloc_request+0x1a/0x220
[75762.653963]  [<ffffffff8130c76d>] ? blk_mq_map_request+0xcd/0x170
[75762.653983]  [<ffffffff8130f029>] ? blk_mq_make_request+0xc9/0x560
[75762.654004]  [<ffffffff811e5cb9>] ? kmem_cache_alloc+0x99/0x200
[75762.654024]  [<ffffffff81302e66>] ? generic_make_request+0x126/0x2d0
[75762.654043]  [<ffffffff81303086>] ? submit_bio+0x76/0x150
[75762.654062]  [<ffffffff812401a7>] ? submit_bh_wbc+0x157/0x1d0
[75762.654081]  [<ffffffff8123ef00>] ? bh_uptodate_or_lock+0x70/0x70
[75762.654099]  [<ffffffff81240341>] ? __block_write_full_page+0x121/0x3f0
[75762.654118]  [<ffffffff81242bc0>] ? I_BDEV+0x10/0x10
[75762.654136]  [<ffffffff8118db05>] ? __writepage+0x15/0x30
[75762.654153]  [<ffffffff8119006a>] ? write_cache_pages+0x20a/0x480
[75762.654172]  [<ffffffff8118daf0>] ? wb_position_ratio+0x1e0/0x1e0
[75762.654192]  [<ffffffff81190331>] ? generic_writepages+0x51/0x80
[75762.654211]  [<ffffffff8118dc8a>] ? __wb_calc_thresh+0x3a/0x150
[75762.654230]  [<ffffffff81236acd>] ? __writeback_single_inode+0x3d/0x340
[75762.654249]  [<ffffffff812372ad>] ? writeback_sb_inodes+0x23d/0x470
[75762.654269]  [<ffffffff81237567>] ? __writeback_inodes_wb+0x87/0xb0
[75762.654288]  [<ffffffff812378e8>] ? wb_writeback+0x288/0x320
[75762.654307]  [<ffffffff8122324c>] ? get_nr_inodes+0x3c/0x60
[75762.654323]  [<ffffffff81238286>] ? wb_workfn+0x2c6/0x3a0
[75762.654341]  [<ffffffff810930f1>] ? process_one_work+0x151/0x410
[75762.654359]  [<ffffffff810941a5>] ? worker_thread+0x65/0x4a0
[75762.654378]  [<ffffffff81094140>] ? rescuer_thread+0x340/0x340
[75762.654397]  [<ffffffff81003bd1>] ? do_syscall_64+0x91/0x1a0
[75762.654416]  [<ffffffff8107e490>] ? SyS_exit_group+0x10/0x10
[75762.654434]  [<ffffffff810994d2>] ? kthread+0xf2/0x110
[75762.654451]  [<ffffffff810257c9>] ? __switch_to+0x2c9/0x730
[75762.654469]  [<ffffffff810993e0>] ? kthread_park+0x60/0x60
[75762.654498]  [<ffffffff81613977>] ? ret_from_fork+0x57/0x70
[75762.654524] INFO: task find:24583 blocked for more than 120 seconds.
[75762.654549]       Not tainted 4.9.0-0.bpo.6-amd64 #1 Debian 4.9.88-1+deb9u1~bpo8+1
[75762.654576] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[75762.654598] find            D    0 24583  24582 0x00000000
[75762.654616]  ffff8800ccfa4c00 0000000000000000 ffffffff81c11540 ffff8800cd37e4c0
[75762.654650]  ffff8800d6218ec0 ffffc9004098fb20 ffffffff8160e973 ffffffff816134a6
[75762.654684]  ffffffff8160e6f4 00000000ffffffff 0000000000000002 ffff8800cd37e4c0
[75762.654717] Call Trace:
[75762.852552]  [<ffffffff8160e973>] ? __schedule+0x243/0x6f0
[75762.852593]  [<ffffffff816134a6>] ? _raw_spin_unlock_irqrestore+0x16/0x20
[75762.852623]  [<ffffffff8160e6f4>] ? io_schedule_timeout+0xf4/0x130
[75762.852652]  [<ffffffff8160ee52>] ? schedule+0x32/0x80
[75762.852680]  [<ffffffff8161233f>] ? schedule_timeout+0x1df/0x380
[75762.852709]  [<ffffffff8160f6e0>] ? bit_wait_timeout+0x90/0x90
[75762.852739]  [<ffffffff8160f34e>] ? out_of_line_wait_on_bit+0x7e/0xa0
[75762.852768]  [<ffffffff8160f6e0>] ? bit_wait_timeout+0x90/0x90
[75762.852793]  [<ffffffff8160e6b4>] ? io_schedule_timeout+0xb4/0x130
[75762.852820]  [<ffffffff810bd0d7>] ? prepare_to_wait+0x57/0x80
[75762.852843]  [<ffffffff8160f6f7>] ? bit_wait_io+0x17/0x60
[75762.852869]  [<ffffffff8160f1de>] ? __wait_on_bit+0x5e/0x90
[75762.852893]  [<ffffffff8160f6e0>] ? bit_wait_timeout+0x90/0x90
[75762.852921]  [<ffffffff8160f34e>] ? out_of_line_wait_on_bit+0x7e/0xa0
[75762.852949]  [<ffffffff810bd400>] ? autoremove_wake_function+0x40/0x40
[75762.852989]  [<ffffffffc007bbf8>] ? do_get_write_access+0x208/0x420 [jbd2]
[75762.853054]  [<ffffffffc00b5fe3>] ? ext4_dirty_inode+0x43/0x60 [ext4]
[75762.853089]  [<ffffffffc007be3e>] ? jbd2_journal_get_write_access+0x2e/0x60 [jbd2]
[75762.853151]  [<ffffffffc00e2d36>] ? __ext4_journal_get_write_access+0x36/0x70 [ext4]
[75762.853210]  [<ffffffffc00b1c8d>] ? ext4_reserve_inode_write+0x5d/0x80 [ext4]
[75762.853262]  [<ffffffffc00b1cff>] ? ext4_mark_inode_dirty+0x4f/0x210 [ext4]
[75762.853310]  [<ffffffffc00b5fe3>] ? ext4_dirty_inode+0x43/0x60 [ext4]
[75762.853340]  [<ffffffff8123688e>] ? __mark_inode_dirty+0x17e/0x380
[75762.853368]  [<ffffffff812241e9>] ? generic_update_time+0x79/0xd0
[75762.853395]  [<ffffffff81223a16>] ? current_time+0x36/0x70
[75762.853422]  [<ffffffff81225dcc>] ? touch_atime+0xac/0xd0
[75762.853448]  [<ffffffff8121c60b>] ? iterate_dir+0x15b/0x190
[75762.853473]  [<ffffffff8121cb09>] ? SyS_getdents+0x99/0x120
[75762.853498]  [<ffffffff8121c640>] ? iterate_dir+0x190/0x190
[75762.853524]  [<ffffffff81003bd1>] ? do_syscall_64+0x91/0x1a0
[75762.853555]  [<ffffffff816137ce>] ? entry_SYSCALL_64_after_swapgs+0x58/0xc6
[75883.484051] INFO: task jbd2/xvda2-8:157 blocked for more than 120 seconds.
[75883.484069]       Not tainted 4.9.0-0.bpo.6-amd64 #1 Debian 4.9.88-1+deb9u1~bpo8+1
[75883.484085] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[75883.484095] jbd2/xvda2-8    D    0   157      2 0x00000000
[75883.484107]  ffff8800cba3d000 0000000000000000 ffff8800d5bf4ec0 ffff8800cd3562c0
[75883.484129]  ffff8800d6298ec0 ffffc900409cfb20 ffffffff8160e973 ffff8800ce225c00
[75883.484147]  00000012ce2360c0 00000000050baa00 ffff8800050baa00 ffff8800cd3562c0
[75883.484166] Call Trace:
[75883.484179]  [<ffffffff8160e973>] ? __schedule+0x243/0x6f0
[75883.484189]  [<ffffffff8160ee52>] ? schedule+0x32/0x80
[75883.484199]  [<ffffffff8161233f>] ? schedule_timeout+0x1df/0x380
[75883.484211]  [<ffffffff816138e4>] ? __switch_to_asm+0x34/0x70
[75883.484222]  [<ffffffff810161e4>] ? xen_mc_flush+0x184/0x1c0
[75883.484233]  [<ffffffff8101bce1>] ? xen_clocksource_get_cycles+0x11/0x20
[75883.484243]  [<ffffffff8160f6e0>] ? bit_wait_timeout+0x90/0x90
[75883.484253]  [<ffffffff8160e6b4>] ? io_schedule_timeout+0xb4/0x130
[75883.484265]  [<ffffffff810bd0d7>] ? prepare_to_wait+0x57/0x80
[75883.484275]  [<ffffffff8160f6f7>] ? bit_wait_io+0x17/0x60
[75883.484285]  [<ffffffff8160f1de>] ? __wait_on_bit+0x5e/0x90
[75883.484293]  [<ffffffff8160f6e0>] ? bit_wait_timeout+0x90/0x90
[75883.484303]  [<ffffffff8160f34e>] ? out_of_line_wait_on_bit+0x7e/0xa0
[75883.484313]  [<ffffffff810bd400>] ? autoremove_wake_function+0x40/0x40
[75883.484333]  [<ffffffffc007ded8>] ? jbd2_journal_commit_transaction+0xec8/0x1800 [jbd2]
[75883.484355]  [<ffffffff810257c9>] ? __switch_to+0x2c9/0x730
[75883.484369]  [<ffffffff810e858d>] ? try_to_del_timer_sync+0x4d/0x80
[75883.484388]  [<ffffffffc00829fd>] ? kjournald2+0xdd/0x280 [jbd2]
[75883.484400]  [<ffffffff810bd3c0>] ? wake_up_atomic_t+0x30/0x30
[75883.484413]  [<ffffffffc0082920>] ? commit_timeout+0x10/0x10 [jbd2]
[75883.484425]  [<ffffffff810994d2>] ? kthread+0xf2/0x110
[75883.484435]  [<ffffffff810257c9>] ? __switch_to+0x2c9/0x730
[75883.484444]  [<ffffffff810993e0>] ? kthread_park+0x60/0x60
[75883.484454]  [<ffffffff81613977>] ? ret_from_fork+0x57/0x70
[75883.484464] INFO: task jbd2/xvda4-8:174 blocked for more than 120 seconds.
[75883.484473]       Not tainted 4.9.0-0.bpo.6-amd64 #1 Debian 4.9.88-1+deb9u1~bpo8+1
[75883.484485] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[75883.484516] jbd2/xvda4-8    D    0   174      2 0x00000000
[75883.484524]  ffff8800cba3d000 0000000000000000 ffffffff81c11540 ffff880005043140
[75883.484543]  ffff8800d6218ec0 ffffc90040a57b20 ffffffff8160e973 ffffc90040a57da0
[75883.484561]  ffff8800cb4fa000 0000000040a57be8 ffffffff81303fcf ffff880005043140
[75883.484582] Call Trace:
[75883.484588]  [<ffffffff8160e973>] ? __schedule+0x243/0x6f0
[75883.484601]  [<ffffffff81303fcf>] ? blk_attempt_plug_merge+0xcf/0xe0
[75883.484617]  [<ffffffff8160ee52>] ? schedule+0x32/0x80
[75883.484630]  [<ffffffff8161233f>] ? schedule_timeout+0x1df/0x380
[75883.484645]  [<ffffffff8101bce1>] ? xen_clocksource_get_cycles+0x11/0x20
[75883.484658]  [<ffffffff8160f6e0>] ? bit_wait_timeout+0x90/0x90
[75883.484669]  [<ffffffff8160e6b4>] ? io_schedule_timeout+0xb4/0x130
[75883.484679]  [<ffffffff810bd0d7>] ? prepare_to_wait+0x57/0x80
[75883.484689]  [<ffffffff8160f6f7>] ? bit_wait_io+0x17/0x60
[75883.484698]  [<ffffffff8160f1de>] ? __wait_on_bit+0x5e/0x90
[75883.484707]  [<ffffffff8160f6e0>] ? bit_wait_timeout+0x90/0x90
[75883.484718]  [<ffffffff8160f34e>] ? out_of_line_wait_on_bit+0x7e/0xa0
[75883.484728]  [<ffffffff810bd400>] ? autoremove_wake_function+0x40/0x40
[75883.484741]  [<ffffffffc007dd5e>] ? jbd2_journal_commit_transaction+0xd4e/0x1800 [jbd2]
[75883.484755]  [<ffffffff810257c9>] ? __switch_to+0x2c9/0x730
[75883.484767]  [<ffffffff810e858d>] ? try_to_del_timer_sync+0x4d/0x80
[75883.484778]  [<ffffffffc00829fd>] ? kjournald2+0xdd/0x280 [jbd2]
[75883.484788]  [<ffffffff810bd3c0>] ? wake_up_atomic_t+0x30/0x30
[75883.484801]  [<ffffffffc0082920>] ? commit_timeout+0x10/0x10 [jbd2]
[75883.484813]  [<ffffffff810994d2>] ? kthread+0xf2/0x110
[75883.484821]  [<ffffffff810257c9>] ? __switch_to+0x2c9/0x730
[75883.484830]  [<ffffffff810993e0>] ? kthread_park+0x60/0x60
[75883.484840]  [<ffffffff81613977>] ? ret_from_fork+0x57/0x70
[92471.290183] TCP: request_sock_TCP: Possible SYN flooding on port 6969. Sending cookies.  Check SNMP counters.

------------------------------------------------------------------------

We tried with no success:

  * Related to this thread
    (https://discussions.citrix.com/topic/272708-xen-block-devices-stop-responding-in-guest/)
    to change:
    vm.swappiness = 0
    vm.overcommit_memory = 1
    vm.dirty_background_ratio = 5
    vm.dirty_ratio = 10
    vm.dirty_expire_centisecs = 1000
  * To disable Irqbalance on Dom[0|U]

Only the hypervisor version rollback corrected the problem.

Kind regards,
Sacha.







_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [admin] [BUG] task jbd2/xvda4-8:174 blocked for more than 120 seconds.
  2019-02-08 17:00 [BUG] task jbd2/xvda4-8:174 blocked for more than 120 seconds Sacha
@ 2019-02-08 17:13 ` Samuel Thibault
  2019-02-08 19:18   ` [Pkg-xen-devel] " Hans van Kranenburg
  0 siblings, 1 reply; 16+ messages in thread
From: Samuel Thibault @ 2019-02-08 17:13 UTC (permalink / raw)
  To: xen-devel, xen; +Cc: admin

Hello,

Sacha, le ven. 08 févr. 2019 18:00:22 +0100, a ecrit:
> On  Debian GNU/Linux 9.7 (stretch) amd64, we have a bug on the last Xen
> Hypervisor version:
> 
>     xen-hypervisor-4.8-amd64 4.8.5+shim4.10.2+xsa282

(Read: 4.8.5+shim4.10.2+xsa282-1+deb9u11)

> The rollback on the previous package version corrected the problem:
> 
>     xen-hypervisor-4.8-amd64 4.8.4+xsa273+shim4.10.1+xsa273-1+deb9u10

(Only the hypervisor needed to be downgraded to fix the issue)

> The errors are on the domU a frozen file system until a kernel panic.

(not really a kernel panic, just a warning that processes are stuck for
more than 2m waiting for fs I/O).

So it looks like the deb9u11 update brought issues with the vbd
behavior, did anybody experience this?

The dom0 and domU kernels are linux-image-4.9.0-8-amd64 4.9.130-2.

Samuel

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Pkg-xen-devel] [admin] [BUG] task jbd2/xvda4-8:174 blocked for more than 120 seconds.
  2019-02-08 17:13 ` [admin] " Samuel Thibault
@ 2019-02-08 19:18   ` Hans van Kranenburg
  2019-02-08 23:16     ` [admin] [Pkg-xen-devel] " Samuel Thibault
  0 siblings, 1 reply; 16+ messages in thread
From: Hans van Kranenburg @ 2019-02-08 19:18 UTC (permalink / raw)
  To: Samuel Thibault, xen-devel, Debian Xen Team

Hi,

Upstream mailing list is at:
  xen-devel@lists.xenproject.org

Apparently,
  xen@packages.debian.org
results in a message to
  pkg-xen-devel@lists.alioth.debian.org

On 2/8/19 6:13 PM, Samuel Thibault wrote:
> 
> Sacha, le ven. 08 févr. 2019 18:00:22 +0100, a ecrit:
>> On  Debian GNU/Linux 9.7 (stretch) amd64, we have a bug on the last Xen
>> Hypervisor version:
>>
>>     xen-hypervisor-4.8-amd64 4.8.5+shim4.10.2+xsa282
> 
> (Read: 4.8.5+shim4.10.2+xsa282-1+deb9u11)
> 
>> The rollback on the previous package version corrected the problem:
>>
>>     xen-hypervisor-4.8-amd64 4.8.4+xsa273+shim4.10.1+xsa273-1+deb9u10

Since this is the first message arriving about this in my inbox, can you
explain what "the problem" is?

> (Only the hypervisor needed to be downgraded to fix the issue)
> 
>> The errors are on the domU a frozen file system until a kernel panic.

Do you have a reproducable case that shows success with the previous Xen
hypervisor package and failure with the new one, while keeping all other
things the same?

This seems like an upstream thing, because for 4.8, the Debian package
updates are almost exclusively shipping upstream stable udpates.

> (not really a kernel panic, just a warning that processes are stuck for
> more than 2m waiting for fs I/O).
> 
> So it looks like the deb9u11 update brought issues with the vbd
> behavior, did anybody experience this?
> 
> The dom0 and domU kernels are linux-image-4.9.0-8-amd64 4.9.130-2.

Hans

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [admin] [Pkg-xen-devel] [BUG] task jbd2/xvda4-8:174 blocked for more than 120 seconds.
  2019-02-08 19:18   ` [Pkg-xen-devel] " Hans van Kranenburg
@ 2019-02-08 23:16     ` Samuel Thibault
  2019-02-09 16:01       ` Hans van Kranenburg
  0 siblings, 1 reply; 16+ messages in thread
From: Samuel Thibault @ 2019-02-08 23:16 UTC (permalink / raw)
  To: admin, Hans van Kranenburg; +Cc: xen-devel, Samuel Thibault, Debian Xen Team

Hello,

Hans van Kranenburg, le ven. 08 févr. 2019 20:18:44 +0100, a ecrit:
> Upstream mailing list is at:
>   xen-devel@lists.xenproject.org

Apparently it didn't get the mails, I guess because it's subscriber-only
posting? I have now forwarded the mails.

> Apparently,
>   xen@packages.debian.org
> results in a message to
>   pkg-xen-devel@lists.alioth.debian.org

Yes, since that's the maintainer of the package.

> On 2/8/19 6:13 PM, Samuel Thibault wrote:
> > 
> > Sacha, le ven. 08 févr. 2019 18:00:22 +0100, a ecrit:
> >> On  Debian GNU/Linux 9.7 (stretch) amd64, we have a bug on the last Xen
> >> Hypervisor version:
> >>
> >>     xen-hypervisor-4.8-amd64 4.8.5+shim4.10.2+xsa282
> > 
> > (Read: 4.8.5+shim4.10.2+xsa282-1+deb9u11)
> > 
> >> The rollback on the previous package version corrected the problem:
> >>
> >>     xen-hypervisor-4.8-amd64 4.8.4+xsa273+shim4.10.1+xsa273-1+deb9u10
> 
> Since this is the first message arriving about this in my inbox, can you
> explain what "the problem" is?

I have forwarded the original mail: all VM I/O get stuck, and thus the
VM becomes unusable.

> > (Only the hypervisor needed to be downgraded to fix the issue)
> > 
> >> The errors are on the domU a frozen file system until a kernel panic.
> 
> Do you have a reproducable case that shows success with the previous Xen
> hypervisor package and failure with the new one, while keeping all other
> things the same?

We have a production system which gets to hang within about a day. We
don't know what exactly triggers the issue.

> This seems like an upstream thing, because for 4.8, the Debian package
> updates are almost exclusively shipping upstream stable udpates.

Ok.

Samuel

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [admin] [Pkg-xen-devel] [BUG] task jbd2/xvda4-8:174 blocked for more than 120 seconds.
  2019-02-08 23:16     ` [admin] [Pkg-xen-devel] " Samuel Thibault
@ 2019-02-09 16:01       ` Hans van Kranenburg
  2019-02-09 16:35         ` Samuel Thibault
  0 siblings, 1 reply; 16+ messages in thread
From: Hans van Kranenburg @ 2019-02-09 16:01 UTC (permalink / raw)
  To: Samuel Thibault, admin, Samuel Thibault, xen-devel, Debian Xen Team

Hi,

On 2/9/19 12:16 AM, Samuel Thibault wrote:
> 
> Hans van Kranenburg, le ven. 08 févr. 2019 20:18:44 +0100, a ecrit:
>> [...]
>>
>> On 2/8/19 6:13 PM, Samuel Thibault wrote:
>>>
>>> Sacha, le ven. 08 févr. 2019 18:00:22 +0100, a ecrit:
>>>> On  Debian GNU/Linux 9.7 (stretch) amd64, we have a bug on the last Xen
>>>> Hypervisor version:
>>>>
>>>>     xen-hypervisor-4.8-amd64 4.8.5+shim4.10.2+xsa282
>>>
>>> (Read: 4.8.5+shim4.10.2+xsa282-1+deb9u11)
>>>
>>>> The rollback on the previous package version corrected the problem:
>>>>
>>>>     xen-hypervisor-4.8-amd64 4.8.4+xsa273+shim4.10.1+xsa273-1+deb9u10
>>
>> Since this is the first message arriving about this in my inbox, can you
>> explain what "the problem" is?
> 
> I have forwarded the original mail: all VM I/O get stuck, and thus the
> VM becomes unusable.

These are in many cases the symptoms of running out of "grant frames".
So let's verify first if this is the case or not.

Your xen-utils-4.8 packages contains a program at
/usr/lib/xen-4.8/bin/xen-diag that you can use in the dom0 to gather
information.

e.g.

  -# ./xen-diag  gnttab_query_size 5
  domid=5: nr_frames=11, max_nr_frames=32

If this nr_frames hits the max allowed, then randomly things will stall.
This does not have to happen directly after domU boot, but it likely
happens later, when disks/cpus are actually used. There is no useful
message/hint at all in the domU kernel (yet) abuot this when it happens.

Can you verify if this is happening?

With Xen 4.8, you can add gnttab_max_frames=64 (or another number, but
higher than the default 32) to the xen hypervisor command line and reboot.

For Xen 4.11 which will be in Buster, the default is 64 and the way to
configure higher values/limits for dom0 and domU have changed. There
will be some text about this recurring problem in the README.Debian
under known issues.

>>> (Only the hypervisor needed to be downgraded to fix the issue)
>>>
>>>> The errors are on the domU a frozen file system until a kernel panic.
>>
>> Do you have a reproducable case that shows success with the previous Xen
>> hypervisor package and failure with the new one, while keeping all other
>> things the same?
> 
> We have a production system which gets to hang within about a day. We
> don't know what exactly triggers the issue.
> 
>> This seems like an upstream thing, because for 4.8, the Debian package
>> updates are almost exclusively shipping upstream stable udpates.
> 
> Ok.

Related:
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=880554

Hans

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [admin] [Pkg-xen-devel] [BUG] task jbd2/xvda4-8:174 blocked for more than 120 seconds.
  2019-02-09 16:01       ` Hans van Kranenburg
@ 2019-02-09 16:35         ` Samuel Thibault
  2019-02-11  1:37           ` Dongli Zhang
  0 siblings, 1 reply; 16+ messages in thread
From: Samuel Thibault @ 2019-02-09 16:35 UTC (permalink / raw)
  To: admin, Hans van Kranenburg
  Cc: xen-devel, Samuel Thibault, Samuel Thibault, Debian Xen Team

Hello,

Hans van Kranenburg, le sam. 09 févr. 2019 17:01:55 +0100, a ecrit:
> > I have forwarded the original mail: all VM I/O get stuck, and thus the
> > VM becomes unusable.
> 
> These are in many cases the symptoms of running out of "grant frames".

Oh!  That could be it indeed.  I'm wondering what could be monopolizing
them, though, and why +deb9u11 is affected while +deb9u10 is not.  I'm
afraid increasing the gnttab max size to 32 might just defer filling it
up.

>   -# ./xen-diag  gnttab_query_size 5
>   domid=5: nr_frames=11, max_nr_frames=32

The current value is 31 over max 32 indeed.

> With Xen 4.8, you can add gnttab_max_frames=64 (or another number, but
> higher than the default 32) to the xen hypervisor command line and reboot.

admin@: I made the modification in the grub config. We can probably try
to reboot with the newer hypervisor, and monitor that value.

Samuel

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [admin] [Pkg-xen-devel] [BUG] task jbd2/xvda4-8:174 blocked for more than 120 seconds.
  2019-02-09 16:35         ` Samuel Thibault
@ 2019-02-11  1:37           ` Dongli Zhang
  2019-02-11  8:18             ` Samuel Thibault
  2019-02-11 21:59             ` Hans van Kranenburg
  0 siblings, 2 replies; 16+ messages in thread
From: Dongli Zhang @ 2019-02-11  1:37 UTC (permalink / raw)
  To: Samuel Thibault, admin, Hans van Kranenburg, Samuel Thibault,
	Samuel Thibault
  Cc: xen-devel, Debian Xen Team



On 2/10/19 12:35 AM, Samuel Thibault wrote:
> Hello,
> 
> Hans van Kranenburg, le sam. 09 févr. 2019 17:01:55 +0100, a ecrit:
>>> I have forwarded the original mail: all VM I/O get stuck, and thus the
>>> VM becomes unusable.
>>
>> These are in many cases the symptoms of running out of "grant frames".
> 
> Oh!  That could be it indeed.  I'm wondering what could be monopolizing
> them, though, and why +deb9u11 is affected while +deb9u10 is not.  I'm
> afraid increasing the gnttab max size to 32 might just defer filling it
> up.
> 
>>   -# ./xen-diag  gnttab_query_size 5
>>   domid=5: nr_frames=11, max_nr_frames=32
> 
> The current value is 31 over max 32 indeed.

Assuming this is grant v1, there are still 4096/8=512 grant references available
(32-31=1 frame available). I do not think the I/O hang can be affected by the
lack of grant entry.

If to increase the max frame to 64 takes effect, it is weird why the I/O would
hang when there are still 512 entries available.

Dongli Zhang

> 
>> With Xen 4.8, you can add gnttab_max_frames=64 (or another number, but
>> higher than the default 32) to the xen hypervisor command line and reboot.
> 
> admin@: I made the modification in the grub config. We can probably try
> to reboot with the newer hypervisor, and monitor that value.
> 
> Samuel
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xenproject.org
> https://lists.xenproject.org/mailman/listinfo/xen-devel
> 

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [admin] [Pkg-xen-devel] [BUG] task jbd2/xvda4-8:174 blocked for more than 120 seconds.
  2019-02-11  1:37           ` Dongli Zhang
@ 2019-02-11  8:18             ` Samuel Thibault
  2019-02-11 21:59             ` Hans van Kranenburg
  1 sibling, 0 replies; 16+ messages in thread
From: Samuel Thibault @ 2019-02-11  8:18 UTC (permalink / raw)
  To: admin, Dongli Zhang
  Cc: Hans van Kranenburg, xen-devel, Samuel Thibault, Samuel Thibault,
	Debian Xen Team

Hello,

Dongli Zhang, le lun. 11 févr. 2019 09:37:43 +0800, a ecrit:
> On 2/10/19 12:35 AM, Samuel Thibault wrote:
> > Hello,
> > 
> > Hans van Kranenburg, le sam. 09 févr. 2019 17:01:55 +0100, a ecrit:
> >>> I have forwarded the original mail: all VM I/O get stuck, and thus the
> >>> VM becomes unusable.
> >>
> >> These are in many cases the symptoms of running out of "grant frames".
> > 
> > Oh!  That could be it indeed.  I'm wondering what could be monopolizing
> > them, though, and why +deb9u11 is affected while +deb9u10 is not.  I'm
> > afraid increasing the gnttab max size to 32 might just defer filling it
> > up.
> > 
> >>   -# ./xen-diag  gnttab_query_size 5
> >>   domid=5: nr_frames=11, max_nr_frames=32
> > 
> > The current value is 31 over max 32 indeed.
> 
> Assuming this is grant v1, there are still 4096/8=512 grant references available
> (32-31=1 frame available). I do not think the I/O hang can be affected by the
> lack of grant entry.
> 
> If to increase the max frame to 64 takes effect, it is weird why the I/O would
> hang when there are still 512 entries available.

After reboot with gnttab_max_frames=256, I could let the VM run for some
time, nr_frames is now 32.

Samuel

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [admin] [Pkg-xen-devel] [BUG] task jbd2/xvda4-8:174 blocked for more than 120 seconds.
  2019-02-11  1:37           ` Dongli Zhang
  2019-02-11  8:18             ` Samuel Thibault
@ 2019-02-11 21:59             ` Hans van Kranenburg
  2019-02-11 22:10               ` Samuel Thibault
  1 sibling, 1 reply; 16+ messages in thread
From: Hans van Kranenburg @ 2019-02-11 21:59 UTC (permalink / raw)
  To: Dongli Zhang, Samuel Thibault, admin; +Cc: xen-devel, Debian Xen Team

On 2/11/19 2:37 AM, Dongli Zhang wrote:
> 
> On 2/10/19 12:35 AM, Samuel Thibault wrote:
>>
>> Hans van Kranenburg, le sam. 09 févr. 2019 17:01:55 +0100, a ecrit:
>>>> I have forwarded the original mail: all VM I/O get stuck, and thus the
>>>> VM becomes unusable.
>>>
>>> These are in many cases the symptoms of running out of "grant frames".
>>
>> Oh!  That could be it indeed.  I'm wondering what could be monopolizing
>> them, though, and why +deb9u11 is affected while +deb9u10 is not.  I'm
>> afraid increasing the gnttab max size to 32 might just defer filling it
>> up.
>>
>>>   -# ./xen-diag  gnttab_query_size 5
>>>   domid=5: nr_frames=11, max_nr_frames=32
>>
>> The current value is 31 over max 32 indeed.
> 
> Assuming this is grant v1, there are still 4096/8=512 grant references available
> (32-31=1 frame available). I do not think the I/O hang can be affected by the
> lack of grant entry.

I suspect that 31 measurement was taken when the domU was not hanging yet.

> If to increase the max frame to 64 takes effect, it is weird why the I/O would
> hang when there are still 512 entries available.
> 
>>> With Xen 4.8, you can add gnttab_max_frames=64 (or another number, but
>>> higher than the default 32) to the xen hypervisor command line and reboot.
>>
>> admin@: I made the modification in the grub config. We can probably try
>> to reboot with the newer hypervisor, and monitor that value.

K

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [admin] [Pkg-xen-devel] [BUG] task jbd2/xvda4-8:174 blocked for more than 120 seconds.
  2019-02-11 21:59             ` Hans van Kranenburg
@ 2019-02-11 22:10               ` Samuel Thibault
  2019-02-12  4:11                 ` Dongli Zhang
  0 siblings, 1 reply; 16+ messages in thread
From: Samuel Thibault @ 2019-02-11 22:10 UTC (permalink / raw)
  To: admin, Hans van Kranenburg; +Cc: Debian Xen Team, Dongli Zhang, xen-devel

Hans van Kranenburg, le lun. 11 févr. 2019 22:59:11 +0100, a ecrit:
> On 2/11/19 2:37 AM, Dongli Zhang wrote:
> > 
> > On 2/10/19 12:35 AM, Samuel Thibault wrote:
> >>
> >> Hans van Kranenburg, le sam. 09 févr. 2019 17:01:55 +0100, a ecrit:
> >>>> I have forwarded the original mail: all VM I/O get stuck, and thus the
> >>>> VM becomes unusable.
> >>>
> >>> These are in many cases the symptoms of running out of "grant frames".
> >>
> >> Oh!  That could be it indeed.  I'm wondering what could be monopolizing
> >> them, though, and why +deb9u11 is affected while +deb9u10 is not.  I'm
> >> afraid increasing the gnttab max size to 32 might just defer filling it
> >> up.
> >>
> >>>   -# ./xen-diag  gnttab_query_size 5
> >>>   domid=5: nr_frames=11, max_nr_frames=32
> >>
> >> The current value is 31 over max 32 indeed.
> > 
> > Assuming this is grant v1, there are still 4096/8=512 grant references available
> > (32-31=1 frame available). I do not think the I/O hang can be affected by the
> > lack of grant entry.
> 
> I suspect that 31 measurement was taken when the domU was not hanging yet.

Indeed, I didn't have the hanging VM offhand.  I have looked again, it's
now at 33. We'll have to monitor to check that it doesn't continue just
increasing.

Samuel

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [admin] [Pkg-xen-devel] [BUG] task jbd2/xvda4-8:174 blocked for more than 120 seconds.
  2019-02-11 22:10               ` Samuel Thibault
@ 2019-02-12  4:11                 ` Dongli Zhang
  2019-02-17 21:29                   ` Samuel Thibault
  0 siblings, 1 reply; 16+ messages in thread
From: Dongli Zhang @ 2019-02-12  4:11 UTC (permalink / raw)
  To: Samuel Thibault, admin, Hans van Kranenburg; +Cc: xen-devel, Debian Xen Team



On 02/12/2019 06:10 AM, Samuel Thibault wrote:
> Hans van Kranenburg, le lun. 11 févr. 2019 22:59:11 +0100, a ecrit:
>> On 2/11/19 2:37 AM, Dongli Zhang wrote:
>>>
>>> On 2/10/19 12:35 AM, Samuel Thibault wrote:
>>>>
>>>> Hans van Kranenburg, le sam. 09 févr. 2019 17:01:55 +0100, a ecrit:
>>>>>> I have forwarded the original mail: all VM I/O get stuck, and thus the
>>>>>> VM becomes unusable.
>>>>>
>>>>> These are in many cases the symptoms of running out of "grant frames".
>>>>
>>>> Oh!  That could be it indeed.  I'm wondering what could be monopolizing
>>>> them, though, and why +deb9u11 is affected while +deb9u10 is not.  I'm
>>>> afraid increasing the gnttab max size to 32 might just defer filling it
>>>> up.
>>>>
>>>>>   -# ./xen-diag  gnttab_query_size 5
>>>>>   domid=5: nr_frames=11, max_nr_frames=32
>>>>
>>>> The current value is 31 over max 32 indeed.
>>>
>>> Assuming this is grant v1, there are still 4096/8=512 grant references available
>>> (32-31=1 frame available). I do not think the I/O hang can be affected by the
>>> lack of grant entry.
>>
>> I suspect that 31 measurement was taken when the domU was not hanging yet.
> 
> Indeed, I didn't have the hanging VM offhand.  I have looked again, it's
> now at 33. We'll have to monitor to check that it doesn't continue just
> increasing.

If the max used to be 32 and the current is already 33, this indicates the grant
entries might be used up in the past before the max_nr_frames is tuned.

Dongli ZHang

> 
> Samuel
> 

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [admin] [Pkg-xen-devel] [BUG] task jbd2/xvda4-8:174 blocked for more than 120 seconds.
  2019-02-12  4:11                 ` Dongli Zhang
@ 2019-02-17 21:29                   ` Samuel Thibault
  2019-02-18  0:09                     ` Dongli Zhang
  0 siblings, 1 reply; 16+ messages in thread
From: Samuel Thibault @ 2019-02-17 21:29 UTC (permalink / raw)
  To: Dongli Zhang; +Cc: Hans van Kranenburg, xen-devel, admin, Debian Xen Team

Hello,

Dongli Zhang, le mar. 12 févr. 2019 12:11:20 +0800, a ecrit:
> On 02/12/2019 06:10 AM, Samuel Thibault wrote:
> > Hans van Kranenburg, le lun. 11 févr. 2019 22:59:11 +0100, a ecrit:
> >> On 2/11/19 2:37 AM, Dongli Zhang wrote:
> >>>
> >>> On 2/10/19 12:35 AM, Samuel Thibault wrote:
> >>>>
> >>>> Hans van Kranenburg, le sam. 09 févr. 2019 17:01:55 +0100, a ecrit:
> >>>>>> I have forwarded the original mail: all VM I/O get stuck, and thus the
> >>>>>> VM becomes unusable.
> >>>>>
> >>>>> These are in many cases the symptoms of running out of "grant frames".
> >>>>
> >>>> Oh!  That could be it indeed.  I'm wondering what could be monopolizing
> >>>> them, though, and why +deb9u11 is affected while +deb9u10 is not.  I'm
> >>>> afraid increasing the gnttab max size to 32 might just defer filling it
> >>>> up.
> >>>>
> >>>>>   -# ./xen-diag  gnttab_query_size 5
> >>>>>   domid=5: nr_frames=11, max_nr_frames=32
> >>>>
> >>>> The current value is 31 over max 32 indeed.
> >>>
> >>> Assuming this is grant v1, there are still 4096/8=512 grant references available
> >>> (32-31=1 frame available). I do not think the I/O hang can be affected by the
> >>> lack of grant entry.
> >>
> >> I suspect that 31 measurement was taken when the domU was not hanging yet.
> > 
> > Indeed, I didn't have the hanging VM offhand.  I have looked again, it's
> > now at 33. We'll have to monitor to check that it doesn't continue just
> > increasing.
> 
> If the max used to be 32 and the current is already 33, this indicates the grant
> entries might be used up in the past before the max_nr_frames is tuned.

The number seems to be going up by about one every day. So probably a
grant entry leak somewhere :/

Samuel

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [admin] [Pkg-xen-devel] [BUG] task jbd2/xvda4-8:174 blocked for more than 120 seconds.
  2019-02-17 21:29                   ` Samuel Thibault
@ 2019-02-18  0:09                     ` Dongli Zhang
  2019-02-19 23:49                       ` Samuel Thibault
  2019-06-01 19:47                         ` [Xen-devel] " Samuel Thibault
  0 siblings, 2 replies; 16+ messages in thread
From: Dongli Zhang @ 2019-02-18  0:09 UTC (permalink / raw)
  To: Samuel Thibault; +Cc: Hans van Kranenburg, xen-devel, Debian Xen Team



On 2/18/19 5:29 AM, Samuel Thibault wrote:
> Hello,
> 
> Dongli Zhang, le mar. 12 févr. 2019 12:11:20 +0800, a ecrit:
>> On 02/12/2019 06:10 AM, Samuel Thibault wrote:
>>> Hans van Kranenburg, le lun. 11 févr. 2019 22:59:11 +0100, a ecrit:
>>>> On 2/11/19 2:37 AM, Dongli Zhang wrote:
>>>>>
>>>>> On 2/10/19 12:35 AM, Samuel Thibault wrote:
>>>>>>
>>>>>> Hans van Kranenburg, le sam. 09 févr. 2019 17:01:55 +0100, a ecrit:
>>>>>>>> I have forwarded the original mail: all VM I/O get stuck, and thus the
>>>>>>>> VM becomes unusable.
>>>>>>>
>>>>>>> These are in many cases the symptoms of running out of "grant frames".
>>>>>>
>>>>>> Oh!  That could be it indeed.  I'm wondering what could be monopolizing
>>>>>> them, though, and why +deb9u11 is affected while +deb9u10 is not.  I'm
>>>>>> afraid increasing the gnttab max size to 32 might just defer filling it
>>>>>> up.
>>>>>>
>>>>>>>   -# ./xen-diag  gnttab_query_size 5
>>>>>>>   domid=5: nr_frames=11, max_nr_frames=32
>>>>>>
>>>>>> The current value is 31 over max 32 indeed.
>>>>>
>>>>> Assuming this is grant v1, there are still 4096/8=512 grant references available
>>>>> (32-31=1 frame available). I do not think the I/O hang can be affected by the
>>>>> lack of grant entry.
>>>>
>>>> I suspect that 31 measurement was taken when the domU was not hanging yet.
>>>
>>> Indeed, I didn't have the hanging VM offhand.  I have looked again, it's
>>> now at 33. We'll have to monitor to check that it doesn't continue just
>>> increasing.
>>
>> If the max used to be 32 and the current is already 33, this indicates the grant
>> entries might be used up in the past before the max_nr_frames is tuned.
> 
> The number seems to be going up by about one every day. So probably a
> grant entry leak somewhere :/

This might not be a grant leak. The block pv driver would hold the persistent
grant for a long time.

Juergen has introduced the feature to reclaim the stale grants.


blkfront since a46b53672b2c2e3770b38a4abf90d16364d2584b

blkback since 973e5405f2f67ddbb2bf07b3ffc71908a37fea8e

Dongli Zhang

> 
> Samuel
> 

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [admin] [Pkg-xen-devel] [BUG] task jbd2/xvda4-8:174 blocked for more than 120 seconds.
  2019-02-18  0:09                     ` Dongli Zhang
@ 2019-02-19 23:49                       ` Samuel Thibault
  2019-06-01 19:47                         ` [Xen-devel] " Samuel Thibault
  1 sibling, 0 replies; 16+ messages in thread
From: Samuel Thibault @ 2019-02-19 23:49 UTC (permalink / raw)
  To: admin, Dongli Zhang; +Cc: Hans van Kranenburg, xen-devel, Debian Xen Team

Dongli Zhang, le lun. 18 févr. 2019 08:09:56 +0800, a ecrit:
> 
> 
> On 2/18/19 5:29 AM, Samuel Thibault wrote:
> > Hello,
> > 
> > Dongli Zhang, le mar. 12 févr. 2019 12:11:20 +0800, a ecrit:
> >> On 02/12/2019 06:10 AM, Samuel Thibault wrote:
> >>> Hans van Kranenburg, le lun. 11 févr. 2019 22:59:11 +0100, a ecrit:
> >>>> On 2/11/19 2:37 AM, Dongli Zhang wrote:
> >>>>>
> >>>>> On 2/10/19 12:35 AM, Samuel Thibault wrote:
> >>>>>>
> >>>>>> Hans van Kranenburg, le sam. 09 févr. 2019 17:01:55 +0100, a ecrit:
> >>>>>>>> I have forwarded the original mail: all VM I/O get stuck, and thus the
> >>>>>>>> VM becomes unusable.
> >>>>>>>
> >>>>>>> These are in many cases the symptoms of running out of "grant frames".
> >>>>>>
> >>>>>> Oh!  That could be it indeed.  I'm wondering what could be monopolizing
> >>>>>> them, though, and why +deb9u11 is affected while +deb9u10 is not.  I'm
> >>>>>> afraid increasing the gnttab max size to 32 might just defer filling it
> >>>>>> up.
> >>>>>>
> >>>>>>>   -# ./xen-diag  gnttab_query_size 5
> >>>>>>>   domid=5: nr_frames=11, max_nr_frames=32
> >>>>>>
> >>>>>> The current value is 31 over max 32 indeed.
> >>>>>
> >>>>> Assuming this is grant v1, there are still 4096/8=512 grant references available
> >>>>> (32-31=1 frame available). I do not think the I/O hang can be affected by the
> >>>>> lack of grant entry.
> >>>>
> >>>> I suspect that 31 measurement was taken when the domU was not hanging yet.
> >>>
> >>> Indeed, I didn't have the hanging VM offhand.  I have looked again, it's
> >>> now at 33. We'll have to monitor to check that it doesn't continue just
> >>> increasing.
> >>
> >> If the max used to be 32 and the current is already 33, this indicates the grant
> >> entries might be used up in the past before the max_nr_frames is tuned.
> > 
> > The number seems to be going up by about one every day. So probably a
> > grant entry leak somewhere :/
> 
> This might not be a grant leak. The block pv driver would hold the persistent
> grant for a long time.
> 
> Juergen has introduced the feature to reclaim the stale grants.
> 
> blkfront since a46b53672b2c2e3770b38a4abf90d16364d2584b
> 
> blkback since 973e5405f2f67ddbb2bf07b3ffc71908a37fea8e

Ok, that hasn't reached Debian Stretch yet :/

Let's keep monitoring for now.

Samuel

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [admin] [Pkg-xen-devel] [BUG] task jbd2/xvda4-8:174 blocked for more than 120 seconds.
@ 2019-06-01 19:47                         ` Samuel Thibault
  0 siblings, 0 replies; 16+ messages in thread
From: Samuel Thibault @ 2019-06-01 19:47 UTC (permalink / raw)
  To: admin, Dongli Zhang; +Cc: Hans van Kranenburg, xen-devel, Debian Xen Team

Hello,

Dongli Zhang, le lun. 18 févr. 2019 08:09:56 +0800, a ecrit:
> On 2/18/19 5:29 AM, Samuel Thibault wrote:
> >>>>> On 2/10/19 12:35 AM, Samuel Thibault wrote:
> >>>>>> Oh!  That could be it indeed.  I'm wondering what could be monopolizing
> >>>>>> them, though, and why +deb9u11 is affected while +deb9u10 is not.  I'm
> >>>>>> afraid increasing the gnttab max size to 32 might just defer filling it
> >>>>>> up.
> >>>>>>
> >>>>>>>   -# ./xen-diag  gnttab_query_size 5
> >>>>>>>   domid=5: nr_frames=11, max_nr_frames=32
> >>>>>>
...
> > The number seems to be going up by about one every day. So probably a
> > grant entry leak somewhere :/
> 
> This might not be a grant leak. The block pv driver would hold the persistent
> grant for a long time.

Just to give an update to close the thread: after a VM uptime of 111
days, nr_frames is at 41, so it looks like we don't have a leak, just a
busy VM :)

Samuel

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Xen-devel] [admin] [Pkg-xen-devel] [BUG] task jbd2/xvda4-8:174 blocked for more than 120 seconds.
@ 2019-06-01 19:47                         ` Samuel Thibault
  0 siblings, 0 replies; 16+ messages in thread
From: Samuel Thibault @ 2019-06-01 19:47 UTC (permalink / raw)
  To: admin, Dongli Zhang; +Cc: Hans van Kranenburg, xen-devel, Debian Xen Team

Hello,

Dongli Zhang, le lun. 18 févr. 2019 08:09:56 +0800, a ecrit:
> On 2/18/19 5:29 AM, Samuel Thibault wrote:
> >>>>> On 2/10/19 12:35 AM, Samuel Thibault wrote:
> >>>>>> Oh!  That could be it indeed.  I'm wondering what could be monopolizing
> >>>>>> them, though, and why +deb9u11 is affected while +deb9u10 is not.  I'm
> >>>>>> afraid increasing the gnttab max size to 32 might just defer filling it
> >>>>>> up.
> >>>>>>
> >>>>>>>   -# ./xen-diag  gnttab_query_size 5
> >>>>>>>   domid=5: nr_frames=11, max_nr_frames=32
> >>>>>>
...
> > The number seems to be going up by about one every day. So probably a
> > grant entry leak somewhere :/
> 
> This might not be a grant leak. The block pv driver would hold the persistent
> grant for a long time.

Just to give an update to close the thread: after a VM uptime of 111
days, nr_frames is at 41, so it looks like we don't have a leak, just a
busy VM :)

Samuel

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2019-06-01 19:47 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-02-08 17:00 [BUG] task jbd2/xvda4-8:174 blocked for more than 120 seconds Sacha
2019-02-08 17:13 ` [admin] " Samuel Thibault
2019-02-08 19:18   ` [Pkg-xen-devel] " Hans van Kranenburg
2019-02-08 23:16     ` [admin] [Pkg-xen-devel] " Samuel Thibault
2019-02-09 16:01       ` Hans van Kranenburg
2019-02-09 16:35         ` Samuel Thibault
2019-02-11  1:37           ` Dongli Zhang
2019-02-11  8:18             ` Samuel Thibault
2019-02-11 21:59             ` Hans van Kranenburg
2019-02-11 22:10               ` Samuel Thibault
2019-02-12  4:11                 ` Dongli Zhang
2019-02-17 21:29                   ` Samuel Thibault
2019-02-18  0:09                     ` Dongli Zhang
2019-02-19 23:49                       ` Samuel Thibault
2019-06-01 19:47                       ` Samuel Thibault
2019-06-01 19:47                         ` [Xen-devel] " Samuel Thibault

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.