All of lore.kernel.org
 help / color / mirror / Atom feed
* blk-mq breaks suspend even with runtime PM patch
@ 2017-07-29 15:27 Oleksandr Natalenko
  2017-07-29 21:17   ` Oleksandr Natalenko
  2017-07-30  5:12   ` Mike Galbraith
  0 siblings, 2 replies; 22+ messages in thread
From: Oleksandr Natalenko @ 2017-07-29 15:27 UTC (permalink / raw)
  To: Jens Axboe; +Cc: Christoph Hellwig, linux-block, linux-kernel

Hello Jens, Christoph.

Unfortunately, even with "block: disable runtime-pm for blk-mq" patch applied 
blk-mq breaks suspend to RAM for me. It is reproducible on my laptop as well 
as in a VM.

I use complex disk layout involving MD, LUKS and LVM, and managed to get these 
warnings from VM via serial console when suspend fails:

===
[  245.516573] INFO: task kworker/0:1:49 blocked for more than 120 seconds.
[  245.520025]       Not tainted 4.12.0-pf4 #1
[  245.521836] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables 
this message.
[  245.525612] kworker/0:1     D    0    49      2 0x00000000
[  245.527515] Workqueue: events vmstat_shepherd
[  245.528685] Call Trace:
[  245.529296]  __schedule+0x459/0xe40
[  245.530115]  ? kvm_clock_read+0x25/0x40
[  245.531003]  ? ktime_get+0x40/0xa0
[  245.531819]  schedule+0x3d/0xb0
[  245.532542]  ? schedule+0x3d/0xb0
[  245.533299]  schedule_preempt_disabled+0x15/0x20
[  245.534367]  __mutex_lock.isra.5+0x295/0x530
[  245.535351]  __mutex_lock_slowpath+0x13/0x20
[  245.536362]  ? __mutex_lock_slowpath+0x13/0x20
[  245.537334]  mutex_lock+0x25/0x30
[  245.538118]  get_online_cpus.part.14+0x15/0x30
[  245.539588]  get_online_cpus+0x20/0x30
[  245.540560]  vmstat_shepherd+0x21/0xc0
[  245.541538]  process_one_work+0x1de/0x430
[  245.542364]  worker_thread+0x47/0x3f0
[  245.543042]  kthread+0x125/0x140
[  245.543649]  ? process_one_work+0x430/0x430
[  245.544417]  ? kthread_create_on_node+0x70/0x70
[  245.545737]  ret_from_fork+0x25/0x30
[  245.546490] INFO: task md0_raid10:459 blocked for more than 120 seconds.
[  245.547668]       Not tainted 4.12.0-pf4 #1
[  245.548769] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables 
this message.
[  245.550133] md0_raid10      D    0   459      2 0x00000000
[  245.551092] Call Trace:
[  245.551539]  __schedule+0x459/0xe40
[  245.552163]  schedule+0x3d/0xb0
[  245.552728]  ? schedule+0x3d/0xb0
[  245.553344]  md_super_wait+0x6e/0xa0 [md_mod]
[  245.554118]  ? wake_bit_function+0x60/0x60
[  245.554854]  md_update_sb.part.60+0x3df/0x840 [md_mod]
[  245.555771]  md_check_recovery+0x215/0x4b0 [md_mod]
[  245.556732]  raid10d+0x62/0x13c0 [raid10]
[  245.557456]  ? schedule+0x3d/0xb0
[  245.558169]  ? schedule+0x3d/0xb0
[  245.558803]  ? schedule_timeout+0x21f/0x330
[  245.559593]  md_thread+0x120/0x160 [md_mod]
[  245.560380]  ? md_thread+0x120/0x160 [md_mod]
[  245.561202]  ? wake_bit_function+0x60/0x60
[  245.561975]  kthread+0x125/0x140
[  245.562601]  ? find_pers+0x70/0x70 [md_mod]
[  245.563394]  ? kthread_create_on_node+0x70/0x70
[  245.564516]  ret_from_fork+0x25/0x30
[  245.565669] INFO: task dmcrypt_write:487 blocked for more than 120 seconds.
[  245.567151]       Not tainted 4.12.0-pf4 #1
[  245.567946] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables 
this message.
[  245.569390] dmcrypt_write   D    0   487      2 0x00000000
[  245.570352] Call Trace:
[  245.570801]  __schedule+0x459/0xe40
[  245.571601]  ? preempt_schedule_common+0x1f/0x30
[  245.572421]  ? ___preempt_schedule+0x16/0x18
[  245.573168]  schedule+0x3d/0xb0
[  245.573733]  ? schedule+0x3d/0xb0
[  245.574378]  md_write_start+0xe3/0x270 [md_mod]
[  245.575180]  ? wake_bit_function+0x60/0x60
[  245.575915]  raid10_make_request+0x3f/0x140 [raid10]
[  245.576827]  md_make_request+0xa9/0x2a0 [md_mod]
[  245.577659]  generic_make_request+0x11e/0x2f0
[  245.578464]  dmcrypt_write+0x22d/0x250 [dm_crypt]
[  245.579288]  ? dmcrypt_write+0x22d/0x250 [dm_crypt]
[  245.580183]  ? wake_up_process+0x20/0x20
[  245.580875]  kthread+0x125/0x140
[  245.581455]  ? kthread+0x125/0x140
[  245.582064]  ? crypt_iv_essiv_dtr+0x70/0x70 [dm_crypt]
[  245.582971]  ? kthread_create_on_node+0x70/0x70
[  245.583772]  ret_from_fork+0x25/0x30
[  245.584424] INFO: task btrfs-transacti:553 blocked for more than 120 
seconds.
[  245.585830]       Not tainted 4.12.0-pf4 #1
[  245.586596] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables 
this message.
[  245.587947] btrfs-transacti D    0   553      2 0x00000000
[  245.588922] Call Trace:
[  245.589377]  __schedule+0x459/0xe40
[  245.590062]  ? btrfs_submit_bio_hook+0x8c/0x180 [btrfs]
[  245.590976]  schedule+0x3d/0xb0
[  245.591538]  ? schedule+0x3d/0xb0
[  245.592175]  io_schedule+0x16/0x40
[  245.592794]  wait_on_page_bit_common+0xe3/0x180
[  245.593589]  ? page_cache_tree_insert+0xc0/0xc0
[  245.594387]  __filemap_fdatawait_range+0x12a/0x1a0
[  245.595229]  filemap_fdatawait_range+0x14/0x30
[  245.596033]  btrfs_wait_ordered_range+0x6b/0x110 [btrfs]
[  245.596998]  ? _raw_spin_unlock+0x10/0x30
[  245.597735]  __btrfs_wait_cache_io+0x45/0x1c0 [btrfs]
[  245.599091]  btrfs_wait_cache_io+0x2c/0x30 [btrfs]
[  245.600789]  btrfs_write_dirty_block_groups+0x206/0x390 [btrfs]
[  245.602068]  commit_cowonly_roots+0x221/0x2c0 [btrfs]
[  245.602981]  btrfs_commit_transaction+0x3c4/0x900 [btrfs]
[  245.603940]  transaction_kthread+0x190/0x1c0 [btrfs]
[  245.604806]  kthread+0x125/0x140
[  245.605398]  ? btrfs_cleanup_transaction+0x500/0x500 [btrfs]
[  245.606414]  ? kthread_create_on_node+0x70/0x70
[  245.607213]  ret_from_fork+0x25/0x30
[  245.607846] INFO: task systemd-sleep:677 blocked for more than 120 seconds.
[  245.609061]       Not tainted 4.12.0-pf4 #1
[  245.609858] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables 
this message.
[  245.611202] systemd-sleep   D    0   677      1 0x00000000
[  245.612160] Call Trace:
[  245.612604]  __schedule+0x459/0xe40
[  245.613231]  ? blk_mq_queue_reinit_work+0x100/0x100
[  245.614080]  schedule+0x3d/0xb0
[  245.614637]  ? schedule+0x3d/0xb0
[  245.615222]  blk_mq_freeze_queue_wait+0x46/0xb0
[  245.616015]  ? wake_bit_function+0x60/0x60
[  245.616778]  blk_mq_queue_reinit_work+0x6e/0x100
[  245.617592]  blk_mq_queue_reinit_dead+0x3d/0x50
[  245.618420]  ? __flow_cache_shrink+0x170/0x170
[  245.619199]  cpuhp_invoke_callback+0x84/0x430
[  245.619965]  ? cpuhp_invoke_callback+0x84/0x430
[  245.620754]  ? __flow_cache_shrink+0x170/0x170
[  245.621530]  cpuhp_down_callbacks+0x42/0x80
[  245.622267]  _cpu_down+0xc5/0x100
[  245.622856]  freeze_secondary_cpus+0x9e/0x230
[  245.623629]  suspend_devices_and_enter+0x2e6/0x800
[  245.624471]  pm_suspend+0x349/0x3c0
[  245.625090]  state_store+0x49/0xa0
[  245.625699]  kobj_attr_store+0xf/0x20
[  245.626379]  sysfs_kf_write+0x37/0x40
[  245.627038]  kernfs_fop_write+0x11c/0x1a0
[  245.627800]  __vfs_write+0x37/0x140
[  245.628448]  ? handle_mm_fault+0xde/0x220
[  245.629155]  vfs_write+0xb1/0x1a0
[  245.629745]  SyS_write+0x55/0xc0
[  245.630337]  ? trace_do_page_fault+0x37/0xf0
[  245.631088]  entry_SYSCALL_64_fastpath+0x1a/0xa5
[  245.632109] RIP: 0033:0x7f810459dbf0
[  245.632821] RSP: 002b:00007ffec9ba2e58 EFLAGS: 00000246 ORIG_RAX: 
0000000000000001
[  245.634127] RAX: ffffffffffffffda RBX: 00000000000000c6 RCX: 
00007f810459dbf0
[  245.635353] RDX: 0000000000000004 RSI: 000000b716c73390 RDI: 
0000000000000004
[  245.636778] RBP: 000000000000270f R08: 000000b716c73240 R09: 
00007f8104a718c0
[  245.638254] R10: 00007f810485ead8 R11: 0000000000000246 R12: 
00007f810485ead8
[  245.639500] R13: 0000000000001010 R14: 000000b716c73380 R15: 
00007f810485ea80
===

With blk-mq disabled everything works okay.

How can I help you in solving this issue?

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: blk-mq breaks suspend even with runtime PM patch
  2017-07-29 15:27 blk-mq breaks suspend even with runtime PM patch Oleksandr Natalenko
@ 2017-07-29 21:17   ` Oleksandr Natalenko
  2017-07-30  5:12   ` Mike Galbraith
  1 sibling, 0 replies; 22+ messages in thread
From: Oleksandr Natalenko @ 2017-07-29 21:17 UTC (permalink / raw)
  To: Jens Axboe; +Cc: Christoph Hellwig, linux-block, linux-kernel

Recompiled kernel with lockdep enabled gives me this:

=3D=3D=3D
[  368.655051] Showing all locks held in the system:
[  368.656387] 1 lock held by khungtaskd/37:
[  368.657171]  #0:  (tasklist_lock){.+.+..}, at: [<ffffffffa90c801d>]=20
debug_show_all_locks+0x3d/0x1a0
[  368.658725] 1 lock held by md0_raid10/458:
[  368.659455]  #0:  (&mddev->reconfig_mutex){+.+.+.}, at:=20
[<ffffffffc036790f>] md_check_recovery+0xaf/0x4d0 [md_mod]
[  368.661403] 3 locks held by btrfs-transacti/550:
[  368.662754]  #0:  (&fs_info->transaction_kthread_mutex){+.+...}, at:=20
[<ffffffffc03eeb79>] transaction_kthread+0x69/0x1c0 [btrfs]
[  368.664797]  #1:  (&fs_info->reloc_mutex){+.+...}, at: [<ffffffffc03f457=
1>]=20
btrfs_commit_transaction+0x2e1/0x9b0 [btrfs]
[  368.666669]  #2:  (&fs_info->tree_log_mutex){+.+...}, at:=20
[<ffffffffc03f45e1>] btrfs_commit_transaction+0x351/0x9b0 [btrfs]
[  368.668644] 4 locks held by kworker/0:2/888:
[  368.669384]  #0:  ("events"){.+.+.+}, at: [<ffffffffa90abd0b>]=20
process_one_work+0x1fb/0x6e0
[  368.670916]  #1:  ((shepherd).work){+.+...}, at: [<ffffffffa90abd0b>]=20
process_one_work+0x1fb/0x6e0
[  368.672592]  #2:  (cpu_hotplug.dep_map){++++++}, at: [<ffffffffa908c245>=
]=20
get_online_cpus.part.14+0x5/0x50
[  368.674742]  #3:  (cpu_hotplug.lock){+.+.+.}, at: [<ffffffffa908c27a>]=20
get_online_cpus.part.14+0x3a/0x50
[  368.677494] 10 locks held by systemd-sleep/889:
[  368.678650]  #0:  (sb_writers#5){.+.+.+}, at: [<ffffffffa924779b>]=20
vfs_write+0x17b/0x1a0
[  368.680483]  #1:  (&of->mutex){+.+.+.}, at: [<ffffffffa92db703>]=20
kernfs_fop_write+0x123/0x1e0
[  368.682412]  #2:  (s_active#257){.+.+.+}, at: [<ffffffffa92db70c>]=20
kernfs_fop_write+0x12c/0x1e0
[  368.684440]  #3:  (autosleep_lock){+.+.+.}, at: [<ffffffffa90da3b7>]=20
pm_autosleep_lock+0x17/0x20
[  368.686707]  #4:  (pm_mutex){+.+.+.}, at: [<ffffffffa90d28f8>] pm_suspend
+0x88/0x490
[  368.688086]  #5:  (acpi_scan_lock){+.+.+.}, at: [<ffffffffa941c907>]=20
acpi_scan_lock_acquire+0x17/0x20
[  368.690213]  #6:  (cpu_add_remove_lock){+.+.+.}, at: [<ffffffffa908e160>=
]=20
freeze_secondary_cpus+0x30/0x3c0
[  368.692016]  #7:  (cpu_hotplug.dep_map){++++++}, at: [<ffffffffa908da75>=
]=20
cpu_hotplug_begin+0x5/0xe0
[  368.694347]  #8:  (cpu_hotplug.lock){+.+.+.}, at: [<ffffffffa908daf3>]=20
cpu_hotplug_begin+0x83/0xe0
[  368.696010]  #9:  (all_q_mutex){+.+...}, at: [<ffffffffa933e05a>]=20
blk_mq_queue_reinit_work+0x1a/0x110
[  368.698624]=20
[  368.698990] =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
[  368.698990]
=3D=3D=3D

Deadlock with CPU hotplug?

On sobota 29. =C4=8Dervence 2017 17:27:41 CEST Oleksandr Natalenko wrote:
> Hello Jens, Christoph.
>=20
> Unfortunately, even with "block: disable runtime-pm for blk-mq" patch
> applied blk-mq breaks suspend to RAM for me. It is reproducible on my
> laptop as well as in a VM.
>=20
> I use complex disk layout involving MD, LUKS and LVM, and managed to get
> these warnings from VM via serial console when suspend fails:
>=20
> =3D=3D=3D
> [  245.516573] INFO: task kworker/0:1:49 blocked for more than 120 second=
s.
> [  245.520025]       Not tainted 4.12.0-pf4 #1
> [  245.521836] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables
> this message.
> [  245.525612] kworker/0:1     D    0    49      2 0x00000000
> [  245.527515] Workqueue: events vmstat_shepherd
> [  245.528685] Call Trace:
> [  245.529296]  __schedule+0x459/0xe40
> [  245.530115]  ? kvm_clock_read+0x25/0x40
> [  245.531003]  ? ktime_get+0x40/0xa0
> [  245.531819]  schedule+0x3d/0xb0
> [  245.532542]  ? schedule+0x3d/0xb0
> [  245.533299]  schedule_preempt_disabled+0x15/0x20
> [  245.534367]  __mutex_lock.isra.5+0x295/0x530
> [  245.535351]  __mutex_lock_slowpath+0x13/0x20
> [  245.536362]  ? __mutex_lock_slowpath+0x13/0x20
> [  245.537334]  mutex_lock+0x25/0x30
> [  245.538118]  get_online_cpus.part.14+0x15/0x30
> [  245.539588]  get_online_cpus+0x20/0x30
> [  245.540560]  vmstat_shepherd+0x21/0xc0
> [  245.541538]  process_one_work+0x1de/0x430
> [  245.542364]  worker_thread+0x47/0x3f0
> [  245.543042]  kthread+0x125/0x140
> [  245.543649]  ? process_one_work+0x430/0x430
> [  245.544417]  ? kthread_create_on_node+0x70/0x70
> [  245.545737]  ret_from_fork+0x25/0x30
> [  245.546490] INFO: task md0_raid10:459 blocked for more than 120 second=
s.
> [  245.547668]       Not tainted 4.12.0-pf4 #1
> [  245.548769] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables
> this message.
> [  245.550133] md0_raid10      D    0   459      2 0x00000000
> [  245.551092] Call Trace:
> [  245.551539]  __schedule+0x459/0xe40
> [  245.552163]  schedule+0x3d/0xb0
> [  245.552728]  ? schedule+0x3d/0xb0
> [  245.553344]  md_super_wait+0x6e/0xa0 [md_mod]
> [  245.554118]  ? wake_bit_function+0x60/0x60
> [  245.554854]  md_update_sb.part.60+0x3df/0x840 [md_mod]
> [  245.555771]  md_check_recovery+0x215/0x4b0 [md_mod]
> [  245.556732]  raid10d+0x62/0x13c0 [raid10]
> [  245.557456]  ? schedule+0x3d/0xb0
> [  245.558169]  ? schedule+0x3d/0xb0
> [  245.558803]  ? schedule_timeout+0x21f/0x330
> [  245.559593]  md_thread+0x120/0x160 [md_mod]
> [  245.560380]  ? md_thread+0x120/0x160 [md_mod]
> [  245.561202]  ? wake_bit_function+0x60/0x60
> [  245.561975]  kthread+0x125/0x140
> [  245.562601]  ? find_pers+0x70/0x70 [md_mod]
> [  245.563394]  ? kthread_create_on_node+0x70/0x70
> [  245.564516]  ret_from_fork+0x25/0x30
> [  245.565669] INFO: task dmcrypt_write:487 blocked for more than 120
> seconds. [  245.567151]       Not tainted 4.12.0-pf4 #1
> [  245.567946] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables
> this message.
> [  245.569390] dmcrypt_write   D    0   487      2 0x00000000
> [  245.570352] Call Trace:
> [  245.570801]  __schedule+0x459/0xe40
> [  245.571601]  ? preempt_schedule_common+0x1f/0x30
> [  245.572421]  ? ___preempt_schedule+0x16/0x18
> [  245.573168]  schedule+0x3d/0xb0
> [  245.573733]  ? schedule+0x3d/0xb0
> [  245.574378]  md_write_start+0xe3/0x270 [md_mod]
> [  245.575180]  ? wake_bit_function+0x60/0x60
> [  245.575915]  raid10_make_request+0x3f/0x140 [raid10]
> [  245.576827]  md_make_request+0xa9/0x2a0 [md_mod]
> [  245.577659]  generic_make_request+0x11e/0x2f0
> [  245.578464]  dmcrypt_write+0x22d/0x250 [dm_crypt]
> [  245.579288]  ? dmcrypt_write+0x22d/0x250 [dm_crypt]
> [  245.580183]  ? wake_up_process+0x20/0x20
> [  245.580875]  kthread+0x125/0x140
> [  245.581455]  ? kthread+0x125/0x140
> [  245.582064]  ? crypt_iv_essiv_dtr+0x70/0x70 [dm_crypt]
> [  245.582971]  ? kthread_create_on_node+0x70/0x70
> [  245.583772]  ret_from_fork+0x25/0x30
> [  245.584424] INFO: task btrfs-transacti:553 blocked for more than 120
> seconds.
> [  245.585830]       Not tainted 4.12.0-pf4 #1
> [  245.586596] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables
> this message.
> [  245.587947] btrfs-transacti D    0   553      2 0x00000000
> [  245.588922] Call Trace:
> [  245.589377]  __schedule+0x459/0xe40
> [  245.590062]  ? btrfs_submit_bio_hook+0x8c/0x180 [btrfs]
> [  245.590976]  schedule+0x3d/0xb0
> [  245.591538]  ? schedule+0x3d/0xb0
> [  245.592175]  io_schedule+0x16/0x40
> [  245.592794]  wait_on_page_bit_common+0xe3/0x180
> [  245.593589]  ? page_cache_tree_insert+0xc0/0xc0
> [  245.594387]  __filemap_fdatawait_range+0x12a/0x1a0
> [  245.595229]  filemap_fdatawait_range+0x14/0x30
> [  245.596033]  btrfs_wait_ordered_range+0x6b/0x110 [btrfs]
> [  245.596998]  ? _raw_spin_unlock+0x10/0x30
> [  245.597735]  __btrfs_wait_cache_io+0x45/0x1c0 [btrfs]
> [  245.599091]  btrfs_wait_cache_io+0x2c/0x30 [btrfs]
> [  245.600789]  btrfs_write_dirty_block_groups+0x206/0x390 [btrfs]
> [  245.602068]  commit_cowonly_roots+0x221/0x2c0 [btrfs]
> [  245.602981]  btrfs_commit_transaction+0x3c4/0x900 [btrfs]
> [  245.603940]  transaction_kthread+0x190/0x1c0 [btrfs]
> [  245.604806]  kthread+0x125/0x140
> [  245.605398]  ? btrfs_cleanup_transaction+0x500/0x500 [btrfs]
> [  245.606414]  ? kthread_create_on_node+0x70/0x70
> [  245.607213]  ret_from_fork+0x25/0x30
> [  245.607846] INFO: task systemd-sleep:677 blocked for more than 120
> seconds. [  245.609061]       Not tainted 4.12.0-pf4 #1
> [  245.609858] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables
> this message.
> [  245.611202] systemd-sleep   D    0   677      1 0x00000000
> [  245.612160] Call Trace:
> [  245.612604]  __schedule+0x459/0xe40
> [  245.613231]  ? blk_mq_queue_reinit_work+0x100/0x100
> [  245.614080]  schedule+0x3d/0xb0
> [  245.614637]  ? schedule+0x3d/0xb0
> [  245.615222]  blk_mq_freeze_queue_wait+0x46/0xb0
> [  245.616015]  ? wake_bit_function+0x60/0x60
> [  245.616778]  blk_mq_queue_reinit_work+0x6e/0x100
> [  245.617592]  blk_mq_queue_reinit_dead+0x3d/0x50
> [  245.618420]  ? __flow_cache_shrink+0x170/0x170
> [  245.619199]  cpuhp_invoke_callback+0x84/0x430
> [  245.619965]  ? cpuhp_invoke_callback+0x84/0x430
> [  245.620754]  ? __flow_cache_shrink+0x170/0x170
> [  245.621530]  cpuhp_down_callbacks+0x42/0x80
> [  245.622267]  _cpu_down+0xc5/0x100
> [  245.622856]  freeze_secondary_cpus+0x9e/0x230
> [  245.623629]  suspend_devices_and_enter+0x2e6/0x800
> [  245.624471]  pm_suspend+0x349/0x3c0
> [  245.625090]  state_store+0x49/0xa0
> [  245.625699]  kobj_attr_store+0xf/0x20
> [  245.626379]  sysfs_kf_write+0x37/0x40
> [  245.627038]  kernfs_fop_write+0x11c/0x1a0
> [  245.627800]  __vfs_write+0x37/0x140
> [  245.628448]  ? handle_mm_fault+0xde/0x220
> [  245.629155]  vfs_write+0xb1/0x1a0
> [  245.629745]  SyS_write+0x55/0xc0
> [  245.630337]  ? trace_do_page_fault+0x37/0xf0
> [  245.631088]  entry_SYSCALL_64_fastpath+0x1a/0xa5
> [  245.632109] RIP: 0033:0x7f810459dbf0
> [  245.632821] RSP: 002b:00007ffec9ba2e58 EFLAGS: 00000246 ORIG_RAX:
> 0000000000000001
> [  245.634127] RAX: ffffffffffffffda RBX: 00000000000000c6 RCX:
> 00007f810459dbf0
> [  245.635353] RDX: 0000000000000004 RSI: 000000b716c73390 RDI:
> 0000000000000004
> [  245.636778] RBP: 000000000000270f R08: 000000b716c73240 R09:
> 00007f8104a718c0
> [  245.638254] R10: 00007f810485ead8 R11: 0000000000000246 R12:
> 00007f810485ead8
> [  245.639500] R13: 0000000000001010 R14: 000000b716c73380 R15:
> 00007f810485ea80
> =3D=3D=3D
>=20
> With blk-mq disabled everything works okay.
>=20
> How can I help you in solving this issue?

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: blk-mq breaks suspend even with runtime PM patch
@ 2017-07-29 21:17   ` Oleksandr Natalenko
  0 siblings, 0 replies; 22+ messages in thread
From: Oleksandr Natalenko @ 2017-07-29 21:17 UTC (permalink / raw)
  To: Jens Axboe; +Cc: Christoph Hellwig, linux-block, linux-kernel

Recompiled kernel with lockdep enabled gives me this:

===
[  368.655051] Showing all locks held in the system:
[  368.656387] 1 lock held by khungtaskd/37:
[  368.657171]  #0:  (tasklist_lock){.+.+..}, at: [<ffffffffa90c801d>] 
debug_show_all_locks+0x3d/0x1a0
[  368.658725] 1 lock held by md0_raid10/458:
[  368.659455]  #0:  (&mddev->reconfig_mutex){+.+.+.}, at: 
[<ffffffffc036790f>] md_check_recovery+0xaf/0x4d0 [md_mod]
[  368.661403] 3 locks held by btrfs-transacti/550:
[  368.662754]  #0:  (&fs_info->transaction_kthread_mutex){+.+...}, at: 
[<ffffffffc03eeb79>] transaction_kthread+0x69/0x1c0 [btrfs]
[  368.664797]  #1:  (&fs_info->reloc_mutex){+.+...}, at: [<ffffffffc03f4571>] 
btrfs_commit_transaction+0x2e1/0x9b0 [btrfs]
[  368.666669]  #2:  (&fs_info->tree_log_mutex){+.+...}, at: 
[<ffffffffc03f45e1>] btrfs_commit_transaction+0x351/0x9b0 [btrfs]
[  368.668644] 4 locks held by kworker/0:2/888:
[  368.669384]  #0:  ("events"){.+.+.+}, at: [<ffffffffa90abd0b>] 
process_one_work+0x1fb/0x6e0
[  368.670916]  #1:  ((shepherd).work){+.+...}, at: [<ffffffffa90abd0b>] 
process_one_work+0x1fb/0x6e0
[  368.672592]  #2:  (cpu_hotplug.dep_map){++++++}, at: [<ffffffffa908c245>] 
get_online_cpus.part.14+0x5/0x50
[  368.674742]  #3:  (cpu_hotplug.lock){+.+.+.}, at: [<ffffffffa908c27a>] 
get_online_cpus.part.14+0x3a/0x50
[  368.677494] 10 locks held by systemd-sleep/889:
[  368.678650]  #0:  (sb_writers#5){.+.+.+}, at: [<ffffffffa924779b>] 
vfs_write+0x17b/0x1a0
[  368.680483]  #1:  (&of->mutex){+.+.+.}, at: [<ffffffffa92db703>] 
kernfs_fop_write+0x123/0x1e0
[  368.682412]  #2:  (s_active#257){.+.+.+}, at: [<ffffffffa92db70c>] 
kernfs_fop_write+0x12c/0x1e0
[  368.684440]  #3:  (autosleep_lock){+.+.+.}, at: [<ffffffffa90da3b7>] 
pm_autosleep_lock+0x17/0x20
[  368.686707]  #4:  (pm_mutex){+.+.+.}, at: [<ffffffffa90d28f8>] pm_suspend
+0x88/0x490
[  368.688086]  #5:  (acpi_scan_lock){+.+.+.}, at: [<ffffffffa941c907>] 
acpi_scan_lock_acquire+0x17/0x20
[  368.690213]  #6:  (cpu_add_remove_lock){+.+.+.}, at: [<ffffffffa908e160>] 
freeze_secondary_cpus+0x30/0x3c0
[  368.692016]  #7:  (cpu_hotplug.dep_map){++++++}, at: [<ffffffffa908da75>] 
cpu_hotplug_begin+0x5/0xe0
[  368.694347]  #8:  (cpu_hotplug.lock){+.+.+.}, at: [<ffffffffa908daf3>] 
cpu_hotplug_begin+0x83/0xe0
[  368.696010]  #9:  (all_q_mutex){+.+...}, at: [<ffffffffa933e05a>] 
blk_mq_queue_reinit_work+0x1a/0x110
[  368.698624] 
[  368.698990] =============================================
[  368.698990]
===

Deadlock with CPU hotplug?

On sobota 29. července 2017 17:27:41 CEST Oleksandr Natalenko wrote:
> Hello Jens, Christoph.
> 
> Unfortunately, even with "block: disable runtime-pm for blk-mq" patch
> applied blk-mq breaks suspend to RAM for me. It is reproducible on my
> laptop as well as in a VM.
> 
> I use complex disk layout involving MD, LUKS and LVM, and managed to get
> these warnings from VM via serial console when suspend fails:
> 
> ===
> [  245.516573] INFO: task kworker/0:1:49 blocked for more than 120 seconds.
> [  245.520025]       Not tainted 4.12.0-pf4 #1
> [  245.521836] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables
> this message.
> [  245.525612] kworker/0:1     D    0    49      2 0x00000000
> [  245.527515] Workqueue: events vmstat_shepherd
> [  245.528685] Call Trace:
> [  245.529296]  __schedule+0x459/0xe40
> [  245.530115]  ? kvm_clock_read+0x25/0x40
> [  245.531003]  ? ktime_get+0x40/0xa0
> [  245.531819]  schedule+0x3d/0xb0
> [  245.532542]  ? schedule+0x3d/0xb0
> [  245.533299]  schedule_preempt_disabled+0x15/0x20
> [  245.534367]  __mutex_lock.isra.5+0x295/0x530
> [  245.535351]  __mutex_lock_slowpath+0x13/0x20
> [  245.536362]  ? __mutex_lock_slowpath+0x13/0x20
> [  245.537334]  mutex_lock+0x25/0x30
> [  245.538118]  get_online_cpus.part.14+0x15/0x30
> [  245.539588]  get_online_cpus+0x20/0x30
> [  245.540560]  vmstat_shepherd+0x21/0xc0
> [  245.541538]  process_one_work+0x1de/0x430
> [  245.542364]  worker_thread+0x47/0x3f0
> [  245.543042]  kthread+0x125/0x140
> [  245.543649]  ? process_one_work+0x430/0x430
> [  245.544417]  ? kthread_create_on_node+0x70/0x70
> [  245.545737]  ret_from_fork+0x25/0x30
> [  245.546490] INFO: task md0_raid10:459 blocked for more than 120 seconds.
> [  245.547668]       Not tainted 4.12.0-pf4 #1
> [  245.548769] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables
> this message.
> [  245.550133] md0_raid10      D    0   459      2 0x00000000
> [  245.551092] Call Trace:
> [  245.551539]  __schedule+0x459/0xe40
> [  245.552163]  schedule+0x3d/0xb0
> [  245.552728]  ? schedule+0x3d/0xb0
> [  245.553344]  md_super_wait+0x6e/0xa0 [md_mod]
> [  245.554118]  ? wake_bit_function+0x60/0x60
> [  245.554854]  md_update_sb.part.60+0x3df/0x840 [md_mod]
> [  245.555771]  md_check_recovery+0x215/0x4b0 [md_mod]
> [  245.556732]  raid10d+0x62/0x13c0 [raid10]
> [  245.557456]  ? schedule+0x3d/0xb0
> [  245.558169]  ? schedule+0x3d/0xb0
> [  245.558803]  ? schedule_timeout+0x21f/0x330
> [  245.559593]  md_thread+0x120/0x160 [md_mod]
> [  245.560380]  ? md_thread+0x120/0x160 [md_mod]
> [  245.561202]  ? wake_bit_function+0x60/0x60
> [  245.561975]  kthread+0x125/0x140
> [  245.562601]  ? find_pers+0x70/0x70 [md_mod]
> [  245.563394]  ? kthread_create_on_node+0x70/0x70
> [  245.564516]  ret_from_fork+0x25/0x30
> [  245.565669] INFO: task dmcrypt_write:487 blocked for more than 120
> seconds. [  245.567151]       Not tainted 4.12.0-pf4 #1
> [  245.567946] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables
> this message.
> [  245.569390] dmcrypt_write   D    0   487      2 0x00000000
> [  245.570352] Call Trace:
> [  245.570801]  __schedule+0x459/0xe40
> [  245.571601]  ? preempt_schedule_common+0x1f/0x30
> [  245.572421]  ? ___preempt_schedule+0x16/0x18
> [  245.573168]  schedule+0x3d/0xb0
> [  245.573733]  ? schedule+0x3d/0xb0
> [  245.574378]  md_write_start+0xe3/0x270 [md_mod]
> [  245.575180]  ? wake_bit_function+0x60/0x60
> [  245.575915]  raid10_make_request+0x3f/0x140 [raid10]
> [  245.576827]  md_make_request+0xa9/0x2a0 [md_mod]
> [  245.577659]  generic_make_request+0x11e/0x2f0
> [  245.578464]  dmcrypt_write+0x22d/0x250 [dm_crypt]
> [  245.579288]  ? dmcrypt_write+0x22d/0x250 [dm_crypt]
> [  245.580183]  ? wake_up_process+0x20/0x20
> [  245.580875]  kthread+0x125/0x140
> [  245.581455]  ? kthread+0x125/0x140
> [  245.582064]  ? crypt_iv_essiv_dtr+0x70/0x70 [dm_crypt]
> [  245.582971]  ? kthread_create_on_node+0x70/0x70
> [  245.583772]  ret_from_fork+0x25/0x30
> [  245.584424] INFO: task btrfs-transacti:553 blocked for more than 120
> seconds.
> [  245.585830]       Not tainted 4.12.0-pf4 #1
> [  245.586596] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables
> this message.
> [  245.587947] btrfs-transacti D    0   553      2 0x00000000
> [  245.588922] Call Trace:
> [  245.589377]  __schedule+0x459/0xe40
> [  245.590062]  ? btrfs_submit_bio_hook+0x8c/0x180 [btrfs]
> [  245.590976]  schedule+0x3d/0xb0
> [  245.591538]  ? schedule+0x3d/0xb0
> [  245.592175]  io_schedule+0x16/0x40
> [  245.592794]  wait_on_page_bit_common+0xe3/0x180
> [  245.593589]  ? page_cache_tree_insert+0xc0/0xc0
> [  245.594387]  __filemap_fdatawait_range+0x12a/0x1a0
> [  245.595229]  filemap_fdatawait_range+0x14/0x30
> [  245.596033]  btrfs_wait_ordered_range+0x6b/0x110 [btrfs]
> [  245.596998]  ? _raw_spin_unlock+0x10/0x30
> [  245.597735]  __btrfs_wait_cache_io+0x45/0x1c0 [btrfs]
> [  245.599091]  btrfs_wait_cache_io+0x2c/0x30 [btrfs]
> [  245.600789]  btrfs_write_dirty_block_groups+0x206/0x390 [btrfs]
> [  245.602068]  commit_cowonly_roots+0x221/0x2c0 [btrfs]
> [  245.602981]  btrfs_commit_transaction+0x3c4/0x900 [btrfs]
> [  245.603940]  transaction_kthread+0x190/0x1c0 [btrfs]
> [  245.604806]  kthread+0x125/0x140
> [  245.605398]  ? btrfs_cleanup_transaction+0x500/0x500 [btrfs]
> [  245.606414]  ? kthread_create_on_node+0x70/0x70
> [  245.607213]  ret_from_fork+0x25/0x30
> [  245.607846] INFO: task systemd-sleep:677 blocked for more than 120
> seconds. [  245.609061]       Not tainted 4.12.0-pf4 #1
> [  245.609858] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables
> this message.
> [  245.611202] systemd-sleep   D    0   677      1 0x00000000
> [  245.612160] Call Trace:
> [  245.612604]  __schedule+0x459/0xe40
> [  245.613231]  ? blk_mq_queue_reinit_work+0x100/0x100
> [  245.614080]  schedule+0x3d/0xb0
> [  245.614637]  ? schedule+0x3d/0xb0
> [  245.615222]  blk_mq_freeze_queue_wait+0x46/0xb0
> [  245.616015]  ? wake_bit_function+0x60/0x60
> [  245.616778]  blk_mq_queue_reinit_work+0x6e/0x100
> [  245.617592]  blk_mq_queue_reinit_dead+0x3d/0x50
> [  245.618420]  ? __flow_cache_shrink+0x170/0x170
> [  245.619199]  cpuhp_invoke_callback+0x84/0x430
> [  245.619965]  ? cpuhp_invoke_callback+0x84/0x430
> [  245.620754]  ? __flow_cache_shrink+0x170/0x170
> [  245.621530]  cpuhp_down_callbacks+0x42/0x80
> [  245.622267]  _cpu_down+0xc5/0x100
> [  245.622856]  freeze_secondary_cpus+0x9e/0x230
> [  245.623629]  suspend_devices_and_enter+0x2e6/0x800
> [  245.624471]  pm_suspend+0x349/0x3c0
> [  245.625090]  state_store+0x49/0xa0
> [  245.625699]  kobj_attr_store+0xf/0x20
> [  245.626379]  sysfs_kf_write+0x37/0x40
> [  245.627038]  kernfs_fop_write+0x11c/0x1a0
> [  245.627800]  __vfs_write+0x37/0x140
> [  245.628448]  ? handle_mm_fault+0xde/0x220
> [  245.629155]  vfs_write+0xb1/0x1a0
> [  245.629745]  SyS_write+0x55/0xc0
> [  245.630337]  ? trace_do_page_fault+0x37/0xf0
> [  245.631088]  entry_SYSCALL_64_fastpath+0x1a/0xa5
> [  245.632109] RIP: 0033:0x7f810459dbf0
> [  245.632821] RSP: 002b:00007ffec9ba2e58 EFLAGS: 00000246 ORIG_RAX:
> 0000000000000001
> [  245.634127] RAX: ffffffffffffffda RBX: 00000000000000c6 RCX:
> 00007f810459dbf0
> [  245.635353] RDX: 0000000000000004 RSI: 000000b716c73390 RDI:
> 0000000000000004
> [  245.636778] RBP: 000000000000270f R08: 000000b716c73240 R09:
> 00007f8104a718c0
> [  245.638254] R10: 00007f810485ead8 R11: 0000000000000246 R12:
> 00007f810485ead8
> [  245.639500] R13: 0000000000001010 R14: 000000b716c73380 R15:
> 00007f810485ea80
> ===
> 
> With blk-mq disabled everything works okay.
> 
> How can I help you in solving this issue?

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: blk-mq breaks suspend even with runtime PM patch
  2017-07-29 15:27 blk-mq breaks suspend even with runtime PM patch Oleksandr Natalenko
@ 2017-07-30  5:12   ` Mike Galbraith
  2017-07-30  5:12   ` Mike Galbraith
  1 sibling, 0 replies; 22+ messages in thread
From: Mike Galbraith @ 2017-07-30  5:12 UTC (permalink / raw)
  To: Oleksandr Natalenko, Jens Axboe
  Cc: Christoph Hellwig, linux-block, linux-kernel

On Sat, 2017-07-29 at 17:27 +0200, Oleksandr Natalenko wrote:
> Hello Jens, Christoph.
>=20
> Unfortunately, even with "block: disable runtime-pm for blk-mq" patch app=
lied=20
> blk-mq breaks suspend to RAM for me. It is reproducible on my laptop as w=
ell=20
> as in a VM.
>=20
> I use complex disk layout involving MD, LUKS and LVM, and managed to get =
these=20
> warnings from VM via serial console when suspend fails:
>=20
> =3D=3D=3D
> [  245.516573] INFO: task kworker/0:1:49 blocked for more than 120 second=
s.
> [  245.520025]       Not tainted 4.12.0-pf4 #1

FWIW, first thing I'd do is update that 4.12.0 to 4.12.4, and see if
stable fixed it. =C2=A0If not, I'd find these two commits irresistible.

5f042e7cbd9eb=C2=A0blk-mq: Include all present CPUs in the default queue ma=
pping
4b855ad37194f blk-mq: Create hctx for each present CPU

'course applying random upstream bits does come with some risk, trying
a kernel already containing them has less "entertainment" potential.=C2=A0

	-Mike

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: blk-mq breaks suspend even with runtime PM patch
@ 2017-07-30  5:12   ` Mike Galbraith
  0 siblings, 0 replies; 22+ messages in thread
From: Mike Galbraith @ 2017-07-30  5:12 UTC (permalink / raw)
  To: Oleksandr Natalenko, Jens Axboe
  Cc: Christoph Hellwig, linux-block, linux-kernel

On Sat, 2017-07-29 at 17:27 +0200, Oleksandr Natalenko wrote:
> Hello Jens, Christoph.
> 
> Unfortunately, even with "block: disable runtime-pm for blk-mq" patch applied 
> blk-mq breaks suspend to RAM for me. It is reproducible on my laptop as well 
> as in a VM.
> 
> I use complex disk layout involving MD, LUKS and LVM, and managed to get these 
> warnings from VM via serial console when suspend fails:
> 
> ===
> [  245.516573] INFO: task kworker/0:1:49 blocked for more than 120 seconds.
> [  245.520025]       Not tainted 4.12.0-pf4 #1

FWIW, first thing I'd do is update that 4.12.0 to 4.12.4, and see if
stable fixed it.  If not, I'd find these two commits irresistible.

5f042e7cbd9eb blk-mq: Include all present CPUs in the default queue mapping
4b855ad37194f blk-mq: Create hctx for each present CPU

'course applying random upstream bits does come with some risk, trying
a kernel already containing them has less "entertainment" potential. 

	-Mike

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: blk-mq breaks suspend even with runtime PM patch
  2017-07-30  5:12   ` Mike Galbraith
@ 2017-07-30 13:50     ` Oleksandr Natalenko
  -1 siblings, 0 replies; 22+ messages in thread
From: Oleksandr Natalenko @ 2017-07-30 13:50 UTC (permalink / raw)
  To: Mike Galbraith
  Cc: Jens Axboe, Christoph Hellwig, linux-block, linux-kernel, gregkh

Hello Mike et al.

On ned=C4=9Ble 30. =C4=8Dervence 2017 7:12:31 CEST Mike Galbraith wrote:
> FWIW, first thing I'd do is update that 4.12.0 to 4.12.4, and see if
> stable fixed it.

My build already includes v4.12.4.

> If not, I'd find these two commits irresistible.
>=20
> 5f042e7cbd9eb blk-mq: Include all present CPUs in the default queue mappi=
ng
> 4b855ad37194f blk-mq: Create hctx for each present CPU

I've applied these 2 commits, and cannot reproduce the issue anymore. Looks=
=20
like a perfect hit, thanks!

> 'course applying random upstream bits does come with some risk, trying
> a kernel already containing them has less "entertainment" potential.=20

Should you consider applying them to v4.12.x stable series? CC'ing Greg jus=
t=20
in case.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: blk-mq breaks suspend even with runtime PM patch
@ 2017-07-30 13:50     ` Oleksandr Natalenko
  0 siblings, 0 replies; 22+ messages in thread
From: Oleksandr Natalenko @ 2017-07-30 13:50 UTC (permalink / raw)
  To: Mike Galbraith
  Cc: Jens Axboe, Christoph Hellwig, linux-block, linux-kernel, gregkh

Hello Mike et al.

On neděle 30. července 2017 7:12:31 CEST Mike Galbraith wrote:
> FWIW, first thing I'd do is update that 4.12.0 to 4.12.4, and see if
> stable fixed it.

My build already includes v4.12.4.

> If not, I'd find these two commits irresistible.
> 
> 5f042e7cbd9eb blk-mq: Include all present CPUs in the default queue mapping
> 4b855ad37194f blk-mq: Create hctx for each present CPU

I've applied these 2 commits, and cannot reproduce the issue anymore. Looks 
like a perfect hit, thanks!

> 'course applying random upstream bits does come with some risk, trying
> a kernel already containing them has less "entertainment" potential. 

Should you consider applying them to v4.12.x stable series? CC'ing Greg just 
in case.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: blk-mq breaks suspend even with runtime PM patch
  2017-07-30 13:50     ` Oleksandr Natalenko
  (?)
@ 2017-08-08 16:22     ` Greg KH
  2017-08-08 16:33       ` Jens Axboe
  2017-08-08 16:34         ` Mike Galbraith
  -1 siblings, 2 replies; 22+ messages in thread
From: Greg KH @ 2017-08-08 16:22 UTC (permalink / raw)
  To: Oleksandr Natalenko
  Cc: Mike Galbraith, Jens Axboe, Christoph Hellwig, linux-block, linux-kernel

On Sun, Jul 30, 2017 at 03:50:15PM +0200, Oleksandr Natalenko wrote:
> Hello Mike et al.
> 
> On neděle 30. července 2017 7:12:31 CEST Mike Galbraith wrote:
> > FWIW, first thing I'd do is update that 4.12.0 to 4.12.4, and see if
> > stable fixed it.
> 
> My build already includes v4.12.4.
> 
> > If not, I'd find these two commits irresistible.
> > 
> > 5f042e7cbd9eb blk-mq: Include all present CPUs in the default queue mapping
> > 4b855ad37194f blk-mq: Create hctx for each present CPU
> 
> I've applied these 2 commits, and cannot reproduce the issue anymore. Looks 
> like a perfect hit, thanks!
> 
> > 'course applying random upstream bits does come with some risk, trying
> > a kernel already containing them has less "entertainment" potential. 
> 
> Should you consider applying them to v4.12.x stable series? CC'ing Greg just 
> in case.

I can queue these up if I get an ack from the developers/maintainers
that it is ok to do so...

{hint}

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: blk-mq breaks suspend even with runtime PM patch
  2017-08-08 16:22     ` Greg KH
@ 2017-08-08 16:33       ` Jens Axboe
  2017-08-08 16:36           ` Oleksandr Natalenko
  2017-08-08 16:34         ` Mike Galbraith
  1 sibling, 1 reply; 22+ messages in thread
From: Jens Axboe @ 2017-08-08 16:33 UTC (permalink / raw)
  To: Greg KH, Oleksandr Natalenko
  Cc: Mike Galbraith, Christoph Hellwig, linux-block, linux-kernel

On 08/08/2017 10:22 AM, Greg KH wrote:
> On Sun, Jul 30, 2017 at 03:50:15PM +0200, Oleksandr Natalenko wrote:
>> Hello Mike et al.
>>
>> On neděle 30. července 2017 7:12:31 CEST Mike Galbraith wrote:
>>> FWIW, first thing I'd do is update that 4.12.0 to 4.12.4, and see if
>>> stable fixed it.
>>
>> My build already includes v4.12.4.
>>
>>> If not, I'd find these two commits irresistible.
>>>
>>> 5f042e7cbd9eb blk-mq: Include all present CPUs in the default queue mapping
>>> 4b855ad37194f blk-mq: Create hctx for each present CPU
>>
>> I've applied these 2 commits, and cannot reproduce the issue anymore. Looks 
>> like a perfect hit, thanks!
>>
>>> 'course applying random upstream bits does come with some risk, trying
>>> a kernel already containing them has less "entertainment" potential. 
>>
>> Should you consider applying them to v4.12.x stable series? CC'ing Greg just 
>> in case.
> 
> I can queue these up if I get an ack from the developers/maintainers
> that it is ok to do so...

You can add those two commits to stable.

-- 
Jens Axboe

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: blk-mq breaks suspend even with runtime PM patch
  2017-08-08 16:22     ` Greg KH
@ 2017-08-08 16:34         ` Mike Galbraith
  2017-08-08 16:34         ` Mike Galbraith
  1 sibling, 0 replies; 22+ messages in thread
From: Mike Galbraith @ 2017-08-08 16:34 UTC (permalink / raw)
  To: Greg KH, Oleksandr Natalenko
  Cc: Jens Axboe, Christoph Hellwig, linux-block, linux-kernel

On Tue, 2017-08-08 at 09:22 -0700, Greg KH wrote:
> On Sun, Jul 30, 2017 at 03:50:15PM +0200, Oleksandr Natalenko wrote:
> > Hello Mike et al.
> >=20
> > On ned=C4=9Ble 30. =C4=8Dervence 2017 7:12:31 CEST Mike Galbraith wrote=
:
> > > FWIW, first thing I'd do is update that 4.12.0 to 4.12.4, and see if
> > > stable fixed it.
> >=20
> > My build already includes v4.12.4.
> >=20
> > > If not, I'd find these two commits irresistible.
> > >=20
> > > 5f042e7cbd9eb blk-mq: Include all present CPUs in the default queue m=
apping
> > > 4b855ad37194f blk-mq: Create hctx for each present CPU
> >=20
> > I've applied these 2 commits, and cannot reproduce the issue anymore. L=
ooks=20
> > like a perfect hit, thanks!
> >=20
> > > 'course applying random upstream bits does come with some risk, tryin=
g
> > > a kernel already containing them has less "entertainment" potential.=
=20
> >=20
> > Should you consider applying them to v4.12.x stable series? CC'ing Greg=
 just=20
> > in case.
>=20
> I can queue these up if I get an ack from the developers/maintainers
> that it is ok to do so...
>=20
> {hint}

{hint++}

Those commits take Steven Rostedt's hotplug stress script runtime down
from 4 _minutes_ down to 7 seconds for my RT tree, so I'm rather hoping
you hear an "ACK" too.

	-Mike

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: blk-mq breaks suspend even with runtime PM patch
@ 2017-08-08 16:34         ` Mike Galbraith
  0 siblings, 0 replies; 22+ messages in thread
From: Mike Galbraith @ 2017-08-08 16:34 UTC (permalink / raw)
  To: Greg KH, Oleksandr Natalenko
  Cc: Jens Axboe, Christoph Hellwig, linux-block, linux-kernel

On Tue, 2017-08-08 at 09:22 -0700, Greg KH wrote:
> On Sun, Jul 30, 2017 at 03:50:15PM +0200, Oleksandr Natalenko wrote:
> > Hello Mike et al.
> > 
> > On neděle 30. července 2017 7:12:31 CEST Mike Galbraith wrote:
> > > FWIW, first thing I'd do is update that 4.12.0 to 4.12.4, and see if
> > > stable fixed it.
> > 
> > My build already includes v4.12.4.
> > 
> > > If not, I'd find these two commits irresistible.
> > > 
> > > 5f042e7cbd9eb blk-mq: Include all present CPUs in the default queue mapping
> > > 4b855ad37194f blk-mq: Create hctx for each present CPU
> > 
> > I've applied these 2 commits, and cannot reproduce the issue anymore. Looks 
> > like a perfect hit, thanks!
> > 
> > > 'course applying random upstream bits does come with some risk, trying
> > > a kernel already containing them has less "entertainment" potential. 
> > 
> > Should you consider applying them to v4.12.x stable series? CC'ing Greg just 
> > in case.
> 
> I can queue these up if I get an ack from the developers/maintainers
> that it is ok to do so...
> 
> {hint}

{hint++}

Those commits take Steven Rostedt's hotplug stress script runtime down
from 4 _minutes_ down to 7 seconds for my RT tree, so I'm rather hoping
you hear an "ACK" too.

	-Mike

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: blk-mq breaks suspend even with runtime PM patch
  2017-08-08 16:33       ` Jens Axboe
@ 2017-08-08 16:36           ` Oleksandr Natalenko
  0 siblings, 0 replies; 22+ messages in thread
From: Oleksandr Natalenko @ 2017-08-08 16:36 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Greg KH, Mike Galbraith, Christoph Hellwig, linux-block, linux-kernel

Could you queue "block: disable runtime-pm for blk-mq" too please? It is al=
so=20
related to suspend-resume freezes that were observed by multiple users.

Thanks.

On =C3=BAter=C3=BD 8. srpna 2017 18:33:29 CEST Jens Axboe wrote:
> On 08/08/2017 10:22 AM, Greg KH wrote:
> > On Sun, Jul 30, 2017 at 03:50:15PM +0200, Oleksandr Natalenko wrote:
> >> Hello Mike et al.
> >>=20
> >> On ned=C4=9Ble 30. =C4=8Dervence 2017 7:12:31 CEST Mike Galbraith wrot=
e:
> >>> FWIW, first thing I'd do is update that 4.12.0 to 4.12.4, and see if
> >>> stable fixed it.
> >>=20
> >> My build already includes v4.12.4.
> >>=20
> >>> If not, I'd find these two commits irresistible.
> >>>=20
> >>> 5f042e7cbd9eb blk-mq: Include all present CPUs in the default queue
> >>> mapping
> >>> 4b855ad37194f blk-mq: Create hctx for each present CPU
> >>=20
> >> I've applied these 2 commits, and cannot reproduce the issue anymore.
> >> Looks
> >> like a perfect hit, thanks!
> >>=20
> >>> 'course applying random upstream bits does come with some risk, trying
> >>> a kernel already containing them has less "entertainment" potential.
> >>=20
> >> Should you consider applying them to v4.12.x stable series? CC'ing Greg
> >> just in case.
> >=20
> > I can queue these up if I get an ack from the developers/maintainers
> > that it is ok to do so...
>=20
> You can add those two commits to stable.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: blk-mq breaks suspend even with runtime PM patch
@ 2017-08-08 16:36           ` Oleksandr Natalenko
  0 siblings, 0 replies; 22+ messages in thread
From: Oleksandr Natalenko @ 2017-08-08 16:36 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Greg KH, Mike Galbraith, Christoph Hellwig, linux-block, linux-kernel

Could you queue "block: disable runtime-pm for blk-mq" too please? It is also 
related to suspend-resume freezes that were observed by multiple users.

Thanks.

On úterý 8. srpna 2017 18:33:29 CEST Jens Axboe wrote:
> On 08/08/2017 10:22 AM, Greg KH wrote:
> > On Sun, Jul 30, 2017 at 03:50:15PM +0200, Oleksandr Natalenko wrote:
> >> Hello Mike et al.
> >> 
> >> On neděle 30. července 2017 7:12:31 CEST Mike Galbraith wrote:
> >>> FWIW, first thing I'd do is update that 4.12.0 to 4.12.4, and see if
> >>> stable fixed it.
> >> 
> >> My build already includes v4.12.4.
> >> 
> >>> If not, I'd find these two commits irresistible.
> >>> 
> >>> 5f042e7cbd9eb blk-mq: Include all present CPUs in the default queue
> >>> mapping
> >>> 4b855ad37194f blk-mq: Create hctx for each present CPU
> >> 
> >> I've applied these 2 commits, and cannot reproduce the issue anymore.
> >> Looks
> >> like a perfect hit, thanks!
> >> 
> >>> 'course applying random upstream bits does come with some risk, trying
> >>> a kernel already containing them has less "entertainment" potential.
> >> 
> >> Should you consider applying them to v4.12.x stable series? CC'ing Greg
> >> just in case.
> > 
> > I can queue these up if I get an ack from the developers/maintainers
> > that it is ok to do so...
> 
> You can add those two commits to stable.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: blk-mq breaks suspend even with runtime PM patch
  2017-08-08 16:36           ` Oleksandr Natalenko
  (?)
@ 2017-08-08 16:43           ` Greg KH
  2017-08-08 16:46               ` Oleksandr Natalenko
  -1 siblings, 1 reply; 22+ messages in thread
From: Greg KH @ 2017-08-08 16:43 UTC (permalink / raw)
  To: Oleksandr Natalenko
  Cc: Jens Axboe, Mike Galbraith, Christoph Hellwig, linux-block, linux-kernel

On Tue, Aug 08, 2017 at 06:36:01PM +0200, Oleksandr Natalenko wrote:
> Could you queue "block: disable runtime-pm for blk-mq" too please? It is also 
> related to suspend-resume freezes that were observed by multiple users.

What is the git commit id of that patch?

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: blk-mq breaks suspend even with runtime PM patch
  2017-08-08 16:34         ` Mike Galbraith
  (?)
@ 2017-08-08 16:44         ` Greg KH
  2017-08-08 16:50             ` Mike Galbraith
  -1 siblings, 1 reply; 22+ messages in thread
From: Greg KH @ 2017-08-08 16:44 UTC (permalink / raw)
  To: Mike Galbraith
  Cc: Oleksandr Natalenko, Jens Axboe, Christoph Hellwig, linux-block,
	linux-kernel

On Tue, Aug 08, 2017 at 06:34:01PM +0200, Mike Galbraith wrote:
> On Tue, 2017-08-08 at 09:22 -0700, Greg KH wrote:
> > On Sun, Jul 30, 2017 at 03:50:15PM +0200, Oleksandr Natalenko wrote:
> > > Hello Mike et al.
> > > 
> > > On neděle 30. července 2017 7:12:31 CEST Mike Galbraith wrote:
> > > > FWIW, first thing I'd do is update that 4.12.0 to 4.12.4, and see if
> > > > stable fixed it.
> > > 
> > > My build already includes v4.12.4.
> > > 
> > > > If not, I'd find these two commits irresistible.
> > > > 
> > > > 5f042e7cbd9eb blk-mq: Include all present CPUs in the default queue mapping
> > > > 4b855ad37194f blk-mq: Create hctx for each present CPU
> > > 
> > > I've applied these 2 commits, and cannot reproduce the issue anymore. Looks 
> > > like a perfect hit, thanks!
> > > 
> > > > 'course applying random upstream bits does come with some risk, trying
> > > > a kernel already containing them has less "entertainment" potential. 
> > > 
> > > Should you consider applying them to v4.12.x stable series? CC'ing Greg just 
> > > in case.
> > 
> > I can queue these up if I get an ack from the developers/maintainers
> > that it is ok to do so...
> > 
> > {hint}
> 
> {hint++}
> 
> Those commits take Steven Rostedt's hotplug stress script runtime down
> from 4 _minutes_ down to 7 seconds for my RT tree, so I'm rather hoping
> you hear an "ACK" too.

Oh, nice!

Should these go back farther than 4.12?  Looks like they apply cleanly
to 4.9, didn't look older than that...

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: blk-mq breaks suspend even with runtime PM patch
  2017-08-08 16:43           ` Greg KH
@ 2017-08-08 16:46               ` Oleksandr Natalenko
  0 siblings, 0 replies; 22+ messages in thread
From: Oleksandr Natalenko @ 2017-08-08 16:46 UTC (permalink / raw)
  To: Greg KH
  Cc: Jens Axboe, Mike Galbraith, Christoph Hellwig, linux-block, linux-kernel

Greg,

this is 765e40b675a9566459ddcb8358ad16f3b8344bbe.

On =FAter=FD 8. srpna 2017 18:43:33 CEST Greg KH wrote:
> On Tue, Aug 08, 2017 at 06:36:01PM +0200, Oleksandr Natalenko wrote:
> > Could you queue "block: disable runtime-pm for blk-mq" too please? It is
> > also related to suspend-resume freezes that were observed by multiple
> > users.
> What is the git commit id of that patch?
>=20
> thanks,
>=20
> greg k-h

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: blk-mq breaks suspend even with runtime PM patch
@ 2017-08-08 16:46               ` Oleksandr Natalenko
  0 siblings, 0 replies; 22+ messages in thread
From: Oleksandr Natalenko @ 2017-08-08 16:46 UTC (permalink / raw)
  To: Greg KH
  Cc: Jens Axboe, Mike Galbraith, Christoph Hellwig, linux-block, linux-kernel

Greg,

this is 765e40b675a9566459ddcb8358ad16f3b8344bbe.

On úterý 8. srpna 2017 18:43:33 CEST Greg KH wrote:
> On Tue, Aug 08, 2017 at 06:36:01PM +0200, Oleksandr Natalenko wrote:
> > Could you queue "block: disable runtime-pm for blk-mq" too please? It is
> > also related to suspend-resume freezes that were observed by multiple
> > users.
> What is the git commit id of that patch?
> 
> thanks,
> 
> greg k-h

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: blk-mq breaks suspend even with runtime PM patch
  2017-08-08 16:44         ` Greg KH
@ 2017-08-08 16:50             ` Mike Galbraith
  0 siblings, 0 replies; 22+ messages in thread
From: Mike Galbraith @ 2017-08-08 16:50 UTC (permalink / raw)
  To: Greg KH
  Cc: Oleksandr Natalenko, Jens Axboe, Christoph Hellwig, linux-block,
	linux-kernel

On Tue, 2017-08-08 at 09:44 -0700, Greg KH wrote:
> 
> Should these go back farther than 4.12?  Looks like they apply cleanly
> to 4.9, didn't look older than that...

I met prerequisites at 4.11, but I wasn't patching anything remotely
resembling virgin source.

	-Mike

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: blk-mq breaks suspend even with runtime PM patch
@ 2017-08-08 16:50             ` Mike Galbraith
  0 siblings, 0 replies; 22+ messages in thread
From: Mike Galbraith @ 2017-08-08 16:50 UTC (permalink / raw)
  To: Greg KH
  Cc: Oleksandr Natalenko, Jens Axboe, Christoph Hellwig, linux-block,
	linux-kernel

On Tue, 2017-08-08 at 09:44 -0700, Greg KH wrote:
> 
> Should these go back farther than 4.12?  Looks like they apply cleanly
> to 4.9, didn't look older than that...

I met prerequisites at 4.11, but I wasn't patching anything remotely
resembling virgin source.

	-Mike

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: blk-mq breaks suspend even with runtime PM patch
  2017-08-08 16:50             ` Mike Galbraith
@ 2017-08-08 18:33               ` Mike Galbraith
  -1 siblings, 0 replies; 22+ messages in thread
From: Mike Galbraith @ 2017-08-08 18:33 UTC (permalink / raw)
  To: Greg KH
  Cc: Oleksandr Natalenko, Jens Axboe, Christoph Hellwig, linux-block,
	linux-kernel

On Tue, 2017-08-08 at 18:50 +0200, Mike Galbraith wrote:
> On Tue, 2017-08-08 at 09:44 -0700, Greg KH wrote:
> >=20
> > Should these go back farther than 4.12?  Looks like they apply cleanly
> > to 4.9, didn't look older than that...
>=20
> I met prerequisites at 4.11...

FWIW, I took/modified 2d0364c8c1a9. =C2=A0Dunno if the suspend regression
exists in 4.11 though, without which you'll likely want nothing.

	-Mike

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: blk-mq breaks suspend even with runtime PM patch
@ 2017-08-08 18:33               ` Mike Galbraith
  0 siblings, 0 replies; 22+ messages in thread
From: Mike Galbraith @ 2017-08-08 18:33 UTC (permalink / raw)
  To: Greg KH
  Cc: Oleksandr Natalenko, Jens Axboe, Christoph Hellwig, linux-block,
	linux-kernel

On Tue, 2017-08-08 at 18:50 +0200, Mike Galbraith wrote:
> On Tue, 2017-08-08 at 09:44 -0700, Greg KH wrote:
> > 
> > Should these go back farther than 4.12?  Looks like they apply cleanly
> > to 4.9, didn't look older than that...
> 
> I met prerequisites at 4.11...

FWIW, I took/modified 2d0364c8c1a9.  Dunno if the suspend regression
exists in 4.11 though, without which you'll likely want nothing.

	-Mike

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: blk-mq breaks suspend even with runtime PM patch
  2017-08-08 18:33               ` Mike Galbraith
  (?)
@ 2017-08-08 18:35               ` Jens Axboe
  -1 siblings, 0 replies; 22+ messages in thread
From: Jens Axboe @ 2017-08-08 18:35 UTC (permalink / raw)
  To: Mike Galbraith, Greg KH
  Cc: Oleksandr Natalenko, Christoph Hellwig, linux-block, linux-kernel

On 08/08/2017 12:33 PM, Mike Galbraith wrote:
> On Tue, 2017-08-08 at 18:50 +0200, Mike Galbraith wrote:
>> On Tue, 2017-08-08 at 09:44 -0700, Greg KH wrote:
>>>
>>> Should these go back farther than 4.12?  Looks like they apply cleanly
>>> to 4.9, didn't look older than that...
>>
>> I met prerequisites at 4.11...
> 
> FWIW, I took/modified 2d0364c8c1a9.  Dunno if the suspend regression
> exists in 4.11 though, without which you'll likely want nothing.

It does exist, the only change here is that we default people to
scsi-mq in 4.12+. Honestly, nobody complained since we've had scsi-mq,
so I could pivot both ways on whether we really need the changes in
earlier versions or not.

-- 
Jens Axboe

^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2017-08-08 18:35 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-07-29 15:27 blk-mq breaks suspend even with runtime PM patch Oleksandr Natalenko
2017-07-29 21:17 ` Oleksandr Natalenko
2017-07-29 21:17   ` Oleksandr Natalenko
2017-07-30  5:12 ` Mike Galbraith
2017-07-30  5:12   ` Mike Galbraith
2017-07-30 13:50   ` Oleksandr Natalenko
2017-07-30 13:50     ` Oleksandr Natalenko
2017-08-08 16:22     ` Greg KH
2017-08-08 16:33       ` Jens Axboe
2017-08-08 16:36         ` Oleksandr Natalenko
2017-08-08 16:36           ` Oleksandr Natalenko
2017-08-08 16:43           ` Greg KH
2017-08-08 16:46             ` Oleksandr Natalenko
2017-08-08 16:46               ` Oleksandr Natalenko
2017-08-08 16:34       ` Mike Galbraith
2017-08-08 16:34         ` Mike Galbraith
2017-08-08 16:44         ` Greg KH
2017-08-08 16:50           ` Mike Galbraith
2017-08-08 16:50             ` Mike Galbraith
2017-08-08 18:33             ` Mike Galbraith
2017-08-08 18:33               ` Mike Galbraith
2017-08-08 18:35               ` Jens Axboe

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.