Linux-rt-users archive on lore.kernel.org
 help / color / Atom feed
* KVM Arm64 and Linux-RT issues
@ 2019-07-23 17:58 Julien Grall
  2019-07-24  8:53 ` Marc Zyngier
  0 siblings, 1 reply; 13+ messages in thread
From: Julien Grall @ 2019-07-23 17:58 UTC (permalink / raw)
  To: linux-rt-users, anna-maria, bigeasy
  Cc: Marc Zyngier, kvmarm, Julien Thierry, James Morse, Suzuki K Poulose

Hi all,

I have been playing with the latest branch of Linux RT (5.2-rt1) and notice the 
following splat when starting a KVM guest.

[  122.336254] 003: BUG: sleeping function called from invalid context at 
kernel/locking/rtmutex.c:968
[  122.336263] 003: in_atomic(): 1, irqs_disabled(): 0, pid: 1430, name: kvm-vcpu-1
[  122.336267] 003: 2 locks held by kvm-vcpu-1/1430:
[  122.336271] 003:  #0: ffff8007c1518100 (&vcpu->mutex){+.+.}, at: 
kvm_vcpu_ioctl+0x70/0xae0
[  122.336287] 003:  #1: ffff8007fb08b478 
(&cpu_base->softirq_expiry_lock){+.+.}, at: hrtimer_grab_expiry_lock+0x24/0x40
[  122.336299] 003: Preemption disabled at:
[  122.336300] 003: [<ffff0000111a44e8>] schedule+0x30/0xd8
[  122.336308] 003: CPU: 3 PID: 1430 Comm: kvm-vcpu-1 Tainted: G        W 
5.2.0-rt1-00008-g5bc0332820fd #88
[  122.336311] 003: Hardware name: AMD Seattle (Rev.B0) Development Board 
(Overdrive) (DT)
[  122.336314] 003: Call trace:
[  122.336315] 003:  dump_backtrace+0x0/0x130
[  122.336319] 003:  show_stack+0x14/0x20
[  122.336321] 003:  dump_stack+0xbc/0x104
[  122.336324] 003:  ___might_sleep+0x198/0x238
[  122.336327] 003:  rt_spin_lock+0x5c/0x70
[  122.336330] 003:  hrtimer_grab_expiry_lock+0x24/0x40
[  122.336332] 003:  hrtimer_cancel+0x1c/0x38
[  122.336334] 003:  kvm_timer_vcpu_load+0x78/0x3e0
[  122.336338] 003:  kvm_arch_vcpu_load+0x130/0x298
[  122.336340] 003:  kvm_sched_in+0x38/0x68
[  122.336342] 003:  finish_task_switch+0x14c/0x300
[  122.336344] 003:  __schedule+0x2b8/0x8d0
[  122.336346] 003:  schedule+0x38/0xd8
[  122.336347] 003:  kvm_vcpu_block+0xac/0x790
[  122.336349] 003:  kvm_handle_wfx+0x210/0x520
[  122.336352] 003:  handle_exit+0x134/0x1d0
[  122.336355] 003:  kvm_arch_vcpu_ioctl_run+0x658/0xbc0
[  122.336357] 003:  kvm_vcpu_ioctl+0x3a0/0xae0
[  122.336359] 003:  do_vfs_ioctl+0xbc/0x910
[  122.336363] 003:  ksys_ioctl+0x78/0xa8
[  122.336365] 003:  __arm64_sys_ioctl+0x1c/0x28
[  122.336367] 003:  el0_svc_common.constprop.0+0x90/0x188
[  122.336370] 003:  el0_svc_handler+0x28/0x78
[  122.336373] 003:  el0_svc+0x8/0xc
[  122.564216] 000: BUG: scheduling while atomic: kvm-vcpu-1/1430/0x00000002
[  122.564221] 000: 2 locks held by kvm-vcpu-1/1430:
[  122.564224] 000:  #0: ffff8007c1518100 (&vcpu->mutex){+.+.}, at: 
kvm_vcpu_ioctl+0x70/0xae0
[  122.564236] 000:  #1: ffff8007fb08b478 
(&cpu_base->softirq_expiry_lock){+.+.}, at: hrtimer_grab_expiry_lock+0x24/0x40
[  122.564245] 000: Modules linked in:
[  122.564248] 000: Preemption disabled at:
[  122.564249] 000: [<ffff0000111a44e8>] schedule+0x30/0xd8
[  122.564255] 000: CPU: 0 PID: 1430 Comm: kvm-vcpu-1 Tainted: G        W 
5.2.0-rt1-00008-g5bc0332820fd #88
[  122.564259] 000: Hardware name: AMD Seattle (Rev.B0) Development Board 
(Overdrive) (DT)
[  122.564261] 000: Call trace:
[  122.564262] 000:  dump_backtrace+0x0/0x130
[  122.564264] 000:  show_stack+0x14/0x20
[  122.564267] 000:  dump_stack+0xbc/0x104
[  122.564269] 000:  __schedule_bug+0xbc/0xe8
[  122.564272] 000:  __schedule+0x6e4/0x8d0
[  122.564274] 000:  schedule+0x38/0xd8
[  122.564275] 000:  rt_spin_lock_slowlock_locked+0xf8/0x390
[  122.564277] 000:  rt_spin_lock_slowlock+0x68/0xa0
[  122.564279] 000:  rt_spin_lock+0x64/0x70
[  122.564281] 000:  hrtimer_grab_expiry_lock+0x24/0x40
[  122.564283] 000:  hrtimer_cancel+0x1c/0x38
[  122.564285] 000:  kvm_timer_vcpu_load+0x78/0x3e0
[  122.564287] 000:  kvm_arch_vcpu_load+0x130/0x298
[  122.564290] 000:  kvm_sched_in+0x38/0x68
[  122.564291] 000:  finish_task_switch+0x14c/0x300
[  122.564293] 000:  __schedule+0x2b8/0x8d0
[  122.564295] 000:  schedule+0x38/0xd8
[  122.564297] 000:  kvm_vcpu_block+0xac/0x790
[  122.564298] 000:  kvm_handle_wfx+0x210/0x520
[  122.564301] 000:  handle_exit+0x134/0x1d0
[  122.564303] 000:  kvm_arch_vcpu_ioctl_run+0x658/0xbc0
[  122.564305] 000:  kvm_vcpu_ioctl+0x3a0/0xae0
[  122.564307] 000:  do_vfs_ioctl+0xbc/0x910
[  122.564309] 000:  ksys_ioctl+0x78/0xa8
[  122.564311] 000:  __arm64_sys_ioctl+0x1c/0x28
[  122.564313] 000:  el0_svc_common.constprop.0+0x90/0x188
[  122.564316] 000:  el0_svc_handler+0x28/0x78
[  122.564318] 000:  el0_svc+0x8/0xc
[  122.564341] 000: WARNING: CPU: 0 PID: 1430 at kernel/sched/core.c:7366 
migrate_enable+0x21c/0x428
[  122.564346] 000: Modules linked in:
[  122.564348] 000: CPU: 0 PID: 1430 Comm: kvm-vcpu-1 Tainted: G        W 
5.2.0-rt1-00008-g5bc0332820fd #88
[  122.564351] 000: Hardware name: AMD Seattle (Rev.B0) Development Board 
(Overdrive) (DT)
[  122.564352] 000: pstate: 60000005 (nZCv daif -PAN -UAO)
[  122.564355] 000: pc : migrate_enable+0x21c/0x428
[  122.564357] 000: lr : rt_spin_unlock+0x2c/0x48
[  122.564359] 000: sp : ffff00002648b680
[  122.564362] 000: x29: ffff00002648b680 x28: ffff000011af9b00
[  122.564365] 000: x27: ffff8007fb03d918 x26: 00008007e9579000
[  122.564368] 000: x25: 0000000000000000 x24: fffefdf7d6910c00
[  122.564371] 000: x23: 0000000000000000 x22: 0000000000000008
[  122.564374] 000: x21: 0000000000000001 x20: ffff8007c284cd80
[  122.564377] 000: x19: ffff000011af0d88 x18: 000000000004ffff
[  122.564379] 000: x17: 0000000000000000 x16: 0000000000000000
[  122.564382] 000: x15: ffff000011f7ae50 x14: 0000000000020f9e
[  122.564385] 000: x13: 0000000000000028 x12: 0000000000000002
[  122.564387] 000: x11: 0000000000000028 x10: ffff000011af2000
[  122.564390] 000: x9 : 0000000000000000 x8 : 0000000000096855
[  122.564393] 000: x7 : 0000000000000001 x6 : 0000000000000000
[  122.564395] 000: x5 : 00008007e9579000 x4 : 0000000000000001
[  122.564398] 000: x3 : 0000000000000001 x2 : 0000000000000000
[  122.564400] 000: x1 : 0000000000000070 x0 : 0000000000000000
[  122.564402] 000: Call trace:
[  122.564403] 000:  migrate_enable+0x21c/0x428
[  122.564406] 000:  rt_spin_unlock+0x2c/0x48
[  122.564408] 000:  hrtimer_grab_expiry_lock+0x30/0x40
[  122.564410] 000:  hrtimer_cancel+0x1c/0x38
[  122.564411] 000:  kvm_timer_vcpu_load+0x78/0x3e0
[  122.564414] 000:  kvm_arch_vcpu_load+0x130/0x298
[  122.564416] 000:  kvm_sched_in+0x38/0x68
[  122.564418] 000:  finish_task_switch+0x14c/0x300
[  122.564420] 000:  __schedule+0x2b8/0x8d0
[  122.564421] 000:  schedule+0x38/0xd8
[  122.564423] 000:  kvm_vcpu_block+0xac/0x790
[  122.564425] 000:  kvm_handle_wfx+0x210/0x520
[  122.564427] 000:  handle_exit+0x134/0x1d0
[  122.564429] 000:  kvm_arch_vcpu_ioctl_run+0x658/0xbc0
[  122.564431] 000:  kvm_vcpu_ioctl+0x3a0/0xae0
[  122.564433] 000:  do_vfs_ioctl+0xbc/0x910
[  122.564435] 000:  ksys_ioctl+0x78/0xa8
[  122.564437] 000:  __arm64_sys_ioctl+0x1c/0x28
[  122.564439] 000:  el0_svc_common.constprop.0+0x90/0x188
[  122.564441] 000:  el0_svc_handler+0x28/0x78
[  122.564443] 000:  el0_svc+0x8/0xc
[  122.564446] 000: irq event stamp: 1855748
[  122.564447] 000: hardirqs last  enabled at (1855747): [<ffff0000111a8a44>] 
_raw_spin_unlock_irqrestore+0x94/0xa0
[  122.564450] 000: hardirqs last disabled at (1855748): [<ffff00001008193c>] 
do_debug_exception+0x94/0x208
[  122.564453] 000: softirqs last  enabled at (1567014): [<ffff000010088090>] 
fpsimd_restore_current_state+0x50/0xa0
[  122.564457] 000: softirqs last disabled at (1567011): [<ffff000010088068>] 
fpsimd_restore_current_state+0x28/0xa0
[  122.564460] 000: ---[ end trace 0000000000000003 ]---
[  122.564488] 000: WARNING: CPU: 0 PID: 1430 at kernel/cpu.c:331 
unpin_current_cpu+0x94/0x138
[  122.564494] 000: Modules linked in:
[  122.564496] 000: CPU: 0 PID: 1430 Comm: kvm-vcpu-1 Tainted: G        W 
5.2.0-rt1-00008-g5bc0332820fd #88
[  122.564498] 000: Hardware name: AMD Seattle (Rev.B0) Development Board 
(Overdrive) (DT)
[  122.564499] 000: pstate: a0000005 (NzCv daif -PAN -UAO)
[  122.564502] 000: pc : unpin_current_cpu+0x94/0x138
[  122.564504] 000: lr : unpin_current_cpu+0x64/0x138
[  122.564506] 000: sp : ffff00002648b640
[  122.564508] 000: x29: ffff00002648b640 x28: ffff000011af9b00
[  122.564511] 000: x27: ffff8007fb03d918 x26: 00008007e9579000
[  122.564513] 000: x25: 0000000000000000 x24: 00000000ffffffff
[  122.564516] 000: x23: ffff8007c284cd80 x22: ffff000011aa31f0
[  122.564519] 000: x21: 0000ffffee50f260 x20: ffff000011af0da8
[  122.564522] 000: x19: ffff8007fb01c2c8 x18: 000000000004ffff
[  122.564524] 000: x17: 0000000000000000 x16: 0000000000000000
[  122.564527] 000: x15: ffff000011f7ae50 x14: 0000000000020f9e
[  122.564529] 000: x13: 0000000000000028 x12: 0000000000000002
[  122.564532] 000: x11: 0000000000000028 x10: ffff000011af2000
[  122.564535] 000: x9 : 0000000000000000 x8 : 000000006cc11800
[  122.564537] 000: x7 : 0000000050000000 x6 : 000000007a82670e
[  122.564540] 000: x5 : 00008007e9579000 x4 : 0000000000000002
[  122.564542] 000: x3 : 00008007e9579000 x2 : 0000000000000001
[  122.564545] 000: x1 : ffff8007c284cd80 x0 : 0000000000000000
[  122.564547] 000: Call trace:
[  122.564548] 000:  unpin_current_cpu+0x94/0x138
[  122.564550] 000:  migrate_enable+0x190/0x428
[  122.564552] 000:  rt_spin_unlock+0x2c/0x48
[  122.564554] 000:  hrtimer_grab_expiry_lock+0x30/0x40
[  122.564556] 000:  hrtimer_cancel+0x1c/0x38
[  122.564558] 000:  kvm_timer_vcpu_load+0x78/0x3e0
[  122.564560] 000:  kvm_arch_vcpu_load+0x130/0x298
[  122.564562] 000:  kvm_sched_in+0x38/0x68
[  122.564564] 000:  finish_task_switch+0x14c/0x300
[  122.564566] 000:  __schedule+0x2b8/0x8d0
[  122.564567] 000:  schedule+0x38/0xd8
[  122.564569] 000:  kvm_vcpu_block+0xac/0x790
[  122.564571] 000:  kvm_handle_wfx+0x210/0x520
[  122.564573] 000:  handle_exit+0x134/0x1d0
[  122.564575] 000:  kvm_arch_vcpu_ioctl_run+0x658/0xbc0
[  122.564577] 000:  kvm_vcpu_ioctl+0x3a0/0xae0
[  122.564579] 000:  do_vfs_ioctl+0xbc/0x910
[  122.564581] 000:  ksys_ioctl+0x78/0xa8
[  122.564583] 000:  __arm64_sys_ioctl+0x1c/0x28
[  122.564585] 000:  el0_svc_common.constprop.0+0x90/0x188
[  122.564588] 000:  el0_svc_handler+0x28/0x78
[  122.564590] 000:  el0_svc+0x8/0xc
[  122.564592] 000: irq event stamp: 1855754
[  122.564593] 000: hardirqs last  enabled at (1855753): [<ffff0000111a8a44>] 
_raw_spin_unlock_irqrestore+0x94/0xa0
[  122.564596] 000: hardirqs last disabled at (1855754): [<ffff00001008193c>] 
do_debug_exception+0x94/0x208
[  122.564598] 000: softirqs last  enabled at (1567014): [<ffff000010088090>] 
fpsimd_restore_current_state+0x50/0xa0
[  122.564601] 000: softirqs last disabled at (1567011): [<ffff000010088068>] 
fpsimd_restore_current_state+0x28/0xa0
[  122.564604] 000: ---[ end trace 0000000000000004 ]---
[  122.564786] 000: 
================================================================================
[  122.564787] 000: UBSAN: Undefined behaviour in 
/home/julieng/works/linux/kernel/cpu.c:332:15
[  122.564791] 000: index -1 is out of range for type 'long unsigned int [256]'
[  122.564794] 000: CPU: 0 PID: 1430 Comm: kvm-vcpu-1 Tainted: G        W 
5.2.0-rt1-00008-g5bc0332820fd #88
[  122.564797] 000: Hardware name: AMD Seattle (Rev.B0) Development Board 
(Overdrive) (DT)
[  122.564798] 000: Call trace:
[  122.564799] 000:  dump_backtrace+0x0/0x130
[  122.564802] 000:  show_stack+0x14/0x20
[  122.564804] 000:  dump_stack+0xbc/0x104
[  122.564806] 000:  ubsan_epilogue.isra.5+0xc/0x3c
[  122.564810] 000:  __ubsan_handle_out_of_bounds+0x78/0x88
[  122.564812] 000:  unpin_current_cpu+0x130/0x138
[  122.564814] 000:  migrate_enable+0x190/0x428
[  122.564816] 000:  rt_spin_unlock+0x2c/0x48
[  122.564819] 000:  hrtimer_grab_expiry_lock+0x30/0x40
[  122.564821] 000:  hrtimer_cancel+0x1c/0x38
[  122.564823] 000:  kvm_timer_vcpu_load+0x78/0x3e0
[  122.564825] 000:  kvm_arch_vcpu_load+0x130/0x298
[  122.564827] 000:  kvm_sched_in+0x38/0x68
[  122.564829] 000:  finish_task_switch+0x14c/0x300
[  122.564831] 000:  __schedule+0x2b8/0x8d0
[  122.564833] 000:  schedule+0x38/0xd8
[  122.564834] 000:  kvm_vcpu_block+0xac/0x790
[  122.564836] 000:  kvm_handle_wfx+0x210/0x520
[  122.564838] 000:  handle_exit+0x134/0x1d0
[  122.564841] 000:  kvm_arch_vcpu_ioctl_run+0x658/0xbc0
[  122.564843] 000:  kvm_vcpu_ioctl+0x3a0/0xae0
[  122.564845] 000:  do_vfs_ioctl+0xbc/0x910
[  122.564847] 000:  ksys_ioctl+0x78/0xa8
[  122.564849] 000:  __arm64_sys_ioctl+0x1c/0x28
[  122.564851] 000:  el0_svc_common.constprop.0+0x90/0x188
[  122.564853] 000:  el0_svc_handler+0x28/0x78
[  122.564856] 000:  el0_svc+0x8/0xc
[  122.564858] 000: 
================================================================================
[  122.564868] 000: Unable to handle kernel paging request at virtual address 
ffff001f11e1e436
[  122.564871] 000: Mem abort info:
[  122.564872] 000:   ESR = 0x96000021
[  122.564874] 000:   Exception class = DABT (current EL), IL = 32 bits
[  122.564876] 000:   SET = 0, FnV = 0
[  122.564878] 000:   EA = 0, S1PTW = 0
[  122.564879] 000: Data abort info:
[  122.564880] 000:   ISV = 0, ISS = 0x00000021
[  122.564882] 000:   CM = 0, WnR = 0
[  122.564884] 000: swapper pgtable: 4k pages, 48-bit VAs, pgdp=00000080038dc000
[  122.564887] 000: [ffff001f11e1e436] pgd=00000087fffff003, pud=0000000000000000
[  122.564944] 000: Internal error: Oops: 96000021 [#1] PREEMPT SMP
[  122.564948] 000: Modules linked in:
[  122.564949] 000: CPU: 0 PID: 1430 Comm: kvm-vcpu-1 Tainted: G        W 
5.2.0-rt1-00008-g5bc0332820fd #88
[  122.564952] 000: Hardware name: AMD Seattle (Rev.B0) Development Board 
(Overdrive) (DT)
[  122.564953] 000: pstate: 80000005 (Nzcv daif -PAN -UAO)
[  122.564955] 000: pc : __ll_sc_arch_atomic_sub_return+0x4/0x20
[  122.564959] 000: lr : __read_rt_unlock+0x1c/0x68
[  122.564963] 000: sp : ffff00002648b620
[  122.564964] 000: x29: ffff00002648b620 x28: ffff000011af9b00
[  122.564967] 000: x27: ffff8007fb03d918 x26: 00008007e9579000
[  122.564970] 000: x25: 0000000000000000 x24: 00000000ffffffff
[  122.564973] 000: x23: ffffffffffffffff x22: ffff000011aa31f0
[  122.564975] 000: x21: 0000000000000000 x20: ffff000011af0da8
[  122.564978] 000: x19: ffff001f11e1e39e x18: 000000000004ffff
[  122.564980] 000: x17: 0000000000000000 x16: 0000000000000000
[  122.564983] 000: x15: ffff000011f7ae50 x14: 0000000000020fac
[  122.564985] 000: x13: 0000000000000028 x12: ffff000012007d70
[  122.564987] 000: x11: 00000000000107d4 x10: ffff000011af2000
[  122.564990] 000: x9 : 0000000000008008 x8 : ffff000011f7a000
[  122.564992] 000: x7 : 0000000000000001 x6 : ffff000011f7a000
[  122.564995] 000: x5 : 00008007e9579000 x4 : 0000000000000002
[  122.564997] 000: x3 : 0000000000000002 x2 : 0000000000000001
[  122.564999] 000: x1 : ffff001f11e1e436 x0 : 0000000000000001
[  122.565001] 000: Call trace:
[  122.565002] 000:  __ll_sc_arch_atomic_sub_return+0x4/0x20
[  122.565004] 000:  unpin_current_cpu+0x80/0x138
[  122.565007] 000:  migrate_enable+0x190/0x428
[  122.565009] 000:  rt_spin_unlock+0x2c/0x48
[  122.565011] 000:  hrtimer_grab_expiry_lock+0x30/0x40
[  122.565012] 000:  hrtimer_cancel+0x1c/0x38
[  122.565014] 000:  kvm_timer_vcpu_load+0x78/0x3e0
[  122.565016] 000:  kvm_arch_vcpu_load+0x130/0x298
[  122.565018] 000:  kvm_sched_in+0x38/0x68
[  122.565020] 000:  finish_task_switch+0x14c/0x300
[  122.565022] 000:  __schedule+0x2b8/0x8d0
[  122.565023] 000:  schedule+0x38/0xd8
[  122.565024] 000:  kvm_vcpu_block+0xac/0x790
[  122.565026] 000:  kvm_handle_wfx+0x210/0x520
[  122.565028] 000:  handle_exit+0x134/0x1d0
[  122.565031] 000:  kvm_arch_vcpu_ioctl_run+0x658/0xbc0
[  122.565032] 000:  kvm_vcpu_ioctl+0x3a0/0xae0
[  122.565034] 000:  do_vfs_ioctl+0xbc/0x910
[  122.565036] 000:  ksys_ioctl+0x78/0xa8
[  122.565038] 000:  __arm64_sys_ioctl+0x1c/0x28
[  122.565040] 000:  el0_svc_common.constprop.0+0x90/0x188
[  122.565042] 000:  el0_svc_handler+0x28/0x78
[  122.565045] 000:  el0_svc+0x8/0xc
[  122.565048] 000: Code: 88107c31 35ffffb0 d65f03c0 f9800031 (885f7c31)
[  122.565052] 000: ---[ end trace 0000000000000005 ]---
[  122.565060] 000: note: kvm-vcpu-1[1430] exited with preempt_count 1

The first problem "BUG: sleeping function called from invalid context at 
kernel/locking/rtmutex.c:968" seem to be related to RT-specific commit 
d628c3c56cab "hrtimer: Introduce expiry spin lock".

 From my understanding, the problem is the hrtimer_cancel() is called from a 
preempt notifier and therefore preemption will be disabled. The patch mentioned 
above will actually require hrtimer_cancel() to be called from preemptible context.

Do you have any thoughts how the problem should be addressed?

The second problem seems to hint that migrate_enable() was called on a task not 
pinned (-1). This will result to derefence an invalid value. I need to 
investigate how this can happen.

Looking at the other RT tree, I think 5.0 RT now has the same problem.

Cheers,

-- 
Julien Grall

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: KVM Arm64 and Linux-RT issues
  2019-07-23 17:58 KVM Arm64 and Linux-RT issues Julien Grall
@ 2019-07-24  8:53 ` Marc Zyngier
  2019-07-26 22:58   ` Thomas Gleixner
  0 siblings, 1 reply; 13+ messages in thread
From: Marc Zyngier @ 2019-07-24  8:53 UTC (permalink / raw)
  To: Julien Grall, linux-rt-users, anna-maria, bigeasy
  Cc: kvmarm, Julien Thierry, James Morse, Suzuki K Poulose

Julien,

On 23/07/2019 18:58, Julien Grall wrote:
> Hi all,
> 
> I have been playing with the latest branch of Linux RT (5.2-rt1) and notice the 
> following splat when starting a KVM guest.
> 
> [  122.336254] 003: BUG: sleeping function called from invalid context at 
> kernel/locking/rtmutex.c:968
> [  122.336263] 003: in_atomic(): 1, irqs_disabled(): 0, pid: 1430, name: kvm-vcpu-1
> [  122.336267] 003: 2 locks held by kvm-vcpu-1/1430:
> [  122.336271] 003:  #0: ffff8007c1518100 (&vcpu->mutex){+.+.}, at: 
> kvm_vcpu_ioctl+0x70/0xae0
> [  122.336287] 003:  #1: ffff8007fb08b478 
> (&cpu_base->softirq_expiry_lock){+.+.}, at: hrtimer_grab_expiry_lock+0x24/0x40
> [  122.336299] 003: Preemption disabled at:
> [  122.336300] 003: [<ffff0000111a44e8>] schedule+0x30/0xd8
> [  122.336308] 003: CPU: 3 PID: 1430 Comm: kvm-vcpu-1 Tainted: G        W 
> 5.2.0-rt1-00008-g5bc0332820fd #88
> [  122.336311] 003: Hardware name: AMD Seattle (Rev.B0) Development Board 
> (Overdrive) (DT)
> [  122.336314] 003: Call trace:
> [  122.336315] 003:  dump_backtrace+0x0/0x130
> [  122.336319] 003:  show_stack+0x14/0x20
> [  122.336321] 003:  dump_stack+0xbc/0x104
> [  122.336324] 003:  ___might_sleep+0x198/0x238
> [  122.336327] 003:  rt_spin_lock+0x5c/0x70
> [  122.336330] 003:  hrtimer_grab_expiry_lock+0x24/0x40
> [  122.336332] 003:  hrtimer_cancel+0x1c/0x38
> [  122.336334] 003:  kvm_timer_vcpu_load+0x78/0x3e0
> [  122.336338] 003:  kvm_arch_vcpu_load+0x130/0x298
> [  122.336340] 003:  kvm_sched_in+0x38/0x68
> [  122.336342] 003:  finish_task_switch+0x14c/0x300
> [  122.336344] 003:  __schedule+0x2b8/0x8d0
> [  122.336346] 003:  schedule+0x38/0xd8
> [  122.336347] 003:  kvm_vcpu_block+0xac/0x790
> [  122.336349] 003:  kvm_handle_wfx+0x210/0x520
> [  122.336352] 003:  handle_exit+0x134/0x1d0
> [  122.336355] 003:  kvm_arch_vcpu_ioctl_run+0x658/0xbc0
> [  122.336357] 003:  kvm_vcpu_ioctl+0x3a0/0xae0
> [  122.336359] 003:  do_vfs_ioctl+0xbc/0x910
> [  122.336363] 003:  ksys_ioctl+0x78/0xa8
> [  122.336365] 003:  __arm64_sys_ioctl+0x1c/0x28
> [  122.336367] 003:  el0_svc_common.constprop.0+0x90/0x188
> [  122.336370] 003:  el0_svc_handler+0x28/0x78
> [  122.336373] 003:  el0_svc+0x8/0xc
> [  122.564216] 000: BUG: scheduling while atomic: kvm-vcpu-1/1430/0x00000002
> [  122.564221] 000: 2 locks held by kvm-vcpu-1/1430:
> [  122.564224] 000:  #0: ffff8007c1518100 (&vcpu->mutex){+.+.}, at: 
> kvm_vcpu_ioctl+0x70/0xae0
> [  122.564236] 000:  #1: ffff8007fb08b478 
> (&cpu_base->softirq_expiry_lock){+.+.}, at: hrtimer_grab_expiry_lock+0x24/0x40
> [  122.564245] 000: Modules linked in:
> [  122.564248] 000: Preemption disabled at:
> [  122.564249] 000: [<ffff0000111a44e8>] schedule+0x30/0xd8

[...]

> The first problem "BUG: sleeping function called from invalid context at 
> kernel/locking/rtmutex.c:968" seem to be related to RT-specific commit 
> d628c3c56cab "hrtimer: Introduce expiry spin lock".
> 
>  From my understanding, the problem is the hrtimer_cancel() is called from a 
> preempt notifier and therefore preemption will be disabled. The patch mentioned 
> above will actually require hrtimer_cancel() to be called from preemptible context.
> 
> Do you have any thoughts how the problem should be addressed?

It really feels like a change in hrtimer_cancel semantics. From what I
understand, this is used to avoid racing against the softirq, but boy it
breaks things.

If this cannot be avoided, this means we can't cancel the background
timer (which is used to emulate the vcpu timer while it is blocked
waiting for an interrupt), then we must move this canceling to the point
where the vcpu is unblocked (instead of scheduled), which may have some
side effects -- I'll have a look.

But that's not the only problem: We also have hrtimers used to emulate
timers while the vcpu is running, and these timers are canceled in
kvm_timer_vcpu_put(), which is also called from a preempt notifier.
Unfortunately, I don't have a reasonable solution for that (other than
putting this hrtimer_cancel in a workqueue and start chasing the
resulting races).

Any other idea before I start tearing our timer code apart *again*?

Thanks,

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: KVM Arm64 and Linux-RT issues
  2019-07-24  8:53 ` Marc Zyngier
@ 2019-07-26 22:58   ` Thomas Gleixner
  2019-07-27 11:13     ` Marc Zyngier
  0 siblings, 1 reply; 13+ messages in thread
From: Thomas Gleixner @ 2019-07-26 22:58 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: Julien Grall, linux-rt-users, anna-maria, bigeasy, kvmarm,
	Julien Thierry, James Morse, Suzuki K Poulose

On Wed, 24 Jul 2019, Marc Zyngier wrote:
> On 23/07/2019 18:58, Julien Grall wrote:
> It really feels like a change in hrtimer_cancel semantics. From what I
> understand, this is used to avoid racing against the softirq, but boy it
> breaks things.
> 
> If this cannot be avoided, this means we can't cancel the background
> timer (which is used to emulate the vcpu timer while it is blocked
> waiting for an interrupt), then we must move this canceling to the point
> where the vcpu is unblocked (instead of scheduled), which may have some
> side effects -- I'll have a look.
> 
> But that's not the only problem: We also have hrtimers used to emulate
> timers while the vcpu is running, and these timers are canceled in
> kvm_timer_vcpu_put(), which is also called from a preempt notifier.
> Unfortunately, I don't have a reasonable solution for that (other than
> putting this hrtimer_cancel in a workqueue and start chasing the
> resulting races).

The fix is simple. See below. We'll add that to the next RT release. That
will take a while as I'm busy with posting RT stuff for upstream :)

Thanks,

	tglx

8<------------
--- a/virt/kvm/arm/arch_timer.c
+++ b/virt/kvm/arm/arch_timer.c
@@ -80,7 +80,7 @@ static inline bool userspace_irqchip(str
 static void soft_timer_start(struct hrtimer *hrt, u64 ns)
 {
 	hrtimer_start(hrt, ktime_add_ns(ktime_get(), ns),
-		      HRTIMER_MODE_ABS);
+		      HRTIMER_MODE_ABS_HARD);
 }
 

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: KVM Arm64 and Linux-RT issues
  2019-07-26 22:58   ` Thomas Gleixner
@ 2019-07-27 11:13     ` Marc Zyngier
  2019-07-27 13:37       ` Julien Grall
  0 siblings, 1 reply; 13+ messages in thread
From: Marc Zyngier @ 2019-07-27 11:13 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Julien Grall, linux-rt-users, anna-maria, bigeasy, kvmarm,
	Julien Thierry, James Morse, Suzuki K Poulose

Hi Thomas,

On Fri, 26 Jul 2019 23:58:38 +0100,
Thomas Gleixner <tglx@linutronix.de> wrote:
> 
> On Wed, 24 Jul 2019, Marc Zyngier wrote:
> > On 23/07/2019 18:58, Julien Grall wrote:
> > It really feels like a change in hrtimer_cancel semantics. From what I
> > understand, this is used to avoid racing against the softirq, but boy it
> > breaks things.
> > 
> > If this cannot be avoided, this means we can't cancel the background
> > timer (which is used to emulate the vcpu timer while it is blocked
> > waiting for an interrupt), then we must move this canceling to the point
> > where the vcpu is unblocked (instead of scheduled), which may have some
> > side effects -- I'll have a look.
> > 
> > But that's not the only problem: We also have hrtimers used to emulate
> > timers while the vcpu is running, and these timers are canceled in
> > kvm_timer_vcpu_put(), which is also called from a preempt notifier.
> > Unfortunately, I don't have a reasonable solution for that (other than
> > putting this hrtimer_cancel in a workqueue and start chasing the
> > resulting races).
> 
> The fix is simple. See below. We'll add that to the next RT release. That
> will take a while as I'm busy with posting RT stuff for upstream :)

Ah, thanks for that! And yes, looking forward to RT upstream, it's
just about time! ;-)

> 
> Thanks,
> 
> 	tglx
> 
> 8<------------
> --- a/virt/kvm/arm/arch_timer.c
> +++ b/virt/kvm/arm/arch_timer.c
> @@ -80,7 +80,7 @@ static inline bool userspace_irqchip(str
>  static void soft_timer_start(struct hrtimer *hrt, u64 ns)
>  {
>  	hrtimer_start(hrt, ktime_add_ns(ktime_get(), ns),
> -		      HRTIMER_MODE_ABS);
> +		      HRTIMER_MODE_ABS_HARD);
>  }
>  

That's pretty neat, and matches the patch you already have for
x86. Feel free to add my

Acked-by: Marc Zyngier <maz@kernel.org>

Thanks,

	M.

-- 
Jazz is not dead, it just smells funny.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: KVM Arm64 and Linux-RT issues
  2019-07-27 11:13     ` Marc Zyngier
@ 2019-07-27 13:37       ` Julien Grall
  2019-08-13 12:58         ` bigeasy
  0 siblings, 1 reply; 13+ messages in thread
From: Julien Grall @ 2019-07-27 13:37 UTC (permalink / raw)
  To: Marc Zyngier, Thomas Gleixner
  Cc: linux-rt-users, anna-maria, bigeasy, kvmarm, Julien Thierry,
	James Morse, Suzuki K Poulose

Hi,

On 7/27/19 12:13 PM, Marc Zyngier wrote:
> On Fri, 26 Jul 2019 23:58:38 +0100,
> Thomas Gleixner <tglx@linutronix.de> wrote:
>>
>> On Wed, 24 Jul 2019, Marc Zyngier wrote:
>>> On 23/07/2019 18:58, Julien Grall wrote:
>>> It really feels like a change in hrtimer_cancel semantics. From what I
>>> understand, this is used to avoid racing against the softirq, but boy it
>>> breaks things.
>>>
>>> If this cannot be avoided, this means we can't cancel the background
>>> timer (which is used to emulate the vcpu timer while it is blocked
>>> waiting for an interrupt), then we must move this canceling to the point
>>> where the vcpu is unblocked (instead of scheduled), which may have some
>>> side effects -- I'll have a look.
>>>
>>> But that's not the only problem: We also have hrtimers used to emulate
>>> timers while the vcpu is running, and these timers are canceled in
>>> kvm_timer_vcpu_put(), which is also called from a preempt notifier.
>>> Unfortunately, I don't have a reasonable solution for that (other than
>>> putting this hrtimer_cancel in a workqueue and start chasing the
>>> resulting races).
>>
>> The fix is simple. See below. We'll add that to the next RT release. That
>> will take a while as I'm busy with posting RT stuff for upstream :)
> 
> Ah, thanks for that! And yes, looking forward to RT upstream, it's
> just about time! ;-)
> 
>>
>> Thanks,
>>
>> 	tglx
>>
>> 8<------------
>> --- a/virt/kvm/arm/arch_timer.c
>> +++ b/virt/kvm/arm/arch_timer.c
>> @@ -80,7 +80,7 @@ static inline bool userspace_irqchip(str
>>   static void soft_timer_start(struct hrtimer *hrt, u64 ns)
>>   {
>>   	hrtimer_start(hrt, ktime_add_ns(ktime_get(), ns),
>> -		      HRTIMER_MODE_ABS);
>> +		      HRTIMER_MODE_ABS_HARD);
>>   }
>>   
> 
> That's pretty neat, and matches the patch you already have for
> x86. Feel free to add my
> 
> Acked-by: Marc Zyngier <maz@kernel.org>

I can confirm the warning now disappeared. Feel free to added my tested-by:

Tested-by: Julien Grall <julien.grall@arm.com>

Thank you both for the help!

Cheers,

-- 
Julien Grall

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: KVM Arm64 and Linux-RT issues
  2019-07-27 13:37       ` Julien Grall
@ 2019-08-13 12:58         ` bigeasy
  2019-08-13 15:44           ` Julien Grall
  0 siblings, 1 reply; 13+ messages in thread
From: bigeasy @ 2019-08-13 12:58 UTC (permalink / raw)
  To: Julien Grall
  Cc: Marc Zyngier, Thomas Gleixner, linux-rt-users, anna-maria,
	kvmarm, Julien Thierry, James Morse, Suzuki K Poulose

On 2019-07-27 14:37:11 [+0100], Julien Grall wrote:
> > > 8<------------
> > > --- a/virt/kvm/arm/arch_timer.c
> > > +++ b/virt/kvm/arm/arch_timer.c
> > > @@ -80,7 +80,7 @@ static inline bool userspace_irqchip(str
> > >   static void soft_timer_start(struct hrtimer *hrt, u64 ns)
> > >   {
> > >   	hrtimer_start(hrt, ktime_add_ns(ktime_get(), ns),
> > > -		      HRTIMER_MODE_ABS);
> > > +		      HRTIMER_MODE_ABS_HARD);
> > >   }
> > 
> > That's pretty neat, and matches the patch you already have for
> > x86. Feel free to add my
> > 
> > Acked-by: Marc Zyngier <maz@kernel.org>
> 
> I can confirm the warning now disappeared. Feel free to added my tested-by:
> 
> Tested-by: Julien Grall <julien.grall@arm.com>
> 

|kvm_hrtimer_expire()
| kvm_timer_update_irq()
|   kvm_vgic_inject_irq()
|     vgic_lazy_init()
|                if (unlikely(!vgic_initialized(kvm))) {
|                 if (kvm->arch.vgic.vgic_model != KVM_DEV_TYPE_ARM_VGIC_V2)
|                         return -EBUSY;
| 
|                 mutex_lock(&kvm->lock);

Is this possible path of any concern? This should throw a warning also
for !RT so probably not…

I prepared the patch below. This one could go straight to tglx's timer tree
since he has the _HARD bits there. I *think* it requires to set the bits
_HARD during _init() and _start() otherwise there is (or was) a warning…

Sebastian
8<------------

From: Thomas Gleixner <tglx@linutronix.de>
Date: Tue, 13 Aug 2019 14:29:41 +0200
Subject: [PATCH] KVM: arm/arm64: Let the timer expire in hardirq context on RT

The timers are canceled from an preempt-notifier which is invoked with
disabled preemption which is not allowed on PREEMPT_RT.
The timer callback is short so in could be invoked in hard-IRQ context
on -RT.

Let the timer expire on hard-IRQ context even on -RT.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Marc Zyngier <maz@kernel.org>
Tested-by: Julien Grall <julien.grall@arm.com>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
---
 virt/kvm/arm/arch_timer.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/virt/kvm/arm/arch_timer.c b/virt/kvm/arm/arch_timer.c
index 1be486d5d7cb4..0bfa7c5b5c890 100644
--- a/virt/kvm/arm/arch_timer.c
+++ b/virt/kvm/arm/arch_timer.c
@@ -80,7 +80,7 @@ static inline bool userspace_irqchip(struct kvm *kvm)
 static void soft_timer_start(struct hrtimer *hrt, u64 ns)
 {
 	hrtimer_start(hrt, ktime_add_ns(ktime_get(), ns),
-		      HRTIMER_MODE_ABS);
+		      HRTIMER_MODE_ABS_HARD);
 }
 
 static void soft_timer_cancel(struct hrtimer *hrt)
@@ -697,11 +697,11 @@ void kvm_timer_vcpu_init(struct kvm_vcpu *vcpu)
 	update_vtimer_cntvoff(vcpu, kvm_phys_timer_read());
 	ptimer->cntvoff = 0;
 
-	hrtimer_init(&timer->bg_timer, CLOCK_MONOTONIC, HRTIMER_MODE_ABS);
+	hrtimer_init(&timer->bg_timer, CLOCK_MONOTONIC, HRTIMER_MODE_ABS_HARD);
 	timer->bg_timer.function = kvm_bg_timer_expire;
 
-	hrtimer_init(&vtimer->hrtimer, CLOCK_MONOTONIC, HRTIMER_MODE_ABS);
-	hrtimer_init(&ptimer->hrtimer, CLOCK_MONOTONIC, HRTIMER_MODE_ABS);
+	hrtimer_init(&vtimer->hrtimer, CLOCK_MONOTONIC, HRTIMER_MODE_ABS_HARD);
+	hrtimer_init(&ptimer->hrtimer, CLOCK_MONOTONIC, HRTIMER_MODE_ABS_HARD);
 	vtimer->hrtimer.function = kvm_hrtimer_expire;
 	ptimer->hrtimer.function = kvm_hrtimer_expire;
 
-- 
2.23.0.rc1


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: KVM Arm64 and Linux-RT issues
  2019-08-13 12:58         ` bigeasy
@ 2019-08-13 15:44           ` Julien Grall
  2019-08-13 16:24             ` Marc Zyngier
  0 siblings, 1 reply; 13+ messages in thread
From: Julien Grall @ 2019-08-13 15:44 UTC (permalink / raw)
  To: bigeasy
  Cc: Marc Zyngier, Thomas Gleixner, linux-rt-users, anna-maria,
	kvmarm, Julien Thierry, James Morse, Suzuki K Poulose,
	Andre Przywara

Hi Sebastian,

On 8/13/19 1:58 PM, bigeasy@linutronix.de wrote:
> On 2019-07-27 14:37:11 [+0100], Julien Grall wrote:
>>>> 8<------------
>>>> --- a/virt/kvm/arm/arch_timer.c
>>>> +++ b/virt/kvm/arm/arch_timer.c
>>>> @@ -80,7 +80,7 @@ static inline bool userspace_irqchip(str
>>>>    static void soft_timer_start(struct hrtimer *hrt, u64 ns)
>>>>    {
>>>>    	hrtimer_start(hrt, ktime_add_ns(ktime_get(), ns),
>>>> -		      HRTIMER_MODE_ABS);
>>>> +		      HRTIMER_MODE_ABS_HARD);
>>>>    }
>>>
>>> That's pretty neat, and matches the patch you already have for
>>> x86. Feel free to add my
>>>
>>> Acked-by: Marc Zyngier <maz@kernel.org>
>>
>> I can confirm the warning now disappeared. Feel free to added my tested-by:
>>
>> Tested-by: Julien Grall <julien.grall@arm.com>
>>
> 
> |kvm_hrtimer_expire()
> | kvm_timer_update_irq()
> |   kvm_vgic_inject_irq()
> |     vgic_lazy_init()
> |                if (unlikely(!vgic_initialized(kvm))) {
> |                 if (kvm->arch.vgic.vgic_model != KVM_DEV_TYPE_ARM_VGIC_V2)
> |                         return -EBUSY;
> |
> |                 mutex_lock(&kvm->lock);
> 
> Is this possible path of any concern? This should throw a warning also
> for !RT so probably not…

Hmmm, theoretically yes. In practice, it looks like the hrtimer will not 
be started before kvm_vcpu_first_run_init() is called on the first run.

The function will call kvm_vgic_map_resources() which will initialize 
the vgic if not already done.

Looking around, I think this is here to cater the case where 
KVM_IRQ_LINE is called before running.

I am not yet familiar with the vgic, so I may have missed something.

> 
> I prepared the patch below. This one could go straight to tglx's timer tree
> since he has the _HARD bits there. I *think* it requires to set the bits
> _HARD during _init() and _start() otherwise there is (or was) a warning…
> 
> Sebastian
> 8<------------
> 
> From: Thomas Gleixner <tglx@linutronix.de>
> Date: Tue, 13 Aug 2019 14:29:41 +0200
> Subject: [PATCH] KVM: arm/arm64: Let the timer expire in hardirq context on RT
> 
> The timers are canceled from an preempt-notifier which is invoked with
> disabled preemption which is not allowed on PREEMPT_RT.
> The timer callback is short so in could be invoked in hard-IRQ context
> on -RT.
> 
> Let the timer expire on hard-IRQ context even on -RT.
> 
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> Acked-by: Marc Zyngier <maz@kernel.org>
> Tested-by: Julien Grall <julien.grall@arm.com>
> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
> ---
>   virt/kvm/arm/arch_timer.c | 8 ++++----
>   1 file changed, 4 insertions(+), 4 deletions(-)
> 
> diff --git a/virt/kvm/arm/arch_timer.c b/virt/kvm/arm/arch_timer.c
> index 1be486d5d7cb4..0bfa7c5b5c890 100644
> --- a/virt/kvm/arm/arch_timer.c
> +++ b/virt/kvm/arm/arch_timer.c
> @@ -80,7 +80,7 @@ static inline bool userspace_irqchip(struct kvm *kvm)
>   static void soft_timer_start(struct hrtimer *hrt, u64 ns)
>   {
>   	hrtimer_start(hrt, ktime_add_ns(ktime_get(), ns),
> -		      HRTIMER_MODE_ABS);
> +		      HRTIMER_MODE_ABS_HARD);
>   }
>   
>   static void soft_timer_cancel(struct hrtimer *hrt)
> @@ -697,11 +697,11 @@ void kvm_timer_vcpu_init(struct kvm_vcpu *vcpu)
>   	update_vtimer_cntvoff(vcpu, kvm_phys_timer_read());
>   	ptimer->cntvoff = 0;
>   
> -	hrtimer_init(&timer->bg_timer, CLOCK_MONOTONIC, HRTIMER_MODE_ABS);
> +	hrtimer_init(&timer->bg_timer, CLOCK_MONOTONIC, HRTIMER_MODE_ABS_HARD);
>   	timer->bg_timer.function = kvm_bg_timer_expire;
>   
> -	hrtimer_init(&vtimer->hrtimer, CLOCK_MONOTONIC, HRTIMER_MODE_ABS);
> -	hrtimer_init(&ptimer->hrtimer, CLOCK_MONOTONIC, HRTIMER_MODE_ABS);
> +	hrtimer_init(&vtimer->hrtimer, CLOCK_MONOTONIC, HRTIMER_MODE_ABS_HARD);
> +	hrtimer_init(&ptimer->hrtimer, CLOCK_MONOTONIC, HRTIMER_MODE_ABS_HARD);
>   	vtimer->hrtimer.function = kvm_hrtimer_expire;
>   	ptimer->hrtimer.function = kvm_hrtimer_expire;
>   
> 

-- 
Julien Grall

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: KVM Arm64 and Linux-RT issues
  2019-08-13 15:44           ` Julien Grall
@ 2019-08-13 16:24             ` Marc Zyngier
  2019-08-16 15:18               ` Julien Grall
  0 siblings, 1 reply; 13+ messages in thread
From: Marc Zyngier @ 2019-08-13 16:24 UTC (permalink / raw)
  To: Julien Grall, bigeasy
  Cc: Thomas Gleixner, linux-rt-users, anna-maria, kvmarm,
	Julien Thierry, James Morse, Suzuki K Poulose, Andre Przywara

On Tue, 13 Aug 2019 16:44:21 +0100,
Julien Grall <julien.grall@arm.com> wrote:
> 
> Hi Sebastian,
> 
> On 8/13/19 1:58 PM, bigeasy@linutronix.de wrote:
> > On 2019-07-27 14:37:11 [+0100], Julien Grall wrote:
> >>>> 8<------------
> >>>> --- a/virt/kvm/arm/arch_timer.c
> >>>> +++ b/virt/kvm/arm/arch_timer.c
> >>>> @@ -80,7 +80,7 @@ static inline bool userspace_irqchip(str
> >>>>    static void soft_timer_start(struct hrtimer *hrt, u64 ns)
> >>>>    {
> >>>>    	hrtimer_start(hrt, ktime_add_ns(ktime_get(), ns),
> >>>> -		      HRTIMER_MODE_ABS);
> >>>> +		      HRTIMER_MODE_ABS_HARD);
> >>>>    }
> >>> 
> >>> That's pretty neat, and matches the patch you already have for
> >>> x86. Feel free to add my
> >>> 
> >>> Acked-by: Marc Zyngier <maz@kernel.org>
> >> 
> >> I can confirm the warning now disappeared. Feel free to added my tested-by:
> >> 
> >> Tested-by: Julien Grall <julien.grall@arm.com>
> >> 
> > 
> > |kvm_hrtimer_expire()
> > | kvm_timer_update_irq()
> > |   kvm_vgic_inject_irq()
> > |     vgic_lazy_init()
> > |                if (unlikely(!vgic_initialized(kvm))) {
> > |                 if (kvm->arch.vgic.vgic_model != KVM_DEV_TYPE_ARM_VGIC_V2)
> > |                         return -EBUSY;
> > |
> > |                 mutex_lock(&kvm->lock);
> > 
> > Is this possible path of any concern? This should throw a warning also
> > for !RT so probably not…
> 
> Hmmm, theoretically yes. In practice, it looks like the hrtimer will
> not be started before kvm_vcpu_first_run_init() is called on the first
> run.

Exactly. Even if you restore the timer in a "firing" configuration,
you'll have to execute the vgic init before any background timer gets
programmed, let alone expired.

Yes, the interface is terrible.

> The function will call kvm_vgic_map_resources() which will initialize
> the vgic if not already done.
> 
> Looking around, I think this is here to cater the case where
> KVM_IRQ_LINE is called before running.
> 
> I am not yet familiar with the vgic, so I may have missed something.
> 
> > 
> > I prepared the patch below. This one could go straight to tglx's timer tree
> > since he has the _HARD bits there. I *think* it requires to set the bits
> > _HARD during _init() and _start() otherwise there is (or was) a warning…
> > 
> > Sebastian
> > 8<------------
> > 
> > From: Thomas Gleixner <tglx@linutronix.de>
> > Date: Tue, 13 Aug 2019 14:29:41 +0200
> > Subject: [PATCH] KVM: arm/arm64: Let the timer expire in hardirq context on RT
> > 
> > The timers are canceled from an preempt-notifier which is invoked with
> > disabled preemption which is not allowed on PREEMPT_RT.
> > The timer callback is short so in could be invoked in hard-IRQ context
> > on -RT.
> > 
> > Let the timer expire on hard-IRQ context even on -RT.
> > 
> > Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> > Acked-by: Marc Zyngier <maz@kernel.org>
> > Tested-by: Julien Grall <julien.grall@arm.com>
> > Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
> > ---
> >   virt/kvm/arm/arch_timer.c | 8 ++++----
> >   1 file changed, 4 insertions(+), 4 deletions(-)
> > 
> > diff --git a/virt/kvm/arm/arch_timer.c b/virt/kvm/arm/arch_timer.c
> > index 1be486d5d7cb4..0bfa7c5b5c890 100644
> > --- a/virt/kvm/arm/arch_timer.c
> > +++ b/virt/kvm/arm/arch_timer.c
> > @@ -80,7 +80,7 @@ static inline bool userspace_irqchip(struct kvm *kvm)
> >   static void soft_timer_start(struct hrtimer *hrt, u64 ns)
> >   {
> >   	hrtimer_start(hrt, ktime_add_ns(ktime_get(), ns),
> > -		      HRTIMER_MODE_ABS);
> > +		      HRTIMER_MODE_ABS_HARD);
> >   }
> >     static void soft_timer_cancel(struct hrtimer *hrt)
> > @@ -697,11 +697,11 @@ void kvm_timer_vcpu_init(struct kvm_vcpu *vcpu)
> >   	update_vtimer_cntvoff(vcpu, kvm_phys_timer_read());
> >   	ptimer->cntvoff = 0;
> >   -	hrtimer_init(&timer->bg_timer, CLOCK_MONOTONIC,
> > HRTIMER_MODE_ABS);
> > +	hrtimer_init(&timer->bg_timer, CLOCK_MONOTONIC, HRTIMER_MODE_ABS_HARD);
> >   	timer->bg_timer.function = kvm_bg_timer_expire;
> >   -	hrtimer_init(&vtimer->hrtimer, CLOCK_MONOTONIC,
> > HRTIMER_MODE_ABS);
> > -	hrtimer_init(&ptimer->hrtimer, CLOCK_MONOTONIC, HRTIMER_MODE_ABS);
> > +	hrtimer_init(&vtimer->hrtimer, CLOCK_MONOTONIC, HRTIMER_MODE_ABS_HARD);
> > +	hrtimer_init(&ptimer->hrtimer, CLOCK_MONOTONIC, HRTIMER_MODE_ABS_HARD);
> >   	vtimer->hrtimer.function = kvm_hrtimer_expire;
> >   	ptimer->hrtimer.function = kvm_hrtimer_expire;
> >   

Patch looks fine, please add it to the pile of RT stuff! ;-)

Thanks,

	M.

-- 
Jazz is not dead, it just smells funny.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: KVM Arm64 and Linux-RT issues
  2019-08-13 16:24             ` Marc Zyngier
@ 2019-08-16 15:18               ` Julien Grall
  2019-08-16 15:23                 ` Sebastian Andrzej Siewior
  0 siblings, 1 reply; 13+ messages in thread
From: Julien Grall @ 2019-08-16 15:18 UTC (permalink / raw)
  To: Marc Zyngier, bigeasy
  Cc: Thomas Gleixner, linux-rt-users, anna-maria, kvmarm,
	Julien Thierry, James Morse, Suzuki K Poulose, Andre Przywara

Hi all,

On 13/08/2019 17:24, Marc Zyngier wrote:
> On Tue, 13 Aug 2019 16:44:21 +0100,
> Julien Grall <julien.grall@arm.com> wrote:
>>
>> Hi Sebastian,
>>
>> On 8/13/19 1:58 PM, bigeasy@linutronix.de wrote:
>>> On 2019-07-27 14:37:11 [+0100], Julien Grall wrote:
>>>>>> 8<------------
>>>>>> --- a/virt/kvm/arm/arch_timer.c
>>>>>> +++ b/virt/kvm/arm/arch_timer.c
>>>>>> @@ -80,7 +80,7 @@ static inline bool userspace_irqchip(str
>>>>>>     static void soft_timer_start(struct hrtimer *hrt, u64 ns)
>>>>>>     {
>>>>>>     	hrtimer_start(hrt, ktime_add_ns(ktime_get(), ns),
>>>>>> -		      HRTIMER_MODE_ABS);
>>>>>> +		      HRTIMER_MODE_ABS_HARD);
>>>>>>     }
>>>>>
>>>>> That's pretty neat, and matches the patch you already have for
>>>>> x86. Feel free to add my
>>>>>
>>>>> Acked-by: Marc Zyngier <maz@kernel.org>
>>>>
>>>> I can confirm the warning now disappeared. Feel free to added my tested-by:
>>>>
>>>> Tested-by: Julien Grall <julien.grall@arm.com>
>>>>
>>>
>>> |kvm_hrtimer_expire()
>>> | kvm_timer_update_irq()
>>> |   kvm_vgic_inject_irq()
>>> |     vgic_lazy_init()
>>> |                if (unlikely(!vgic_initialized(kvm))) {
>>> |                 if (kvm->arch.vgic.vgic_model != KVM_DEV_TYPE_ARM_VGIC_V2)
>>> |                         return -EBUSY;
>>> |
>>> |                 mutex_lock(&kvm->lock);
>>>
>>> Is this possible path of any concern? This should throw a warning also
>>> for !RT so probably not…
>>
>> Hmmm, theoretically yes. In practice, it looks like the hrtimer will
>> not be started before kvm_vcpu_first_run_init() is called on the first
>> run.
> 
> Exactly. Even if you restore the timer in a "firing" configuration,
> you'll have to execute the vgic init before any background timer gets
> programmed, let alone expired.
> 
> Yes, the interface is terrible.
> 
>> The function will call kvm_vgic_map_resources() which will initialize
>> the vgic if not already done.
>>
>> Looking around, I think this is here to cater the case where
>> KVM_IRQ_LINE is called before running.
>>
>> I am not yet familiar with the vgic, so I may have missed something.
>>
>>>
>>> I prepared the patch below. This one could go straight to tglx's timer tree
>>> since he has the _HARD bits there. I *think* it requires to set the bits
>>> _HARD during _init() and _start() otherwise there is (or was) a warning…
>>>
>>> Sebastian
>>> 8<------------
>>>
>>> From: Thomas Gleixner <tglx@linutronix.de>
>>> Date: Tue, 13 Aug 2019 14:29:41 +0200
>>> Subject: [PATCH] KVM: arm/arm64: Let the timer expire in hardirq context on RT
>>>
>>> The timers are canceled from an preempt-notifier which is invoked with
>>> disabled preemption which is not allowed on PREEMPT_RT.
>>> The timer callback is short so in could be invoked in hard-IRQ context
>>> on -RT.
>>>
>>> Let the timer expire on hard-IRQ context even on -RT.
>>>
>>> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
>>> Acked-by: Marc Zyngier <maz@kernel.org>
>>> Tested-by: Julien Grall <julien.grall@arm.com>
>>> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
>>> ---
>>>    virt/kvm/arm/arch_timer.c | 8 ++++----
>>>    1 file changed, 4 insertions(+), 4 deletions(-)
>>>
>>> diff --git a/virt/kvm/arm/arch_timer.c b/virt/kvm/arm/arch_timer.c
>>> index 1be486d5d7cb4..0bfa7c5b5c890 100644
>>> --- a/virt/kvm/arm/arch_timer.c
>>> +++ b/virt/kvm/arm/arch_timer.c
>>> @@ -80,7 +80,7 @@ static inline bool userspace_irqchip(struct kvm *kvm)
>>>    static void soft_timer_start(struct hrtimer *hrt, u64 ns)
>>>    {
>>>    	hrtimer_start(hrt, ktime_add_ns(ktime_get(), ns),
>>> -		      HRTIMER_MODE_ABS);
>>> +		      HRTIMER_MODE_ABS_HARD);
>>>    }
>>>      static void soft_timer_cancel(struct hrtimer *hrt)
>>> @@ -697,11 +697,11 @@ void kvm_timer_vcpu_init(struct kvm_vcpu *vcpu)
>>>    	update_vtimer_cntvoff(vcpu, kvm_phys_timer_read());
>>>    	ptimer->cntvoff = 0;
>>>    -	hrtimer_init(&timer->bg_timer, CLOCK_MONOTONIC,
>>> HRTIMER_MODE_ABS);
>>> +	hrtimer_init(&timer->bg_timer, CLOCK_MONOTONIC, HRTIMER_MODE_ABS_HARD);
>>>    	timer->bg_timer.function = kvm_bg_timer_expire;
>>>    -	hrtimer_init(&vtimer->hrtimer, CLOCK_MONOTONIC,
>>> HRTIMER_MODE_ABS);
>>> -	hrtimer_init(&ptimer->hrtimer, CLOCK_MONOTONIC, HRTIMER_MODE_ABS);
>>> +	hrtimer_init(&vtimer->hrtimer, CLOCK_MONOTONIC, HRTIMER_MODE_ABS_HARD);
>>> +	hrtimer_init(&ptimer->hrtimer, CLOCK_MONOTONIC, HRTIMER_MODE_ABS_HARD);
>>>    	vtimer->hrtimer.function = kvm_hrtimer_expire;
>>>    	ptimer->hrtimer.function = kvm_hrtimer_expire;
>>>    
> 
> Patch looks fine, please add it to the pile of RT stuff! ;-)

Sadly, I managed to hit the same BUG_ON() today with this patch
applied on top v5.2-rt1-rebase. :/ Although, it is more difficult
to hit than previously.

[  157.449545] 000: BUG: sleeping function called from invalid context at kernel/locking/rtmutex.c:968
[  157.449569] 000: in_atomic(): 1, irqs_disabled(): 0, pid: 990, name: kvm-vcpu-1
[  157.449579] 000: 2 locks held by kvm-vcpu-1/990:
[  157.449592] 000:  #0: 00000000c2fc8217 (&vcpu->mutex){+.+.}, at: kvm_vcpu_ioctl+0x70/0xae0
[  157.449638] 000:  #1: 0000000096863801 (&cpu_base->softirq_expiry_lock){+.+.}, at: hrtimer_grab_expiry_lock+0x24/0x40
[  157.449677] 000: Preemption disabled at:
[  157.449679] 000: [<ffff0000111a4538>] schedule+0x30/0xd8
[  157.449702] 000: CPU: 0 PID: 990 Comm: kvm-vcpu-1 Tainted: G        W 5.2.0-rt1-00001-gd368139e892f #104
[  157.449712] 000: Hardware name: ARM LTD ARM Juno Development Platform/ARM Juno Development Platform, BIOS EDK II Jan 23 2017
[  157.449718] 000: Call trace:
[  157.449722] 000:  dump_backtrace+0x0/0x130
[  157.449730] 000:  show_stack+0x14/0x20
[  157.449738] 000:  dump_stack+0xbc/0x104
[  157.449747] 000:  ___might_sleep+0x198/0x238
[  157.449756] 000:  rt_spin_lock+0x5c/0x70
[  157.449765] 000:  hrtimer_grab_expiry_lock+0x24/0x40
[  157.449773] 000:  hrtimer_cancel+0x1c/0x38
[  157.449780] 000:  kvm_timer_vcpu_load+0x78/0x3e0
[  157.449791] 000:  kvm_arch_vcpu_load+0x130/0x298
[  157.449800] 000:  kvm_sched_in+0x38/0x68
[  157.449808] 000:  finish_task_switch+0x14c/0x300
[  157.449816] 000:  __schedule+0x2b8/0x8d0
[  157.449826] 000:  schedule+0x38/0xd8
[  157.449833] 000:  kvm_vcpu_block+0xac/0x790
[  157.449841] 000:  kvm_handle_wfx+0x210/0x520
[  157.449852] 000:  handle_exit+0x134/0x1d0
[  157.449861] 000:  kvm_arch_vcpu_ioctl_run+0x650/0xbb8
[  157.449869] 000:  kvm_vcpu_ioctl+0x3a0/0xae0
[  157.449877] 000:  do_vfs_ioctl+0xbc/0x910
[  157.449887] 000:  ksys_ioctl+0x78/0xa8
[  157.449896] 000:  __arm64_sys_ioctl+0x1c/0x28
[  157.449904] 000:  el0_svc_common.constprop.0+0x90/0x188
[  157.449915] 000:  el0_svc_handler+0x28/0x78
[  157.449925] 000:  el0_svc+0x8/0xc
[  173.521497] 002: hrtimer: interrupt took 19281 ns

I will do some debug and see what I can find.

Cheers,

-- 
Julien Grall

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: KVM Arm64 and Linux-RT issues
  2019-08-16 15:18               ` Julien Grall
@ 2019-08-16 15:23                 ` Sebastian Andrzej Siewior
  2019-08-16 16:32                   ` Julien Grall
  0 siblings, 1 reply; 13+ messages in thread
From: Sebastian Andrzej Siewior @ 2019-08-16 15:23 UTC (permalink / raw)
  To: Julien Grall
  Cc: Marc Zyngier, Thomas Gleixner, linux-rt-users, anna-maria,
	kvmarm, Julien Thierry, James Morse, Suzuki K Poulose,
	Andre Przywara

On 2019-08-16 16:18:20 [+0100], Julien Grall wrote:
> Sadly, I managed to hit the same BUG_ON() today with this patch
> applied on top v5.2-rt1-rebase. :/ Although, it is more difficult
> to hit than previously.
> 
> [  157.449545] 000: BUG: sleeping function called from invalid context at kernel/locking/rtmutex.c:968
> [  157.449569] 000: in_atomic(): 1, irqs_disabled(): 0, pid: 990, name: kvm-vcpu-1
> [  157.449579] 000: 2 locks held by kvm-vcpu-1/990:
> [  157.449592] 000:  #0: 00000000c2fc8217 (&vcpu->mutex){+.+.}, at: kvm_vcpu_ioctl+0x70/0xae0
> [  157.449638] 000:  #1: 0000000096863801 (&cpu_base->softirq_expiry_lock){+.+.}, at: hrtimer_grab_expiry_lock+0x24/0x40
> [  157.449677] 000: Preemption disabled at:
> [  157.449679] 000: [<ffff0000111a4538>] schedule+0x30/0xd8
> [  157.449702] 000: CPU: 0 PID: 990 Comm: kvm-vcpu-1 Tainted: G        W 5.2.0-rt1-00001-gd368139e892f #104
> [  157.449712] 000: Hardware name: ARM LTD ARM Juno Development Platform/ARM Juno Development Platform, BIOS EDK II Jan 23 2017
> [  157.449718] 000: Call trace:
> [  157.449722] 000:  dump_backtrace+0x0/0x130
> [  157.449730] 000:  show_stack+0x14/0x20
> [  157.449738] 000:  dump_stack+0xbc/0x104
> [  157.449747] 000:  ___might_sleep+0x198/0x238
> [  157.449756] 000:  rt_spin_lock+0x5c/0x70
> [  157.449765] 000:  hrtimer_grab_expiry_lock+0x24/0x40
> [  157.449773] 000:  hrtimer_cancel+0x1c/0x38
> [  157.449780] 000:  kvm_timer_vcpu_load+0x78/0x3e0
> I will do some debug and see what I can find.

which timer is this? Is there another one?
In the meantime I do a release with that patch included.

> 
> Cheers,

Sebastian

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: KVM Arm64 and Linux-RT issues
  2019-08-16 15:23                 ` Sebastian Andrzej Siewior
@ 2019-08-16 16:32                   ` Julien Grall
  2019-08-19  7:33                     ` Sebastian Andrzej Siewior
  0 siblings, 1 reply; 13+ messages in thread
From: Julien Grall @ 2019-08-16 16:32 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior
  Cc: Marc Zyngier, Thomas Gleixner, linux-rt-users, anna-maria,
	kvmarm, Julien Thierry, James Morse, Suzuki K Poulose,
	Andre Przywara

Hi Sebastian,

On 16/08/2019 16:23, Sebastian Andrzej Siewior wrote:
> On 2019-08-16 16:18:20 [+0100], Julien Grall wrote:
>> Sadly, I managed to hit the same BUG_ON() today with this patch
>> applied on top v5.2-rt1-rebase. :/ Although, it is more difficult
>> to hit than previously.
>>
>> [  157.449545] 000: BUG: sleeping function called from invalid context at kernel/locking/rtmutex.c:968
>> [  157.449569] 000: in_atomic(): 1, irqs_disabled(): 0, pid: 990, name: kvm-vcpu-1
>> [  157.449579] 000: 2 locks held by kvm-vcpu-1/990:
>> [  157.449592] 000:  #0: 00000000c2fc8217 (&vcpu->mutex){+.+.}, at: kvm_vcpu_ioctl+0x70/0xae0
>> [  157.449638] 000:  #1: 0000000096863801 (&cpu_base->softirq_expiry_lock){+.+.}, at: hrtimer_grab_expiry_lock+0x24/0x40
>> [  157.449677] 000: Preemption disabled at:
>> [  157.449679] 000: [<ffff0000111a4538>] schedule+0x30/0xd8
>> [  157.449702] 000: CPU: 0 PID: 990 Comm: kvm-vcpu-1 Tainted: G        W 5.2.0-rt1-00001-gd368139e892f #104
>> [  157.449712] 000: Hardware name: ARM LTD ARM Juno Development Platform/ARM Juno Development Platform, BIOS EDK II Jan 23 2017
>> [  157.449718] 000: Call trace:
>> [  157.449722] 000:  dump_backtrace+0x0/0x130
>> [  157.449730] 000:  show_stack+0x14/0x20
>> [  157.449738] 000:  dump_stack+0xbc/0x104
>> [  157.449747] 000:  ___might_sleep+0x198/0x238
>> [  157.449756] 000:  rt_spin_lock+0x5c/0x70
>> [  157.449765] 000:  hrtimer_grab_expiry_lock+0x24/0x40
>> [  157.449773] 000:  hrtimer_cancel+0x1c/0x38
>> [  157.449780] 000:  kvm_timer_vcpu_load+0x78/0x3e0
> 
> …
>> I will do some debug and see what I can find.
> 
> which timer is this? Is there another one?

It looks like the timer is the background timer (bg_timer).
Although, the BUG() seems to happen with the other ones
but less often. All of them have already been converted.

Interestingly, hrtimer_grab_expiry_lock may be called by
timer even if is_soft (I assume this means softirq will
not be used) is 0.

diff --git a/kernel/time/hrtimer.c b/kernel/time/hrtimer.c
index 7d7db8802131..fe05e553dea2 100644
--- a/kernel/time/hrtimer.c
+++ b/kernel/time/hrtimer.c
@@ -934,6 +934,9 @@ void hrtimer_grab_expiry_lock(const struct hrtimer *timer)
 {
        struct hrtimer_clock_base *base = timer->base;
 
+       WARN(!preemptible(), "is_soft %u base %p base->cpu_base %p\n",
+            timer->is_soft, base, base ? base->cpu_base : NULL);
+
        if (base && base->cpu_base) {
                spin_lock(&base->cpu_base->softirq_expiry_lock);
                spin_unlock(&base->cpu_base->softirq_expiry_lock);

[  576.291886] 004: is_soft 0 base ffff80097eed44c0 base->cpu_base ffff80097eed4380

Because the hrtimer is started when scheduling out the
vCPU and canceled when the scheduling in, there is no
guarantee the hrtimer will be running on the same pCPU.
So I think the following can happen:

CPU0                                          |  CPU1
                                              |
                                              |  hrtimer_interrupt()
                                              |    raw_spin_lock_irqsave(&cpu_save->lock)
 hrtimer_cancel()                             |      __run_hrtimer_run_queues()
   hrtimer_try_to_cancel()                    |      __run_hrtimer()
     lock_hrtimer_base()                      |        base->running = timer;
                                              |        raw_spin_unlock_irqrestore(&cpu_save->lock)
       raw_spin_lock_irqsave(cpu_base->lock)  |        fn(timer);
     hrtimer_callback_running()               |
        
hrtimer_callback_running() will be returning true as the callback is
running somewhere else. This means hrtimer_try_to_cancel()
would return -1. Therefore hrtimer_grab_expiry_lock() would
be called.

Did I miss anything?

Cheers,

-- 
Julien Grall

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: KVM Arm64 and Linux-RT issues
  2019-08-16 16:32                   ` Julien Grall
@ 2019-08-19  7:33                     ` Sebastian Andrzej Siewior
  2019-08-20 14:18                       ` Julien Grall
  0 siblings, 1 reply; 13+ messages in thread
From: Sebastian Andrzej Siewior @ 2019-08-19  7:33 UTC (permalink / raw)
  To: Julien Grall
  Cc: Marc Zyngier, Thomas Gleixner, linux-rt-users, anna-maria,
	kvmarm, Julien Thierry, James Morse, Suzuki K Poulose,
	Andre Przywara

On 2019-08-16 17:32:38 [+0100], Julien Grall wrote:
> Hi Sebastian,
Hi Julien,

> hrtimer_callback_running() will be returning true as the callback is
> running somewhere else. This means hrtimer_try_to_cancel()
> would return -1. Therefore hrtimer_grab_expiry_lock() would
> be called.
> 
> Did I miss anything?

nope, you are right. I assumed that we had code to deal with this but
didn't find it…

diff --git a/kernel/time/hrtimer.c b/kernel/time/hrtimer.c
index 7d7db88021311..40d83c709503e 100644
--- a/kernel/time/hrtimer.c
+++ b/kernel/time/hrtimer.c
@@ -934,7 +934,7 @@ void hrtimer_grab_expiry_lock(const struct hrtimer *timer)
 {
 	struct hrtimer_clock_base *base = timer->base;
 
-	if (base && base->cpu_base) {
+	if (base && base->cpu_base && base->index < MASK_SHIFT) {
 		spin_lock(&base->cpu_base->softirq_expiry_lock);
 		spin_unlock(&base->cpu_base->softirq_expiry_lock);
 	}

This should deal with it.

> Cheers,

Sebastian

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: KVM Arm64 and Linux-RT issues
  2019-08-19  7:33                     ` Sebastian Andrzej Siewior
@ 2019-08-20 14:18                       ` Julien Grall
  0 siblings, 0 replies; 13+ messages in thread
From: Julien Grall @ 2019-08-20 14:18 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior
  Cc: Marc Zyngier, Thomas Gleixner, linux-rt-users, anna-maria,
	kvmarm, Julien Thierry, James Morse, Suzuki K Poulose,
	Andre Przywara

Hi Sebastian,

On 19/08/2019 08:33, Sebastian Andrzej Siewior wrote:
> On 2019-08-16 17:32:38 [+0100], Julien Grall wrote:
>> Hi Sebastian,
> Hi Julien,
> 
>> hrtimer_callback_running() will be returning true as the callback is
>> running somewhere else. This means hrtimer_try_to_cancel()
>> would return -1. Therefore hrtimer_grab_expiry_lock() would
>> be called.
>>
>> Did I miss anything?
> 
> nope, you are right. I assumed that we had code to deal with this but
> didn't find it…
> 
> diff --git a/kernel/time/hrtimer.c b/kernel/time/hrtimer.c
> index 7d7db88021311..40d83c709503e 100644
> --- a/kernel/time/hrtimer.c
> +++ b/kernel/time/hrtimer.c
> @@ -934,7 +934,7 @@ void hrtimer_grab_expiry_lock(const struct hrtimer *timer)
>   {
>   	struct hrtimer_clock_base *base = timer->base;
>   
> -	if (base && base->cpu_base) {
> +	if (base && base->cpu_base && base->index < MASK_SHIFT) {

Lower indexes are used for hard interrupt. So this would need to be base->index 
 >= MASK_SHIFT.

But I was wondering whether checking timer->is_soft would make the code more 
readable?

While investigation how this is meant to work, I noticed a few others things.

timer->base could potentially change under our feet at any point of time (we 
don't hold any lock). So it would be valid to have base == migration_base.

migration_cpu_base does not have softirq_expiry_lock initialized. So we would 
end up to use an uninitialized lock. Note that migration_base->index is always 
0, so the check base->index > MASK_SHIFT would hide it.

Alternatively, we could initialize the spin lock for migration_cpu_base avoiding 
to rely on side effect of the check.

Another potential issue is the compiler is free to reload timer->base at any 
time. So I think we want an ACCESS_ONCE(...).

Lastly timer->base cannot be NULL. From the comment on top of 
migration_cpu_base, timer->base->cpu_base will as well not be NULL.

So I think the function can be reworked as:

void hrtimer_grab_expirty_lock(const struct hrtimer *timer)
{
         struct hrtimer_clock_base *base = ACCESS_ONCE(timer->base);

         if (!timer->is_soft && base != migration_base ) {
           spin_lock();
           spin_unlock();
         }
}


>   		spin_lock(&base->cpu_base->softirq_expiry_lock);
>   		spin_unlock(&base->cpu_base->softirq_expiry_lock);
>   	}
> 
> This should deal with it.
> 
>> Cheers,
> 
> Sebastian
> 

-- 
Julien Grall

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, back to index

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-07-23 17:58 KVM Arm64 and Linux-RT issues Julien Grall
2019-07-24  8:53 ` Marc Zyngier
2019-07-26 22:58   ` Thomas Gleixner
2019-07-27 11:13     ` Marc Zyngier
2019-07-27 13:37       ` Julien Grall
2019-08-13 12:58         ` bigeasy
2019-08-13 15:44           ` Julien Grall
2019-08-13 16:24             ` Marc Zyngier
2019-08-16 15:18               ` Julien Grall
2019-08-16 15:23                 ` Sebastian Andrzej Siewior
2019-08-16 16:32                   ` Julien Grall
2019-08-19  7:33                     ` Sebastian Andrzej Siewior
2019-08-20 14:18                       ` Julien Grall

Linux-rt-users archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-rt-users/0 linux-rt-users/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-rt-users linux-rt-users/ https://lore.kernel.org/linux-rt-users \
		linux-rt-users@vger.kernel.org linux-rt-users@archiver.kernel.org
	public-inbox-index linux-rt-users


Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-rt-users


AGPL code for this site: git clone https://public-inbox.org/ public-inbox