* Re: [Qestion] Softlockup when send IPI to other CPUs
[not found] <95C141B25E7AB14BA042DCCC556C0E6501620A47@dggeml529-mbx.china.huawei.com>
@ 2019-01-19 23:58 ` Will Deacon
2019-01-21 14:21 ` Catalin Marinas
0 siblings, 1 reply; 17+ messages in thread
From: Will Deacon @ 2019-01-19 23:58 UTC (permalink / raw)
To: chenwandun
Cc: catalin.marinas, Wangkefeng (Kevin),
linux-kernel, linux-arm-kernel, anshuman.khandual
[+Anshuman]
On Thu, Jan 17, 2019 at 07:42:44AM +0000, chenwandun wrote:
> Recently, I do some tests on linux-4.19 and hit a softlockup issue.
>
> I find some CPUs get the spinlock in the __split_huge_pmd function and
> then send IPI to other CPUs, waiting the response, while several CPUs
> enter the __split_huge_pmd function, want to get the spinlock, but always
> in queued_spin_lock_slowpath,
>
> Because long time no response to the IPI, that results in a softlockup.
>
> As to sending IPI, it was in the patch
> 3b8c9f1cdfc506e94e992ae66b68bbe416f89610. The patch is mean to send IPI
> to each CPU after invalidating the I-cache for kernel mappings. In this
> case, after modify pmd, it sends IPI to other CPUS to sync memory
> mappings.
>
> No stable test case to repeat the result, it is hard to repeat the test procedure.
>
> The environment is arm64, 64 CPUs. Except for idle CPU, there are 6 kind
> of callstacks in total.
This looks like another lockup that would be solved if we deferred our
I-cache invalidation when mapping user-executable pages, and instead
performed the invalidation off the back of a UXN permission fault, where we
could avoid holding any locks.
Will
> The callstack of 3 CPUs:
>
>
>
> PID: 37265 TASK: ffff801626006c80 CPU: 9 COMMAND: "QThread"
>
> #0 [ffff00000804bb20] __crash_kexec at ffff000008198c60
>
> #1 [ffff00000804bcb0] panic at ffff0000080e466c
>
> #2 [ffff00000804bd90] watchdog_timer_fn at ffff0000081cdc90
>
> #3 [ffff00000804be10] __hrtimer_run_queues at ffff000008173c50
>
> #4 [ffff00000804be90] hrtimer_interrupt at ffff000008174a98
>
> #5 [ffff00000804bf00] arch_timer_handler_phys at ffff000008bd591c
>
> #6 [ffff00000804bf20] handle_percpu_devid_irq at ffff000008155c2c
>
> #7 [ffff00000804bf60] generic_handle_irq at ffff00000814edd8
>
> #8 [ffff00000804bf80] __handle_domain_irq at ffff00000814f6d4
>
> #9 [ffff00000804bfc0] gic_handle_irq at ffff0000080816cc
>
> --- <IRQ stack> ---
>
> #10 [ffff000034b039c0] el1_irq at ffff0000080833ec
>
> PC: ffff00000818d228 [smp_call_function_many+824]
>
> LR: ffff00000818d1e8 [smp_call_function_many+760]
>
> SP: ffff000034b039d0 PSTATE: 80000005
>
> X29: ffff000034b039d0 X28: ffff000009592184 X27: ffff801ffbf52f08
>
> X26: 0000000000001000 X25: 0000000000000000 X24: ffff00000818c890
>
> X23: 0000000000000001 X22: ffff0000095897b8 X21: ffff801ffbf52d08
>
> X20: ffff801ffbf52d00 X19: ffff000009592184 X18: 0000000000000000
>
> X17: 0000000000000000 X16: 0000000000000000 X15: 0000000000000000
>
> X14: 0000000000000000 X13: 0000000000100077 X12: 00280018f7e0bb53
>
> X11: 0000fffccb000000 X10: ffff7e005fb39940 X9: 0000000000000000
>
> X8: ffff801ffbf53108 X7: 0000000000000000 X6: 000000000000003f
>
> X5: 0000000000000000 X4: fffffffffffffdff X3: 0000000000000000
>
> X2: ffff80affbea42f8 X1: 0000000000000003 X0: 0000000000000038
>
> #11 [ffff000034b039d0] smp_call_function_many at ffff00000818d224
>
> #12 [ffff000034b03a40] kick_all_cpus_sync at ffff00000818d36c
>
> #13 [ffff000034b03a50] sync_icache_aliases at ffff0000080a5488
>
> #14 [ffff000034b03a70] __sync_icache_dcache at ffff0000080a5548
>
> #15 [ffff000034b03aa0] __split_huge_pmd_locked at ffff0000082e093c
>
> #16 [ffff000034b03b70] __split_huge_pmd at ffff0000082e3578
>
> #17 [ffff000034b03bc0] unmap_page_range at ffff00000829890c
>
> #18 [ffff000034b03c70] unmap_single_vma at ffff000008298c50
>
> #19 [ffff000034b03cb0] zap_page_range at ffff000008299068
>
> #20 [ffff000034b03d90] __arm64_sys_madvise at ffff0000082b19f4
>
> #21 [ffff000034b03e60] el0_svc_common at ffff0000080979dc
>
> #22 [ffff000034b03ea0] el0_svc_handler at ffff000008097a9c
>
> #23 [ffff000034b03ff0] el0_svc at ffff000008084044
>
> PC: 0000ffffb974f018 LR: 0000ffffbaca2520 SP: 0000fffccaffe3c0
>
> X29: 0000fffccaffe3c0 X28: 0000fffccaffebe0 X27: 000000000000000f
>
> X26: 0000fffcca800000 X25: 00000000007ff3c0 X24: 0000fffccb7fe1e8
>
> X23: 0000fffccb7fe157 X22: 0000fffccb7fe156 X21: 0000ffffbacc2000
>
> X20: 0000fffccb7fe1e8 X19: 0000fffccaffebe0 X18: 0000000000000000
>
> X17: 0000ffffb974f010 X16: 0000ffffbacc31d8 X15: 0000000000000292
>
> X14: 0000fffccafff2d0 X13: 000000000000005f X12: 0000ffffb97d2000
>
> X11: 0000000000000007 X10: 0000ffffbacf4948 X9: 0000fffcc0000b10
>
> X8: 00000000000000e9 X7: 0000000000000000 X6: 0000000000000000
>
> X5: 00000000ffffffff X4: 0000000000000000 X3: 0000fffcca800000
>
> X2: 0000000000000004 X1: 00000000007de000 X0: 0000fffcca800000
>
> ORIG_X0: 0000fffcca800000 SYSCALLNO: e9 PSTATE: 20000000
>
>
>
> The callstack of 7 CPUs:
>
>
>
> PID: 37299 TASK: ffff801b522d8000 CPU: 0 COMMAND: "QThread"
>
> #0 [ffff000008003d90] crash_save_cpu at ffff00000819904c
>
> #1 [ffff000008003f50] handle_IPI at ffff000008096ce4
>
> #2 [ffff000008003fc0] gic_handle_irq at ffff00000808179c
>
> --- <IRQ stack> ---
>
> #3 [ffff000034e43b40] el1_irq at ffff0000080833ec
>
> PC: ffff0000081447b0 [queued_spin_lock_slowpath+416]
>
> LR: ffff0000082e35dc [__split_huge_pmd+380]
>
> SP: ffff000034e43b50 PSTATE: 00000005
>
> X29: ffff000034e43b50 X28: ffff000009313000 X27: ffff801930554a58
>
> X26: 0000fffca97de000 X25: 0000fffca9800000 X24: 0000000000000000
>
> X23: 0000000000000000 X22: ffff801930554a58 X21: ffff801f9c625b00
>
> X20: 0000000000280101 X19: ffff801c03fd805c X18: 0000000000000000
>
> X17: 0000000000000000 X16: 0000000000000000 X15: 0000000000000000
>
> X14: 0000000000000000 X13: 0000000000000000 X12: 0000000000000000
>
> X11: 0000000000000000 X10: 0000000000000000 X9: 0000000000000000
>
> X8: ffff801ffbe687c0 X7: 0000000000000000 X6: ffff801ffbed07c0
>
> X5: 0000000000040000 X4: 0000000000000001 X3: 0000000000000000
>
> X2: ffff801ffbe687c0 X1: ffff00000956f000 X0: 00000000000c0001
>
> #4 [ffff000034e43b50] queued_spin_lock_slowpath at ffff0000081447ac
>
> #5 [ffff000034e43b70] __split_huge_pmd at ffff0000082e35d8
>
> #6 [ffff000034e43bc0] unmap_page_range at ffff00000829890c
>
> #7 [ffff000034e43c70] unmap_single_vma at ffff000008298c50
>
> #8 [ffff000034e43cb0] zap_page_range at ffff000008299068
>
> #9 [ffff000034e43d90] __arm64_sys_madvise at ffff0000082b19f4
>
> #10 [ffff000034e43e60] el0_svc_common at ffff0000080979dc
>
> #11 [ffff000034e43ea0] el0_svc_handler at ffff000008097a9c
>
> #12 [ffff000034e43ff0] el0_svc at ffff000008084044
>
> PC: 0000ffffb974f018 LR: 0000ffffbaca2520 SP: 0000fffca97fe3c0
>
> X29: 0000fffca97fe3c0 X28: 0000fffca97febe0 X27: 000000000000000f
>
> X26: 0000fffca9000000 X25: 00000000007ff3c0 X24: 0000fffca9ffe1e8
>
> X23: 0000fffca9ffe157 X22: 0000fffca9ffe156 X21: 0000ffffbacc2000
>
> X20: 0000fffca9ffe1e8 X19: 0000fffca97febe0 X18: 0000000000000000
>
> X17: 0000ffffb974f010 X16: 0000ffffbacc31d8 X15: 0000000000000292
>
> X14: 0000000000000027 X13: ffffffffffffffff X12: 0000fffca8ffebe0
>
> X11: 00000000003d0f00 X10: 0000ffffbacf4948 X9: 0000fffc88000b10
>
> X8: 00000000000000e9 X7: 0000000000000000 X6: 0000000000000000
>
> X5: 00000000ffffffff X4: 0000000000000000 X3: 0000fffca9000000
>
> X2: 0000000000000004 X1: 00000000007de000 X0: 0000fffca9000000
>
> ORIG_X0: 0000fffca9000000 SYSCALLNO: e9 PSTATE: 20000000
>
>
>
> The callstack of 6 CPUS:
>
>
>
> PID: 38541 TASK: ffff8022a1a5ae80 CPU: 16 COMMAND: "QThread"
>
> #0 [ffff00000a0abd90] crash_save_cpu at ffff00000819904c
>
> #1 [ffff00000a0abf50] handle_IPI at ffff000008096ce4
>
> #2 [ffff00000a0abfc0] gic_handle_irq at ffff00000808179c
>
> --- <IRQ stack> ---
>
> #3 [ffff00003741ba00] el1_irq at ffff0000080833ec
>
> PC: ffff00000818d228 [smp_call_function_many+824]
>
> LR: ffff00000818d1e8 [smp_call_function_many+760]
>
> SP: ffff00003741ba10 PSTATE: 80000005
>
> X29: ffff00003741ba10 X28: ffff000009592184 X27: ffff802ffbe69f08
>
> X26: 0000000000001000 X25: 0000000000000000 X24: ffff00000818c890
>
> X23: 0000000000000001 X22: ffff0000095897b8 X21: ffff802ffbe69d08
>
> X20: ffff802ffbe69d00 X19: ffff000009592184 X18: 0000000000000000
>
> X17: 0000000000000000 X16: 0000000000000000 X15: 0000000000000000
>
> X14: ffff8023173a8d90 X13: 0000000000000040 X12: 0000000000000228
>
> X11: 0000000000000000 X10: 0000000000000000 X9: 0000000000000000
>
> X8: ffff802ffbe6a108 X7: 0000000000000000 X6: 000000000000003f
>
> X5: 0000000000000000 X4: fffffffffffeffff X3: 0000000000000000
>
> X2: ffff801ffbf573d8 X1: 0000000000000003 X0: 0000000000000009
>
> #4 [ffff00003741ba10] smp_call_function_many at ffff00000818d224
>
> #5 [ffff00003741ba80] kick_all_cpus_sync at ffff00000818d36c
>
> #6 [ffff00003741ba90] sync_icache_aliases at ffff0000080a5488
>
> #7 [ffff00003741bab0] __sync_icache_dcache at ffff0000080a5548
>
> #8 [ffff00003741bae0] alloc_set_pte at ffff00000829a904
>
> #9 [ffff00003741bb30] filemap_map_pages at ffff0000082533a8
>
> #10 [ffff00003741bbc0] __handle_mm_fault at ffff00000829d174
>
> #11 [ffff00003741bc90] handle_mm_fault at ffff00000829d7e8
>
> #12 [ffff00003741bcc0] do_page_fault at ffff000008e39d8c
>
> #13 [ffff00003741bdc0] do_translation_fault at ffff000008e3a124
>
> #14 [ffff00003741bdf0] do_mem_abort at ffff0000080812c4
>
> #15 [ffff00003741bff0] el0_da at ffff000008083a8c
>
> PC: 0000ffff9291def8 LR: 0000ffff9291e0ac SP: 0000fffcb22d94c0
>
> X29: 0000fffcb22d94c0 X28: 0000fffca4125410 X27: 0000000000000009
>
> X26: 0000fffca405db20 X25: 0000fffca40e87a8 X24: 0000fffca414e5e0
>
> X23: 0000fffca413e930 X22: 0000fffca413e918 X21: 000000000000001b
>
> X20: 0000fffca413e808 X19: 0000000000000000 X18: 93fa57e40000ffff
>
> X17: 0000ffffa621ec68 X16: 0000ffff93f152c0 X15: 0000fffcb22d94a8
>
> X14: 0000fffcb22e22d0 X13: 000000000000000f X12: 0000ffffa62f4000
>
> X11: 0000000000000002 X10: 0000000000000000 X9: 0000ffff93c8c410
>
> X8: 0000ffffa62f65b8 X7: 00000000000004d0 X6: 0000ffff934c5818
>
> X5: 0000fffca413e810 X4: 0000ffff934a6170 X3: 0000fffca413e918
>
> X2: 000000000000005e X1: 0000000000000000 X0: 0000fffca413e808
>
> ORIG_X0: 0000fffca413f000 SYSCALLNO: ffffffff PSTATE: 80000000
>
>
>
>
>
> The callstack of 33 CPUs:
>
>
>
> PID: 37099 TASK: ffff80adb2299f00 CPU: 63 COMMAND: "QThread"
>
> #0 [ffff00000a223d90] crash_save_cpu at ffff00000819904c
>
> #1 [ffff00000a223f50] handle_IPI at ffff000008096ce4
>
> #2 [ffff00000a223fc0] gic_handle_irq at ffff00000808179c
>
> --- <IRQ stack> ---
>
> #3 [ffff0000324bba30] el1_irq at ffff0000080833ec
>
> PC: ffff0000080bae74 [kvm_arch_vcpu_ioctl_run+580]
>
> LR: ffff0000080bae70 [kvm_arch_vcpu_ioctl_run+576]
>
> SP: ffff0000324bba40 PSTATE: 60000005
>
> X29: ffff0000324bba40 X28: ffff80af38f90000 X27: 0000000000000000
>
> X26: ffffff800833c538 X25: ffffff800833c53c X24: ffff0000095ad000
>
> X23: ffff000009589000 X22: ffff80ad8581a000 X21: ffff80a9e5a9a000
>
> X20: ffff000009e61ed8 X19: ffff80a9e5a98000 X18: 0000000000000000
>
> X17: 0000000000000000 X16: 0000000000000000 X15: 0000000000000000
>
> X14: 00000000ecfc015d X13: ffffffc26c5e5488 X12: 0000000000000000
>
> X11: 0000000000000006 X10: 0000000000000000 X9: ffffffc26c5e56d0
>
> X8: ffffffc272cd6a50 X7: 0000000001000000 X6: 00000000002a1df3
>
> X5: 0000000000000000 X4: 0000000000000000 X3: 0000000000001d70
>
> X2: ffff80af38d39438 X1: 000000000000001b X0: 0000000000000004
>
> #4 [ffff0000324bba40] kvm_arch_vcpu_ioctl_run at ffff0000080bae70
>
> #5 [ffff0000324bbb00] kvm_vcpu_ioctl at ffff0000080aec04
>
> #6 [ffff0000324bbd70] do_vfs_ioctl at ffff000008320298
>
> #7 [ffff0000324bbe00] ksys_ioctl at ffff000008320b10
>
> #8 [ffff0000324bbe40] __arm64_sys_ioctl at ffff000008320b4c
>
> #9 [ffff0000324bbe60] el0_svc_common at ffff0000080979dc
>
> #10 [ffff0000324bbea0] el0_svc_handler at ffff000008097a9c
>
> #11 [ffff0000324bbff0] el0_svc at ffff000008084044
>
> PC: 0000ffffaeabed0c LR: 00000000005ebeac SP: 0000ffff6abea200
>
> X29: 0000ffff6abea200 X28: 0000000000df43d8 X27: 000000000000000f
>
> X26: 0000ffff6c8c6840 X25: 00000000007ff3c0 X24: 0000ffff9484c4e8
>
> X23: 0000ffff9484c457 X22: 0000ffff6c8be8e6 X21: 0000000000000000
>
> X20: 0000ffff6c8be840 X19: 0000ffffb0060000 X18: 000000000157ab30
>
> X17: 0000ffffaeabed00 X16: 00000000010b3a40 X15: 00000000000000f9
>
> X14: 0000ffff6abeb2d0 X13: 0000000000000000 X12: 0000ffff6c0c984c
>
> X11: 0000ffff6c0c9858 X10: 0101010101010101 X9: 0000000000000004
>
> X8: 000000000000001d X7: 0000000000000001 X6: 0000000000000001
>
> X5: 0000000000000000 X4: 000000000114b428 X3: 0000000000000000
>
> X2: 0000000000000000 X1: 000000000000ae80 X0: 0000000000000022
>
> ORIG_X0: 0000000000000022 SYSCALLNO: 1d PSTATE: 60000000
>
>
>
>
>
> The callstack of 1 CPU:
>
>
>
> PID: 36979 TASK: ffff80a6a6ecae80 CPU: 48 COMMAND: "QThread"
>
> #0 [ffff00000a1abd90] crash_save_cpu at ffff00000819904c
>
> #1 [ffff00000a1abf50] handle_IPI at ffff000008096ce4
>
> #2 [ffff00000a1abfc0] gic_handle_irq at ffff00000808179c
>
> --- <IRQ stack> ---
>
> #3 [ffff000033d43930] el1_irq at ffff0000080833ec
>
> PC: ffff0000080a51a0 [invalidate_icache_range+40]
>
> LR: ffff0000080be568 [invalidate_icache_guest_page+144]
>
> SP: ffff000033d43940 PSTATE: 80000005
>
> X29: ffff000033d43940 X28: 0000000000000020 X27: 0000000000000000
>
> X26: ffff80aabaca0000 X25: 000000a422e007fd X24: 0000000000000000
>
> X23: 0000ffff1b6ac000 X22: ffff80aa043d8000 X21: 0000000000000503
>
> X20: 000000000a422e00 X19: 0000000000200000 X18: 0000000000000001
>
> X17: 00000077579c5dac X16: 0000007757a311d0 X15: 0000000000000000
>
> X14: ffff7e02908b8034 X13: 0000000000000001 X12: 0000ffff5b6ac000
>
> X11: 0000000000000000 X10: 0000000000000041 X9: ffff80a5b7c976e0
>
> X8: 0000000000002218 X7: 0000000ffc000000 X6: 0000000000000018
>
> X5: ffff000009ffdde8 X4: ffff000009ffdc98 X3: ffff80a422f92ec0
>
> X2: 0000000000000040 X1: ffff80a423000000 X0: ffff80a422e00000
>
> #4 [ffff000033d43940] invalidate_icache_range at ffff0000080a519c
>
> #5 [ffff000033d43960] kvm_handle_guest_abort at ffff0000080c0ae4
>
> #6 [ffff000033d43a10] handle_exit at ffff0000080c4fe0
>
> #7 [ffff000033d43a40] kvm_arch_vcpu_ioctl_run at ffff0000080baec8
>
> #8 [ffff000033d43b00] kvm_vcpu_ioctl at ffff0000080aec04
>
> #9 [ffff000033d43d70] do_vfs_ioctl at ffff000008320298
>
> #10 [ffff000033d43e00] ksys_ioctl at ffff000008320b10
>
> #11 [ffff000033d43e40] __arm64_sys_ioctl at ffff000008320b4c
>
> #12 [ffff000033d43e60] el0_svc_common at ffff0000080979dc
>
> #13 [ffff000033d43ea0] el0_svc_handler at ffff000008097a9c
>
> #14 [ffff000033d43ff0] el0_svc at ffff000008084044
>
> PC: 0000ffff94bd8d0c LR: 00000000005ebeac SP: 0000ffff4b7fe200
>
> X29: 0000ffff4b7fe200 X28: 0000000000df43d8 X27: 000000000000000f
>
> X26: 0000ffff589459b0 X25: 00000000007ff3c0 X24: 0000ffff6e7fd4e8
>
> X23: 0000ffff6e7fd457 X22: 0000ffff5893da56 X21: 0000000000000000
>
> X20: 0000ffff5893d9b0 X19: 0000ffff7c0a3000 X18: 000000000157ab30
>
> X17: 0000ffff94bd8d00 X16: 00000000010b3a40 X15: 000000000157ab28
>
> X14: 0000000000000000 X13: 3038343d79746973 X12: 6e65645f64636c2e
>
> X11: 0000000000000003 X10: 0000000000000004 X9: 0000000000000004
>
> X8: 000000000000001d X7: 0000000000000000 X6: 0000000000000001
>
> X5: 0000000000000000 X4: 000000000114b428 X3: 0000000000000000
>
> X2: 0000000000000000 X1: 000000000000ae80 X0: 0000000000000024
>
> ORIG_X0: 0000000000000024 SYSCALLNO: 1d PSTATE: 60000000
>
>
>
> The callstack of 1 CPU:
>
>
>
> PID: 37276 TASK: ffff80ac9f3edd00 CPU: 53 COMMAND: "QThread"
>
> #0 [ffff00000a1d3d90] crash_save_cpu at ffff00000819904c
>
> #1 [ffff00000a1d3f50] handle_IPI at ffff000008096ce4
>
> #2 [ffff00000a1d3fc0] gic_handle_irq at ffff00000808179c
>
> --- <IRQ stack> ---
>
> #3 [ffff000033f23830] el1_irq at ffff0000080833ec
>
> PC: ffff00000817345c [hrtimer_active+92]
>
> LR: ffff000008174614 [hrtimer_try_to_cancel+52]
>
> SP: ffff000033f23840 PSTATE: 60000005
>
> X29: ffff000033f23840 X28: ffff80ad86210000 X27: ffff80ac9f3edd00
>
> X26: ffff000008e3309c X25: ffff80ac9f3ee440 X24: 0000000000000001
>
> X23: ffff000009560018 X22: 000080aff28e2000 X21: ffff80a3b74c1e20
>
> X20: ffff80a3b74c1ea8 X19: ffff80a3b74c1ea8 X18: 0000000000000001
>
> X17: 0000007ca33c45d0 X16: 0000007ca3430be0 X15: 0000000000000000
>
> X14: ffff7e02901f8034 X13: 0000000000000001 X12: 0000ffff60a02000
>
> X11: 0000000000000003 X10: 0000000000000040 X9: ffff0000095c4008
>
> X8: ffff801fbb400920 X7: ffff801fbb400960 X6: 0000000000000018
>
> X5: ffff801fbb400920 X4: 0000000000000000 X3: 0000000000000001
>
> X2: 000000002035d71c X1: ffff80affbe16e40 X0: 0000000000000000
>
> #4 [ffff000033f23840] hrtimer_active at ffff000008173458
>
> #5 [ffff000033f23860] hrtimer_try_to_cancel at ffff000008174610
>
> #6 [ffff000033f238a0] hrtimer_cancel at ffff000008174780
>
> #7 [ffff000033f238c0] phys_timer_emulate at ffff0000080dc4f0
>
> #8 [ffff000033f238e0] kvm_timer_vcpu_load at ffff0000080dc8b8
>
> #9 [ffff000033f23910] kvm_arch_vcpu_load at ffff0000080ba890
>
> #10 [ffff000033f23940] kvm_sched_in at ffff0000080ad758
>
> #11 [ffff000033f23960] finish_task_switch at ffff000008116b18
>
> #12 [ffff000033f239a0] __schedule at ffff000008e32638
>
> #13 [ffff000033f23a20] _cond_resched at ffff000008e33098
>
> #14 [ffff000033f23a40] kvm_arch_vcpu_ioctl_run at ffff0000080bace4
>
> #15 [ffff000033f23b00] kvm_vcpu_ioctl at ffff0000080aec04
>
> #16 [ffff000033f23d70] do_vfs_ioctl at ffff000008320298
>
> #17 [ffff000033f23e00] ksys_ioctl at ffff000008320b10
>
> #18 [ffff000033f23e40] __arm64_sys_ioctl at ffff000008320b4c
>
> #19 [ffff000033f23e60] el0_svc_common at ffff0000080979dc
>
> #20 [ffff000033f23ea0] el0_svc_handler at ffff000008097a9c
>
> #21 [ffff000033f23ff0] el0_svc at ffff000008084044
>
> PC: 0000ffffa15c9d0c LR: 00000000005ebeac SP: 0000ffff5d6d5200
>
> X29: 0000ffff5d6d5200 X28: 0000000000df43d8 X27: 000000000000000f
>
> X26: 0000ffff648c6840 X25: 00000000007ff3c0 X24: 0000ffff6f5fd4e8
>
> X23: 0000ffff6f5fd457 X22: 0000ffff648be8e6 X21: 0000000000000000
>
> X20: 0000ffff648be840 X19: 0000ffffa2b6b000 X18: 0000000000000001
>
> X17: 0000ffffa15c9d00 X16: 00000000010b3a40 X15: 0000000000000000
>
> X14: 0000ffff5d6d62d0 X13: 000000000000003f X12: 0000ffffa1651000
>
> X11: 0000000000000005 X10: 00000000000000db X9: 0000000000000004
>
> X8: 000000000000001d X7: 0000000000000001 X6: 0000000000000001
>
> X5: 0000000000000000 X4: 000000000114b428 X3: 0000000000000000
>
> X2: 0000000000000000 X1: 000000000000ae80 X0: 0000000000000022
>
> ORIG_X0: 0000000000000022 SYSCALLNO: 1d PSTATE: 60000000
>
>
>
> I will be appreciated if you could give me some suggestion.
>
>
>
> Thanks,
>
> Wandun Chen
>
>
>
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [Qestion] Softlockup when send IPI to other CPUs
2019-01-19 23:58 ` [Qestion] Softlockup when send IPI to other CPUs Will Deacon
@ 2019-01-21 14:21 ` Catalin Marinas
0 siblings, 0 replies; 17+ messages in thread
From: Catalin Marinas @ 2019-01-21 14:21 UTC (permalink / raw)
To: Will Deacon
Cc: chenwandun, linux-kernel, linux-arm-kernel, Wangkefeng (Kevin),
anshuman.khandual
On Sat, Jan 19, 2019 at 11:58:27PM +0000, Will Deacon wrote:
> On Thu, Jan 17, 2019 at 07:42:44AM +0000, chenwandun wrote:
> > Recently, I do some tests on linux-4.19 and hit a softlockup issue.
> >
> > I find some CPUs get the spinlock in the __split_huge_pmd function and
> > then send IPI to other CPUs, waiting the response, while several CPUs
> > enter the __split_huge_pmd function, want to get the spinlock, but always
> > in queued_spin_lock_slowpath,
> >
> > Because long time no response to the IPI, that results in a softlockup.
> >
> > As to sending IPI, it was in the patch
> > 3b8c9f1cdfc506e94e992ae66b68bbe416f89610. The patch is mean to send IPI
> > to each CPU after invalidating the I-cache for kernel mappings. In this
> > case, after modify pmd, it sends IPI to other CPUS to sync memory
> > mappings.
> >
> > No stable test case to repeat the result, it is hard to repeat the test procedure.
> >
> > The environment is arm64, 64 CPUs. Except for idle CPU, there are 6 kind
> > of callstacks in total.
>
> This looks like another lockup that would be solved if we deferred our
> I-cache invalidation when mapping user-executable pages, and instead
> performed the invalidation off the back of a UXN permission fault, where we
> could avoid holding any locks.
Looking back at commit 3b8c9f1cdfc5 ("arm64: IPI each CPU after
invalidating the I-cache for kernel mappings"), the text implies that it
should only do this for kernel mappings. I don't think we need this for
user mappings. We have a few scenarios where we invoke set_pte_at() with
exec permission:
1. Page faulted in - the pte was not previously accessible and the CPU
should not have stale decoded instructions (my interpretation of the
ARM ARM).
2. huge pmd splitting - there is no change to the underlying data. I
have a suspicion here that we shouldn't even call
sync_icache_aliases() but probably the PG_arch_1 isn't carried over
from the head compound page to the small pages (needs checking).
3. page migration - there is no change to the underlying data and
instructions, so I don't think we need the all CPUs sync.
4. mprotect() setting exec permission - that's normally the
responsibility of the user to ensure cache maintenance
(I can add more text to the patch below but need to get to the bottom of
this first)
---------8<-------------------------------------------------
arm64: Do not issue IPIs for user executable ptes
From: Catalin Marinas <catalin.marinas@arm.com>
Commit 3b8c9f1cdfc5 ("arm64: IPI each CPU after invalidating the I-cache
for kernel mappings") was aimed at fixing the I-cache invalidation for
kernel mappings. However, it inadvertently caused all cache maintenance
for user mappings via set_pte_at() -> __sync_icache_dcache() to call
kick_all_cpus_sync().
Fixes: 3b8c9f1cdfc5 ("arm64: IPI each CPU after invalidating the I-cache for kernel mappings")
Cc: <stable@vger.kernel.org> # 4.19.x-
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
---
arch/arm64/include/asm/cacheflush.h | 2 +-
arch/arm64/kernel/probes/uprobes.c | 2 +-
arch/arm64/mm/flush.c | 14 ++++++++++----
3 files changed, 12 insertions(+), 6 deletions(-)
diff --git a/arch/arm64/include/asm/cacheflush.h b/arch/arm64/include/asm/cacheflush.h
index 19844211a4e6..18e92d9dacd4 100644
--- a/arch/arm64/include/asm/cacheflush.h
+++ b/arch/arm64/include/asm/cacheflush.h
@@ -80,7 +80,7 @@ extern void __clean_dcache_area_poc(void *addr, size_t len);
extern void __clean_dcache_area_pop(void *addr, size_t len);
extern void __clean_dcache_area_pou(void *addr, size_t len);
extern long __flush_cache_user_range(unsigned long start, unsigned long end);
-extern void sync_icache_aliases(void *kaddr, unsigned long len);
+extern void sync_icache_aliases(void *kaddr, unsigned long len, bool sync);
static inline void flush_icache_range(unsigned long start, unsigned long end)
{
diff --git a/arch/arm64/kernel/probes/uprobes.c b/arch/arm64/kernel/probes/uprobes.c
index 636ca0119c0e..595e8c8f41cd 100644
--- a/arch/arm64/kernel/probes/uprobes.c
+++ b/arch/arm64/kernel/probes/uprobes.c
@@ -24,7 +24,7 @@ void arch_uprobe_copy_ixol(struct page *page, unsigned long vaddr,
memcpy(dst, src, len);
/* flush caches (dcache/icache) */
- sync_icache_aliases(dst, len);
+ sync_icache_aliases(dst, len, true);
kunmap_atomic(xol_page_kaddr);
}
diff --git a/arch/arm64/mm/flush.c b/arch/arm64/mm/flush.c
index 30695a868107..5c2f23a92d14 100644
--- a/arch/arm64/mm/flush.c
+++ b/arch/arm64/mm/flush.c
@@ -25,15 +25,17 @@
#include <asm/cache.h>
#include <asm/tlbflush.h>
-void sync_icache_aliases(void *kaddr, unsigned long len)
+void sync_icache_aliases(void *kaddr, unsigned long len, bool sync)
{
unsigned long addr = (unsigned long)kaddr;
if (icache_is_aliasing()) {
__clean_dcache_area_pou(kaddr, len);
__flush_icache_all();
- } else {
+ } else if (sync) {
flush_icache_range(addr, addr + len);
+ } else {
+ __flush_icache_range(addr, addr + len);
}
}
@@ -42,7 +44,7 @@ static void flush_ptrace_access(struct vm_area_struct *vma, struct page *page,
unsigned long len)
{
if (vma->vm_flags & VM_EXEC)
- sync_icache_aliases(kaddr, len);
+ sync_icache_aliases(kaddr, len, true);
}
/*
@@ -63,8 +65,12 @@ void __sync_icache_dcache(pte_t pte)
struct page *page = pte_page(pte);
if (!test_and_set_bit(PG_dcache_clean, &page->flags))
+ /*
+ * Don't issue kick_all_cpus_sync() after I-cache invalidation
+ * when setting a user executable pte.
+ */
sync_icache_aliases(page_address(page),
- PAGE_SIZE << compound_order(page));
+ PAGE_SIZE << compound_order(page), false);
}
EXPORT_SYMBOL_GPL(__sync_icache_dcache);
^ permalink raw reply related [flat|nested] 17+ messages in thread
* Re: [Qestion] Softlockup when send IPI to other CPUs
@ 2019-01-21 14:21 ` Catalin Marinas
0 siblings, 0 replies; 17+ messages in thread
From: Catalin Marinas @ 2019-01-21 14:21 UTC (permalink / raw)
To: Will Deacon
Cc: linux-arm-kernel, Wangkefeng (Kevin),
linux-kernel, chenwandun, anshuman.khandual
On Sat, Jan 19, 2019 at 11:58:27PM +0000, Will Deacon wrote:
> On Thu, Jan 17, 2019 at 07:42:44AM +0000, chenwandun wrote:
> > Recently, I do some tests on linux-4.19 and hit a softlockup issue.
> >
> > I find some CPUs get the spinlock in the __split_huge_pmd function and
> > then send IPI to other CPUs, waiting the response, while several CPUs
> > enter the __split_huge_pmd function, want to get the spinlock, but always
> > in queued_spin_lock_slowpath,
> >
> > Because long time no response to the IPI, that results in a softlockup.
> >
> > As to sending IPI, it was in the patch
> > 3b8c9f1cdfc506e94e992ae66b68bbe416f89610. The patch is mean to send IPI
> > to each CPU after invalidating the I-cache for kernel mappings. In this
> > case, after modify pmd, it sends IPI to other CPUS to sync memory
> > mappings.
> >
> > No stable test case to repeat the result, it is hard to repeat the test procedure.
> >
> > The environment is arm64, 64 CPUs. Except for idle CPU, there are 6 kind
> > of callstacks in total.
>
> This looks like another lockup that would be solved if we deferred our
> I-cache invalidation when mapping user-executable pages, and instead
> performed the invalidation off the back of a UXN permission fault, where we
> could avoid holding any locks.
Looking back at commit 3b8c9f1cdfc5 ("arm64: IPI each CPU after
invalidating the I-cache for kernel mappings"), the text implies that it
should only do this for kernel mappings. I don't think we need this for
user mappings. We have a few scenarios where we invoke set_pte_at() with
exec permission:
1. Page faulted in - the pte was not previously accessible and the CPU
should not have stale decoded instructions (my interpretation of the
ARM ARM).
2. huge pmd splitting - there is no change to the underlying data. I
have a suspicion here that we shouldn't even call
sync_icache_aliases() but probably the PG_arch_1 isn't carried over
from the head compound page to the small pages (needs checking).
3. page migration - there is no change to the underlying data and
instructions, so I don't think we need the all CPUs sync.
4. mprotect() setting exec permission - that's normally the
responsibility of the user to ensure cache maintenance
(I can add more text to the patch below but need to get to the bottom of
this first)
---------8<-------------------------------------------------
arm64: Do not issue IPIs for user executable ptes
From: Catalin Marinas <catalin.marinas@arm.com>
Commit 3b8c9f1cdfc5 ("arm64: IPI each CPU after invalidating the I-cache
for kernel mappings") was aimed at fixing the I-cache invalidation for
kernel mappings. However, it inadvertently caused all cache maintenance
for user mappings via set_pte_at() -> __sync_icache_dcache() to call
kick_all_cpus_sync().
Fixes: 3b8c9f1cdfc5 ("arm64: IPI each CPU after invalidating the I-cache for kernel mappings")
Cc: <stable@vger.kernel.org> # 4.19.x-
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
---
arch/arm64/include/asm/cacheflush.h | 2 +-
arch/arm64/kernel/probes/uprobes.c | 2 +-
arch/arm64/mm/flush.c | 14 ++++++++++----
3 files changed, 12 insertions(+), 6 deletions(-)
diff --git a/arch/arm64/include/asm/cacheflush.h b/arch/arm64/include/asm/cacheflush.h
index 19844211a4e6..18e92d9dacd4 100644
--- a/arch/arm64/include/asm/cacheflush.h
+++ b/arch/arm64/include/asm/cacheflush.h
@@ -80,7 +80,7 @@ extern void __clean_dcache_area_poc(void *addr, size_t len);
extern void __clean_dcache_area_pop(void *addr, size_t len);
extern void __clean_dcache_area_pou(void *addr, size_t len);
extern long __flush_cache_user_range(unsigned long start, unsigned long end);
-extern void sync_icache_aliases(void *kaddr, unsigned long len);
+extern void sync_icache_aliases(void *kaddr, unsigned long len, bool sync);
static inline void flush_icache_range(unsigned long start, unsigned long end)
{
diff --git a/arch/arm64/kernel/probes/uprobes.c b/arch/arm64/kernel/probes/uprobes.c
index 636ca0119c0e..595e8c8f41cd 100644
--- a/arch/arm64/kernel/probes/uprobes.c
+++ b/arch/arm64/kernel/probes/uprobes.c
@@ -24,7 +24,7 @@ void arch_uprobe_copy_ixol(struct page *page, unsigned long vaddr,
memcpy(dst, src, len);
/* flush caches (dcache/icache) */
- sync_icache_aliases(dst, len);
+ sync_icache_aliases(dst, len, true);
kunmap_atomic(xol_page_kaddr);
}
diff --git a/arch/arm64/mm/flush.c b/arch/arm64/mm/flush.c
index 30695a868107..5c2f23a92d14 100644
--- a/arch/arm64/mm/flush.c
+++ b/arch/arm64/mm/flush.c
@@ -25,15 +25,17 @@
#include <asm/cache.h>
#include <asm/tlbflush.h>
-void sync_icache_aliases(void *kaddr, unsigned long len)
+void sync_icache_aliases(void *kaddr, unsigned long len, bool sync)
{
unsigned long addr = (unsigned long)kaddr;
if (icache_is_aliasing()) {
__clean_dcache_area_pou(kaddr, len);
__flush_icache_all();
- } else {
+ } else if (sync) {
flush_icache_range(addr, addr + len);
+ } else {
+ __flush_icache_range(addr, addr + len);
}
}
@@ -42,7 +44,7 @@ static void flush_ptrace_access(struct vm_area_struct *vma, struct page *page,
unsigned long len)
{
if (vma->vm_flags & VM_EXEC)
- sync_icache_aliases(kaddr, len);
+ sync_icache_aliases(kaddr, len, true);
}
/*
@@ -63,8 +65,12 @@ void __sync_icache_dcache(pte_t pte)
struct page *page = pte_page(pte);
if (!test_and_set_bit(PG_dcache_clean, &page->flags))
+ /*
+ * Don't issue kick_all_cpus_sync() after I-cache invalidation
+ * when setting a user executable pte.
+ */
sync_icache_aliases(page_address(page),
- PAGE_SIZE << compound_order(page));
+ PAGE_SIZE << compound_order(page), false);
}
EXPORT_SYMBOL_GPL(__sync_icache_dcache);
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related [flat|nested] 17+ messages in thread
* Re: [Qestion] Softlockup when send IPI to other CPUs
2019-01-21 14:21 ` Catalin Marinas
@ 2019-01-22 5:44 ` Will Deacon
-1 siblings, 0 replies; 17+ messages in thread
From: Will Deacon @ 2019-01-22 5:44 UTC (permalink / raw)
To: Catalin Marinas
Cc: chenwandun, linux-kernel, linux-arm-kernel, Wangkefeng (Kevin),
anshuman.khandual
Hi Catalin,
On Mon, Jan 21, 2019 at 02:21:28PM +0000, Catalin Marinas wrote:
> On Sat, Jan 19, 2019 at 11:58:27PM +0000, Will Deacon wrote:
> > On Thu, Jan 17, 2019 at 07:42:44AM +0000, chenwandun wrote:
> > > Recently, I do some tests on linux-4.19 and hit a softlockup issue.
> > >
> > > I find some CPUs get the spinlock in the __split_huge_pmd function and
> > > then send IPI to other CPUs, waiting the response, while several CPUs
> > > enter the __split_huge_pmd function, want to get the spinlock, but always
> > > in queued_spin_lock_slowpath,
> > >
> > > Because long time no response to the IPI, that results in a softlockup.
> > >
> > > As to sending IPI, it was in the patch
> > > 3b8c9f1cdfc506e94e992ae66b68bbe416f89610. The patch is mean to send IPI
> > > to each CPU after invalidating the I-cache for kernel mappings. In this
> > > case, after modify pmd, it sends IPI to other CPUS to sync memory
> > > mappings.
> > >
> > > No stable test case to repeat the result, it is hard to repeat the test procedure.
> > >
> > > The environment is arm64, 64 CPUs. Except for idle CPU, there are 6 kind
> > > of callstacks in total.
> >
> > This looks like another lockup that would be solved if we deferred our
> > I-cache invalidation when mapping user-executable pages, and instead
> > performed the invalidation off the back of a UXN permission fault, where we
> > could avoid holding any locks.
>
> Looking back at commit 3b8c9f1cdfc5 ("arm64: IPI each CPU after
> invalidating the I-cache for kernel mappings"), the text implies that it
> should only do this for kernel mappings. I don't think we need this for
> user mappings. We have a few scenarios where we invoke set_pte_at() with
> exec permission:
Yes, I think you're right. I got confused because in this case we are
invalidating lines written by the kernel, but actually it's not about who
writes the data, but about whether or not the page table is being changed.
> 1. Page faulted in - the pte was not previously accessible and the CPU
> should not have stale decoded instructions (my interpretation of the
> ARM ARM).
>
> 2. huge pmd splitting - there is no change to the underlying data. I
> have a suspicion here that we shouldn't even call
> sync_icache_aliases() but probably the PG_arch_1 isn't carried over
> from the head compound page to the small pages (needs checking).
>
> 3. page migration - there is no change to the underlying data and
> instructions, so I don't think we need the all CPUs sync.
>
> 4. mprotect() setting exec permission - that's normally the
> responsibility of the user to ensure cache maintenance
>
> (I can add more text to the patch below but need to get to the bottom of
> this first)
>
> ---------8<-------------------------------------------------
> arm64: Do not issue IPIs for user executable ptes
>
> From: Catalin Marinas <catalin.marinas@arm.com>
>
> Commit 3b8c9f1cdfc5 ("arm64: IPI each CPU after invalidating the I-cache
> for kernel mappings") was aimed at fixing the I-cache invalidation for
> kernel mappings. However, it inadvertently caused all cache maintenance
> for user mappings via set_pte_at() -> __sync_icache_dcache() to call
> kick_all_cpus_sync().
>
> Fixes: 3b8c9f1cdfc5 ("arm64: IPI each CPU after invalidating the I-cache for kernel mappings")
> Cc: <stable@vger.kernel.org> # 4.19.x-
> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
> ---
> arch/arm64/include/asm/cacheflush.h | 2 +-
> arch/arm64/kernel/probes/uprobes.c | 2 +-
> arch/arm64/mm/flush.c | 14 ++++++++++----
> 3 files changed, 12 insertions(+), 6 deletions(-)
>
> diff --git a/arch/arm64/include/asm/cacheflush.h b/arch/arm64/include/asm/cacheflush.h
> index 19844211a4e6..18e92d9dacd4 100644
> --- a/arch/arm64/include/asm/cacheflush.h
> +++ b/arch/arm64/include/asm/cacheflush.h
> @@ -80,7 +80,7 @@ extern void __clean_dcache_area_poc(void *addr, size_t len);
> extern void __clean_dcache_area_pop(void *addr, size_t len);
> extern void __clean_dcache_area_pou(void *addr, size_t len);
> extern long __flush_cache_user_range(unsigned long start, unsigned long end);
> -extern void sync_icache_aliases(void *kaddr, unsigned long len);
> +extern void sync_icache_aliases(void *kaddr, unsigned long len, bool sync);
I'd much prefer just adding something like sync_user_icache_aliases(), which
would avoid the IPI, since bool arguments tend to make the callsites
unreadable imo.
With that:
Acked-by: Will Deacon <will.deacon@arm.com>
Will
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [Qestion] Softlockup when send IPI to other CPUs
@ 2019-01-22 5:44 ` Will Deacon
0 siblings, 0 replies; 17+ messages in thread
From: Will Deacon @ 2019-01-22 5:44 UTC (permalink / raw)
To: Catalin Marinas
Cc: linux-arm-kernel, Wangkefeng (Kevin),
linux-kernel, chenwandun, anshuman.khandual
Hi Catalin,
On Mon, Jan 21, 2019 at 02:21:28PM +0000, Catalin Marinas wrote:
> On Sat, Jan 19, 2019 at 11:58:27PM +0000, Will Deacon wrote:
> > On Thu, Jan 17, 2019 at 07:42:44AM +0000, chenwandun wrote:
> > > Recently, I do some tests on linux-4.19 and hit a softlockup issue.
> > >
> > > I find some CPUs get the spinlock in the __split_huge_pmd function and
> > > then send IPI to other CPUs, waiting the response, while several CPUs
> > > enter the __split_huge_pmd function, want to get the spinlock, but always
> > > in queued_spin_lock_slowpath,
> > >
> > > Because long time no response to the IPI, that results in a softlockup.
> > >
> > > As to sending IPI, it was in the patch
> > > 3b8c9f1cdfc506e94e992ae66b68bbe416f89610. The patch is mean to send IPI
> > > to each CPU after invalidating the I-cache for kernel mappings. In this
> > > case, after modify pmd, it sends IPI to other CPUS to sync memory
> > > mappings.
> > >
> > > No stable test case to repeat the result, it is hard to repeat the test procedure.
> > >
> > > The environment is arm64, 64 CPUs. Except for idle CPU, there are 6 kind
> > > of callstacks in total.
> >
> > This looks like another lockup that would be solved if we deferred our
> > I-cache invalidation when mapping user-executable pages, and instead
> > performed the invalidation off the back of a UXN permission fault, where we
> > could avoid holding any locks.
>
> Looking back at commit 3b8c9f1cdfc5 ("arm64: IPI each CPU after
> invalidating the I-cache for kernel mappings"), the text implies that it
> should only do this for kernel mappings. I don't think we need this for
> user mappings. We have a few scenarios where we invoke set_pte_at() with
> exec permission:
Yes, I think you're right. I got confused because in this case we are
invalidating lines written by the kernel, but actually it's not about who
writes the data, but about whether or not the page table is being changed.
> 1. Page faulted in - the pte was not previously accessible and the CPU
> should not have stale decoded instructions (my interpretation of the
> ARM ARM).
>
> 2. huge pmd splitting - there is no change to the underlying data. I
> have a suspicion here that we shouldn't even call
> sync_icache_aliases() but probably the PG_arch_1 isn't carried over
> from the head compound page to the small pages (needs checking).
>
> 3. page migration - there is no change to the underlying data and
> instructions, so I don't think we need the all CPUs sync.
>
> 4. mprotect() setting exec permission - that's normally the
> responsibility of the user to ensure cache maintenance
>
> (I can add more text to the patch below but need to get to the bottom of
> this first)
>
> ---------8<-------------------------------------------------
> arm64: Do not issue IPIs for user executable ptes
>
> From: Catalin Marinas <catalin.marinas@arm.com>
>
> Commit 3b8c9f1cdfc5 ("arm64: IPI each CPU after invalidating the I-cache
> for kernel mappings") was aimed at fixing the I-cache invalidation for
> kernel mappings. However, it inadvertently caused all cache maintenance
> for user mappings via set_pte_at() -> __sync_icache_dcache() to call
> kick_all_cpus_sync().
>
> Fixes: 3b8c9f1cdfc5 ("arm64: IPI each CPU after invalidating the I-cache for kernel mappings")
> Cc: <stable@vger.kernel.org> # 4.19.x-
> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
> ---
> arch/arm64/include/asm/cacheflush.h | 2 +-
> arch/arm64/kernel/probes/uprobes.c | 2 +-
> arch/arm64/mm/flush.c | 14 ++++++++++----
> 3 files changed, 12 insertions(+), 6 deletions(-)
>
> diff --git a/arch/arm64/include/asm/cacheflush.h b/arch/arm64/include/asm/cacheflush.h
> index 19844211a4e6..18e92d9dacd4 100644
> --- a/arch/arm64/include/asm/cacheflush.h
> +++ b/arch/arm64/include/asm/cacheflush.h
> @@ -80,7 +80,7 @@ extern void __clean_dcache_area_poc(void *addr, size_t len);
> extern void __clean_dcache_area_pop(void *addr, size_t len);
> extern void __clean_dcache_area_pou(void *addr, size_t len);
> extern long __flush_cache_user_range(unsigned long start, unsigned long end);
> -extern void sync_icache_aliases(void *kaddr, unsigned long len);
> +extern void sync_icache_aliases(void *kaddr, unsigned long len, bool sync);
I'd much prefer just adding something like sync_user_icache_aliases(), which
would avoid the IPI, since bool arguments tend to make the callsites
unreadable imo.
With that:
Acked-by: Will Deacon <will.deacon@arm.com>
Will
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [Qestion] Softlockup when send IPI to other CPUs
2019-01-22 5:44 ` Will Deacon
@ 2019-01-22 14:55 ` Mark Rutland
-1 siblings, 0 replies; 17+ messages in thread
From: Mark Rutland @ 2019-01-22 14:55 UTC (permalink / raw)
To: Will Deacon
Cc: Catalin Marinas, chenwandun, linux-kernel, linux-arm-kernel,
Wangkefeng (Kevin),
anshuman.khandual
Hi Will,
On Tue, Jan 22, 2019 at 05:44:02AM +0000, Will Deacon wrote:
> On Mon, Jan 21, 2019 at 02:21:28PM +0000, Catalin Marinas wrote:
> > On Sat, Jan 19, 2019 at 11:58:27PM +0000, Will Deacon wrote:
> > > On Thu, Jan 17, 2019 at 07:42:44AM +0000, chenwandun wrote:
> > > > Recently, I do some tests on linux-4.19 and hit a softlockup issue.
> > > >
> > > > I find some CPUs get the spinlock in the __split_huge_pmd function and
> > > > then send IPI to other CPUs, waiting the response, while several CPUs
> > > > enter the __split_huge_pmd function, want to get the spinlock, but always
> > > > in queued_spin_lock_slowpath,
> > > >
> > > > Because long time no response to the IPI, that results in a softlockup.
> > > >
> > > > As to sending IPI, it was in the patch
> > > > 3b8c9f1cdfc506e94e992ae66b68bbe416f89610. The patch is mean to send IPI
> > > > to each CPU after invalidating the I-cache for kernel mappings. In this
> > > > case, after modify pmd, it sends IPI to other CPUS to sync memory
> > > > mappings.
> > > >
> > > > No stable test case to repeat the result, it is hard to repeat the test procedure.
> > > >
> > > > The environment is arm64, 64 CPUs. Except for idle CPU, there are 6 kind
> > > > of callstacks in total.
> > >
> > > This looks like another lockup that would be solved if we deferred our
> > > I-cache invalidation when mapping user-executable pages, and instead
> > > performed the invalidation off the back of a UXN permission fault, where we
> > > could avoid holding any locks.
> >
> > Looking back at commit 3b8c9f1cdfc5 ("arm64: IPI each CPU after
> > invalidating the I-cache for kernel mappings"), the text implies that it
> > should only do this for kernel mappings. I don't think we need this for
> > user mappings. We have a few scenarios where we invoke set_pte_at() with
> > exec permission:
>
> Yes, I think you're right. I got confused because in this case we are
> invalidating lines written by the kernel, but actually it's not about who
> writes the data, but about whether or not the page table is being changed.
IIUC we may have a userspace problem analagous to the kernel modules
problem, if userspace uses dlopen/dlclose to dynamically load/unload
shared objects.
If userspace unloads an object, then loads another, the new object might
get placed at the same VA. A PE could have started speculating
instructions from the old object, and IIUC the TLB invalidation and
I-cache maintenance don't cause those instructions be re-fetched from
the I-cache unless there's a context synchronization event.
Do we require the use of membarrier when loading or unloading objects?
If so, when does that happen relative to the unmap or map?
Thanks,
Mark.
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [Qestion] Softlockup when send IPI to other CPUs
@ 2019-01-22 14:55 ` Mark Rutland
0 siblings, 0 replies; 17+ messages in thread
From: Mark Rutland @ 2019-01-22 14:55 UTC (permalink / raw)
To: Will Deacon
Cc: Wangkefeng (Kevin),
anshuman.khandual, Catalin Marinas, linux-kernel, chenwandun,
linux-arm-kernel
Hi Will,
On Tue, Jan 22, 2019 at 05:44:02AM +0000, Will Deacon wrote:
> On Mon, Jan 21, 2019 at 02:21:28PM +0000, Catalin Marinas wrote:
> > On Sat, Jan 19, 2019 at 11:58:27PM +0000, Will Deacon wrote:
> > > On Thu, Jan 17, 2019 at 07:42:44AM +0000, chenwandun wrote:
> > > > Recently, I do some tests on linux-4.19 and hit a softlockup issue.
> > > >
> > > > I find some CPUs get the spinlock in the __split_huge_pmd function and
> > > > then send IPI to other CPUs, waiting the response, while several CPUs
> > > > enter the __split_huge_pmd function, want to get the spinlock, but always
> > > > in queued_spin_lock_slowpath,
> > > >
> > > > Because long time no response to the IPI, that results in a softlockup.
> > > >
> > > > As to sending IPI, it was in the patch
> > > > 3b8c9f1cdfc506e94e992ae66b68bbe416f89610. The patch is mean to send IPI
> > > > to each CPU after invalidating the I-cache for kernel mappings. In this
> > > > case, after modify pmd, it sends IPI to other CPUS to sync memory
> > > > mappings.
> > > >
> > > > No stable test case to repeat the result, it is hard to repeat the test procedure.
> > > >
> > > > The environment is arm64, 64 CPUs. Except for idle CPU, there are 6 kind
> > > > of callstacks in total.
> > >
> > > This looks like another lockup that would be solved if we deferred our
> > > I-cache invalidation when mapping user-executable pages, and instead
> > > performed the invalidation off the back of a UXN permission fault, where we
> > > could avoid holding any locks.
> >
> > Looking back at commit 3b8c9f1cdfc5 ("arm64: IPI each CPU after
> > invalidating the I-cache for kernel mappings"), the text implies that it
> > should only do this for kernel mappings. I don't think we need this for
> > user mappings. We have a few scenarios where we invoke set_pte_at() with
> > exec permission:
>
> Yes, I think you're right. I got confused because in this case we are
> invalidating lines written by the kernel, but actually it's not about who
> writes the data, but about whether or not the page table is being changed.
IIUC we may have a userspace problem analagous to the kernel modules
problem, if userspace uses dlopen/dlclose to dynamically load/unload
shared objects.
If userspace unloads an object, then loads another, the new object might
get placed at the same VA. A PE could have started speculating
instructions from the old object, and IIUC the TLB invalidation and
I-cache maintenance don't cause those instructions be re-fetched from
the I-cache unless there's a context synchronization event.
Do we require the use of membarrier when loading or unloading objects?
If so, when does that happen relative to the unmap or map?
Thanks,
Mark.
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [Qestion] Softlockup when send IPI to other CPUs
2019-01-22 14:55 ` Mark Rutland
@ 2019-01-23 10:21 ` Will Deacon
-1 siblings, 0 replies; 17+ messages in thread
From: Will Deacon @ 2019-01-23 10:21 UTC (permalink / raw)
To: Mark Rutland
Cc: Catalin Marinas, chenwandun, linux-kernel, linux-arm-kernel,
Wangkefeng (Kevin),
anshuman.khandual
On Tue, Jan 22, 2019 at 02:55:22PM +0000, Mark Rutland wrote:
> On Tue, Jan 22, 2019 at 05:44:02AM +0000, Will Deacon wrote:
> > On Mon, Jan 21, 2019 at 02:21:28PM +0000, Catalin Marinas wrote:
> > > On Sat, Jan 19, 2019 at 11:58:27PM +0000, Will Deacon wrote:
> > > > On Thu, Jan 17, 2019 at 07:42:44AM +0000, chenwandun wrote:
> > > > > Recently, I do some tests on linux-4.19 and hit a softlockup issue.
> > > > >
> > > > > I find some CPUs get the spinlock in the __split_huge_pmd function and
> > > > > then send IPI to other CPUs, waiting the response, while several CPUs
> > > > > enter the __split_huge_pmd function, want to get the spinlock, but always
> > > > > in queued_spin_lock_slowpath,
> > > > >
> > > > > Because long time no response to the IPI, that results in a softlockup.
> > > > >
> > > > > As to sending IPI, it was in the patch
> > > > > 3b8c9f1cdfc506e94e992ae66b68bbe416f89610. The patch is mean to send IPI
> > > > > to each CPU after invalidating the I-cache for kernel mappings. In this
> > > > > case, after modify pmd, it sends IPI to other CPUS to sync memory
> > > > > mappings.
> > > > >
> > > > > No stable test case to repeat the result, it is hard to repeat the test procedure.
> > > > >
> > > > > The environment is arm64, 64 CPUs. Except for idle CPU, there are 6 kind
> > > > > of callstacks in total.
> > > >
> > > > This looks like another lockup that would be solved if we deferred our
> > > > I-cache invalidation when mapping user-executable pages, and instead
> > > > performed the invalidation off the back of a UXN permission fault, where we
> > > > could avoid holding any locks.
> > >
> > > Looking back at commit 3b8c9f1cdfc5 ("arm64: IPI each CPU after
> > > invalidating the I-cache for kernel mappings"), the text implies that it
> > > should only do this for kernel mappings. I don't think we need this for
> > > user mappings. We have a few scenarios where we invoke set_pte_at() with
> > > exec permission:
> >
> > Yes, I think you're right. I got confused because in this case we are
> > invalidating lines written by the kernel, but actually it's not about who
> > writes the data, but about whether or not the page table is being changed.
>
> IIUC we may have a userspace problem analagous to the kernel modules
> problem, if userspace uses dlopen/dlclose to dynamically load/unload
> shared objects.
>
> If userspace unloads an object, then loads another, the new object might
> get placed at the same VA. A PE could have started speculating
> instructions from the old object, and IIUC the TLB invalidation and
> I-cache maintenance don't cause those instructions be re-fetched from
> the I-cache unless there's a context synchronization event.
>
> Do we require the use of membarrier when loading or unloading objects?
> If so, when does that happen relative to the unmap or map?
membarrier seems a bit OTT for this. Assumedly userspace is already having
to synchronise threads in this case so that (a) nobody is executing the old
object when it is unloaded and (b) nobody tries to execute the new
object until it has been successfully loaded. Squeezing in an ISB shouldn't
be too tricky, although I don't know whether it's actually done (especially
since the chance of this going wrong is so tiny).
Will
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [Qestion] Softlockup when send IPI to other CPUs
@ 2019-01-23 10:21 ` Will Deacon
0 siblings, 0 replies; 17+ messages in thread
From: Will Deacon @ 2019-01-23 10:21 UTC (permalink / raw)
To: Mark Rutland
Cc: Wangkefeng (Kevin),
anshuman.khandual, Catalin Marinas, linux-kernel, chenwandun,
linux-arm-kernel
On Tue, Jan 22, 2019 at 02:55:22PM +0000, Mark Rutland wrote:
> On Tue, Jan 22, 2019 at 05:44:02AM +0000, Will Deacon wrote:
> > On Mon, Jan 21, 2019 at 02:21:28PM +0000, Catalin Marinas wrote:
> > > On Sat, Jan 19, 2019 at 11:58:27PM +0000, Will Deacon wrote:
> > > > On Thu, Jan 17, 2019 at 07:42:44AM +0000, chenwandun wrote:
> > > > > Recently, I do some tests on linux-4.19 and hit a softlockup issue.
> > > > >
> > > > > I find some CPUs get the spinlock in the __split_huge_pmd function and
> > > > > then send IPI to other CPUs, waiting the response, while several CPUs
> > > > > enter the __split_huge_pmd function, want to get the spinlock, but always
> > > > > in queued_spin_lock_slowpath,
> > > > >
> > > > > Because long time no response to the IPI, that results in a softlockup.
> > > > >
> > > > > As to sending IPI, it was in the patch
> > > > > 3b8c9f1cdfc506e94e992ae66b68bbe416f89610. The patch is mean to send IPI
> > > > > to each CPU after invalidating the I-cache for kernel mappings. In this
> > > > > case, after modify pmd, it sends IPI to other CPUS to sync memory
> > > > > mappings.
> > > > >
> > > > > No stable test case to repeat the result, it is hard to repeat the test procedure.
> > > > >
> > > > > The environment is arm64, 64 CPUs. Except for idle CPU, there are 6 kind
> > > > > of callstacks in total.
> > > >
> > > > This looks like another lockup that would be solved if we deferred our
> > > > I-cache invalidation when mapping user-executable pages, and instead
> > > > performed the invalidation off the back of a UXN permission fault, where we
> > > > could avoid holding any locks.
> > >
> > > Looking back at commit 3b8c9f1cdfc5 ("arm64: IPI each CPU after
> > > invalidating the I-cache for kernel mappings"), the text implies that it
> > > should only do this for kernel mappings. I don't think we need this for
> > > user mappings. We have a few scenarios where we invoke set_pte_at() with
> > > exec permission:
> >
> > Yes, I think you're right. I got confused because in this case we are
> > invalidating lines written by the kernel, but actually it's not about who
> > writes the data, but about whether or not the page table is being changed.
>
> IIUC we may have a userspace problem analagous to the kernel modules
> problem, if userspace uses dlopen/dlclose to dynamically load/unload
> shared objects.
>
> If userspace unloads an object, then loads another, the new object might
> get placed at the same VA. A PE could have started speculating
> instructions from the old object, and IIUC the TLB invalidation and
> I-cache maintenance don't cause those instructions be re-fetched from
> the I-cache unless there's a context synchronization event.
>
> Do we require the use of membarrier when loading or unloading objects?
> If so, when does that happen relative to the unmap or map?
membarrier seems a bit OTT for this. Assumedly userspace is already having
to synchronise threads in this case so that (a) nobody is executing the old
object when it is unloaded and (b) nobody tries to execute the new
object until it has been successfully loaded. Squeezing in an ISB shouldn't
be too tricky, although I don't know whether it's actually done (especially
since the chance of this going wrong is so tiny).
Will
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [Qestion] Softlockup when send IPI to other CPUs
2019-01-22 5:44 ` Will Deacon
@ 2019-01-23 18:15 ` Catalin Marinas
-1 siblings, 0 replies; 17+ messages in thread
From: Catalin Marinas @ 2019-01-23 18:15 UTC (permalink / raw)
To: Will Deacon
Cc: chenwandun, linux-kernel, linux-arm-kernel, Wangkefeng (Kevin),
anshuman.khandual
On Tue, Jan 22, 2019 at 05:44:02AM +0000, Will Deacon wrote:
> On Mon, Jan 21, 2019 at 02:21:28PM +0000, Catalin Marinas wrote:
> > arm64: Do not issue IPIs for user executable ptes
> >
> > From: Catalin Marinas <catalin.marinas@arm.com>
> >
> > Commit 3b8c9f1cdfc5 ("arm64: IPI each CPU after invalidating the I-cache
> > for kernel mappings") was aimed at fixing the I-cache invalidation for
> > kernel mappings. However, it inadvertently caused all cache maintenance
> > for user mappings via set_pte_at() -> __sync_icache_dcache() to call
> > kick_all_cpus_sync().
> >
> > Fixes: 3b8c9f1cdfc5 ("arm64: IPI each CPU after invalidating the I-cache for kernel mappings")
> > Cc: <stable@vger.kernel.org> # 4.19.x-
> > Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
> > ---
> > arch/arm64/include/asm/cacheflush.h | 2 +-
> > arch/arm64/kernel/probes/uprobes.c | 2 +-
> > arch/arm64/mm/flush.c | 14 ++++++++++----
> > 3 files changed, 12 insertions(+), 6 deletions(-)
> >
> > diff --git a/arch/arm64/include/asm/cacheflush.h b/arch/arm64/include/asm/cacheflush.h
> > index 19844211a4e6..18e92d9dacd4 100644
> > --- a/arch/arm64/include/asm/cacheflush.h
> > +++ b/arch/arm64/include/asm/cacheflush.h
> > @@ -80,7 +80,7 @@ extern void __clean_dcache_area_poc(void *addr, size_t len);
> > extern void __clean_dcache_area_pop(void *addr, size_t len);
> > extern void __clean_dcache_area_pou(void *addr, size_t len);
> > extern long __flush_cache_user_range(unsigned long start, unsigned long end);
> > -extern void sync_icache_aliases(void *kaddr, unsigned long len);
> > +extern void sync_icache_aliases(void *kaddr, unsigned long len, bool sync);
>
> I'd much prefer just adding something like sync_user_icache_aliases(), which
> would avoid the IPI, since bool arguments tend to make the callsites
> unreadable imo.
I wonder whether we need the IPI for uprobes and ptrace. I would say we
can avoid it for ptrace since normally the debugged thread should be
stopped. I think it's different for uprobes but changing the text of a
live thread doesn't come with many guarantees anyway. So I'm tempted to
simply change sync_icache_aliases() to not issue an IPI at all, in which
case we wouldn't even need the bool argument; it's only used for user
addresses. So the new diff would be (the text is the same):
diff --git a/arch/arm64/mm/flush.c b/arch/arm64/mm/flush.c
index 30695a868107..5c9073bace83 100644
--- a/arch/arm64/mm/flush.c
+++ b/arch/arm64/mm/flush.c
@@ -33,7 +33,11 @@ void sync_icache_aliases(void *kaddr, unsigned long len)
__clean_dcache_area_pou(kaddr, len);
__flush_icache_all();
} else {
- flush_icache_range(addr, addr + len);
+ /*
+ * Don't issue kick_all_cpus_sync() after I-cache invalidation
+ * for user mappings.
+ */
+ __flush_icache_range(addr, addr + len);
}
}
^ permalink raw reply related [flat|nested] 17+ messages in thread
* Re: [Qestion] Softlockup when send IPI to other CPUs
@ 2019-01-23 18:15 ` Catalin Marinas
0 siblings, 0 replies; 17+ messages in thread
From: Catalin Marinas @ 2019-01-23 18:15 UTC (permalink / raw)
To: Will Deacon
Cc: linux-arm-kernel, Wangkefeng (Kevin),
linux-kernel, chenwandun, anshuman.khandual
On Tue, Jan 22, 2019 at 05:44:02AM +0000, Will Deacon wrote:
> On Mon, Jan 21, 2019 at 02:21:28PM +0000, Catalin Marinas wrote:
> > arm64: Do not issue IPIs for user executable ptes
> >
> > From: Catalin Marinas <catalin.marinas@arm.com>
> >
> > Commit 3b8c9f1cdfc5 ("arm64: IPI each CPU after invalidating the I-cache
> > for kernel mappings") was aimed at fixing the I-cache invalidation for
> > kernel mappings. However, it inadvertently caused all cache maintenance
> > for user mappings via set_pte_at() -> __sync_icache_dcache() to call
> > kick_all_cpus_sync().
> >
> > Fixes: 3b8c9f1cdfc5 ("arm64: IPI each CPU after invalidating the I-cache for kernel mappings")
> > Cc: <stable@vger.kernel.org> # 4.19.x-
> > Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
> > ---
> > arch/arm64/include/asm/cacheflush.h | 2 +-
> > arch/arm64/kernel/probes/uprobes.c | 2 +-
> > arch/arm64/mm/flush.c | 14 ++++++++++----
> > 3 files changed, 12 insertions(+), 6 deletions(-)
> >
> > diff --git a/arch/arm64/include/asm/cacheflush.h b/arch/arm64/include/asm/cacheflush.h
> > index 19844211a4e6..18e92d9dacd4 100644
> > --- a/arch/arm64/include/asm/cacheflush.h
> > +++ b/arch/arm64/include/asm/cacheflush.h
> > @@ -80,7 +80,7 @@ extern void __clean_dcache_area_poc(void *addr, size_t len);
> > extern void __clean_dcache_area_pop(void *addr, size_t len);
> > extern void __clean_dcache_area_pou(void *addr, size_t len);
> > extern long __flush_cache_user_range(unsigned long start, unsigned long end);
> > -extern void sync_icache_aliases(void *kaddr, unsigned long len);
> > +extern void sync_icache_aliases(void *kaddr, unsigned long len, bool sync);
>
> I'd much prefer just adding something like sync_user_icache_aliases(), which
> would avoid the IPI, since bool arguments tend to make the callsites
> unreadable imo.
I wonder whether we need the IPI for uprobes and ptrace. I would say we
can avoid it for ptrace since normally the debugged thread should be
stopped. I think it's different for uprobes but changing the text of a
live thread doesn't come with many guarantees anyway. So I'm tempted to
simply change sync_icache_aliases() to not issue an IPI at all, in which
case we wouldn't even need the bool argument; it's only used for user
addresses. So the new diff would be (the text is the same):
diff --git a/arch/arm64/mm/flush.c b/arch/arm64/mm/flush.c
index 30695a868107..5c9073bace83 100644
--- a/arch/arm64/mm/flush.c
+++ b/arch/arm64/mm/flush.c
@@ -33,7 +33,11 @@ void sync_icache_aliases(void *kaddr, unsigned long len)
__clean_dcache_area_pou(kaddr, len);
__flush_icache_all();
} else {
- flush_icache_range(addr, addr + len);
+ /*
+ * Don't issue kick_all_cpus_sync() after I-cache invalidation
+ * for user mappings.
+ */
+ __flush_icache_range(addr, addr + len);
}
}
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related [flat|nested] 17+ messages in thread
* Re: [Qestion] Softlockup when send IPI to other CPUs
2019-01-23 18:15 ` Catalin Marinas
@ 2019-01-24 7:00 ` Shijith Thotton
-1 siblings, 0 replies; 17+ messages in thread
From: Shijith Thotton @ 2019-01-24 7:00 UTC (permalink / raw)
To: Catalin Marinas, Will Deacon
Cc: chenwandun, linux-kernel, linux-arm-kernel, Wangkefeng (Kevin),
anshuman.khandual, Ganapatrao Kulkarni,
Jayachandran Chandrasekharan Nair
Hi Catalin,
On 01/23/2019 11:45 PM, Catalin Marinas wrote:
> On Tue, Jan 22, 2019 at 05:44:02AM +0000, Will Deacon wrote:
>> On Mon, Jan 21, 2019 at 02:21:28PM +0000, Catalin Marinas wrote:
>>> arm64: Do not issue IPIs for user executable ptes
>>>
>>> From: Catalin Marinas <catalin.marinas@arm.com>
>>>
>>> Commit 3b8c9f1cdfc5 ("arm64: IPI each CPU after invalidating the I-cache
>>> for kernel mappings") was aimed at fixing the I-cache invalidation for
>>> kernel mappings. However, it inadvertently caused all cache maintenance
>>> for user mappings via set_pte_at() -> __sync_icache_dcache() to call
>>> kick_all_cpus_sync().
>>>
>>> Fixes: 3b8c9f1cdfc5 ("arm64: IPI each CPU after invalidating the I-cache for kernel mappings")
>>> Cc: <stable@vger.kernel.org> # 4.19.x-
>>> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
>>> ---
>>> arch/arm64/include/asm/cacheflush.h | 2 +-
>>> arch/arm64/kernel/probes/uprobes.c | 2 +-
>>> arch/arm64/mm/flush.c | 14 ++++++++++----
>>> 3 files changed, 12 insertions(+), 6 deletions(-)
>>>
>>> diff --git a/arch/arm64/include/asm/cacheflush.h b/arch/arm64/include/asm/cacheflush.h
>>> index 19844211a4e6..18e92d9dacd4 100644
>>> --- a/arch/arm64/include/asm/cacheflush.h
>>> +++ b/arch/arm64/include/asm/cacheflush.h
>>> @@ -80,7 +80,7 @@ extern void __clean_dcache_area_poc(void *addr, size_t len);
>>> extern void __clean_dcache_area_pop(void *addr, size_t len);
>>> extern void __clean_dcache_area_pou(void *addr, size_t len);
>>> extern long __flush_cache_user_range(unsigned long start, unsigned long end);
>>> -extern void sync_icache_aliases(void *kaddr, unsigned long len);
>>> +extern void sync_icache_aliases(void *kaddr, unsigned long len, bool sync);
>>
>> I'd much prefer just adding something like sync_user_icache_aliases(), which
>> would avoid the IPI, since bool arguments tend to make the callsites
>> unreadable imo.
>
> I wonder whether we need the IPI for uprobes and ptrace. I would say we
> can avoid it for ptrace since normally the debugged thread should be
> stopped. I think it's different for uprobes but changing the text of a
> live thread doesn't come with many guarantees anyway. So I'm tempted to
> simply change sync_icache_aliases() to not issue an IPI at all, in which
> case we wouldn't even need the bool argument; it's only used for user
> addresses. So the new diff would be (the text is the same):
>
> diff --git a/arch/arm64/mm/flush.c b/arch/arm64/mm/flush.c
> index 30695a868107..5c9073bace83 100644
> --- a/arch/arm64/mm/flush.c
> +++ b/arch/arm64/mm/flush.c
> @@ -33,7 +33,11 @@ void sync_icache_aliases(void *kaddr, unsigned long len)
> __clean_dcache_area_pou(kaddr, len);
> __flush_icache_all();
> } else {
> - flush_icache_range(addr, addr + len);
> + /*
> + * Don't issue kick_all_cpus_sync() after I-cache invalidation
> + * for user mappings.
> + */
> + __flush_icache_range(addr, addr + len);
> }
> }
We also faced similar issue with LTP test migrate_pages03 in past and this patch
fixes the issue.
http://lists.infradead.org/pipermail/linux-arm-kernel/2019-January/623574.html
Thanks,
Shijith
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [Qestion] Softlockup when send IPI to other CPUs
@ 2019-01-24 7:00 ` Shijith Thotton
0 siblings, 0 replies; 17+ messages in thread
From: Shijith Thotton @ 2019-01-24 7:00 UTC (permalink / raw)
To: Catalin Marinas, Will Deacon
Cc: Wangkefeng (Kevin),
Ganapatrao Kulkarni, anshuman.khandual, linux-kernel,
Jayachandran Chandrasekharan Nair, chenwandun, linux-arm-kernel
Hi Catalin,
On 01/23/2019 11:45 PM, Catalin Marinas wrote:
> On Tue, Jan 22, 2019 at 05:44:02AM +0000, Will Deacon wrote:
>> On Mon, Jan 21, 2019 at 02:21:28PM +0000, Catalin Marinas wrote:
>>> arm64: Do not issue IPIs for user executable ptes
>>>
>>> From: Catalin Marinas <catalin.marinas@arm.com>
>>>
>>> Commit 3b8c9f1cdfc5 ("arm64: IPI each CPU after invalidating the I-cache
>>> for kernel mappings") was aimed at fixing the I-cache invalidation for
>>> kernel mappings. However, it inadvertently caused all cache maintenance
>>> for user mappings via set_pte_at() -> __sync_icache_dcache() to call
>>> kick_all_cpus_sync().
>>>
>>> Fixes: 3b8c9f1cdfc5 ("arm64: IPI each CPU after invalidating the I-cache for kernel mappings")
>>> Cc: <stable@vger.kernel.org> # 4.19.x-
>>> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
>>> ---
>>> arch/arm64/include/asm/cacheflush.h | 2 +-
>>> arch/arm64/kernel/probes/uprobes.c | 2 +-
>>> arch/arm64/mm/flush.c | 14 ++++++++++----
>>> 3 files changed, 12 insertions(+), 6 deletions(-)
>>>
>>> diff --git a/arch/arm64/include/asm/cacheflush.h b/arch/arm64/include/asm/cacheflush.h
>>> index 19844211a4e6..18e92d9dacd4 100644
>>> --- a/arch/arm64/include/asm/cacheflush.h
>>> +++ b/arch/arm64/include/asm/cacheflush.h
>>> @@ -80,7 +80,7 @@ extern void __clean_dcache_area_poc(void *addr, size_t len);
>>> extern void __clean_dcache_area_pop(void *addr, size_t len);
>>> extern void __clean_dcache_area_pou(void *addr, size_t len);
>>> extern long __flush_cache_user_range(unsigned long start, unsigned long end);
>>> -extern void sync_icache_aliases(void *kaddr, unsigned long len);
>>> +extern void sync_icache_aliases(void *kaddr, unsigned long len, bool sync);
>>
>> I'd much prefer just adding something like sync_user_icache_aliases(), which
>> would avoid the IPI, since bool arguments tend to make the callsites
>> unreadable imo.
>
> I wonder whether we need the IPI for uprobes and ptrace. I would say we
> can avoid it for ptrace since normally the debugged thread should be
> stopped. I think it's different for uprobes but changing the text of a
> live thread doesn't come with many guarantees anyway. So I'm tempted to
> simply change sync_icache_aliases() to not issue an IPI at all, in which
> case we wouldn't even need the bool argument; it's only used for user
> addresses. So the new diff would be (the text is the same):
>
> diff --git a/arch/arm64/mm/flush.c b/arch/arm64/mm/flush.c
> index 30695a868107..5c9073bace83 100644
> --- a/arch/arm64/mm/flush.c
> +++ b/arch/arm64/mm/flush.c
> @@ -33,7 +33,11 @@ void sync_icache_aliases(void *kaddr, unsigned long len)
> __clean_dcache_area_pou(kaddr, len);
> __flush_icache_all();
> } else {
> - flush_icache_range(addr, addr + len);
> + /*
> + * Don't issue kick_all_cpus_sync() after I-cache invalidation
> + * for user mappings.
> + */
> + __flush_icache_range(addr, addr + len);
> }
> }
We also faced similar issue with LTP test migrate_pages03 in past and this patch
fixes the issue.
http://lists.infradead.org/pipermail/linux-arm-kernel/2019-January/623574.html
Thanks,
Shijith
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [Qestion] Softlockup when send IPI to other CPUs
2019-01-24 7:00 ` Shijith Thotton
@ 2019-01-24 16:37 ` Catalin Marinas
-1 siblings, 0 replies; 17+ messages in thread
From: Catalin Marinas @ 2019-01-24 16:37 UTC (permalink / raw)
To: Shijith Thotton
Cc: Will Deacon, chenwandun, linux-kernel, linux-arm-kernel,
Wangkefeng (Kevin),
anshuman.khandual, Ganapatrao Kulkarni,
Jayachandran Chandrasekharan Nair
Hi Shijith,
On Thu, Jan 24, 2019 at 07:00:42AM +0000, Shijith Thotton wrote:
> On 01/23/2019 11:45 PM, Catalin Marinas wrote:
> > diff --git a/arch/arm64/mm/flush.c b/arch/arm64/mm/flush.c
> > index 30695a868107..5c9073bace83 100644
> > --- a/arch/arm64/mm/flush.c
> > +++ b/arch/arm64/mm/flush.c
> > @@ -33,7 +33,11 @@ void sync_icache_aliases(void *kaddr, unsigned long len)
> > __clean_dcache_area_pou(kaddr, len);
> > __flush_icache_all();
> > } else {
> > - flush_icache_range(addr, addr + len);
> > + /*
> > + * Don't issue kick_all_cpus_sync() after I-cache invalidation
> > + * for user mappings.
> > + */
> > + __flush_icache_range(addr, addr + len);
> > }
> > }
>
> We also faced similar issue with LTP test migrate_pages03 in past and this patch
> fixes the issue.
>
> http://lists.infradead.org/pipermail/linux-arm-kernel/2019-January/623574.html
Thanks for confirming. I presume I can add your tested-by.
--
Catalin
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [Qestion] Softlockup when send IPI to other CPUs
@ 2019-01-24 16:37 ` Catalin Marinas
0 siblings, 0 replies; 17+ messages in thread
From: Catalin Marinas @ 2019-01-24 16:37 UTC (permalink / raw)
To: Shijith Thotton
Cc: Wangkefeng (Kevin),
Ganapatrao Kulkarni, anshuman.khandual, Will Deacon,
linux-kernel, Jayachandran Chandrasekharan Nair, chenwandun,
linux-arm-kernel
Hi Shijith,
On Thu, Jan 24, 2019 at 07:00:42AM +0000, Shijith Thotton wrote:
> On 01/23/2019 11:45 PM, Catalin Marinas wrote:
> > diff --git a/arch/arm64/mm/flush.c b/arch/arm64/mm/flush.c
> > index 30695a868107..5c9073bace83 100644
> > --- a/arch/arm64/mm/flush.c
> > +++ b/arch/arm64/mm/flush.c
> > @@ -33,7 +33,11 @@ void sync_icache_aliases(void *kaddr, unsigned long len)
> > __clean_dcache_area_pou(kaddr, len);
> > __flush_icache_all();
> > } else {
> > - flush_icache_range(addr, addr + len);
> > + /*
> > + * Don't issue kick_all_cpus_sync() after I-cache invalidation
> > + * for user mappings.
> > + */
> > + __flush_icache_range(addr, addr + len);
> > }
> > }
>
> We also faced similar issue with LTP test migrate_pages03 in past and this patch
> fixes the issue.
>
> http://lists.infradead.org/pipermail/linux-arm-kernel/2019-January/623574.html
Thanks for confirming. I presume I can add your tested-by.
--
Catalin
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [Qestion] Softlockup when send IPI to other CPUs
2019-01-24 16:37 ` Catalin Marinas
@ 2019-01-25 9:57 ` Shijith Thotton
-1 siblings, 0 replies; 17+ messages in thread
From: Shijith Thotton @ 2019-01-25 9:57 UTC (permalink / raw)
To: Catalin Marinas
Cc: Will Deacon, chenwandun, linux-kernel, linux-arm-kernel,
Wangkefeng (Kevin),
anshuman.khandual, Ganapatrao Kulkarni,
Jayachandran Chandrasekharan Nair
On 01/24/2019 10:07 PM, Catalin Marinas wrote:
> Hi Shijith,
>
> On Thu, Jan 24, 2019 at 07:00:42AM +0000, Shijith Thotton wrote:
>> On 01/23/2019 11:45 PM, Catalin Marinas wrote:
>>> diff --git a/arch/arm64/mm/flush.c b/arch/arm64/mm/flush.c
>>> index 30695a868107..5c9073bace83 100644
>>> --- a/arch/arm64/mm/flush.c
>>> +++ b/arch/arm64/mm/flush.c
>>> @@ -33,7 +33,11 @@ void sync_icache_aliases(void *kaddr, unsigned long len)
>>> __clean_dcache_area_pou(kaddr, len);
>>> __flush_icache_all();
>>> } else {
>>> - flush_icache_range(addr, addr + len);
>>> + /*
>>> + * Don't issue kick_all_cpus_sync() after I-cache invalidation
>>> + * for user mappings.
>>> + */
>>> + __flush_icache_range(addr, addr + len);
>>> }
>>> }
>>
>> We also faced similar issue with LTP test migrate_pages03 in past and this patch
>> fixes the issue.
>>
>> http://lists.infradead.org/pipermail/linux-arm-kernel/2019-January/623574.html
>
> Thanks for confirming. I presume I can add your tested-by.
>
Sure.
Tested-by: Shijith Thotton <sthotton@marvell.com>
Shijith
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [Qestion] Softlockup when send IPI to other CPUs
@ 2019-01-25 9:57 ` Shijith Thotton
0 siblings, 0 replies; 17+ messages in thread
From: Shijith Thotton @ 2019-01-25 9:57 UTC (permalink / raw)
To: Catalin Marinas
Cc: Wangkefeng (Kevin),
Ganapatrao Kulkarni, anshuman.khandual, Will Deacon,
linux-kernel, Jayachandran Chandrasekharan Nair, chenwandun,
linux-arm-kernel
On 01/24/2019 10:07 PM, Catalin Marinas wrote:
> Hi Shijith,
>
> On Thu, Jan 24, 2019 at 07:00:42AM +0000, Shijith Thotton wrote:
>> On 01/23/2019 11:45 PM, Catalin Marinas wrote:
>>> diff --git a/arch/arm64/mm/flush.c b/arch/arm64/mm/flush.c
>>> index 30695a868107..5c9073bace83 100644
>>> --- a/arch/arm64/mm/flush.c
>>> +++ b/arch/arm64/mm/flush.c
>>> @@ -33,7 +33,11 @@ void sync_icache_aliases(void *kaddr, unsigned long len)
>>> __clean_dcache_area_pou(kaddr, len);
>>> __flush_icache_all();
>>> } else {
>>> - flush_icache_range(addr, addr + len);
>>> + /*
>>> + * Don't issue kick_all_cpus_sync() after I-cache invalidation
>>> + * for user mappings.
>>> + */
>>> + __flush_icache_range(addr, addr + len);
>>> }
>>> }
>>
>> We also faced similar issue with LTP test migrate_pages03 in past and this patch
>> fixes the issue.
>>
>> http://lists.infradead.org/pipermail/linux-arm-kernel/2019-January/623574.html
>
> Thanks for confirming. I presume I can add your tested-by.
>
Sure.
Tested-by: Shijith Thotton <sthotton@marvell.com>
Shijith
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 17+ messages in thread
end of thread, other threads:[~2019-01-25 9:57 UTC | newest]
Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
[not found] <95C141B25E7AB14BA042DCCC556C0E6501620A47@dggeml529-mbx.china.huawei.com>
2019-01-19 23:58 ` [Qestion] Softlockup when send IPI to other CPUs Will Deacon
2019-01-21 14:21 ` Catalin Marinas
2019-01-21 14:21 ` Catalin Marinas
2019-01-22 5:44 ` Will Deacon
2019-01-22 5:44 ` Will Deacon
2019-01-22 14:55 ` Mark Rutland
2019-01-22 14:55 ` Mark Rutland
2019-01-23 10:21 ` Will Deacon
2019-01-23 10:21 ` Will Deacon
2019-01-23 18:15 ` Catalin Marinas
2019-01-23 18:15 ` Catalin Marinas
2019-01-24 7:00 ` Shijith Thotton
2019-01-24 7:00 ` Shijith Thotton
2019-01-24 16:37 ` Catalin Marinas
2019-01-24 16:37 ` Catalin Marinas
2019-01-25 9:57 ` Shijith Thotton
2019-01-25 9:57 ` Shijith Thotton
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.