* [PATCH] arm64: Add read_mostly declaration/definition to irq stack ptr
@ 2022-02-07 8:46 Jisheng Zhang
2022-02-16 17:09 ` Will Deacon
0 siblings, 1 reply; 2+ messages in thread
From: Jisheng Zhang @ 2022-02-07 8:46 UTC (permalink / raw)
To: Catalin Marinas, Will Deacon; +Cc: linux-arm-kernel, linux-kernel
Add "read-mostly" qualifier to irq_stack_ptr and
irq_shadow_call_stack_ptr. This is to prevent the false sharing.
Before the patch, I got below percpu layout with one defconfig:
ffffffc008723050 <mde_ref_count>:
ffffffc008723050: 00 00 00 00
....
ffffffc008723054 <kde_ref_count>:
ffffffc008723054: 00 00 00 00
....
ffffffc008723058 <irq_stack_ptr>:
...
ffffffc008723060 <nmi_contexts>:
...
ffffffc008723070 <fpsimd_last_state>:
As can be seen, the irq_stack_ptr sits with the heavy read/write percpu
vars such as fpsimd_last_state etc. at the same cacheline.
After the patch:
ffffffc008723000 <irq_stack_ptr>:
...
ffffffc008723008 <cpu_number>:
...
ffffffc008723010 <arm64_ssbd_callback_required>:
...
ffffffc008723018 <bp_hardening_data>:
...
Now, the irq_stack_ptr sits with read mostly percpu vars such as
cpu_number etc. at the same cacheline.
Signed-off-by: Jisheng Zhang <Jisheng.Zhang@synaptics.com>
---
arch/arm64/include/asm/stacktrace.h | 2 +-
arch/arm64/kernel/irq.c | 7 +++----
2 files changed, 4 insertions(+), 5 deletions(-)
diff --git a/arch/arm64/include/asm/stacktrace.h b/arch/arm64/include/asm/stacktrace.h
index e77cdef9ca29..75c142bfdffe 100644
--- a/arch/arm64/include/asm/stacktrace.h
+++ b/arch/arm64/include/asm/stacktrace.h
@@ -66,7 +66,7 @@ struct stackframe {
extern void dump_backtrace(struct pt_regs *regs, struct task_struct *tsk,
const char *loglvl);
-DECLARE_PER_CPU(unsigned long *, irq_stack_ptr);
+DECLARE_PER_CPU_READ_MOSTLY(unsigned long *, irq_stack_ptr);
static inline bool on_stack(unsigned long sp, unsigned long size,
unsigned long low, unsigned long high,
diff --git a/arch/arm64/kernel/irq.c b/arch/arm64/kernel/irq.c
index bda49430c9ea..d2e75e9bb826 100644
--- a/arch/arm64/kernel/irq.c
+++ b/arch/arm64/kernel/irq.c
@@ -26,13 +26,12 @@
/* Only access this in an NMI enter/exit */
DEFINE_PER_CPU(struct nmi_ctx, nmi_contexts);
-DEFINE_PER_CPU(unsigned long *, irq_stack_ptr);
+DEFINE_PER_CPU_READ_MOSTLY(unsigned long *, irq_stack_ptr);
-
-DECLARE_PER_CPU(unsigned long *, irq_shadow_call_stack_ptr);
+DECLARE_PER_CPU_READ_MOSTLY(unsigned long *, irq_shadow_call_stack_ptr);
#ifdef CONFIG_SHADOW_CALL_STACK
-DEFINE_PER_CPU(unsigned long *, irq_shadow_call_stack_ptr);
+DEFINE_PER_CPU_READ_MOSTLY(unsigned long *, irq_shadow_call_stack_ptr);
#endif
static void init_irq_scs(void)
--
2.34.1
^ permalink raw reply related [flat|nested] 2+ messages in thread
* Re: [PATCH] arm64: Add read_mostly declaration/definition to irq stack ptr
2022-02-07 8:46 [PATCH] arm64: Add read_mostly declaration/definition to irq stack ptr Jisheng Zhang
@ 2022-02-16 17:09 ` Will Deacon
0 siblings, 0 replies; 2+ messages in thread
From: Will Deacon @ 2022-02-16 17:09 UTC (permalink / raw)
To: Jisheng Zhang; +Cc: Catalin Marinas, linux-arm-kernel, linux-kernel
On Mon, Feb 07, 2022 at 04:46:42PM +0800, Jisheng Zhang wrote:
> Add "read-mostly" qualifier to irq_stack_ptr and
> irq_shadow_call_stack_ptr. This is to prevent the false sharing.
>
> Before the patch, I got below percpu layout with one defconfig:
> ffffffc008723050 <mde_ref_count>:
> ffffffc008723050: 00 00 00 00
> ....
>
> ffffffc008723054 <kde_ref_count>:
> ffffffc008723054: 00 00 00 00
> ....
>
> ffffffc008723058 <irq_stack_ptr>:
> ...
>
> ffffffc008723060 <nmi_contexts>:
> ...
>
> ffffffc008723070 <fpsimd_last_state>:
>
> As can be seen, the irq_stack_ptr sits with the heavy read/write percpu
> vars such as fpsimd_last_state etc. at the same cacheline.
>
> After the patch:
>
> ffffffc008723000 <irq_stack_ptr>:
> ...
>
> ffffffc008723008 <cpu_number>:
> ...
>
> ffffffc008723010 <arm64_ssbd_callback_required>:
> ...
>
> ffffffc008723018 <bp_hardening_data>:
> ...
>
> Now, the irq_stack_ptr sits with read mostly percpu vars such as
> cpu_number etc. at the same cacheline.
Were you able to measure any performance difference after this change?
Will
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2022-02-16 17:09 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-02-07 8:46 [PATCH] arm64: Add read_mostly declaration/definition to irq stack ptr Jisheng Zhang
2022-02-16 17:09 ` Will Deacon
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).