[PATCH] riscv: Optimize switch_mm by passing "cpu" to flush_icache

All of lore.kernel.org
 help / color / mirror / Atom feed

* [PATCH] riscv: Optimize switch_mm by passing "cpu" to flush_icache_deferred()
@ 2021-05-11 17:42 ` Jisheng Zhang
  0 siblings, 0 replies; 4+ messages in thread
From: Jisheng Zhang @ 2021-05-11 17:42 UTC (permalink / raw)
  To: Paul Walmsley, Palmer Dabbelt, Albert Ou; +Cc: linux-riscv, linux-kernel

From: Jisheng Zhang <jszhang@kernel.org>

Directly passing the cpu to flush_icache_deferred() rather than calling
smp_processor_id() again.

Here are some performance numbers:

With a run of hackbench 30 times on a single core riscv64 Qemu instance
with 1GB memory:

without this patch: mean 36.934
with this patch: mean 36.104 (improved by 2.24%)

Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
---
 arch/riscv/mm/context.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/arch/riscv/mm/context.c b/arch/riscv/mm/context.c
index 68aa312fc352..6d445f2888ec 100644
--- a/arch/riscv/mm/context.c
+++ b/arch/riscv/mm/context.c
@@ -281,10 +281,9 @@ static inline void set_mm(struct mm_struct *mm, unsigned int cpu)
  * actually performs that local instruction cache flush, which implicitly only
  * refers to the current hart.
  */
-static inline void flush_icache_deferred(struct mm_struct *mm)
+static inline void flush_icache_deferred(struct mm_struct *mm, unsigned int cpu)
 {
 #ifdef CONFIG_SMP
-	unsigned int cpu = smp_processor_id();
 	cpumask_t *mask = &mm->context.icache_stale_mask;
 
 	if (cpumask_test_cpu(cpu, mask)) {
@@ -320,5 +319,5 @@ void switch_mm(struct mm_struct *prev, struct mm_struct *next,
 
 	set_mm(next, cpu);
 
-	flush_icache_deferred(next);
+	flush_icache_deferred(next, cpu);
 }
-- 
2.31.0



^ permalink raw reply related	[flat|nested] 4+ messages in thread

* [PATCH] riscv: Optimize switch_mm by passing "cpu" to flush_icache_deferred()
@ 2021-05-11 17:42 ` Jisheng Zhang
  0 siblings, 0 replies; 4+ messages in thread
From: Jisheng Zhang @ 2021-05-11 17:42 UTC (permalink / raw)
  To: Paul Walmsley, Palmer Dabbelt, Albert Ou; +Cc: linux-riscv, linux-kernel

From: Jisheng Zhang <jszhang@kernel.org>

Directly passing the cpu to flush_icache_deferred() rather than calling
smp_processor_id() again.

Here are some performance numbers:

With a run of hackbench 30 times on a single core riscv64 Qemu instance
with 1GB memory:

without this patch: mean 36.934
with this patch: mean 36.104 (improved by 2.24%)

Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
---
 arch/riscv/mm/context.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/arch/riscv/mm/context.c b/arch/riscv/mm/context.c
index 68aa312fc352..6d445f2888ec 100644
--- a/arch/riscv/mm/context.c
+++ b/arch/riscv/mm/context.c
@@ -281,10 +281,9 @@ static inline void set_mm(struct mm_struct *mm, unsigned int cpu)
  * actually performs that local instruction cache flush, which implicitly only
  * refers to the current hart.
  */
-static inline void flush_icache_deferred(struct mm_struct *mm)
+static inline void flush_icache_deferred(struct mm_struct *mm, unsigned int cpu)
 {
 #ifdef CONFIG_SMP
-	unsigned int cpu = smp_processor_id();
 	cpumask_t *mask = &mm->context.icache_stale_mask;
 
 	if (cpumask_test_cpu(cpu, mask)) {
@@ -320,5 +319,5 @@ void switch_mm(struct mm_struct *prev, struct mm_struct *next,
 
 	set_mm(next, cpu);
 
-	flush_icache_deferred(next);
+	flush_icache_deferred(next, cpu);
 }
-- 
2.31.0



_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH] riscv: Optimize switch_mm by passing "cpu" to flush_icache_deferred()
  2021-05-11 17:42 ` Jisheng Zhang
@ 2021-05-26  5:30   ` Palmer Dabbelt
  -1 siblings, 0 replies; 4+ messages in thread
From: Palmer Dabbelt @ 2021-05-26  5:30 UTC (permalink / raw)
  To: jszhang3; +Cc: Paul Walmsley, aou, linux-riscv, linux-kernel

On Tue, 11 May 2021 10:42:31 PDT (-0700), jszhang3@mail.ustc.edu.cn wrote:
> From: Jisheng Zhang <jszhang@kernel.org>
>
> Directly passing the cpu to flush_icache_deferred() rather than calling
> smp_processor_id() again.
>
> Here are some performance numbers:
>
> With a run of hackbench 30 times on a single core riscv64 Qemu instance
> with 1GB memory:
>
> without this patch: mean 36.934
> with this patch: mean 36.104 (improved by 2.24%)

I don't really put any stock in QEMU performance numbers for this sort 
of thing, but for something like this where we're just skipping an 
expensive call I don't really see a reason to even need performance 
numbers at all as we can just consider it an obvious optimization.

> Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
> ---
>  arch/riscv/mm/context.c | 5 ++---
>  1 file changed, 2 insertions(+), 3 deletions(-)
>
> diff --git a/arch/riscv/mm/context.c b/arch/riscv/mm/context.c
> index 68aa312fc352..6d445f2888ec 100644
> --- a/arch/riscv/mm/context.c
> +++ b/arch/riscv/mm/context.c
> @@ -281,10 +281,9 @@ static inline void set_mm(struct mm_struct *mm, unsigned int cpu)
>   * actually performs that local instruction cache flush, which implicitly only
>   * refers to the current hart.
>   */
> -static inline void flush_icache_deferred(struct mm_struct *mm)
> +static inline void flush_icache_deferred(struct mm_struct *mm, unsigned int cpu)
>  {
>  #ifdef CONFIG_SMP
> -	unsigned int cpu = smp_processor_id();
>  	cpumask_t *mask = &mm->context.icache_stale_mask;
>
>  	if (cpumask_test_cpu(cpu, mask)) {

This proceeds to perform only a local icache flush, which means it will 
break if any callers use a different CPU number.  That large comment at 
the top alludes to this behavior, but I went ahead and added a line to 
make that more explicit.

> @@ -320,5 +319,5 @@ void switch_mm(struct mm_struct *prev, struct mm_struct *next,
>
>  	set_mm(next, cpu);
>
> -	flush_icache_deferred(next);
> +	flush_icache_deferred(next, cpu);
>  }

Thanks, this is on for-next.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] riscv: Optimize switch_mm by passing "cpu" to flush_icache_deferred()
@ 2021-05-26  5:30   ` Palmer Dabbelt
  0 siblings, 0 replies; 4+ messages in thread
From: Palmer Dabbelt @ 2021-05-26  5:30 UTC (permalink / raw)
  To: jszhang3; +Cc: Paul Walmsley, aou, linux-riscv, linux-kernel

On Tue, 11 May 2021 10:42:31 PDT (-0700), jszhang3@mail.ustc.edu.cn wrote:
> From: Jisheng Zhang <jszhang@kernel.org>
>
> Directly passing the cpu to flush_icache_deferred() rather than calling
> smp_processor_id() again.
>
> Here are some performance numbers:
>
> With a run of hackbench 30 times on a single core riscv64 Qemu instance
> with 1GB memory:
>
> without this patch: mean 36.934
> with this patch: mean 36.104 (improved by 2.24%)

I don't really put any stock in QEMU performance numbers for this sort 
of thing, but for something like this where we're just skipping an 
expensive call I don't really see a reason to even need performance 
numbers at all as we can just consider it an obvious optimization.

> Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
> ---
>  arch/riscv/mm/context.c | 5 ++---
>  1 file changed, 2 insertions(+), 3 deletions(-)
>
> diff --git a/arch/riscv/mm/context.c b/arch/riscv/mm/context.c
> index 68aa312fc352..6d445f2888ec 100644
> --- a/arch/riscv/mm/context.c
> +++ b/arch/riscv/mm/context.c
> @@ -281,10 +281,9 @@ static inline void set_mm(struct mm_struct *mm, unsigned int cpu)
>   * actually performs that local instruction cache flush, which implicitly only
>   * refers to the current hart.
>   */
> -static inline void flush_icache_deferred(struct mm_struct *mm)
> +static inline void flush_icache_deferred(struct mm_struct *mm, unsigned int cpu)
>  {
>  #ifdef CONFIG_SMP
> -	unsigned int cpu = smp_processor_id();
>  	cpumask_t *mask = &mm->context.icache_stale_mask;
>
>  	if (cpumask_test_cpu(cpu, mask)) {

This proceeds to perform only a local icache flush, which means it will 
break if any callers use a different CPU number.  That large comment at 
the top alludes to this behavior, but I went ahead and added a line to 
make that more explicit.

> @@ -320,5 +319,5 @@ void switch_mm(struct mm_struct *prev, struct mm_struct *next,
>
>  	set_mm(next, cpu);
>
> -	flush_icache_deferred(next);
> +	flush_icache_deferred(next, cpu);
>  }

Thanks, this is on for-next.

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2021-05-26  5:30 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-05-11 17:42 [PATCH] riscv: Optimize switch_mm by passing "cpu" to flush_icache_deferred() Jisheng Zhang
2021-05-11 17:42 ` Jisheng Zhang
2021-05-26  5:30 ` Palmer Dabbelt
2021-05-26  5:30   ` Palmer Dabbelt

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.