All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 0/2] riscv: enable lockless lockref implementation
@ 2023-12-02 14:03 ` Jisheng Zhang
  0 siblings, 0 replies; 10+ messages in thread
From: Jisheng Zhang @ 2023-12-02 14:03 UTC (permalink / raw)
  To: Paul Walmsley, Palmer Dabbelt, Albert Ou; +Cc: linux-riscv, linux-kernel

This series selects ARCH_USE_CMPXCHG_LOCKREF to enable the
cmpxchg-based lockless lockref implementation for riscv. Then,
implement arch_cmpxchg64_{relaxed|acquire|release}.

After patch1:
Using Linus' test case[1] on TH1520 platform, I see a 11.2% improvement.
On JH7110 platform, I see 12.0% improvement.

After patch2:
on both TH1520 and JH7110 platforms, I didn't see obvious
performance improvement with Linus' test case [1]. IMHO, this may
be related with the fence and lr.d/sc.d hw implementations. In theory,
lr/sc without fence could give performance improvement over lr/sc plus
fence, so add the code here to leave performance improvement room on
newer HW platforms.

Link: http://marc.info/?l=linux-fsdevel&m=137782380714721&w=4 [1]

Since v1:
  - only select ARCH_USE_CMPXCHG_LOCKREF if 64BIT

Jisheng Zhang (2):
  riscv: select ARCH_USE_CMPXCHG_LOCKREF
  riscv: cmpxchg: implement arch_cmpxchg64_{relaxed|acquire|release}

 arch/riscv/Kconfig               |  1 +
 arch/riscv/include/asm/cmpxchg.h | 18 ++++++++++++++++++
 2 files changed, 19 insertions(+)

-- 
2.42.0


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH v2 0/2] riscv: enable lockless lockref implementation
@ 2023-12-02 14:03 ` Jisheng Zhang
  0 siblings, 0 replies; 10+ messages in thread
From: Jisheng Zhang @ 2023-12-02 14:03 UTC (permalink / raw)
  To: Paul Walmsley, Palmer Dabbelt, Albert Ou; +Cc: linux-riscv, linux-kernel

This series selects ARCH_USE_CMPXCHG_LOCKREF to enable the
cmpxchg-based lockless lockref implementation for riscv. Then,
implement arch_cmpxchg64_{relaxed|acquire|release}.

After patch1:
Using Linus' test case[1] on TH1520 platform, I see a 11.2% improvement.
On JH7110 platform, I see 12.0% improvement.

After patch2:
on both TH1520 and JH7110 platforms, I didn't see obvious
performance improvement with Linus' test case [1]. IMHO, this may
be related with the fence and lr.d/sc.d hw implementations. In theory,
lr/sc without fence could give performance improvement over lr/sc plus
fence, so add the code here to leave performance improvement room on
newer HW platforms.

Link: http://marc.info/?l=linux-fsdevel&m=137782380714721&w=4 [1]

Since v1:
  - only select ARCH_USE_CMPXCHG_LOCKREF if 64BIT

Jisheng Zhang (2):
  riscv: select ARCH_USE_CMPXCHG_LOCKREF
  riscv: cmpxchg: implement arch_cmpxchg64_{relaxed|acquire|release}

 arch/riscv/Kconfig               |  1 +
 arch/riscv/include/asm/cmpxchg.h | 18 ++++++++++++++++++
 2 files changed, 19 insertions(+)

-- 
2.42.0


_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH v2 1/2] riscv: select ARCH_USE_CMPXCHG_LOCKREF
  2023-12-02 14:03 ` Jisheng Zhang
@ 2023-12-02 14:03   ` Jisheng Zhang
  -1 siblings, 0 replies; 10+ messages in thread
From: Jisheng Zhang @ 2023-12-02 14:03 UTC (permalink / raw)
  To: Paul Walmsley, Palmer Dabbelt, Albert Ou; +Cc: linux-riscv, linux-kernel

Select ARCH_USE_CMPXCHG_LOCKREF to enable the cmpxchg-based lockless
lockref implementation for riscv.

Using Linus' test case[1] on TH1520 platform, I see a 11.2% improvement.
On JH7110 platform, I see 12.0% improvement.

Link: http://marc.info/?l=linux-fsdevel&m=137782380714721&w=4 [1]
Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
---
 arch/riscv/Kconfig | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
index 433ec617703e..da4ae76a892c 100644
--- a/arch/riscv/Kconfig
+++ b/arch/riscv/Kconfig
@@ -51,6 +51,7 @@ config RISCV
 	select ARCH_SUPPORTS_PAGE_TABLE_CHECK if MMU
 	select ARCH_SUPPORTS_PER_VMA_LOCK if MMU
 	select ARCH_SUPPORTS_SHADOW_CALL_STACK if HAVE_SHADOW_CALL_STACK
+	select ARCH_USE_CMPXCHG_LOCKREF if 64BIT
 	select ARCH_USE_MEMTEST
 	select ARCH_USE_QUEUED_RWLOCKS
 	select ARCH_USES_CFI_TRAPS if CFI_CLANG
-- 
2.42.0


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v2 1/2] riscv: select ARCH_USE_CMPXCHG_LOCKREF
@ 2023-12-02 14:03   ` Jisheng Zhang
  0 siblings, 0 replies; 10+ messages in thread
From: Jisheng Zhang @ 2023-12-02 14:03 UTC (permalink / raw)
  To: Paul Walmsley, Palmer Dabbelt, Albert Ou; +Cc: linux-riscv, linux-kernel

Select ARCH_USE_CMPXCHG_LOCKREF to enable the cmpxchg-based lockless
lockref implementation for riscv.

Using Linus' test case[1] on TH1520 platform, I see a 11.2% improvement.
On JH7110 platform, I see 12.0% improvement.

Link: http://marc.info/?l=linux-fsdevel&m=137782380714721&w=4 [1]
Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
---
 arch/riscv/Kconfig | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
index 433ec617703e..da4ae76a892c 100644
--- a/arch/riscv/Kconfig
+++ b/arch/riscv/Kconfig
@@ -51,6 +51,7 @@ config RISCV
 	select ARCH_SUPPORTS_PAGE_TABLE_CHECK if MMU
 	select ARCH_SUPPORTS_PER_VMA_LOCK if MMU
 	select ARCH_SUPPORTS_SHADOW_CALL_STACK if HAVE_SHADOW_CALL_STACK
+	select ARCH_USE_CMPXCHG_LOCKREF if 64BIT
 	select ARCH_USE_MEMTEST
 	select ARCH_USE_QUEUED_RWLOCKS
 	select ARCH_USES_CFI_TRAPS if CFI_CLANG
-- 
2.42.0


_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v2 2/2] riscv: cmpxchg: implement arch_cmpxchg64_{relaxed|acquire|release}
  2023-12-02 14:03 ` Jisheng Zhang
@ 2023-12-02 14:03   ` Jisheng Zhang
  -1 siblings, 0 replies; 10+ messages in thread
From: Jisheng Zhang @ 2023-12-02 14:03 UTC (permalink / raw)
  To: Paul Walmsley, Palmer Dabbelt, Albert Ou; +Cc: linux-riscv, linux-kernel

After selecting ARCH_USE_CMPXCHG_LOCKREF, one straight futher
optimization is implementing the arch_cmpxchg64_relaxed() because the
lockref code does not need the cmpxchg to have barrier semantics. At
the same time, implement arch_cmpxchg64_acquire and
arch_cmpxchg64_release as well.

However, on both TH1520 and JH7110 platforms, I didn't see obvious
performance improvement with Linus' test case [1]. IMHO, this may
be related with the fence and lr.d/sc.d hw implementations. In theory,
lr/sc without fence could give performance improvement over lr/sc plus
fence, so add the code here to leave performance improvement room on
newer HW platforms.

Link: http://marc.info/?l=linux-fsdevel&m=137782380714721&w=4 [1]
Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
---
 arch/riscv/include/asm/cmpxchg.h | 18 ++++++++++++++++++
 1 file changed, 18 insertions(+)

diff --git a/arch/riscv/include/asm/cmpxchg.h b/arch/riscv/include/asm/cmpxchg.h
index 2f4726d3cfcc..6318187f426f 100644
--- a/arch/riscv/include/asm/cmpxchg.h
+++ b/arch/riscv/include/asm/cmpxchg.h
@@ -360,4 +360,22 @@
 	arch_cmpxchg_relaxed((ptr), (o), (n));				\
 })
 
+#define arch_cmpxchg64_relaxed(ptr, o, n)				\
+({									\
+	BUILD_BUG_ON(sizeof(*(ptr)) != 8);				\
+	arch_cmpxchg_relaxed((ptr), (o), (n));				\
+})
+
+#define arch_cmpxchg64_acquire(ptr, o, n)				\
+({									\
+	BUILD_BUG_ON(sizeof(*(ptr)) != 8);				\
+	arch_cmpxchg_acquire((ptr), (o), (n));				\
+})
+
+#define arch_cmpxchg64_release(ptr, o, n)				\
+({									\
+	BUILD_BUG_ON(sizeof(*(ptr)) != 8);				\
+	arch_cmpxchg_release((ptr), (o), (n));				\
+})
+
 #endif /* _ASM_RISCV_CMPXCHG_H */
-- 
2.42.0


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v2 2/2] riscv: cmpxchg: implement arch_cmpxchg64_{relaxed|acquire|release}
@ 2023-12-02 14:03   ` Jisheng Zhang
  0 siblings, 0 replies; 10+ messages in thread
From: Jisheng Zhang @ 2023-12-02 14:03 UTC (permalink / raw)
  To: Paul Walmsley, Palmer Dabbelt, Albert Ou; +Cc: linux-riscv, linux-kernel

After selecting ARCH_USE_CMPXCHG_LOCKREF, one straight futher
optimization is implementing the arch_cmpxchg64_relaxed() because the
lockref code does not need the cmpxchg to have barrier semantics. At
the same time, implement arch_cmpxchg64_acquire and
arch_cmpxchg64_release as well.

However, on both TH1520 and JH7110 platforms, I didn't see obvious
performance improvement with Linus' test case [1]. IMHO, this may
be related with the fence and lr.d/sc.d hw implementations. In theory,
lr/sc without fence could give performance improvement over lr/sc plus
fence, so add the code here to leave performance improvement room on
newer HW platforms.

Link: http://marc.info/?l=linux-fsdevel&m=137782380714721&w=4 [1]
Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
---
 arch/riscv/include/asm/cmpxchg.h | 18 ++++++++++++++++++
 1 file changed, 18 insertions(+)

diff --git a/arch/riscv/include/asm/cmpxchg.h b/arch/riscv/include/asm/cmpxchg.h
index 2f4726d3cfcc..6318187f426f 100644
--- a/arch/riscv/include/asm/cmpxchg.h
+++ b/arch/riscv/include/asm/cmpxchg.h
@@ -360,4 +360,22 @@
 	arch_cmpxchg_relaxed((ptr), (o), (n));				\
 })
 
+#define arch_cmpxchg64_relaxed(ptr, o, n)				\
+({									\
+	BUILD_BUG_ON(sizeof(*(ptr)) != 8);				\
+	arch_cmpxchg_relaxed((ptr), (o), (n));				\
+})
+
+#define arch_cmpxchg64_acquire(ptr, o, n)				\
+({									\
+	BUILD_BUG_ON(sizeof(*(ptr)) != 8);				\
+	arch_cmpxchg_acquire((ptr), (o), (n));				\
+})
+
+#define arch_cmpxchg64_release(ptr, o, n)				\
+({									\
+	BUILD_BUG_ON(sizeof(*(ptr)) != 8);				\
+	arch_cmpxchg_release((ptr), (o), (n));				\
+})
+
 #endif /* _ASM_RISCV_CMPXCHG_H */
-- 
2.42.0


_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH v2 0/2] riscv: enable lockless lockref implementation
  2023-12-02 14:03 ` Jisheng Zhang
@ 2024-01-15  9:37   ` Jisheng Zhang
  -1 siblings, 0 replies; 10+ messages in thread
From: Jisheng Zhang @ 2024-01-15  9:37 UTC (permalink / raw)
  To: Paul Walmsley, Palmer Dabbelt, Albert Ou; +Cc: linux-riscv, linux-kernel

On Sat, Dec 02, 2023 at 10:03:21PM +0800, Jisheng Zhang wrote:
> This series selects ARCH_USE_CMPXCHG_LOCKREF to enable the
> cmpxchg-based lockless lockref implementation for riscv. Then,
> implement arch_cmpxchg64_{relaxed|acquire|release}.
> 
> After patch1:
> Using Linus' test case[1] on TH1520 platform, I see a 11.2% improvement.
> On JH7110 platform, I see 12.0% improvement.
> 
> After patch2:
> on both TH1520 and JH7110 platforms, I didn't see obvious
> performance improvement with Linus' test case [1]. IMHO, this may
> be related with the fence and lr.d/sc.d hw implementations. In theory,
> lr/sc without fence could give performance improvement over lr/sc plus
> fence, so add the code here to leave performance improvement room on
> newer HW platforms.
> 
> Link: http://marc.info/?l=linux-fsdevel&m=137782380714721&w=4 [1]

Hi Palmer,

this series is also missed, let me know if there's something need to be
done.

Thanks
> 
> Since v1:
>   - only select ARCH_USE_CMPXCHG_LOCKREF if 64BIT
> 
> Jisheng Zhang (2):
>   riscv: select ARCH_USE_CMPXCHG_LOCKREF
>   riscv: cmpxchg: implement arch_cmpxchg64_{relaxed|acquire|release}
> 
>  arch/riscv/Kconfig               |  1 +
>  arch/riscv/include/asm/cmpxchg.h | 18 ++++++++++++++++++
>  2 files changed, 19 insertions(+)
> 
> -- 
> 2.42.0
> 
> 
> _______________________________________________
> linux-riscv mailing list
> linux-riscv@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v2 0/2] riscv: enable lockless lockref implementation
@ 2024-01-15  9:37   ` Jisheng Zhang
  0 siblings, 0 replies; 10+ messages in thread
From: Jisheng Zhang @ 2024-01-15  9:37 UTC (permalink / raw)
  To: Paul Walmsley, Palmer Dabbelt, Albert Ou; +Cc: linux-riscv, linux-kernel

On Sat, Dec 02, 2023 at 10:03:21PM +0800, Jisheng Zhang wrote:
> This series selects ARCH_USE_CMPXCHG_LOCKREF to enable the
> cmpxchg-based lockless lockref implementation for riscv. Then,
> implement arch_cmpxchg64_{relaxed|acquire|release}.
> 
> After patch1:
> Using Linus' test case[1] on TH1520 platform, I see a 11.2% improvement.
> On JH7110 platform, I see 12.0% improvement.
> 
> After patch2:
> on both TH1520 and JH7110 platforms, I didn't see obvious
> performance improvement with Linus' test case [1]. IMHO, this may
> be related with the fence and lr.d/sc.d hw implementations. In theory,
> lr/sc without fence could give performance improvement over lr/sc plus
> fence, so add the code here to leave performance improvement room on
> newer HW platforms.
> 
> Link: http://marc.info/?l=linux-fsdevel&m=137782380714721&w=4 [1]

Hi Palmer,

this series is also missed, let me know if there's something need to be
done.

Thanks
> 
> Since v1:
>   - only select ARCH_USE_CMPXCHG_LOCKREF if 64BIT
> 
> Jisheng Zhang (2):
>   riscv: select ARCH_USE_CMPXCHG_LOCKREF
>   riscv: cmpxchg: implement arch_cmpxchg64_{relaxed|acquire|release}
> 
>  arch/riscv/Kconfig               |  1 +
>  arch/riscv/include/asm/cmpxchg.h | 18 ++++++++++++++++++
>  2 files changed, 19 insertions(+)
> 
> -- 
> 2.42.0
> 
> 
> _______________________________________________
> linux-riscv mailing list
> linux-riscv@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-riscv

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v2 0/2] riscv: enable lockless lockref implementation
  2023-12-02 14:03 ` Jisheng Zhang
@ 2024-01-15 15:39   ` Andrea Parri
  -1 siblings, 0 replies; 10+ messages in thread
From: Andrea Parri @ 2024-01-15 15:39 UTC (permalink / raw)
  To: Jisheng Zhang
  Cc: Paul Walmsley, Palmer Dabbelt, Albert Ou, linux-riscv, linux-kernel

On Sat, Dec 02, 2023 at 10:03:21PM +0800, Jisheng Zhang wrote:
> This series selects ARCH_USE_CMPXCHG_LOCKREF to enable the
> cmpxchg-based lockless lockref implementation for riscv. Then,
> implement arch_cmpxchg64_{relaxed|acquire|release}.
> 
> After patch1:
> Using Linus' test case[1] on TH1520 platform, I see a 11.2% improvement.
> On JH7110 platform, I see 12.0% improvement.
> 
> After patch2:
> on both TH1520 and JH7110 platforms, I didn't see obvious
> performance improvement with Linus' test case [1]. IMHO, this may
> be related with the fence and lr.d/sc.d hw implementations. In theory,
> lr/sc without fence could give performance improvement over lr/sc plus
> fence, so add the code here to leave performance improvement room on
> newer HW platforms.
> 
> Link: http://marc.info/?l=linux-fsdevel&m=137782380714721&w=4 [1]
> 
> Since v1:
>   - only select ARCH_USE_CMPXCHG_LOCKREF if 64BIT
> 
> Jisheng Zhang (2):
>   riscv: select ARCH_USE_CMPXCHG_LOCKREF
>   riscv: cmpxchg: implement arch_cmpxchg64_{relaxed|acquire|release}

For the series,

Reviewed-by: Andrea Parri <parri.andrea@gmail.com>  # code audit, QEMU tests

  Andrea

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v2 0/2] riscv: enable lockless lockref implementation
@ 2024-01-15 15:39   ` Andrea Parri
  0 siblings, 0 replies; 10+ messages in thread
From: Andrea Parri @ 2024-01-15 15:39 UTC (permalink / raw)
  To: Jisheng Zhang
  Cc: Paul Walmsley, Palmer Dabbelt, Albert Ou, linux-riscv, linux-kernel

On Sat, Dec 02, 2023 at 10:03:21PM +0800, Jisheng Zhang wrote:
> This series selects ARCH_USE_CMPXCHG_LOCKREF to enable the
> cmpxchg-based lockless lockref implementation for riscv. Then,
> implement arch_cmpxchg64_{relaxed|acquire|release}.
> 
> After patch1:
> Using Linus' test case[1] on TH1520 platform, I see a 11.2% improvement.
> On JH7110 platform, I see 12.0% improvement.
> 
> After patch2:
> on both TH1520 and JH7110 platforms, I didn't see obvious
> performance improvement with Linus' test case [1]. IMHO, this may
> be related with the fence and lr.d/sc.d hw implementations. In theory,
> lr/sc without fence could give performance improvement over lr/sc plus
> fence, so add the code here to leave performance improvement room on
> newer HW platforms.
> 
> Link: http://marc.info/?l=linux-fsdevel&m=137782380714721&w=4 [1]
> 
> Since v1:
>   - only select ARCH_USE_CMPXCHG_LOCKREF if 64BIT
> 
> Jisheng Zhang (2):
>   riscv: select ARCH_USE_CMPXCHG_LOCKREF
>   riscv: cmpxchg: implement arch_cmpxchg64_{relaxed|acquire|release}

For the series,

Reviewed-by: Andrea Parri <parri.andrea@gmail.com>  # code audit, QEMU tests

  Andrea

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2024-01-15 15:40 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-12-02 14:03 [PATCH v2 0/2] riscv: enable lockless lockref implementation Jisheng Zhang
2023-12-02 14:03 ` Jisheng Zhang
2023-12-02 14:03 ` [PATCH v2 1/2] riscv: select ARCH_USE_CMPXCHG_LOCKREF Jisheng Zhang
2023-12-02 14:03   ` Jisheng Zhang
2023-12-02 14:03 ` [PATCH v2 2/2] riscv: cmpxchg: implement arch_cmpxchg64_{relaxed|acquire|release} Jisheng Zhang
2023-12-02 14:03   ` Jisheng Zhang
2024-01-15  9:37 ` [PATCH v2 0/2] riscv: enable lockless lockref implementation Jisheng Zhang
2024-01-15  9:37   ` Jisheng Zhang
2024-01-15 15:39 ` Andrea Parri
2024-01-15 15:39   ` Andrea Parri

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.