* [RFC] arm: Add for atomic half word exchange
@ 2015-05-19 11:20 Sarbojit Ganguly
2015-05-19 11:42 ` Arnd Bergmann
` (2 more replies)
0 siblings, 3 replies; 10+ messages in thread
From: Sarbojit Ganguly @ 2015-05-19 11:20 UTC (permalink / raw)
To: Arnd Bergmann, linux-arm-kernel
Cc: tglx, mingo, hpa, peterz, Waiman.Long, raghavendra.kt, oleg,
linux-kernel, SHARAN ALLUR, torvalds, vikram.m
[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset=windows-1252, Size: 1822 bytes --]
Yes, I will try to do that. OTOH, I saw that there was a discussion on removal of bad_xchg() altogether. Perhaps that approach be better than adding this half word exchange?
Regards,
Sarbojit
------- Original Message -------
Sender : Arnd Bergmann<arnd@arndb.de>
Date : May 19, 2015 18:51 (GMT+09:00)
Title : Re: [RFC] arm: Add for atomic half word exchange
On Tuesday 19 May 2015 09:39:33 Sarbojit Ganguly wrote:
> Since 16 bit half word exchange was not there and MCS based qspinlock by Waiman's xchg_tail() requires an atomic exchange on a half word,
> here is a small modification to __xchg() code.
We have discussed a similar patch before, see
https://lkml.org/lkml/2015/2/25/390
> #if __LINUX_ARM_ARCH__ >= 6
> @@ -50,6 +52,23 @@
> : "r" (x), "r" (ptr)
> : "memory", "cc");
> break;
> + /*
> + * halfword exclusive exchange
> + * This is new implementation as qspinlock
> + * wants 16 bit atomic CAS.
> + */
> + case 2:
> + asm volatile("@ __xchg2\n"
> + "1: ldrexh %0, [%3]\n"
> + " strexh %1, %2, [%3]\n"
> + " teq %1, #0\n"
> + " bne 1b"
> + : "=&r" (ret), "=&r" (tmp)
> + : "r" (x), "r" (ptr)
> + : "memory", "cc");
> + break;
> case 4:
> asm volatile("@ __xchg4\n"
> "1: ldrex %0, [%3]\n"
Please try to find a way to make this compile when CONFIG_CPU_V6
is set.
Arnd
ÿôèº{.nÇ+·®+%Ëÿ±éݶ\x17¥wÿº{.nÇ+·¥{±þG«éÿ{ayº\x1dÊÚë,j\a¢f£¢·hïêÿêçz_è®\x03(éÝ¢j"ú\x1a¶^[m§ÿÿ¾\a«þG«éÿ¢¸?¨èÚ&£ø§~á¶iOæ¬z·vØ^\x14\x04\x1a¶^[m§ÿÿÃ\fÿ¶ìÿ¢¸?I¥
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [RFC] arm: Add for atomic half word exchange
2015-05-19 11:20 [RFC] arm: Add for atomic half word exchange Sarbojit Ganguly
@ 2015-05-19 11:42 ` Arnd Bergmann
2015-05-19 12:13 ` Russell King - ARM Linux
2015-05-19 12:43 ` Peter Zijlstra
2 siblings, 0 replies; 10+ messages in thread
From: Arnd Bergmann @ 2015-05-19 11:42 UTC (permalink / raw)
To: linux-arm-kernel, ganguly.s
Cc: Waiman.Long, peterz, vikram.m, raghavendra.kt, oleg,
linux-kernel, mingo, SHARAN ALLUR, hpa, tglx, torvalds
On Tuesday 19 May 2015 11:20:11 Sarbojit Ganguly wrote:
> Yes, I will try to do that. OTOH, I saw that there was a discussion on
> removal of bad_xchg() altogether. Perhaps that approach be better than
> adding this half word exchange?
Removing bad_xchg() only helps insofar as we get a link-time error
instead of a run-time error. That is certainly a good idea, but it
does not solve your problem.
Arnd
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [RFC] arm: Add for atomic half word exchange
2015-05-19 11:20 [RFC] arm: Add for atomic half word exchange Sarbojit Ganguly
2015-05-19 11:42 ` Arnd Bergmann
@ 2015-05-19 12:13 ` Russell King - ARM Linux
2015-05-19 12:43 ` Peter Zijlstra
2 siblings, 0 replies; 10+ messages in thread
From: Russell King - ARM Linux @ 2015-05-19 12:13 UTC (permalink / raw)
To: Sarbojit Ganguly
Cc: Arnd Bergmann, linux-arm-kernel, Waiman.Long, peterz, vikram.m,
raghavendra.kt, oleg, linux-kernel, mingo, SHARAN ALLUR, hpa,
tglx, torvalds
On Tue, May 19, 2015 at 11:20:11AM +0000, Sarbojit Ganguly wrote:
> Yes, I will try to do that. OTOH, I saw that there was a discussion on
> removal of bad_xchg() altogether. Perhaps that approach be better than
> adding this half word exchange?
The only possibility for removal of __bad_xchg() is to remove it's
_definition_ only, not its callsite, so that we get a _link_ time
error for use cases we don't support.
Removing its callsite leaves us open to code malfunction: xchg()
effectively becomes a no-op for sizes which are not supported, and
that's a _very_ bad thing to happen.
--
FTTC broadband for 0.8mile line: currently at 10.5Mbps down 400kbps up
according to speedtest.net.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [RFC] arm: Add for atomic half word exchange
2015-05-19 11:20 [RFC] arm: Add for atomic half word exchange Sarbojit Ganguly
2015-05-19 11:42 ` Arnd Bergmann
2015-05-19 12:13 ` Russell King - ARM Linux
@ 2015-05-19 12:43 ` Peter Zijlstra
2 siblings, 0 replies; 10+ messages in thread
From: Peter Zijlstra @ 2015-05-19 12:43 UTC (permalink / raw)
To: Sarbojit Ganguly
Cc: Arnd Bergmann, linux-arm-kernel, tglx, mingo, hpa, Waiman.Long,
raghavendra.kt, oleg, linux-kernel, SHARAN ALLUR, torvalds,
vikram.m
On Tue, May 19, 2015 at 11:20:13AM +0000, Sarbojit Ganguly wrote:
> On Tuesday 19 May 2015 09:39:33 Sarbojit Ganguly wrote:
> > Since 16 bit half word exchange was not there and MCS based
> > qspinlock by Waiman's xchg_tail() requires an atomic exchange on a
> > half word, here is a small modification to __xchg() code.
Can you actually see a performance improvement with the qspinlock code
on ARM ?
The real improvements on x86 were on NUMA systems; although there were
real improvements on light loads as well.
Note that ARM (or any load-store arch) could get rid of all the cmpxchg
loops in that code. Although I suppose we replaced the most common ones
with these unconditional atomics already -- like that xchg16 -- so
implementing those with ll/sc, as you did, should be near optimal.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [RFC] arm: Add for atomic half word exchange
2015-06-05 1:17 Re: " Sarbojit Ganguly
@ 2015-06-05 12:33 ` Arnd Bergmann
0 siblings, 0 replies; 10+ messages in thread
From: Arnd Bergmann @ 2015-06-05 12:33 UTC (permalink / raw)
To: ganguly.s
Cc: Raghavendra K T, linux-arm-kernel, SUNEEL KUMAR SURIMANI,
VIKRAM MUPPARTHI, tglx, mingo, hpa, peterz, Waiman.Long, oleg,
linux-kernel, SHARAN ALLUR, torvalds
On Friday 05 June 2015 01:17:13 Sarbojit Ganguly wrote:
> Since the compilation is also a success for CONFIG_CPU_V6 with the patch,
> I think we're good to go.
I'm not following your logic. Did you create a new patch that addresses
compilation with CONFIG_CPU_V6?
The original patch should cause a compile error on v6, if it does not,
you either have a broken assembler or you forgot to enable the code
that uses these instructions. Neither of them would be a reason to
merge the patch. Please explain what is going on.
Arnd
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [RFC] arm: Add for atomic half word exchange
2015-06-02 6:21 Sarbojit Ganguly
@ 2015-06-02 10:49 ` Arnd Bergmann
0 siblings, 0 replies; 10+ messages in thread
From: Arnd Bergmann @ 2015-06-02 10:49 UTC (permalink / raw)
To: ganguly.s
Cc: Raghavendra K T, linux-arm-kernel, SUNEEL KUMAR SURIMANI,
VIKRAM MUPPARTHI, tglx, mingo, hpa, peterz, Waiman.Long, oleg,
linux-kernel, SHARAN ALLUR, torvalds
On Tuesday 02 June 2015 06:21:43 Sarbojit Ganguly wrote:
> Hello Raghavendra,
>
> That is exactly I had done albeit from menuconfig. Basically the whole point was to make sure my patch compiles against ARM11 architectures as well. Hence I ensured the .config contains the relevant flags on.
>
>
Most ARM11 implementations are ARMv6k, which has support for the instruction,
but the older ARMv6 (without k) does not. From your description, I assume
you only tested with ARMv6k, not ARMv6. Please enable the OMAP2 and IMX31
platforms to test ARMv6.
Arnd
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [RFC] arm: Add for atomic half word exchange
2015-06-02 5:49 Sarbojit Ganguly
@ 2015-06-02 6:11 ` Raghavendra K T
0 siblings, 0 replies; 10+ messages in thread
From: Raghavendra K T @ 2015-06-02 6:11 UTC (permalink / raw)
To: ganguly.s
Cc: Arnd Bergmann, linux-arm-kernel, SUNEEL KUMAR SURIMANI, vikram.m,
tglx, mingo, hpa, peterz, Waiman.Long, oleg, linux-kernel,
SHARAN ALLUR, torvalds
On 06/02/2015 11:19 AM, Sarbojit Ganguly wrote:
> I made the CONFIG_ARCH_MULTI_V6=y and
> CONFIG_CPU_V6K=y
> CONFIG_CPU_32v6=y
> CONFIG_CPU_32v6K=y
>
> and compiled 4.0.4 with the patch. Result is a compilation success.
>
> Regards,
> Sarbojit
>
Hi Sarbojit,
I am not familiar about the implication of setting those options
unconditionally, But from Kconfig point of view
Arnd was expecting something like below IIUC
with the patch: (a quick example)
---8<---
diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
index 9f1f09a..194fc13 100644
--- a/arch/arm/Kconfig
+++ b/arch/arm/Kconfig
@@ -812,6 +812,8 @@ config ARCH_MULTI_V6
bool "ARMv6 based platforms (ARM11)"
select ARCH_MULTI_V6_V7
select CPU_V6K
+ select CPU_32v6
+ select CPU_32v6k
config ARCH_MULTI_V7
bool "ARMv7 based platforms (Cortex-A, PJ4, Scorpion, Krait)"
> ------- Original Message -------
> Sender : Arnd Bergmann<arnd@arndb.de>
> Date : May 19, 2015 18:51 (GMT+09:00)
> Title : Re: [RFC] arm: Add for atomic half word exchange
>
> On Tuesday 19 May 2015 09:39:33 Sarbojit Ganguly wrote:
>> Since 16 bit half word exchange was not there and MCS based qspinlock by Waiman's xchg_tail() requires an atomic exchange on a half word,
>> here is a small modification to __xchg() code.
>
> We have discussed a similar patch before, see
> https://lkml.org/lkml/2015/2/25/390
>
>> #if __LINUX_ARM_ARCH__ >= 6
>> @@ -50,6 +52,23 @@
>> : "r" (x), "r" (ptr)
>> : "memory", "cc");
>> break;
>> + /*
>> + * halfword exclusive exchange
>> + * This is new implementation as qspinlock
>> + * wants 16 bit atomic CAS.
>> + */
>> + case 2:
>> + asm volatile("@ __xchg2\n"
>> + "1: ldrexh %0, [%3]\n"
>> + " strexh %1, %2, [%3]\n"
>> + " teq %1, #0\n"
>> + " bne 1b"
>> + : "=&r" (ret), "=&r" (tmp)
>> + : "r" (x), "r" (ptr)
>> + : "memory", "cc");
>> + break;
>> case 4:
>> asm volatile("@ __xchg4\n"
>> "1: ldrex %0, [%3]\n"
>
> Please try to find a way to make this compile when CONFIG_CPU_V6
> is set.
>
> Arnd
>
^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [RFC] arm: Add for atomic half word exchange
2015-05-20 5:09 Sarbojit Ganguly
@ 2015-05-20 6:51 ` Arnd Bergmann
0 siblings, 0 replies; 10+ messages in thread
From: Arnd Bergmann @ 2015-05-20 6:51 UTC (permalink / raw)
To: ganguly.s
Cc: Peter Zijlstra, linux-arm-kernel, tglx, mingo, hpa, Waiman.Long,
raghavendra.kt, oleg, linux-kernel, SHARAN ALLUR, torvalds,
VIKRAM MUPPARTHI, SUNEEL KUMAR SURIMANI
On Wednesday 20 May 2015 05:09:35 Sarbojit Ganguly wrote:
> > ------- Original Message -------
> > Sender : Peter Zijlstra<peterz@infradead.org>
> > Date : May 19, 2015 21:43 (GMT+09:00)
> > Title : Re: [RFC] arm: Add for atomic half word exchange
> >
> > On Tue, May 19, 2015 at 11:20:13AM +0000, Sarbojit Ganguly wrote:
> > > On Tuesday 19 May 2015 09:39:33 Sarbojit Ganguly wrote:
> > > > Since 16 bit half word exchange was not there and MCS based
> > > > qspinlock by Waiman's xchg_tail() requires an atomic exchange on a
> > > > half word, here is a small modification to __xchg() code.
> >
> > Can you actually see a performance improvement with the qspinlock code
> > on ARM ?
> >
> > The real improvements on x86 were on NUMA systems; although there were
> > real improvements on light loads as well.
> >
> >
> > Note that ARM (or any load-store arch) could get rid of all the cmpxchg
> > loops in that code. Although I suppose we replaced the most common ones
> > with these unconditional atomics already -- like that xchg16 -- so
> > implementing those with ll/sc, as you did, should be near optimal.
>
> Yes, the main advantage of Qspinlock code can be observed in NUMA but
> when I tested in an embedded system, a slight advantage was observed.
Is this a multi-cluster SMP system? Those can behave like NUMA
machines in some ways.
We could easily limit the use of 16-bit xchg() to ARMv7 machines
by using
select ARCH_USE_QUEUED_SPINLOCKS if !SMP_ON_UP
or
select ARCH_USE_QUEUED_SPINLOCKS if !CPU_V6
when enabling the qspinlock implementation.
Arnd
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [RFC] arm: Add for atomic half word exchange
2015-05-19 9:39 Sarbojit Ganguly
@ 2015-05-19 9:51 ` Arnd Bergmann
0 siblings, 0 replies; 10+ messages in thread
From: Arnd Bergmann @ 2015-05-19 9:51 UTC (permalink / raw)
To: linux-arm-kernel, ganguly.s
Cc: tglx, mingo, hpa, peterz, Waiman.Long, raghavendra.kt, oleg,
linux-kernel, SHARAN ALLUR, torvalds
On Tuesday 19 May 2015 09:39:33 Sarbojit Ganguly wrote:
> Since 16 bit half word exchange was not there and MCS based qspinlock by Waiman's xchg_tail() requires an atomic exchange on a half word,
> here is a small modification to __xchg() code.
We have discussed a similar patch before, see
https://lkml.org/lkml/2015/2/25/390
> #if __LINUX_ARM_ARCH__ >= 6
> @@ -50,6 +52,23 @@
> : "r" (x), "r" (ptr)
> : "memory", "cc");
> break;
> + /*
> + * halfword exclusive exchange
> + * This is new implementation as qspinlock
> + * wants 16 bit atomic CAS.
> + */
> + case 2:
> + asm volatile("@ __xchg2\n"
> + "1: ldrexh %0, [%3]\n"
> + " strexh %1, %2, [%3]\n"
> + " teq %1, #0\n"
> + " bne 1b"
> + : "=&r" (ret), "=&r" (tmp)
> + : "r" (x), "r" (ptr)
> + : "memory", "cc");
> + break;
> case 4:
> asm volatile("@ __xchg4\n"
> "1: ldrex %0, [%3]\n"
Please try to find a way to make this compile when CONFIG_CPU_V6
is set.
Arnd
^ permalink raw reply [flat|nested] 10+ messages in thread
* [RFC] arm: Add for atomic half word exchange
@ 2015-05-19 9:39 Sarbojit Ganguly
2015-05-19 9:51 ` Arnd Bergmann
0 siblings, 1 reply; 10+ messages in thread
From: Sarbojit Ganguly @ 2015-05-19 9:39 UTC (permalink / raw)
To: tglx, mingo, hpa, peterz, Waiman.Long
Cc: torvalds, raghavendra.kt, oleg, linux-kernel, linux-arm-kernel,
SHARAN ALLUR
[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset=utf-8, Size: 1846 bytes --]
Since 16 bit half word exchange was not there and MCS based qspinlock by Waiman's xchg_tail() requires an atomic exchange on a half word,
here is a small modification to __xchg() code.
--- /linux.trees.git/tip/arch/arm/include/asm/cmpxchg.h 2015-05-11 23:36:06.942583615 +0530
+++ arch/arm/include/asm/cmpxchg.h 2015-04-08 18:40:43.276255712 +0530
@@ -2,9 +2,12 @@
#define __ASM_ARM_CMPXCHG_H
#include <linux/irqflags.h>
#include <asm/barrier.h>
#if defined(CONFIG_CPU_SA1100) || defined(CONFIG_CPU_SA110)
/*
* On the StrongARM, "swp" is terminally broken since it bypasses the
@@ -36,7 +39,6 @@
#endif
smp_mb();
switch (size) {
#if __LINUX_ARM_ARCH__ >= 6
@@ -50,6 +52,23 @@
: "r" (x), "r" (ptr)
: "memory", "cc");
break;
+ /*
+ * halfword exclusive exchange
+ * This is new implementation as qspinlock
+ * wants 16 bit atomic CAS.
+ */
+ case 2:
+ asm volatile("@ __xchg2\n"
+ "1: ldrexh %0, [%3]\n"
+ " strexh %1, %2, [%3]\n"
+ " teq %1, #0\n"
+ " bne 1b"
+ : "=&r" (ret), "=&r" (tmp)
+ : "r" (x), "r" (ptr)
+ : "memory", "cc");
+ break;
case 4:
asm volatile("@ __xchg4\n"
"1: ldrex %0, [%3]\n"
@@ -94,6 +113,10 @@
break;
#endif
default:
__bad_xchg(ptr, size), ret = 0;
break;
}
Regards,
Sarbojitÿôèº{.nÇ+·®+%Ëÿ±éݶ\x17¥wÿº{.nÇ+·¥{±þG«éÿ{ayº\x1dÊÚë,j\a¢f£¢·hïêÿêçz_è®\x03(éÝ¢j"ú\x1a¶^[m§ÿÿ¾\a«þG«éÿ¢¸?¨èÚ&£ø§~á¶iOæ¬z·vØ^\x14\x04\x1a¶^[m§ÿÿÃ\fÿ¶ìÿ¢¸?I¥
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2015-06-05 12:36 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-05-19 11:20 [RFC] arm: Add for atomic half word exchange Sarbojit Ganguly
2015-05-19 11:42 ` Arnd Bergmann
2015-05-19 12:13 ` Russell King - ARM Linux
2015-05-19 12:43 ` Peter Zijlstra
-- strict thread matches above, loose matches on Subject: below --
2015-06-05 1:17 Re: " Sarbojit Ganguly
2015-06-05 12:33 ` Arnd Bergmann
2015-06-02 6:21 Sarbojit Ganguly
2015-06-02 10:49 ` Arnd Bergmann
2015-06-02 5:49 Sarbojit Ganguly
2015-06-02 6:11 ` Raghavendra K T
2015-05-20 5:09 Sarbojit Ganguly
2015-05-20 6:51 ` Arnd Bergmann
2015-05-19 9:39 Sarbojit Ganguly
2015-05-19 9:51 ` Arnd Bergmann
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).