linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC] arm: Add for atomic half word exchange
@ 2015-05-19 11:20 Sarbojit Ganguly
  2015-05-19 11:42 ` Arnd Bergmann
                   ` (2 more replies)
  0 siblings, 3 replies; 10+ messages in thread
From: Sarbojit Ganguly @ 2015-05-19 11:20 UTC (permalink / raw)
  To: Arnd Bergmann, linux-arm-kernel
  Cc: tglx, mingo, hpa, peterz, Waiman.Long, raghavendra.kt, oleg,
	linux-kernel, SHARAN ALLUR, torvalds, vikram.m

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset=windows-1252, Size: 1822 bytes --]

Yes, I will try to do that. OTOH, I saw that  there was a discussion on removal of bad_xchg() altogether. Perhaps that approach be better than adding this half word exchange?


Regards,
Sarbojit

------- Original Message -------
Sender : Arnd Bergmann<arnd@arndb.de>
Date : May 19, 2015 18:51 (GMT+09:00)
Title : Re: [RFC] arm: Add for atomic half word exchange

On Tuesday 19 May 2015 09:39:33 Sarbojit Ganguly wrote:
> Since 16 bit half word exchange was not there and MCS based qspinlock by Waiman's xchg_tail() requires an atomic exchange on a half word,
> here is a small modification to __xchg() code.

We have discussed a similar patch before, see
https://lkml.org/lkml/2015/2/25/390

>  #if __LINUX_ARM_ARCH__ >= 6
> @@ -50,6 +52,23 @@
>                         : "r" (x), "r" (ptr)
>                         : "memory", "cc");
>                 break;
> +               /* 
> +                * halfword exclusive exchange
> +                * This is new implementation as qspinlock
> +                * wants 16 bit atomic CAS.
> +                */
> +       case 2:
> +               asm volatile("@ __xchg2\n"
> +               "1:     ldrexh  %0, [%3]\n"
> +               "       strexh  %1, %2, [%3]\n"
> +               "       teq     %1, #0\n"
> +               "       bne     1b"
> +                       : "=&r" (ret), "=&r" (tmp)
> +                       : "r" (x), "r" (ptr)
> +                       : "memory", "cc");
> +               break;
>         case 4: 
>                 asm volatile("@ __xchg4\n"
>                 "1:     ldrex   %0, [%3]\n"

Please try to find a way to make this compile when CONFIG_CPU_V6
is set.

Arnd
ÿôèº{.nÇ+‰·Ÿ®‰­†+%ŠËÿ±éݶ\x17¥Šwÿº{.nÇ+‰·¥Š{±þG«éÿŠ{ayº\x1dʇڙë,j\a­¢f£¢·hšïêÿ‘êçz_è®\x03(­éšŽŠÝ¢j"ú\x1a¶^[m§ÿÿ¾\a«þG«éÿ¢¸?™¨è­Ú&£ø§~á¶iO•æ¬z·švØ^\x14\x04\x1a¶^[m§ÿÿÃ\fÿ¶ìÿ¢¸?–I¥

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [RFC] arm: Add for atomic half word exchange
  2015-05-19 11:20 [RFC] arm: Add for atomic half word exchange Sarbojit Ganguly
@ 2015-05-19 11:42 ` Arnd Bergmann
  2015-05-19 12:13 ` Russell King - ARM Linux
  2015-05-19 12:43 ` Peter Zijlstra
  2 siblings, 0 replies; 10+ messages in thread
From: Arnd Bergmann @ 2015-05-19 11:42 UTC (permalink / raw)
  To: linux-arm-kernel, ganguly.s
  Cc: Waiman.Long, peterz, vikram.m, raghavendra.kt, oleg,
	linux-kernel, mingo, SHARAN ALLUR, hpa, tglx, torvalds

On Tuesday 19 May 2015 11:20:11 Sarbojit Ganguly wrote:
> Yes, I will try to do that. OTOH, I saw that  there was a discussion on
> removal of bad_xchg() altogether. Perhaps that approach be better than
> adding this half word exchange?

Removing bad_xchg() only helps insofar as we get a link-time error
instead of a run-time error. That is certainly a good idea, but it
does not solve your problem.

	Arnd

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [RFC] arm: Add for atomic half word exchange
  2015-05-19 11:20 [RFC] arm: Add for atomic half word exchange Sarbojit Ganguly
  2015-05-19 11:42 ` Arnd Bergmann
@ 2015-05-19 12:13 ` Russell King - ARM Linux
  2015-05-19 12:43 ` Peter Zijlstra
  2 siblings, 0 replies; 10+ messages in thread
From: Russell King - ARM Linux @ 2015-05-19 12:13 UTC (permalink / raw)
  To: Sarbojit Ganguly
  Cc: Arnd Bergmann, linux-arm-kernel, Waiman.Long, peterz, vikram.m,
	raghavendra.kt, oleg, linux-kernel, mingo, SHARAN ALLUR, hpa,
	tglx, torvalds

On Tue, May 19, 2015 at 11:20:11AM +0000, Sarbojit Ganguly wrote:
> Yes, I will try to do that. OTOH, I saw that  there was a discussion on
> removal of bad_xchg() altogether. Perhaps that approach be better than
> adding this half word exchange?

The only possibility for removal of __bad_xchg() is to remove it's
_definition_ only, not its callsite, so that we get a _link_ time
error for use cases we don't support.

Removing its callsite leaves us open to code malfunction: xchg()
effectively becomes a no-op for sizes which are not supported, and
that's a _very_ bad thing to happen.

-- 
FTTC broadband for 0.8mile line: currently at 10.5Mbps down 400kbps up
according to speedtest.net.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [RFC] arm: Add for atomic half word exchange
  2015-05-19 11:20 [RFC] arm: Add for atomic half word exchange Sarbojit Ganguly
  2015-05-19 11:42 ` Arnd Bergmann
  2015-05-19 12:13 ` Russell King - ARM Linux
@ 2015-05-19 12:43 ` Peter Zijlstra
  2 siblings, 0 replies; 10+ messages in thread
From: Peter Zijlstra @ 2015-05-19 12:43 UTC (permalink / raw)
  To: Sarbojit Ganguly
  Cc: Arnd Bergmann, linux-arm-kernel, tglx, mingo, hpa, Waiman.Long,
	raghavendra.kt, oleg, linux-kernel, SHARAN ALLUR, torvalds,
	vikram.m

On Tue, May 19, 2015 at 11:20:13AM +0000, Sarbojit Ganguly wrote:
> On Tuesday 19 May 2015 09:39:33 Sarbojit Ganguly wrote:
> > Since 16 bit half word exchange was not there and MCS based
> > qspinlock by Waiman's xchg_tail() requires an atomic exchange on a
> > half word, here is a small modification to __xchg() code.

Can you actually see a performance improvement with the qspinlock code
on ARM ?

The real improvements on x86 were on NUMA systems; although there were
real improvements on light loads as well.


Note that ARM (or any load-store arch) could get rid of all the cmpxchg
loops in that code. Although I suppose we replaced the most common ones
with these unconditional atomics already -- like that xchg16 -- so
implementing those with ll/sc, as you did, should be near optimal.



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [RFC] arm: Add for atomic half word exchange
  2015-06-05  1:17 Re: " Sarbojit Ganguly
@ 2015-06-05 12:33 ` Arnd Bergmann
  0 siblings, 0 replies; 10+ messages in thread
From: Arnd Bergmann @ 2015-06-05 12:33 UTC (permalink / raw)
  To: ganguly.s
  Cc: Raghavendra K T, linux-arm-kernel, SUNEEL KUMAR SURIMANI,
	VIKRAM MUPPARTHI, tglx, mingo, hpa, peterz, Waiman.Long, oleg,
	linux-kernel, SHARAN ALLUR, torvalds

On Friday 05 June 2015 01:17:13 Sarbojit Ganguly wrote:
> Since the compilation is also a success for CONFIG_CPU_V6 with the patch,
> I think we're good to go.

I'm not following your logic. Did you create a new patch that addresses
compilation with CONFIG_CPU_V6?

The original patch should cause a compile error on v6, if it does not,
you either have a broken assembler or you forgot to enable the code
that uses these instructions. Neither of them would be a reason to
merge the patch. Please explain what is going on.

	Arnd

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [RFC] arm: Add for atomic half word exchange
  2015-06-02  6:21 Sarbojit Ganguly
@ 2015-06-02 10:49 ` Arnd Bergmann
  0 siblings, 0 replies; 10+ messages in thread
From: Arnd Bergmann @ 2015-06-02 10:49 UTC (permalink / raw)
  To: ganguly.s
  Cc: Raghavendra K T, linux-arm-kernel, SUNEEL KUMAR SURIMANI,
	VIKRAM MUPPARTHI, tglx, mingo, hpa, peterz, Waiman.Long, oleg,
	linux-kernel, SHARAN ALLUR, torvalds

On Tuesday 02 June 2015 06:21:43 Sarbojit Ganguly wrote:
> Hello Raghavendra,
> 
> That is exactly I had done albeit from menuconfig. Basically the whole point was to make sure my patch compiles against ARM11 architectures as well. Hence I ensured the .config contains the relevant flags on. 
> 
> 

Most ARM11 implementations are ARMv6k, which has support for the instruction,
but the older ARMv6 (without k) does not. From your description, I assume
you only tested with ARMv6k, not ARMv6. Please enable the OMAP2 and IMX31
platforms to test ARMv6.

	Arnd

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [RFC] arm: Add for atomic half word exchange
  2015-06-02  5:49 Sarbojit Ganguly
@ 2015-06-02  6:11 ` Raghavendra K T
  0 siblings, 0 replies; 10+ messages in thread
From: Raghavendra K T @ 2015-06-02  6:11 UTC (permalink / raw)
  To: ganguly.s
  Cc: Arnd Bergmann, linux-arm-kernel, SUNEEL KUMAR SURIMANI, vikram.m,
	tglx, mingo, hpa, peterz, Waiman.Long, oleg, linux-kernel,
	SHARAN ALLUR, torvalds

On 06/02/2015 11:19 AM, Sarbojit Ganguly wrote:
> I made the CONFIG_ARCH_MULTI_V6=y and
> CONFIG_CPU_V6K=y
> CONFIG_CPU_32v6=y
> CONFIG_CPU_32v6K=y
>
> and compiled 4.0.4 with the patch. Result is a compilation success.
>
> Regards,
> Sarbojit
>

Hi Sarbojit,

I am not familiar about the implication of setting those options
unconditionally, But from Kconfig point of view
Arnd was expecting something like below IIUC
with the patch: (a quick example)

---8<---
diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
index 9f1f09a..194fc13 100644
--- a/arch/arm/Kconfig
+++ b/arch/arm/Kconfig
@@ -812,6 +812,8 @@ config ARCH_MULTI_V6
         bool "ARMv6 based platforms (ARM11)"
         select ARCH_MULTI_V6_V7
         select CPU_V6K
+       select CPU_32v6
+       select CPU_32v6k

  config ARCH_MULTI_V7
         bool "ARMv7 based platforms (Cortex-A, PJ4, Scorpion, Krait)"



> ------- Original Message -------
> Sender : Arnd Bergmann<arnd@arndb.de>
> Date : May 19, 2015 18:51 (GMT+09:00)
> Title : Re: [RFC] arm: Add for atomic half word exchange
>
> On Tuesday 19 May 2015 09:39:33 Sarbojit Ganguly wrote:
>> Since 16 bit half word exchange was not there and MCS based qspinlock by Waiman's xchg_tail() requires an atomic exchange on a half word,
>> here is a small modification to __xchg() code.
>
> We have discussed a similar patch before, see
> https://lkml.org/lkml/2015/2/25/390
>
>>   #if __LINUX_ARM_ARCH__ >= 6
>> @@ -50,6 +52,23 @@
>>                          : "r" (x), "r" (ptr)
>>                          : "memory", "cc");
>>                  break;
>> +               /*
>> +                * halfword exclusive exchange
>> +                * This is new implementation as qspinlock
>> +                * wants 16 bit atomic CAS.
>> +                */
>> +       case 2:
>> +               asm volatile("@ __xchg2\n"
>> +               "1:     ldrexh  %0, [%3]\n"
>> +               "       strexh  %1, %2, [%3]\n"
>> +               "       teq     %1, #0\n"
>> +               "       bne     1b"
>> +                       : "=&r" (ret), "=&r" (tmp)
>> +                       : "r" (x), "r" (ptr)
>> +                       : "memory", "cc");
>> +               break;
>>          case 4:
>>                  asm volatile("@ __xchg4\n"
>>                  "1:     ldrex   %0, [%3]\n"
>
> Please try to find a way to make this compile when CONFIG_CPU_V6
> is set.
>
> Arnd
>


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [RFC] arm: Add for atomic half word exchange
  2015-05-20  5:09 Sarbojit Ganguly
@ 2015-05-20  6:51 ` Arnd Bergmann
  0 siblings, 0 replies; 10+ messages in thread
From: Arnd Bergmann @ 2015-05-20  6:51 UTC (permalink / raw)
  To: ganguly.s
  Cc: Peter Zijlstra, linux-arm-kernel, tglx, mingo, hpa, Waiman.Long,
	raghavendra.kt, oleg, linux-kernel, SHARAN ALLUR, torvalds,
	VIKRAM MUPPARTHI, SUNEEL KUMAR SURIMANI

On Wednesday 20 May 2015 05:09:35 Sarbojit Ganguly wrote:

> > ------- Original Message -------
> > Sender : Peter Zijlstra<peterz@infradead.org>
> > Date : May 19, 2015 21:43 (GMT+09:00)
> > Title : Re: [RFC] arm: Add for atomic half word exchange
> > 
> > On Tue, May 19, 2015 at 11:20:13AM +0000, Sarbojit Ganguly wrote:
> > > On Tuesday 19 May 2015 09:39:33 Sarbojit Ganguly wrote:
> > > > Since 16 bit half word exchange was not there and MCS based
> > > > qspinlock by Waiman's xchg_tail() requires an atomic exchange on a
> > > > half word, here is a small modification to __xchg() code.
> > 
> > Can you actually see a performance improvement with the qspinlock code
> > on ARM ?
> > 
> > The real improvements on x86 were on NUMA systems; although there were
> > real improvements on light loads as well.
> > 
> > 
> > Note that ARM (or any load-store arch) could get rid of all the cmpxchg
> > loops in that code. Although I suppose we replaced the most common ones
> > with these unconditional atomics already -- like that xchg16 -- so
> > implementing those with ll/sc, as you did, should be near optimal.
>
> Yes, the main advantage of Qspinlock code can be observed in NUMA but
> when I tested in an embedded system, a slight advantage was observed.

Is this a multi-cluster SMP system? Those can behave like NUMA
machines in some ways.

We could easily limit the use of 16-bit xchg() to ARMv7 machines
by using

	select ARCH_USE_QUEUED_SPINLOCKS if !SMP_ON_UP

or

	select ARCH_USE_QUEUED_SPINLOCKS if !CPU_V6

when enabling the qspinlock implementation.

	Arnd

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [RFC] arm: Add for atomic half word exchange
  2015-05-19  9:39 Sarbojit Ganguly
@ 2015-05-19  9:51 ` Arnd Bergmann
  0 siblings, 0 replies; 10+ messages in thread
From: Arnd Bergmann @ 2015-05-19  9:51 UTC (permalink / raw)
  To: linux-arm-kernel, ganguly.s
  Cc: tglx, mingo, hpa, peterz, Waiman.Long, raghavendra.kt, oleg,
	linux-kernel, SHARAN ALLUR, torvalds

On Tuesday 19 May 2015 09:39:33 Sarbojit Ganguly wrote:
> Since 16 bit half word exchange was not there and MCS based qspinlock by Waiman's xchg_tail() requires an atomic exchange on a half word,
> here is a small modification to __xchg() code.

We have discussed a similar patch before, see
https://lkml.org/lkml/2015/2/25/390

>  #if __LINUX_ARM_ARCH__ >= 6
> @@ -50,6 +52,23 @@
>                         : "r" (x), "r" (ptr)
>                         : "memory", "cc");
>                 break;
> +               /* 
> +                * halfword exclusive exchange
> +                * This is new implementation as qspinlock
> +                * wants 16 bit atomic CAS.
> +                */
> +       case 2:
> +               asm volatile("@ __xchg2\n"
> +               "1:     ldrexh  %0, [%3]\n"
> +               "       strexh  %1, %2, [%3]\n"
> +               "       teq     %1, #0\n"
> +               "       bne     1b"
> +                       : "=&r" (ret), "=&r" (tmp)
> +                       : "r" (x), "r" (ptr)
> +                       : "memory", "cc");
> +               break;
>         case 4: 
>                 asm volatile("@ __xchg4\n"
>                 "1:     ldrex   %0, [%3]\n"

Please try to find a way to make this compile when CONFIG_CPU_V6
is set.

	Arnd

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [RFC] arm: Add for atomic half word exchange
@ 2015-05-19  9:39 Sarbojit Ganguly
  2015-05-19  9:51 ` Arnd Bergmann
  0 siblings, 1 reply; 10+ messages in thread
From: Sarbojit Ganguly @ 2015-05-19  9:39 UTC (permalink / raw)
  To: tglx, mingo, hpa, peterz, Waiman.Long
  Cc: torvalds, raghavendra.kt, oleg, linux-kernel, linux-arm-kernel,
	SHARAN ALLUR

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset=utf-8, Size: 1846 bytes --]

Since 16 bit half word exchange was not there and MCS based qspinlock by Waiman's xchg_tail() requires an atomic exchange on a half word,
here is a small modification to __xchg() code.

--- /linux.trees.git/tip/arch/arm/include/asm/cmpxchg.h      2015-05-11     23:36:06.942583615 +0530
+++ arch/arm/include/asm/cmpxchg.h      2015-04-08 18:40:43.276255712 +0530
@@ -2,9 +2,12 @@
 #define __ASM_ARM_CMPXCHG_H

 #include <linux/irqflags.h>
 #include <asm/barrier.h>

 #if defined(CONFIG_CPU_SA1100) || defined(CONFIG_CPU_SA110)
 /*
  * On the StrongARM, "swp" is terminally broken since it bypasses the
@@ -36,7 +39,6 @@
 #endif

        smp_mb();

        switch (size) {
 #if __LINUX_ARM_ARCH__ >= 6
@@ -50,6 +52,23 @@
                        : "r" (x), "r" (ptr)
                        : "memory", "cc");
                break;
+               /* 
+                * halfword exclusive exchange
+                * This is new implementation as qspinlock
+                * wants 16 bit atomic CAS.
+                */
+       case 2:
+               asm volatile("@ __xchg2\n"
+               "1:     ldrexh  %0, [%3]\n"
+               "       strexh  %1, %2, [%3]\n"
+               "       teq     %1, #0\n"
+               "       bne     1b"
+                       : "=&r" (ret), "=&r" (tmp)
+                       : "r" (x), "r" (ptr)
+                       : "memory", "cc");
+               break;
        case 4: 
                asm volatile("@ __xchg4\n"
                "1:     ldrex   %0, [%3]\n"
@@ -94,6 +113,10 @@ 
                break;
 #endif 
        default:
                __bad_xchg(ptr, size), ret = 0;
                break;
        }


Regards,
Sarbojitÿôèº{.nÇ+‰·Ÿ®‰­†+%ŠËÿ±éݶ\x17¥Šwÿº{.nÇ+‰·¥Š{±þG«éÿŠ{ayº\x1dʇڙë,j\a­¢f£¢·hšïêÿ‘êçz_è®\x03(­éšŽŠÝ¢j"ú\x1a¶^[m§ÿÿ¾\a«þG«éÿ¢¸?™¨è­Ú&£ø§~á¶iO•æ¬z·švØ^\x14\x04\x1a¶^[m§ÿÿÃ\fÿ¶ìÿ¢¸?–I¥

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2015-06-05 12:36 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-05-19 11:20 [RFC] arm: Add for atomic half word exchange Sarbojit Ganguly
2015-05-19 11:42 ` Arnd Bergmann
2015-05-19 12:13 ` Russell King - ARM Linux
2015-05-19 12:43 ` Peter Zijlstra
  -- strict thread matches above, loose matches on Subject: below --
2015-06-05  1:17 Re: " Sarbojit Ganguly
2015-06-05 12:33 ` Arnd Bergmann
2015-06-02  6:21 Sarbojit Ganguly
2015-06-02 10:49 ` Arnd Bergmann
2015-06-02  5:49 Sarbojit Ganguly
2015-06-02  6:11 ` Raghavendra K T
2015-05-20  5:09 Sarbojit Ganguly
2015-05-20  6:51 ` Arnd Bergmann
2015-05-19  9:39 Sarbojit Ganguly
2015-05-19  9:51 ` Arnd Bergmann

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).