linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: Re: [RFC] arm: Add for atomic half word exchange
@ 2015-05-20  5:09 Sarbojit Ganguly
  2015-05-20  6:51 ` Arnd Bergmann
  2015-05-20  6:57 ` Peter Zijlstra
  0 siblings, 2 replies; 7+ messages in thread
From: Sarbojit Ganguly @ 2015-05-20  5:09 UTC (permalink / raw)
  To: Peter Zijlstra, Sarbojit Ganguly
  Cc: Arnd Bergmann, linux-arm-kernel, tglx, mingo, hpa, Waiman.Long,
	raghavendra.kt, oleg, linux-kernel, SHARAN ALLUR, torvalds,
	VIKRAM MUPPARTHI, SUNEEL KUMAR SURIMANI

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset=windows-1252, Size: 1779 bytes --]

Yes, the main advantage of Qspinlock code can be observed in NUMA but when I tested in an embedded system, a slight advantage was observed.


------- Original Message -------
Sender : Peter Zijlstra<peterz@infradead.org>
Date : May 19, 2015 21:43 (GMT+09:00)
Title : Re: [RFC] arm: Add for atomic half word exchange

On Tue, May 19, 2015 at 11:20:13AM +0000, Sarbojit Ganguly wrote:
> On Tuesday 19 May 2015 09:39:33 Sarbojit Ganguly wrote:
> > Since 16 bit half word exchange was not there and MCS based
> > qspinlock by Waiman's xchg_tail() requires an atomic exchange on a
> > half word, here is a small modification to __xchg() code.

Can you actually see a performance improvement with the qspinlock code
on ARM ?

The real improvements on x86 were on NUMA systems; although there were
real improvements on light loads as well.


Note that ARM (or any load-store arch) could get rid of all the cmpxchg
loops in that code. Although I suppose we replaced the most common ones
with these unconditional atomics already -- like that xchg16 -- so
implementing those with ll/sc, as you did, should be near optimal.




?????
???   ??   ?? ??
----------------------------------------------------------------------+
The Tao lies beyond Yin and Yang. It is silent and still as a pool of water.      |
It does not seek fame, therefore nobody knows its presence.                       |
It does not seek fortune, for it is complete within itself.                       |
It exists beyond space and time.                                                  |
----------------------------------------------------------------------+ÿôèº{.nÇ+‰·Ÿ®‰­†+%ŠËÿ±éݶ\x17¥Šwÿº{.nÇ+‰·¥Š{±þG«éÿŠ{ayº\x1dʇڙë,j\a­¢f£¢·hšïêÿ‘êçz_è®\x03(­éšŽŠÝ¢j"ú\x1a¶^[m§ÿÿ¾\a«þG«éÿ¢¸?™¨è­Ú&£ø§~á¶iO•æ¬z·švØ^\x14\x04\x1a¶^[m§ÿÿÃ\fÿ¶ìÿ¢¸?–I¥

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [RFC] arm: Add for atomic half word exchange
  2015-05-20  5:09 Re: [RFC] arm: Add for atomic half word exchange Sarbojit Ganguly
@ 2015-05-20  6:51 ` Arnd Bergmann
  2015-05-20  6:57 ` Peter Zijlstra
  1 sibling, 0 replies; 7+ messages in thread
From: Arnd Bergmann @ 2015-05-20  6:51 UTC (permalink / raw)
  To: ganguly.s
  Cc: Peter Zijlstra, linux-arm-kernel, tglx, mingo, hpa, Waiman.Long,
	raghavendra.kt, oleg, linux-kernel, SHARAN ALLUR, torvalds,
	VIKRAM MUPPARTHI, SUNEEL KUMAR SURIMANI

On Wednesday 20 May 2015 05:09:35 Sarbojit Ganguly wrote:

> > ------- Original Message -------
> > Sender : Peter Zijlstra<peterz@infradead.org>
> > Date : May 19, 2015 21:43 (GMT+09:00)
> > Title : Re: [RFC] arm: Add for atomic half word exchange
> > 
> > On Tue, May 19, 2015 at 11:20:13AM +0000, Sarbojit Ganguly wrote:
> > > On Tuesday 19 May 2015 09:39:33 Sarbojit Ganguly wrote:
> > > > Since 16 bit half word exchange was not there and MCS based
> > > > qspinlock by Waiman's xchg_tail() requires an atomic exchange on a
> > > > half word, here is a small modification to __xchg() code.
> > 
> > Can you actually see a performance improvement with the qspinlock code
> > on ARM ?
> > 
> > The real improvements on x86 were on NUMA systems; although there were
> > real improvements on light loads as well.
> > 
> > 
> > Note that ARM (or any load-store arch) could get rid of all the cmpxchg
> > loops in that code. Although I suppose we replaced the most common ones
> > with these unconditional atomics already -- like that xchg16 -- so
> > implementing those with ll/sc, as you did, should be near optimal.
>
> Yes, the main advantage of Qspinlock code can be observed in NUMA but
> when I tested in an embedded system, a slight advantage was observed.

Is this a multi-cluster SMP system? Those can behave like NUMA
machines in some ways.

We could easily limit the use of 16-bit xchg() to ARMv7 machines
by using

	select ARCH_USE_QUEUED_SPINLOCKS if !SMP_ON_UP

or

	select ARCH_USE_QUEUED_SPINLOCKS if !CPU_V6

when enabling the qspinlock implementation.

	Arnd

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Re: [RFC] arm: Add for atomic half word exchange
  2015-05-20  5:09 Re: [RFC] arm: Add for atomic half word exchange Sarbojit Ganguly
  2015-05-20  6:51 ` Arnd Bergmann
@ 2015-05-20  6:57 ` Peter Zijlstra
  1 sibling, 0 replies; 7+ messages in thread
From: Peter Zijlstra @ 2015-05-20  6:57 UTC (permalink / raw)
  To: Sarbojit Ganguly
  Cc: Arnd Bergmann, linux-arm-kernel, tglx, mingo, hpa, Waiman.Long,
	raghavendra.kt, oleg, linux-kernel, SHARAN ALLUR, torvalds,
	VIKRAM MUPPARTHI, SUNEEL KUMAR SURIMANI

On Wed, May 20, 2015 at 05:09:34AM +0000, Sarbojit Ganguly wrote:
> Yes, the main advantage of Qspinlock code can be observed in NUMA but
> when I tested in an embedded system, a slight advantage was observed.

OK, great!

This is the first !x86 port I'm aware of and a load-store arch at that,
so its good to hear our efforts at making it generic actually paid off.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Re: [RFC] arm: Add for atomic half word exchange
@ 2015-07-03 14:35 Sarbojit Ganguly
  0 siblings, 0 replies; 7+ messages in thread
From: Sarbojit Ganguly @ 2015-07-03 14:35 UTC (permalink / raw)
  To: Sarbojit Ganguly, Arnd Bergmann
  Cc: Raghavendra K T, linux-arm-kernel, SUNEEL KUMAR SURIMANI,
	VIKRAM MUPPARTHI, tglx, mingo, hpa, peterz, Waiman.Long, oleg,
	linux-kernel, SHARAN ALLUR, torvalds

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset=utf-8, Size: 2241 bytes --]

Sorry about that. I dug a bit deeper and found that the code was not getting executed and rectified that. The compilation indeed fails for the lack of ldrexh instruction support on ARMv6. 

Hence, my patch needs to be guarded with !CONFIG_CPU_V6 so as to allow the code run on  >=6k and above.

I will post the v2 soon.

------- Original Message -------
Sender : Arnd Bergmann<arnd@arndb.de>
Date : Jun 05, 2015 18:03 (GMT+05:30)
Title : Re: [RFC] arm: Add for atomic half word exchange

On Friday 05 June 2015 01:17:13 Sarbojit Ganguly wrote:
> Since the compilation is also a success for CONFIG_CPU_V6 with the patch,
> I think we're good to go.

I'm not following your logic. Did you create a new patch that addresses
compilation with CONFIG_CPU_V6?

The original patch should cause a compile error on v6, if it does not,
you either have a broken assembler or you forgot to enable the code
that uses these instructions. Neither of them would be a reason to
merge the patch. Please explain what is going on.

Arnd


감사합니다
사보짓   선임   삼성 전자
----------------------------------------------------------------------+
The Tao lies beyond Yin and Yang. It is silent and still as a pool of water.      |
It does not seek fame, therefore nobody knows its presence.                       |
It does not seek fortune, for it is complete within itself.                       |
It exists beyond space and time.                                                  |
----------------------------------------------------------------------+


감사합니다
사보짓   선임   삼성 전자
----------------------------------------------------------------------+
The Tao lies beyond Yin and Yang. It is silent and still as a pool of water.      |
It does not seek fame, therefore nobody knows its presence.                       |
It does not seek fortune, for it is complete within itself.                       |
It exists beyond space and time.                                                  |
----------------------------------------------------------------------+ÿôèº{.nÇ+‰·Ÿ®‰­†+%ŠËÿ±éݶ\x17¥Šwÿº{.nÇ+‰·¥Š{±þG«éÿŠ{ayº\x1dʇڙë,j\a­¢f£¢·hšïêÿ‘êçz_è®\x03(­éšŽŠÝ¢j"ú\x1a¶^[m§ÿÿ¾\a«þG«éÿ¢¸?™¨è­Ú&£ø§~á¶iO•æ¬z·švØ^\x14\x04\x1a¶^[m§ÿÿÃ\fÿ¶ìÿ¢¸?–I¥

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Re: [RFC] arm: Add for atomic half word exchange
@ 2015-06-02 11:11 Sarbojit Ganguly
  0 siblings, 0 replies; 7+ messages in thread
From: Sarbojit Ganguly @ 2015-06-02 11:11 UTC (permalink / raw)
  To: Arnd Bergmann, Sarbojit Ganguly
  Cc: Raghavendra K T, linux-arm-kernel, SUNEEL KUMAR SURIMANI,
	VIKRAM MUPPARTHI, tglx, mingo, hpa, peterz, Waiman.Long, oleg,
	linux-kernel, SHARAN ALLUR, torvalds

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset=windows-1252, Size: 1945 bytes --]

Yes,  I have enabled :

#
# Processor Type
#
CONFIG_CPU_V6=y
CONFIG_CPU_V6K=y
CONFIG_CPU_32v6=y
CONFIG_CPU_32v6K=y

and

# TI OMAP/AM/DM/DRA Family
#
CONFIG_ARCH_OMAP2=y
CONFIG_ARCH_OMAP2PLUS=y

and

#
# MX31 platforms:
#
CONFIG_MACH_MX31ADS=y
CONFIG_MACH_MX31LILLY=y
CONFIG_MACH_MX31LITE=y
CONFIG_MACH_PCM037=y
CONFIG_MACH_PCM037_EET=y
CONFIG_MACH_MX31_3DS=y
CONFIG_MACH_MX31MOBOARD=y
CONFIG_MACH_QONG=y
CONFIG_MACH_ARMADILLO5X0=y
CONFIG_MACH_KZM_ARM11_01=y
CONFIG_MACH_BUG=y
CONFIG_MACH_IMX31_DT=y

The compilation is successful.


Regards,
Sarbojit





------- Original Message -------
Sender : Arnd Bergmann<arnd@arndb.de>
Date : Jun 02, 2015 19:49 (GMT+09:00)
Title : Re: [RFC] arm: Add for atomic half word exchange

On Tuesday 02 June 2015 06:21:43 Sarbojit Ganguly wrote:
> Hello Raghavendra,
> 
> That is exactly I had done albeit from menuconfig. Basically the whole point was to make sure my patch compiles against ARM11 architectures as well. Hence I ensured the .config contains the relevant flags on. 
> 
> 

Most ARM11 implementations are ARMv6k, which has support for the instruction,
but the older ARMv6 (without k) does not. From your description, I assume
you only tested with ARMv6k, not ARMv6. Please enable the OMAP2 and IMX31
platforms to test ARMv6.

Arnd


?????
???   ??   ?? ??
----------------------------------------------------------------------+
The Tao lies beyond Yin and Yang. It is silent and still as a pool of water.      |
It does not seek fame, therefore nobody knows its presence.                       |
It does not seek fortune, for it is complete within itself.                       |
It exists beyond space and time.                                                  |
----------------------------------------------------------------------+ÿôèº{.nÇ+‰·Ÿ®‰­†+%ŠËÿ±éݶ\x17¥Šwÿº{.nÇ+‰·¥Š{±þG«éÿŠ{ayº\x1dʇڙë,j\a­¢f£¢·hšïêÿ‘êçz_è®\x03(­éšŽŠÝ¢j"ú\x1a¶^[m§ÿÿ¾\a«þG«éÿ¢¸?™¨è­Ú&£ø§~á¶iO•æ¬z·švØ^\x14\x04\x1a¶^[m§ÿÿÃ\fÿ¶ìÿ¢¸?–I¥

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Re: [RFC] arm: Add for atomic half word exchange
@ 2015-06-02  6:21 Sarbojit Ganguly
  0 siblings, 0 replies; 7+ messages in thread
From: Sarbojit Ganguly @ 2015-06-02  6:21 UTC (permalink / raw)
  To: Raghavendra K T, Sarbojit Ganguly
  Cc: Arnd Bergmann, linux-arm-kernel, SUNEEL KUMAR SURIMANI,
	VIKRAM MUPPARTHI, tglx, mingo, hpa, peterz, Waiman.Long, oleg,
	linux-kernel, SHARAN ALLUR, torvalds

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset=windows-1252, Size: 3491 bytes --]

Hello Raghavendra,

That is exactly I had done albeit from menuconfig. Basically the whole point was to make sure my patch compiles against ARM11 architectures as well. Hence I ensured the .config contains the relevant flags on. 

Regards,
Sarbojit

------- Original Message -------
Sender : Raghavendra K T<raghavendra.kt@linux.vnet.ibm.com>
Date : Jun 02, 2015 15:11 (GMT+09:00)
Title : Re: [RFC] arm: Add for atomic half word exchange

On 06/02/2015 11:19 AM, Sarbojit Ganguly wrote:
> I made the CONFIG_ARCH_MULTI_V6=y and
> CONFIG_CPU_V6K=y
> CONFIG_CPU_32v6=y
> CONFIG_CPU_32v6K=y
>
> and compiled 4.0.4 with the patch. Result is a compilation success.
>
> Regards,
> Sarbojit
>

Hi Sarbojit,

I am not familiar about the implication of setting those options
unconditionally, But from Kconfig point of view
Arnd was expecting something like below IIUC
with the patch: (a quick example)

---8<---
diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
index 9f1f09a..194fc13 100644
--- a/arch/arm/Kconfig
+++ b/arch/arm/Kconfig
@@ -812,6 +812,8 @@ config ARCH_MULTI_V6
         bool "ARMv6 based platforms (ARM11)"
         select ARCH_MULTI_V6_V7
         select CPU_V6K
+       select CPU_32v6
+       select CPU_32v6k

  config ARCH_MULTI_V7
         bool "ARMv7 based platforms (Cortex-A, PJ4, Scorpion, Krait)"



> ------- Original Message -------
> Sender : Arnd Bergmann
> Date : May 19, 2015 18:51 (GMT+09:00)
> Title : Re: [RFC] arm: Add for atomic half word exchange
>
> On Tuesday 19 May 2015 09:39:33 Sarbojit Ganguly wrote:
>> Since 16 bit half word exchange was not there and MCS based qspinlock by Waiman's xchg_tail() requires an atomic exchange on a half word,
>> here is a small modification to __xchg() code.
>
> We have discussed a similar patch before, see
> https://lkml.org/lkml/2015/2/25/390
>
>>   #if __LINUX_ARM_ARCH__ >= 6
>> @@ -50,6 +52,23 @@
>>                          : "r" (x), "r" (ptr)
>>                          : "memory", "cc");
>>                  break;
>> +               /*
>> +                * halfword exclusive exchange
>> +                * This is new implementation as qspinlock
>> +                * wants 16 bit atomic CAS.
>> +                */
>> +       case 2:
>> +               asm volatile("@ __xchg2\n"
>> +               "1:     ldrexh  %0, [%3]\n"
>> +               "       strexh  %1, %2, [%3]\n"
>> +               "       teq     %1, #0\n"
>> +               "       bne     1b"
>> +                       : "=&r" (ret), "=&r" (tmp)
>> +                       : "r" (x), "r" (ptr)
>> +                       : "memory", "cc");
>> +               break;
>>          case 4:
>>                  asm volatile("@ __xchg4\n"
>>                  "1:     ldrex   %0, [%3]\n"
>
> Please try to find a way to make this compile when CONFIG_CPU_V6
> is set.
>
> Arnd
>



?????
???   ??   ?? ??
----------------------------------------------------------------------+
The Tao lies beyond Yin and Yang. It is silent and still as a pool of water.      |
It does not seek fame, therefore nobody knows its presence.                       |
It does not seek fortune, for it is complete within itself.                       |
It exists beyond space and time.                                                  |
----------------------------------------------------------------------+ÿôèº{.nÇ+‰·Ÿ®‰­†+%ŠËÿ±éݶ\x17¥Šwÿº{.nÇ+‰·¥Š{±þG«éÿŠ{ayº\x1dʇڙë,j\a­¢f£¢·hšïêÿ‘êçz_è®\x03(­éšŽŠÝ¢j"ú\x1a¶^[m§ÿÿ¾\a«þG«éÿ¢¸?™¨è­Ú&£ø§~á¶iO•æ¬z·švØ^\x14\x04\x1a¶^[m§ÿÿÃ\fÿ¶ìÿ¢¸?–I¥

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: Re: [RFC] arm: Add for atomic half word exchange
@ 2015-06-02  5:49 Sarbojit Ganguly
  0 siblings, 0 replies; 7+ messages in thread
From: Sarbojit Ganguly @ 2015-06-02  5:49 UTC (permalink / raw)
  To: Arnd Bergmann, linux-arm-kernel, Sarbojit Ganguly,
	SUNEEL KUMAR SURIMANI, vikram.m
  Cc: tglx, mingo, hpa, peterz, Waiman.Long, raghavendra.kt, oleg,
	linux-kernel, SHARAN ALLUR, torvalds

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset=windows-1252, Size: 1808 bytes --]

I made the CONFIG_ARCH_MULTI_V6=y and 
CONFIG_CPU_V6K=y
CONFIG_CPU_32v6=y
CONFIG_CPU_32v6K=y

and compiled 4.0.4 with the patch. Result is a compilation success. 

Regards,
Sarbojit

------- Original Message -------
Sender : Arnd Bergmann<arnd@arndb.de>
Date : May 19, 2015 18:51 (GMT+09:00)
Title : Re: [RFC] arm: Add for atomic half word exchange

On Tuesday 19 May 2015 09:39:33 Sarbojit Ganguly wrote:
> Since 16 bit half word exchange was not there and MCS based qspinlock by Waiman's xchg_tail() requires an atomic exchange on a half word,
> here is a small modification to __xchg() code.

We have discussed a similar patch before, see
https://lkml.org/lkml/2015/2/25/390

>  #if __LINUX_ARM_ARCH__ >= 6
> @@ -50,6 +52,23 @@
>                         : "r" (x), "r" (ptr)
>                         : "memory", "cc");
>                 break;
> +               /* 
> +                * halfword exclusive exchange
> +                * This is new implementation as qspinlock
> +                * wants 16 bit atomic CAS.
> +                */
> +       case 2:
> +               asm volatile("@ __xchg2\n"
> +               "1:     ldrexh  %0, [%3]\n"
> +               "       strexh  %1, %2, [%3]\n"
> +               "       teq     %1, #0\n"
> +               "       bne     1b"
> +                       : "=&r" (ret), "=&r" (tmp)
> +                       : "r" (x), "r" (ptr)
> +                       : "memory", "cc");
> +               break;
>         case 4: 
>                 asm volatile("@ __xchg4\n"
>                 "1:     ldrex   %0, [%3]\n"

Please try to find a way to make this compile when CONFIG_CPU_V6
is set.

Arndÿôèº{.nÇ+‰·Ÿ®‰­†+%ŠËÿ±éݶ\x17¥Šwÿº{.nÇ+‰·¥Š{±þG«éÿŠ{ayº\x1dʇڙë,j\a­¢f£¢·hšïêÿ‘êçz_è®\x03(­éšŽŠÝ¢j"ú\x1a¶^[m§ÿÿ¾\a«þG«éÿ¢¸?™¨è­Ú&£ø§~á¶iO•æ¬z·švØ^\x14\x04\x1a¶^[m§ÿÿÃ\fÿ¶ìÿ¢¸?–I¥

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2015-07-03 14:35 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-05-20  5:09 Re: [RFC] arm: Add for atomic half word exchange Sarbojit Ganguly
2015-05-20  6:51 ` Arnd Bergmann
2015-05-20  6:57 ` Peter Zijlstra
2015-06-02  5:49 Sarbojit Ganguly
2015-06-02  6:21 Sarbojit Ganguly
2015-06-02 11:11 Sarbojit Ganguly
2015-07-03 14:35 Sarbojit Ganguly

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).