linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
* [RFC PATCH 0/3] Queued spinlocks/RW-locks for ARM
@ 2019-10-07 21:44 Sebastian Andrzej Siewior
  2019-10-07 21:44 ` [PATCH 1/3] ARM: Use qrwlock implementation Sebastian Andrzej Siewior
                   ` (3 more replies)
  0 siblings, 4 replies; 22+ messages in thread
From: Sebastian Andrzej Siewior @ 2019-10-07 21:44 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: Arnd Bergmann, Peter Zijlstra, Russell King, Ingo Molnar,
	Waiman Long, Will Deacon


I added support for queued-RW and -spinlocks for ARM. I wanted to
remove the current implementation but this does not work. The CPU_V6
kernel config does not have support for xchg() on 2 byte memory address.
This is required by q-lock' slowpath. It is possible to create a
multi-kernel (with v6+v7+SMP) which then lack the function.

I tested the q-lock implementation with
	hackbench -g40 -s 500 -l 500

The numbers in the table below represent the average runtime of 10
invocations. I tested with HZ_100,HZ_250 and the different preemption
levels on a IMX6q-board (quad Cortex-A9) and an AM572x board (dual
Cortex-A15).
"Ticket" means the current implementation on v5.4-rc1, Q-Locks is the
switch to queued RW and spinlocks and in Q-locksI the locking
instruction is additionally inlined.

IMX6q
~~~~~
HZ_100  | PREEMPT_NONE  | PREEMPT_VOLUNTARY 	| PREEMPT
Ticket  | 52.103        | 52.284		| 60.5681
Q-locks | 54.1804	| 53.267		| 56.1914
Q-locksI| 52.2985       | 49.398		| 56.7441

HZ_250  | PREEMPT_NONE  | PREEMPT_VOLUNTARY 	| PREEMPT
Ticket  | 54.3888       | 52.7896      		| 58.4837
Q-locks | 52.1027	| 52.2302               | 57.26
Q-locksI| 51.6185       | 51.5856		| 55.327

AM572x
~~~~~~
HZ_100  | PREEMPT_NONE  | PREEMPT_VOLUNTARY 	| PREEMPT
Ticket  | 42.3819       | 42.4821      		| 43.2671
Q-locks | 40.9141	| 40.0269	        | 42.65  
Q-locksI| 40.0763       | 39.9101      		| 40.7811

HZ_250  | PREEMPT_NONE  | PREEMPT_VOLUNTARY 	| PREEMPT
Ticket  | 41.6399       | 42.9386      		| 44.5865
Q-locks | 41.4476	| 43.0836               | 43.1937
Q-locksI| 39.6897       | 41.1746		| 43.1962

Based on these numbers, the Q-lock based implementation performs a
little better that the current ticket spinlock implementation. On IMX6q
it requires additionally to inline the locks while it makes hardly a
difference on AM572x.

Here are `size' numbers for the different vmlinux binary:

   text	   data	    bss	    dec	 dec KiB  variant
8096124	2604932	 198648	10899704 10644.24 5.4-rc1 CONFIG_HZ_100 CONFIG_PREEMPT_NONE
8031639	2605060	 198656	10835355 10581.40 qlocks  CONFIG_HZ_100 CONFIG_PREEMPT_NONE
8319233	2605072	 198656	11122961 10862.27 qlocksI CONFIG_HZ_100 CONFIG_PREEMPT_NONE

8098548	2604932	 198648	10902128 10646.61 5.4-rc1 CONFIG_HZ_100 CONFIG_PREEMPT_VOLUNTARY
8034103	2605060	 198656	10837819 10583.81 qlocks  CONFIG_HZ_100 CONFIG_PREEMPT_VOLUNTARY
8321769	2605072	 198656	11125497 10864.74 qlocksI CONFIG_HZ_100 CONFIG_PREEMPT_VOLUNTARY

8082969	2605468	 198712	10887149 10631.98 5.4-rc1 CONFIG_HZ_100 CONFIG_PREEMPT
8083732	2609692	 198720	10892144 10636.86 qlocks  CONFIG_HZ_100 CONFIG_PREEMPT
8725070	2609704	 198720	11533494 11263.18 qlocksI CONFIG_HZ_100 CONFIG_PREEMPT

8096784	2605188	 198648	10900620 10645.14 5.4-rc1 CONFIG_HZ_250 CONFIG_PREEMPT_NONE
8032307	2605316	 198656	10836279 10582.30 qlocks  CONFIG_HZ_250 CONFIG_PREEMPT_NONE
8319901	2605328	 198656	11123885 10863.17 qlocksI CONFIG_HZ_250 CONFIG_PREEMPT_NONE

8099208	2605188	 198648	10903044 10647.50 5.4-rc1 CONFIG_HZ_250 CONFIG_PREEMPT_VOLUNTARY
8034739	2605316	 198656	10838711 10584.68 qlocks  CONFIG_HZ_250 CONFIG_PREEMPT_VOLUNTARY
8322405	2605328	 198656	11126389 10865.61 qlocksI CONFIG_HZ_250 CONFIG_PREEMPT_VOLUNTARY

8083645	2605724	 198712	10888081 10632.89 5.4-rc1 CONFIG_HZ_250 CONFIG_PREEMPT
8084376	2609948	 198720	10893044 10637.74 qlocks  CONFIG_HZ_250 CONFIG_PREEMPT
8725762	2609960	 198720	11534442 11264.10 qlocksI CONFIG_HZ_250 CONFIG_PREEMPT

On average the q-locksI variant is approx. 200KiB larger compared to the
current implementation. On the preempt configuration the size increases
by approx. 600KiB which is probably not worth it.

Sebastian


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2019-10-09 21:42 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-10-07 21:44 [RFC PATCH 0/3] Queued spinlocks/RW-locks for ARM Sebastian Andrzej Siewior
2019-10-07 21:44 ` [PATCH 1/3] ARM: Use qrwlock implementation Sebastian Andrzej Siewior
2019-10-07 21:44 ` [PATCH 2/3] ARM: Use qspinlock implementation Sebastian Andrzej Siewior
2019-10-07 21:44 ` [PATCH 3/3] ARM: Inline locking functions for !PREEMPTION Sebastian Andrzej Siewior
2019-10-08 11:42 ` [RFC PATCH 0/3] Queued spinlocks/RW-locks for ARM Arnd Bergmann
2019-10-08 13:36   ` Waiman Long
2019-10-08 14:32     ` Arnd Bergmann
2019-10-08 19:47       ` Sebastian Andrzej Siewior
2019-10-08 21:47         ` Arnd Bergmann
2019-10-08 22:02           ` Sebastian Andrzej Siewior
2019-10-09  8:15             ` Arnd Bergmann
2019-10-09  8:46       ` Peter Zijlstra
2019-10-09  8:57         ` Arnd Bergmann
2019-10-09  9:31           ` Peter Zijlstra
2019-10-09 10:31             ` Arnd Bergmann
2019-10-09 10:56               ` Peter Zijlstra
2019-10-09 12:00                 ` Arnd Bergmann
2019-10-09 12:06                   ` Peter Zijlstra
2019-10-09 12:52                     ` Will Deacon
2019-10-09 13:50                     ` Arnd Bergmann
2019-10-09 21:42                       ` Sebastian Andrzej Siewior
2019-10-08 19:32   ` Sebastian Andrzej Siewior

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).