[RFC PATCH 0/3] Queued spinlocks/RW-locks for ARM

* [RFC PATCH 0/3] Queued spinlocks/RW-locks for ARM
@ 2019-10-07 21:44 Sebastian Andrzej Siewior
  2019-10-07 21:44 ` [PATCH 1/3] ARM: Use qrwlock implementation Sebastian Andrzej Siewior
                   ` (3 more replies)
  0 siblings, 4 replies; 22+ messages in thread
From: Sebastian Andrzej Siewior @ 2019-10-07 21:44 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: Arnd Bergmann, Peter Zijlstra, Russell King, Ingo Molnar,
	Waiman Long, Will Deacon

I added support for queued-RW and -spinlocks for ARM. I wanted to
remove the current implementation but this does not work. The CPU_V6
kernel config does not have support for xchg() on 2 byte memory address.
This is required by q-lock' slowpath. It is possible to create a
multi-kernel (with v6+v7+SMP) which then lack the function.

I tested the q-lock implementation with
	hackbench -g40 -s 500 -l 500

The numbers in the table below represent the average runtime of 10
invocations. I tested with HZ_100,HZ_250 and the different preemption
levels on a IMX6q-board (quad Cortex-A9) and an AM572x board (dual
Cortex-A15).
"Ticket" means the current implementation on v5.4-rc1, Q-Locks is the
switch to queued RW and spinlocks and in Q-locksI the locking
instruction is additionally inlined.

IMX6q
~~~~~
HZ_100  | PREEMPT_NONE  | PREEMPT_VOLUNTARY 	| PREEMPT
Ticket  | 52.103        | 52.284		| 60.5681
Q-locks | 54.1804	| 53.267		| 56.1914
Q-locksI| 52.2985       | 49.398		| 56.7441

HZ_250  | PREEMPT_NONE  | PREEMPT_VOLUNTARY 	| PREEMPT
Ticket  | 54.3888       | 52.7896      		| 58.4837
Q-locks | 52.1027	| 52.2302               | 57.26
Q-locksI| 51.6185       | 51.5856		| 55.327

AM572x
~~~~~~
HZ_100  | PREEMPT_NONE  | PREEMPT_VOLUNTARY 	| PREEMPT
Ticket  | 42.3819       | 42.4821      		| 43.2671
Q-locks | 40.9141	| 40.0269	        | 42.65  
Q-locksI| 40.0763       | 39.9101      		| 40.7811

HZ_250  | PREEMPT_NONE  | PREEMPT_VOLUNTARY 	| PREEMPT
Ticket  | 41.6399       | 42.9386      		| 44.5865
Q-locks | 41.4476	| 43.0836               | 43.1937
Q-locksI| 39.6897       | 41.1746		| 43.1962

Based on these numbers, the Q-lock based implementation performs a
little better that the current ticket spinlock implementation. On IMX6q
it requires additionally to inline the locks while it makes hardly a
difference on AM572x.

Here are `size' numbers for the different vmlinux binary:

   text	   data	    bss	    dec	 dec KiB  variant
8096124	2604932	 198648	10899704 10644.24 5.4-rc1 CONFIG_HZ_100 CONFIG_PREEMPT_NONE
8031639	2605060	 198656	10835355 10581.40 qlocks  CONFIG_HZ_100 CONFIG_PREEMPT_NONE
8319233	2605072	 198656	11122961 10862.27 qlocksI CONFIG_HZ_100 CONFIG_PREEMPT_NONE

8098548	2604932	 198648	10902128 10646.61 5.4-rc1 CONFIG_HZ_100 CONFIG_PREEMPT_VOLUNTARY
8034103	2605060	 198656	10837819 10583.81 qlocks  CONFIG_HZ_100 CONFIG_PREEMPT_VOLUNTARY
8321769	2605072	 198656	11125497 10864.74 qlocksI CONFIG_HZ_100 CONFIG_PREEMPT_VOLUNTARY

8082969	2605468	 198712	10887149 10631.98 5.4-rc1 CONFIG_HZ_100 CONFIG_PREEMPT
8083732	2609692	 198720	10892144 10636.86 qlocks  CONFIG_HZ_100 CONFIG_PREEMPT
8725070	2609704	 198720	11533494 11263.18 qlocksI CONFIG_HZ_100 CONFIG_PREEMPT

8096784	2605188	 198648	10900620 10645.14 5.4-rc1 CONFIG_HZ_250 CONFIG_PREEMPT_NONE
8032307	2605316	 198656	10836279 10582.30 qlocks  CONFIG_HZ_250 CONFIG_PREEMPT_NONE
8319901	2605328	 198656	11123885 10863.17 qlocksI CONFIG_HZ_250 CONFIG_PREEMPT_NONE

8099208	2605188	 198648	10903044 10647.50 5.4-rc1 CONFIG_HZ_250 CONFIG_PREEMPT_VOLUNTARY
8034739	2605316	 198656	10838711 10584.68 qlocks  CONFIG_HZ_250 CONFIG_PREEMPT_VOLUNTARY
8322405	2605328	 198656	11126389 10865.61 qlocksI CONFIG_HZ_250 CONFIG_PREEMPT_VOLUNTARY

8083645	2605724	 198712	10888081 10632.89 5.4-rc1 CONFIG_HZ_250 CONFIG_PREEMPT
8084376	2609948	 198720	10893044 10637.74 qlocks  CONFIG_HZ_250 CONFIG_PREEMPT
8725762	2609960	 198720	11534442 11264.10 qlocksI CONFIG_HZ_250 CONFIG_PREEMPT

On average the q-locksI variant is approx. 200KiB larger compared to the
current implementation. On the preempt configuration the size increases
by approx. 600KiB which is probably not worth it.

Sebastian

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 22+ messages in thread