kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [kvm-unit-tests PATCH v8 00/10] MTTCG sanity tests for ARM
@ 2021-11-18 18:46 Alex Bennée
  2021-11-18 18:46 ` [kvm-unit-tests PATCH v8 01/10] docs: mention checkpatch in the README Alex Bennée
                   ` (9 more replies)
  0 siblings, 10 replies; 19+ messages in thread
From: Alex Bennée @ 2021-11-18 18:46 UTC (permalink / raw)
  To: kvm
  Cc: idan.horowitz, qemu-arm, linux-arm-kernel, kvmarm,
	christoffer.dall, maz, Alex Bennée

Hi,

It's been a long time since I last posted these but I'd like to
incorporate some MTTCG tests into QEMU's upstream acceptance tests and
a first step is getting these up-streamed. Most of the changes are
fixing up the numerous checkpatch failures (although isaac remains
unchanged and some warnings make no sense for kvm-unit-tests).

I dropped an additional test which attempts to test for data flush
behaviour but it still needs some work:

  https://github.com/stsquad/kvm-unit-tests/commit/712eb3a287df24cdeff00ef966d68aef6ff2b8eb

Alex Bennée (10):
  docs: mention checkpatch in the README
  arm/flat.lds: don't drop debug during link
  Makefile: add GNU global tags support
  run_tests.sh: add --config option for alt test set
  lib: add isaac prng library from CCAN
  arm/tlbflush-code: TLB flush during code execution
  arm/locking-tests: add comprehensive locking test
  arm/barrier-litmus-tests: add simple mp and sal litmus tests
  arm/run: use separate --accel form
  arm/tcg-test: some basic TCG exercising tests

 arm/run                   |   4 +-
 run_tests.sh              |  11 +-
 Makefile                  |   5 +-
 arm/Makefile.arm          |   2 +
 arm/Makefile.arm64        |   2 +
 arm/Makefile.common       |   6 +-
 lib/arm/asm/barrier.h     |  61 ++++++
 lib/arm64/asm/barrier.h   |  50 +++++
 lib/prng.h                |  82 +++++++
 lib/prng.c                | 162 ++++++++++++++
 arm/flat.lds              |   1 -
 arm/tcg-test-asm.S        | 171 +++++++++++++++
 arm/tcg-test-asm64.S      | 170 ++++++++++++++
 arm/barrier-litmus-test.c | 450 ++++++++++++++++++++++++++++++++++++++
 arm/locking-test.c        | 322 +++++++++++++++++++++++++++
 arm/spinlock-test.c       |  87 --------
 arm/tcg-test.c            | 338 ++++++++++++++++++++++++++++
 arm/tlbflush-code.c       | 209 ++++++++++++++++++
 arm/mttcgtests.cfg        | 176 +++++++++++++++
 README.md                 |   2 +
 20 files changed, 2216 insertions(+), 95 deletions(-)
 create mode 100644 lib/prng.h
 create mode 100644 lib/prng.c
 create mode 100644 arm/tcg-test-asm.S
 create mode 100644 arm/tcg-test-asm64.S
 create mode 100644 arm/barrier-litmus-test.c
 create mode 100644 arm/locking-test.c
 delete mode 100644 arm/spinlock-test.c
 create mode 100644 arm/tcg-test.c
 create mode 100644 arm/tlbflush-code.c
 create mode 100644 arm/mttcgtests.cfg

-- 
2.30.2


^ permalink raw reply	[flat|nested] 19+ messages in thread

* [kvm-unit-tests PATCH v8 01/10] docs: mention checkpatch in the README
  2021-11-18 18:46 [kvm-unit-tests PATCH v8 00/10] MTTCG sanity tests for ARM Alex Bennée
@ 2021-11-18 18:46 ` Alex Bennée
  2021-11-24 11:06   ` Andrew Jones
  2021-11-18 18:46 ` [kvm-unit-tests PATCH v8 02/10] arm/flat.lds: don't drop debug during link Alex Bennée
                   ` (8 subsequent siblings)
  9 siblings, 1 reply; 19+ messages in thread
From: Alex Bennée @ 2021-11-18 18:46 UTC (permalink / raw)
  To: kvm
  Cc: idan.horowitz, qemu-arm, linux-arm-kernel, kvmarm,
	christoffer.dall, maz, Alex Bennée

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
---
 README.md | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/README.md b/README.md
index b498aaf..5db48e5 100644
--- a/README.md
+++ b/README.md
@@ -182,3 +182,5 @@ the code files.  We also start with common code and finish with unit test
 code. git-diff's orderFile feature allows us to specify the order in a
 file.  The orderFile we use is `scripts/git.difforder`; adding the config
 with `git config diff.orderFile scripts/git.difforder` enables it.
+
+Please run the kernel's ./scripts/checkpatch.pl on new patches
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [kvm-unit-tests PATCH v8 02/10] arm/flat.lds: don't drop debug during link
  2021-11-18 18:46 [kvm-unit-tests PATCH v8 00/10] MTTCG sanity tests for ARM Alex Bennée
  2021-11-18 18:46 ` [kvm-unit-tests PATCH v8 01/10] docs: mention checkpatch in the README Alex Bennée
@ 2021-11-18 18:46 ` Alex Bennée
  2021-11-18 18:46 ` [kvm-unit-tests PATCH v8 03/10] Makefile: add GNU global tags support Alex Bennée
                   ` (7 subsequent siblings)
  9 siblings, 0 replies; 19+ messages in thread
From: Alex Bennée @ 2021-11-18 18:46 UTC (permalink / raw)
  To: kvm
  Cc: idan.horowitz, qemu-arm, linux-arm-kernel, kvmarm,
	christoffer.dall, maz, Alex Bennée

It is useful to keep the debug in the .elf file so we can debug and it
doesn't get copied across to the final .flat file. Of course we still
need to ensure we apply the offset when we load the symbols based on
where QEMU decided to load the kernel.

  (gdb) symbol-file ./builds/arm64/arm/tlbflush-data.elf -o 0x40080000

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
---
 arm/flat.lds | 1 -
 1 file changed, 1 deletion(-)

diff --git a/arm/flat.lds b/arm/flat.lds
index 6fb459e..47fcb64 100644
--- a/arm/flat.lds
+++ b/arm/flat.lds
@@ -62,7 +62,6 @@ SECTIONS
     /DISCARD/ : {
         *(.note*)
         *(.interp)
-        *(.debug*)
         *(.comment)
         *(.dynamic)
     }
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [kvm-unit-tests PATCH v8 03/10] Makefile: add GNU global tags support
  2021-11-18 18:46 [kvm-unit-tests PATCH v8 00/10] MTTCG sanity tests for ARM Alex Bennée
  2021-11-18 18:46 ` [kvm-unit-tests PATCH v8 01/10] docs: mention checkpatch in the README Alex Bennée
  2021-11-18 18:46 ` [kvm-unit-tests PATCH v8 02/10] arm/flat.lds: don't drop debug during link Alex Bennée
@ 2021-11-18 18:46 ` Alex Bennée
  2021-11-18 18:46 ` [kvm-unit-tests PATCH v8 04/10] run_tests.sh: add --config option for alt test set Alex Bennée
                   ` (6 subsequent siblings)
  9 siblings, 0 replies; 19+ messages in thread
From: Alex Bennée @ 2021-11-18 18:46 UTC (permalink / raw)
  To: kvm
  Cc: idan.horowitz, qemu-arm, linux-arm-kernel, kvmarm,
	christoffer.dall, maz, Alex Bennée

If you have ctags you might as well offer gtags as a target.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
---
 Makefile | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/Makefile b/Makefile
index b80c31f..0b7c03a 100644
--- a/Makefile
+++ b/Makefile
@@ -122,6 +122,9 @@ cscope:
 		-name '*.[chsS]' -exec realpath --relative-base=$(CURDIR) {} \; | sort -u > ./cscope.files
 	cscope -bk
 
-.PHONY: tags
+.PHONY: tags gtags
 tags:
 	ctags -R
+
+gtags:
+	gtags
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [kvm-unit-tests PATCH v8 04/10] run_tests.sh: add --config option for alt test set
  2021-11-18 18:46 [kvm-unit-tests PATCH v8 00/10] MTTCG sanity tests for ARM Alex Bennée
                   ` (2 preceding siblings ...)
  2021-11-18 18:46 ` [kvm-unit-tests PATCH v8 03/10] Makefile: add GNU global tags support Alex Bennée
@ 2021-11-18 18:46 ` Alex Bennée
  2021-11-24 16:48   ` Andrew Jones
  2021-11-18 18:46 ` [kvm-unit-tests PATCH v8 05/10] lib: add isaac prng library from CCAN Alex Bennée
                   ` (5 subsequent siblings)
  9 siblings, 1 reply; 19+ messages in thread
From: Alex Bennée @ 2021-11-18 18:46 UTC (permalink / raw)
  To: kvm
  Cc: idan.horowitz, qemu-arm, linux-arm-kernel, kvmarm,
	christoffer.dall, maz, Alex Bennée

The upcoming MTTCG tests don't need to be run for normal KVM unit
tests so lets add the facility to have a custom set of tests.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
---
 run_tests.sh | 11 ++++++++---
 1 file changed, 8 insertions(+), 3 deletions(-)

diff --git a/run_tests.sh b/run_tests.sh
index 9f233c5..b1088d2 100755
--- a/run_tests.sh
+++ b/run_tests.sh
@@ -15,7 +15,7 @@ function usage()
 {
 cat <<EOF
 
-Usage: $0 [-h] [-v] [-a] [-g group] [-j NUM-TASKS] [-t]
+Usage: $0 [-h] [-v] [-a] [-g group] [-j NUM-TASKS] [-t] [-c CONFIG]
 
     -h, --help      Output this help text
     -v, --verbose   Enables verbose mode
@@ -24,6 +24,7 @@ Usage: $0 [-h] [-v] [-a] [-g group] [-j NUM-TASKS] [-t]
     -g, --group     Only execute tests in the given group
     -j, --parallel  Execute tests in parallel
     -t, --tap13     Output test results in TAP format
+    -c, --config    Override default unittests.cfg
 
 Set the environment variable QEMU=/path/to/qemu-system-ARCH to
 specify the appropriate qemu binary for ARCH-run.
@@ -42,7 +43,7 @@ if [ $? -ne 4 ]; then
 fi
 
 only_tests=""
-args=$(getopt -u -o ag:htj:v -l all,group:,help,tap13,parallel:,verbose -- $*)
+args=$(getopt -u -o ag:htj:vc: -l all,group:,help,tap13,parallel:,verbose,config: -- $*)
 [ $? -ne 0 ] && exit 2;
 set -- $args;
 while [ $# -gt 0 ]; do
@@ -73,6 +74,10 @@ while [ $# -gt 0 ]; do
         -t | --tap13)
             tap_output="yes"
             ;;
+        -c | --config)
+            shift
+            config=$1
+            ;;
         --)
             ;;
         *)
@@ -152,7 +157,7 @@ function run_task()
 
 : ${unittest_log_dir:=logs}
 : ${unittest_run_queues:=1}
-config=$TEST_DIR/unittests.cfg
+: ${config:=$TEST_DIR/unittests.cfg}
 
 rm -rf $unittest_log_dir.old
 [ -d $unittest_log_dir ] && mv $unittest_log_dir $unittest_log_dir.old
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [kvm-unit-tests PATCH v8 05/10] lib: add isaac prng library from CCAN
  2021-11-18 18:46 [kvm-unit-tests PATCH v8 00/10] MTTCG sanity tests for ARM Alex Bennée
                   ` (3 preceding siblings ...)
  2021-11-18 18:46 ` [kvm-unit-tests PATCH v8 04/10] run_tests.sh: add --config option for alt test set Alex Bennée
@ 2021-11-18 18:46 ` Alex Bennée
  2021-11-18 18:46 ` [kvm-unit-tests PATCH v8 06/10] arm/tlbflush-code: TLB flush during code execution Alex Bennée
                   ` (4 subsequent siblings)
  9 siblings, 0 replies; 19+ messages in thread
From: Alex Bennée @ 2021-11-18 18:46 UTC (permalink / raw)
  To: kvm
  Cc: idan.horowitz, qemu-arm, linux-arm-kernel, kvmarm,
	christoffer.dall, maz, Alex Bennée, Timothy B . Terriberry,
	Andrew Jones

It's often useful to introduce some sort of random variation when
testing several racing CPU conditions. Instead of each test implementing
some half-arsed PRNG bring in a a decent one which has good statistical
randomness. Obviously it is deterministic for a given seed value which
is likely the behaviour you want.

I've pulled in the ISAAC library from CCAN:

    http://ccodearchive.net/info/isaac.html

I shaved off the float related stuff which is less useful for unit
testing and re-indented to fit the style. The original license was
CC0 (Public Domain) which is compatible with the LGPL v2 of
kvm-unit-tests.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
CC: Timothy B. Terriberry <tterribe@xiph.org>
Acked-by: Andrew Jones <drjones@redhat.com>
---
 arm/Makefile.common |   1 +
 lib/prng.h          |  82 ++++++++++++++++++++++
 lib/prng.c          | 162 ++++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 245 insertions(+)
 create mode 100644 lib/prng.h
 create mode 100644 lib/prng.c

diff --git a/arm/Makefile.common b/arm/Makefile.common
index 38385e0..99bcf3f 100644
--- a/arm/Makefile.common
+++ b/arm/Makefile.common
@@ -44,6 +44,7 @@ cflatobjs += lib/pci-testdev.o
 cflatobjs += lib/virtio.o
 cflatobjs += lib/virtio-mmio.o
 cflatobjs += lib/chr-testdev.o
+cflatobjs += lib/prng.o
 cflatobjs += lib/arm/io.o
 cflatobjs += lib/arm/setup.o
 cflatobjs += lib/arm/mmu.o
diff --git a/lib/prng.h b/lib/prng.h
new file mode 100644
index 0000000..bf5776d
--- /dev/null
+++ b/lib/prng.h
@@ -0,0 +1,82 @@
+/*
+ * PRNG Header
+ */
+#ifndef __PRNG_H__
+#define __PRNG_H__
+
+# include <stdint.h>
+
+
+
+typedef struct isaac_ctx isaac_ctx;
+
+
+
+/*This value may be lowered to reduce memory usage on embedded platforms, at
+  the cost of reducing security and increasing bias.
+  Quoting Bob Jenkins: "The current best guess is that bias is detectable after
+  2**37 values for [ISAAC_SZ_LOG]=3, 2**45 for 4, 2**53 for 5, 2**61 for 6,
+  2**69 for 7, and 2**77 values for [ISAAC_SZ_LOG]=8."*/
+#define ISAAC_SZ_LOG      (8)
+#define ISAAC_SZ          (1<<ISAAC_SZ_LOG)
+#define ISAAC_SEED_SZ_MAX (ISAAC_SZ<<2)
+
+
+
+/*ISAAC is the most advanced of a series of pseudo-random number generators
+  designed by Robert J. Jenkins Jr. in 1996.
+  http://www.burtleburtle.net/bob/rand/isaac.html
+  To quote:
+  No efficient method is known for deducing their internal states.
+  ISAAC requires an amortized 18.75 instructions to produce a 32-bit value.
+  There are no cycles in ISAAC shorter than 2**40 values.
+  The expected cycle length is 2**8295 values.*/
+struct isaac_ctx{
+	unsigned n;
+	uint32_t r[ISAAC_SZ];
+	uint32_t m[ISAAC_SZ];
+	uint32_t a;
+	uint32_t b;
+	uint32_t c;
+};
+
+
+/**
+ * isaac_init - Initialize an instance of the ISAAC random number generator.
+ * @_ctx:   The instance to initialize.
+ * @_seed:  The specified seed bytes.
+ *          This may be NULL if _nseed is less than or equal to zero.
+ * @_nseed: The number of bytes to use for the seed.
+ *          If this is greater than ISAAC_SEED_SZ_MAX, the extra bytes are
+ *           ignored.
+ */
+void isaac_init(isaac_ctx *_ctx,const unsigned char *_seed,int _nseed);
+
+/**
+ * isaac_reseed - Mix a new batch of entropy into the current state.
+ * To reset ISAAC to a known state, call isaac_init() again instead.
+ * @_ctx:   The instance to reseed.
+ * @_seed:  The specified seed bytes.
+ *          This may be NULL if _nseed is zero.
+ * @_nseed: The number of bytes to use for the seed.
+ *          If this is greater than ISAAC_SEED_SZ_MAX, the extra bytes are
+ *           ignored.
+ */
+void isaac_reseed(isaac_ctx *_ctx,const unsigned char *_seed,int _nseed);
+/**
+ * isaac_next_uint32 - Return the next random 32-bit value.
+ * @_ctx: The ISAAC instance to generate the value with.
+ */
+uint32_t isaac_next_uint32(isaac_ctx *_ctx);
+/**
+ * isaac_next_uint - Uniform random integer less than the given value.
+ * @_ctx: The ISAAC instance to generate the value with.
+ * @_n:   The upper bound on the range of numbers returned (not inclusive).
+ *        This must be greater than zero and less than 2**32.
+ *        To return integers in the full range 0...2**32-1, use
+ *         isaac_next_uint32() instead.
+ * Return: An integer uniformly distributed between 0 and _n-1 (inclusive).
+ */
+uint32_t isaac_next_uint(isaac_ctx *_ctx,uint32_t _n);
+
+#endif
diff --git a/lib/prng.c b/lib/prng.c
new file mode 100644
index 0000000..ebd6df7
--- /dev/null
+++ b/lib/prng.c
@@ -0,0 +1,162 @@
+/*
+ * Pseudo Random Number Generator
+ *
+ * Lifted from ccan modules ilog/isaac under CC0
+ *   - http://ccodearchive.net/info/isaac.html
+ *   - http://ccodearchive.net/info/ilog.html
+ *
+ * And lightly hacked to compile under the KVM unit test environment.
+ * This provides a handy RNG for torture tests that want to vary
+ * delays and the like.
+ *
+ */
+
+/*Written by Timothy B. Terriberry (tterribe@xiph.org) 1999-2009.
+  CC0 (Public domain) - see LICENSE file for details
+  Based on the public domain implementation by Robert J. Jenkins Jr.*/
+
+#include "libcflat.h"
+
+#include <string.h>
+#include "prng.h"
+
+#define ISAAC_MASK        (0xFFFFFFFFU)
+
+/* Extract ISAAC_SZ_LOG bits (starting at bit 2). */
+static inline uint32_t lower_bits(uint32_t x)
+{
+	return (x & ((ISAAC_SZ-1) << 2)) >> 2;
+}
+
+/* Extract next ISAAC_SZ_LOG bits (starting at bit ISAAC_SZ_LOG+2). */
+static inline uint32_t upper_bits(uint32_t y)
+{
+	return (y >> (ISAAC_SZ_LOG+2)) & (ISAAC_SZ-1);
+}
+
+static void isaac_update(isaac_ctx *_ctx){
+	uint32_t *m;
+	uint32_t *r;
+	uint32_t  a;
+	uint32_t  b;
+	uint32_t  x;
+	uint32_t  y;
+	int       i;
+	m=_ctx->m;
+	r=_ctx->r;
+	a=_ctx->a;
+	b=_ctx->b+(++_ctx->c);
+	for(i=0;i<ISAAC_SZ/2;i++){
+		x=m[i];
+		a=(a^a<<13)+m[i+ISAAC_SZ/2];
+		m[i]=y=m[lower_bits(x)]+a+b;
+		r[i]=b=m[upper_bits(y)]+x;
+		x=m[++i];
+		a=(a^a>>6)+m[i+ISAAC_SZ/2];
+		m[i]=y=m[lower_bits(x)]+a+b;
+		r[i]=b=m[upper_bits(y)]+x;
+		x=m[++i];
+		a=(a^a<<2)+m[i+ISAAC_SZ/2];
+		m[i]=y=m[lower_bits(x)]+a+b;
+		r[i]=b=m[upper_bits(y)]+x;
+		x=m[++i];
+		a=(a^a>>16)+m[i+ISAAC_SZ/2];
+		m[i]=y=m[lower_bits(x)]+a+b;
+		r[i]=b=m[upper_bits(y)]+x;
+	}
+	for(i=ISAAC_SZ/2;i<ISAAC_SZ;i++){
+		x=m[i];
+		a=(a^a<<13)+m[i-ISAAC_SZ/2];
+		m[i]=y=m[lower_bits(x)]+a+b;
+		r[i]=b=m[upper_bits(y)]+x;
+		x=m[++i];
+		a=(a^a>>6)+m[i-ISAAC_SZ/2];
+		m[i]=y=m[lower_bits(x)]+a+b;
+		r[i]=b=m[upper_bits(y)]+x;
+		x=m[++i];
+		a=(a^a<<2)+m[i-ISAAC_SZ/2];
+		m[i]=y=m[lower_bits(x)]+a+b;
+		r[i]=b=m[upper_bits(y)]+x;
+		x=m[++i];
+		a=(a^a>>16)+m[i-ISAAC_SZ/2];
+		m[i]=y=m[lower_bits(x)]+a+b;
+		r[i]=b=m[upper_bits(y)]+x;
+	}
+	_ctx->b=b;
+	_ctx->a=a;
+	_ctx->n=ISAAC_SZ;
+}
+
+static void isaac_mix(uint32_t _x[8]){
+	static const unsigned char SHIFT[8]={11,2,8,16,10,4,8,9};
+	int i;
+	for(i=0;i<8;i++){
+		_x[i]^=_x[(i+1)&7]<<SHIFT[i];
+		_x[(i+3)&7]+=_x[i];
+		_x[(i+1)&7]+=_x[(i+2)&7];
+		i++;
+		_x[i]^=_x[(i+1)&7]>>SHIFT[i];
+		_x[(i+3)&7]+=_x[i];
+		_x[(i+1)&7]+=_x[(i+2)&7];
+	}
+}
+
+
+void isaac_init(isaac_ctx *_ctx,const unsigned char *_seed,int _nseed){
+	_ctx->a=_ctx->b=_ctx->c=0;
+	memset(_ctx->r,0,sizeof(_ctx->r));
+	isaac_reseed(_ctx,_seed,_nseed);
+}
+
+void isaac_reseed(isaac_ctx *_ctx,const unsigned char *_seed,int _nseed){
+	uint32_t *m;
+	uint32_t *r;
+	uint32_t  x[8];
+	int       i;
+	int       j;
+	m=_ctx->m;
+	r=_ctx->r;
+	if(_nseed>ISAAC_SEED_SZ_MAX)_nseed=ISAAC_SEED_SZ_MAX;
+	for(i=0;i<_nseed>>2;i++){
+		r[i]^=(uint32_t)_seed[i<<2|3]<<24|(uint32_t)_seed[i<<2|2]<<16|
+			(uint32_t)_seed[i<<2|1]<<8|_seed[i<<2];
+	}
+	_nseed-=i<<2;
+	if(_nseed>0){
+		uint32_t ri;
+		ri=_seed[i<<2];
+		for(j=1;j<_nseed;j++)ri|=(uint32_t)_seed[i<<2|j]<<(j<<3);
+		r[i++]^=ri;
+	}
+	x[0]=x[1]=x[2]=x[3]=x[4]=x[5]=x[6]=x[7]=0x9E3779B9U;
+	for(i=0;i<4;i++)isaac_mix(x);
+	for(i=0;i<ISAAC_SZ;i+=8){
+		for(j=0;j<8;j++)x[j]+=r[i+j];
+		isaac_mix(x);
+		memcpy(m+i,x,sizeof(x));
+	}
+	for(i=0;i<ISAAC_SZ;i+=8){
+		for(j=0;j<8;j++)x[j]+=m[i+j];
+		isaac_mix(x);
+		memcpy(m+i,x,sizeof(x));
+	}
+	isaac_update(_ctx);
+}
+
+uint32_t isaac_next_uint32(isaac_ctx *_ctx){
+	if(!_ctx->n)isaac_update(_ctx);
+	return _ctx->r[--_ctx->n];
+}
+
+uint32_t isaac_next_uint(isaac_ctx *_ctx,uint32_t _n){
+	uint32_t r;
+	uint32_t v;
+	uint32_t d;
+	do{
+		r=isaac_next_uint32(_ctx);
+		v=r%_n;
+		d=r-v;
+	}
+	while(((d+_n-1)&ISAAC_MASK)<d);
+	return v;
+}
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [kvm-unit-tests PATCH v8 06/10] arm/tlbflush-code: TLB flush during code execution
  2021-11-18 18:46 [kvm-unit-tests PATCH v8 00/10] MTTCG sanity tests for ARM Alex Bennée
                   ` (4 preceding siblings ...)
  2021-11-18 18:46 ` [kvm-unit-tests PATCH v8 05/10] lib: add isaac prng library from CCAN Alex Bennée
@ 2021-11-18 18:46 ` Alex Bennée
  2021-11-18 18:46 ` [kvm-unit-tests PATCH v8 07/10] arm/locking-tests: add comprehensive locking test Alex Bennée
                   ` (3 subsequent siblings)
  9 siblings, 0 replies; 19+ messages in thread
From: Alex Bennée @ 2021-11-18 18:46 UTC (permalink / raw)
  To: kvm
  Cc: idan.horowitz, qemu-arm, linux-arm-kernel, kvmarm,
	christoffer.dall, maz, Alex Bennée, Mark Rutland

This adds a fairly brain dead torture test for TLB flushes intended
for stressing the MTTCG QEMU build. It takes the usual -smp option for
multiple CPUs.

By default it CPU0 will do a TLBIALL flush after each cycle. You can
pass options via -append to control additional aspects of the test:

  - "page" flush each page in turn (one per function)
  - "self" do the flush after each computation cycle
  - "verbose" report progress on each computation cycle

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
CC: Mark Rutland <mark.rutland@arm.com>

---
v2
  - rename to tlbflush-test
  - made makefile changes cleaner
  - added self/other flush mode
  - create specific prefix
  - whitespace fixes
v3
  - using new SMP framework for test runing
v4
  - merge in the unitests.cfg
v5
  - max out at -smp 4
  - printf fmtfix
v7
  - rename to tlbflush-code
  - int -> bool flags
v8
  - kernel style fixes
  - move to separate mttcgtests.cfg
---
 arm/Makefile.common |   1 +
 arm/tlbflush-code.c | 209 ++++++++++++++++++++++++++++++++++++++++++++
 arm/mttcgtests.cfg  |  30 +++++++
 3 files changed, 240 insertions(+)
 create mode 100644 arm/tlbflush-code.c
 create mode 100644 arm/mttcgtests.cfg

diff --git a/arm/Makefile.common b/arm/Makefile.common
index 99bcf3f..e3f04f2 100644
--- a/arm/Makefile.common
+++ b/arm/Makefile.common
@@ -12,6 +12,7 @@ tests-common += $(TEST_DIR)/gic.flat
 tests-common += $(TEST_DIR)/psci.flat
 tests-common += $(TEST_DIR)/sieve.flat
 tests-common += $(TEST_DIR)/pl031.flat
+tests-common += $(TEST_DIR)/tlbflush-code.flat
 
 tests-all = $(tests-common) $(tests)
 all: directories $(tests-all)
diff --git a/arm/tlbflush-code.c b/arm/tlbflush-code.c
new file mode 100644
index 0000000..ca98c82
--- /dev/null
+++ b/arm/tlbflush-code.c
@@ -0,0 +1,209 @@
+// SPDX-License-Identifier: GPL-2.0-or-later
+/*
+ * TLB Flush Race Tests
+ *
+ * These tests are designed to test for incorrect TLB flush semantics
+ * under emulation. The initial CPU will set all the others working a
+ * compuation task and will then trigger TLB flushes across the
+ * system. It doesn't actually need to re-map anything but the flushes
+ * themselves will trigger QEMU's TCG self-modifying code detection
+ * which will invalidate any generated  code causing re-translation.
+ * Eventually the code buffer will fill and a general tb_lush() will
+ * be triggered.
+ *
+ * Copyright (C) 2016-2021, Linaro, Alex Bennée <alex.bennee@linaro.org>
+ *
+ * This work is licensed under the terms of the GNU LGPL, version 2.
+ */
+
+#include <libcflat.h>
+#include <asm/smp.h>
+#include <asm/cpumask.h>
+#include <asm/barrier.h>
+#include <asm/mmu.h>
+
+#define SEQ_LENGTH 10
+#define SEQ_HASH 0x7cd707fe
+
+static cpumask_t smp_test_complete;
+static int flush_count = 1000000;
+static bool flush_self;
+static bool flush_page;
+static bool flush_verbose;
+
+/*
+ * Work functions
+ *
+ * These work functions need to be:
+ *
+ *  - page aligned, so we can flush one function at a time
+ *  - have branches, so QEMU TCG generates multiple basic blocks
+ *  - call across pages, so we exercise the TCG basic block slow path
+ */
+
+/* Adler32 */
+__attribute__((aligned(PAGE_SIZE))) static
+uint32_t hash_array(const void *buf, size_t buflen)
+{
+	const uint8_t *data = (uint8_t *) buf;
+	uint32_t s1 = 1;
+	uint32_t s2 = 0;
+
+	for (size_t n = 0; n < buflen; n++) {
+		s1 = (s1 + data[n]) % 65521;
+		s2 = (s2 + s1) % 65521;
+	}
+	return (s2 << 16) | s1;
+}
+
+__attribute__((aligned(PAGE_SIZE))) static
+void create_fib_sequence(int length, unsigned int *array)
+{
+	int i;
+
+	/* first two values */
+	array[0] = 0;
+	array[1] = 1;
+	for (i = 2; i < length; i++)
+		array[i] = array[i-2] + array[i-1];
+}
+
+__attribute__((aligned(PAGE_SIZE))) static
+unsigned long long factorial(unsigned int n)
+{
+	unsigned int i;
+	unsigned long long fac = 1;
+
+	for (i = 1; i <= n; i++)
+		fac = fac * i;
+	return fac;
+}
+
+__attribute__((aligned(PAGE_SIZE))) static
+void factorial_array(unsigned int n, unsigned int *input,
+		     unsigned long long *output)
+{
+	unsigned int i;
+
+	for (i = 0; i < n; i++)
+		output[i] = factorial(input[i]);
+}
+
+__attribute__((aligned(PAGE_SIZE))) static
+unsigned int do_computation(void)
+{
+	unsigned int fib_array[SEQ_LENGTH];
+	unsigned long long facfib_array[SEQ_LENGTH];
+	uint32_t fib_hash, facfib_hash;
+
+	create_fib_sequence(SEQ_LENGTH, &fib_array[0]);
+	fib_hash = hash_array(&fib_array[0], sizeof(fib_array));
+	factorial_array(SEQ_LENGTH, &fib_array[0], &facfib_array[0]);
+	facfib_hash = hash_array(&facfib_array[0], sizeof(facfib_array));
+
+	return (fib_hash ^ facfib_hash);
+}
+
+/* This provides a table of the work functions so we can flush each
+ * page individually
+ */
+static void *pages[] = {&hash_array, &create_fib_sequence, &factorial,
+			&factorial_array, &do_computation};
+
+static void do_flush(int i)
+{
+	if (flush_page)
+		flush_tlb_page((unsigned long)pages[i % ARRAY_SIZE(pages)]);
+	else
+		flush_tlb_all();
+}
+
+
+static void just_compute(void)
+{
+	int i, errors = 0;
+	int cpu = smp_processor_id();
+
+	uint32_t result;
+
+	printf("CPU%d online\n", cpu);
+
+	for (i = 0 ; i < flush_count; i++) {
+		result = do_computation();
+
+		if (result != SEQ_HASH) {
+			errors++;
+			printf("CPU%d: seq%d 0x%"PRIx32"!=0x%x\n",
+				cpu, i, result, SEQ_HASH);
+		}
+
+		if (flush_verbose && (i % 1000) == 0)
+			printf("CPU%d: seq%d\n", cpu, i);
+
+		if (flush_self)
+			do_flush(i);
+	}
+
+	report(errors == 0, "CPU%d: Done - Errors: %d\n", cpu, errors);
+
+	cpumask_set_cpu(cpu, &smp_test_complete);
+	if (cpu != 0)
+		halt();
+}
+
+static void just_flush(void)
+{
+	int cpu = smp_processor_id();
+	int i = 0;
+
+	/*
+	 * Set our CPU as done, keep flushing until everyone else
+	 * finished
+	 */
+	cpumask_set_cpu(cpu, &smp_test_complete);
+
+	while (!cpumask_full(&smp_test_complete))
+		do_flush(i++);
+
+	report_info("CPU%d: Done - Triggered %d flushes\n", cpu, i);
+}
+
+int main(int argc, char **argv)
+{
+	int cpu, i;
+	char prefix[100];
+
+	for (i = 0; i < argc; i++) {
+		char *arg = argv[i];
+
+		if (strcmp(arg, "page") == 0)
+			flush_page = true;
+
+		if (strcmp(arg, "self") == 0)
+			flush_self = true;
+
+		if (strcmp(arg, "verbose") == 0)
+			flush_verbose = true;
+	}
+
+	snprintf(prefix, sizeof(prefix), "tlbflush_%s_%s",
+		 flush_page?"page":"all",
+		 flush_self?"self":"other");
+	report_prefix_push(prefix);
+
+	for_each_present_cpu(cpu) {
+		if (cpu == 0)
+			continue;
+		smp_boot_secondary(cpu, just_compute);
+	}
+
+	if (flush_self)
+		just_compute();
+	else
+		just_flush();
+
+	while (!cpumask_full(&smp_test_complete))
+		cpu_relax();
+
+	return report_summary();
+}
diff --git a/arm/mttcgtests.cfg b/arm/mttcgtests.cfg
new file mode 100644
index 0000000..d3ff102
--- /dev/null
+++ b/arm/mttcgtests.cfg
@@ -0,0 +1,30 @@
+##############################################################################
+# MTTCG unit tests configuration
+#
+# These are torture tests for QEMU's Multi-threaded TCG (MTTCG) which
+# aim to trigger various races in its emulation code. You can run them
+# on a real system if you like but they shouldn't fail.
+#
+# See unittests.cfg for the file format
+##############################################################################
+
+# TLB Torture Tests
+[tlbflush-code::all_other]
+file = tlbflush-code.flat
+smp = $(($MAX_SMP>4?4:$MAX_SMP))
+
+[tlbflush-code::page_other]
+file = tlbflush-code.flat
+smp = $(($MAX_SMP>4?4:$MAX_SMP))
+extra_params = -append 'page'
+
+[tlbflush-code::all_self]
+file = tlbflush-code.flat
+smp = $(($MAX_SMP>4?4:$MAX_SMP))
+extra_params = -append 'self'
+
+[tlbflush-code::page_self]
+file = tlbflush-code.flat
+smp = $(($MAX_SMP>4?4:$MAX_SMP))
+extra_params = -append 'page self'
+
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [kvm-unit-tests PATCH v8 07/10] arm/locking-tests: add comprehensive locking test
  2021-11-18 18:46 [kvm-unit-tests PATCH v8 00/10] MTTCG sanity tests for ARM Alex Bennée
                   ` (5 preceding siblings ...)
  2021-11-18 18:46 ` [kvm-unit-tests PATCH v8 06/10] arm/tlbflush-code: TLB flush during code execution Alex Bennée
@ 2021-11-18 18:46 ` Alex Bennée
  2021-11-18 18:46 ` [kvm-unit-tests PATCH v8 08/10] arm/barrier-litmus-tests: add simple mp and sal litmus tests Alex Bennée
                   ` (2 subsequent siblings)
  9 siblings, 0 replies; 19+ messages in thread
From: Alex Bennée @ 2021-11-18 18:46 UTC (permalink / raw)
  To: kvm
  Cc: idan.horowitz, qemu-arm, linux-arm-kernel, kvmarm,
	christoffer.dall, maz, Alex Bennée

This test has been written mainly to stress multi-threaded TCG behaviour
but will demonstrate failure by default on real hardware. The test takes
the following parameters:

  - "lock" use GCC's locking semantics
  - "atomic" use GCC's __atomic primitives
  - "wfelock" use WaitForEvent sleep
  - "excl" use load/store exclusive semantics

Also two more options allow the test to be tweaked

  - "noshuffle" disables the memory shuffling
  - "count=%ld" set your own per-CPU increment count

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>

---
v2
  - Don't use thumb style strexeq stuff
  - Add atomic and wfelock tests
  - Add count/noshuffle test controls
  - Move barrier tests to separate test file
v4
  - fix up unitests.cfg to use correct test name
  - move into "locking" group, remove barrier tests
  - use a table to add tests, mark which are expected to work
  - correctly report XFAIL
v5
  - max out at -smp 4 in unittest.cfg
v7
  - make test control flags bools
  - default the count to 100000 (so it doesn't timeout)
v8
  - rm spinlock test
  - fix checkpatch errors
  - fix report usage
---
 arm/Makefile.common |   2 +-
 arm/locking-test.c  | 322 ++++++++++++++++++++++++++++++++++++++++++++
 arm/spinlock-test.c |  87 ------------
 arm/mttcgtests.cfg  |  29 ++++
 4 files changed, 352 insertions(+), 88 deletions(-)
 create mode 100644 arm/locking-test.c
 delete mode 100644 arm/spinlock-test.c

diff --git a/arm/Makefile.common b/arm/Makefile.common
index e3f04f2..f905971 100644
--- a/arm/Makefile.common
+++ b/arm/Makefile.common
@@ -5,7 +5,6 @@
 #
 
 tests-common  = $(TEST_DIR)/selftest.flat
-tests-common += $(TEST_DIR)/spinlock-test.flat
 tests-common += $(TEST_DIR)/pci-test.flat
 tests-common += $(TEST_DIR)/pmu.flat
 tests-common += $(TEST_DIR)/gic.flat
@@ -13,6 +12,7 @@ tests-common += $(TEST_DIR)/psci.flat
 tests-common += $(TEST_DIR)/sieve.flat
 tests-common += $(TEST_DIR)/pl031.flat
 tests-common += $(TEST_DIR)/tlbflush-code.flat
+tests-common += $(TEST_DIR)/locking-test.flat
 
 tests-all = $(tests-common) $(tests)
 all: directories $(tests-all)
diff --git a/arm/locking-test.c b/arm/locking-test.c
new file mode 100644
index 0000000..eab9497
--- /dev/null
+++ b/arm/locking-test.c
@@ -0,0 +1,322 @@
+// SPDX-License-Identifier: GPL-2.0-or-later
+/*
+ * Locking Test
+ *
+ * This test allows us to stress the various atomic primitives of a VM
+ * guest. A number of methods are available that use various patterns
+ * to implement a lock.
+ *
+ * Copyright (C) 2017 Linaro
+ * Author: Alex Bennée <alex.bennee@linaro.org>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#include <libcflat.h>
+#include <asm/smp.h>
+#include <asm/cpumask.h>
+#include <asm/barrier.h>
+#include <asm/mmu.h>
+
+#include <prng.h>
+
+#define MAX_CPUS 8
+
+/* Test definition structure
+ *
+ * A simple structure that describes the test name, expected pass and
+ * increment function.
+ */
+
+/* Function pointers for test */
+typedef void (*inc_fn)(int cpu);
+
+typedef struct {
+	const char *test_name;
+	bool  should_pass;
+	inc_fn main_fn;
+} test_descr_t;
+
+/* How many increments to do */
+static int increment_count = 1000000;
+static bool do_shuffle = true;
+
+/* Shared value all the tests attempt to safely increment using
+ * various forms of atomic locking and exclusive behaviour.
+ */
+static unsigned int shared_value;
+
+/* PAGE_SIZE * uint32_t means we span several pages */
+__attribute__((aligned(PAGE_SIZE))) static uint32_t memory_array[PAGE_SIZE];
+
+/* We use the alignment of the following to ensure accesses to locking
+ * and synchronisation primatives don't interfere with the page of the
+ * shared value
+ */
+__attribute__((aligned(PAGE_SIZE))) static unsigned int per_cpu_value[MAX_CPUS];
+__attribute__((aligned(PAGE_SIZE))) static cpumask_t smp_test_complete;
+__attribute__((aligned(PAGE_SIZE))) struct isaac_ctx prng_context[MAX_CPUS];
+
+/* Some of the approaches use a global lock to prevent contention. */
+static int global_lock;
+
+/* In any SMP setting this *should* fail due to cores stepping on
+ * each other updating the shared variable
+ */
+static void increment_shared(int cpu)
+{
+	(void)cpu;
+
+	shared_value++;
+}
+
+/* GCC __sync primitives are deprecated in favour of __atomic */
+static void increment_shared_with_lock(int cpu)
+{
+	(void)cpu;
+
+	while (__sync_lock_test_and_set(&global_lock, 1));
+
+	shared_value++;
+
+	__sync_lock_release(&global_lock);
+}
+
+/*
+ * In practice even __ATOMIC_RELAXED uses ARM's ldxr/stex exclusive
+ * semantics
+ */
+static void increment_shared_with_atomic(int cpu)
+{
+	(void)cpu;
+
+	__atomic_add_fetch(&shared_value, 1, __ATOMIC_SEQ_CST);
+}
+
+
+/*
+ * Load/store exclusive with WFE (wait-for-event)
+ *
+ * See ARMv8 ARM examples:
+ *   Use of Wait For Event (WFE) and Send Event (SEV) with locks
+ */
+
+static void increment_shared_with_wfelock(int cpu)
+{
+	(void)cpu;
+
+#if defined(__aarch64__)
+	asm volatile(
+	"	mov     w1, #1\n"
+	"       sevl\n"
+	"       prfm PSTL1KEEP, [%[lock]]\n"
+	"1:     wfe\n"
+	"	ldaxr	w0, [%[lock]]\n"
+	"	cbnz    w0, 1b\n"
+	"	stxr    w0, w1, [%[lock]]\n"
+	"	cbnz	w0, 1b\n"
+	/* lock held */
+	"	ldr	w0, [%[sptr]]\n"
+	"	add	w0, w0, #0x1\n"
+	"	str	w0, [%[sptr]]\n"
+	/* now release */
+	"	stlr	wzr, [%[lock]]\n"
+	: /* out */
+	: [lock] "r" (&global_lock), [sptr] "r" (&shared_value) /* in */
+	: "w0", "w1", "cc");
+#else
+	asm volatile(
+	"	mov     r1, #1\n"
+	"1:	ldrex	r0, [%[lock]]\n"
+	"	cmp     r0, #0\n"
+	"	wfene\n"
+	"	strexeq r0, r1, [%[lock]]\n"
+	"	cmpeq	r0, #0\n"
+	"	bne	1b\n"
+	"	dmb\n"
+	/* lock held */
+	"	ldr	r0, [%[sptr]]\n"
+	"	add	r0, r0, #0x1\n"
+	"	str	r0, [%[sptr]]\n"
+	/* now release */
+	"	mov	r0, #0\n"
+	"	dmb\n"
+	"	str	r0, [%[lock]]\n"
+	"	dsb\n"
+	"	sev\n"
+	: /* out */
+	: [lock] "r" (&global_lock), [sptr] "r" (&shared_value) /* in */
+	: "r0", "r1", "cc");
+#endif
+}
+
+
+/*
+ * Hand-written version of the load/store exclusive
+ */
+static void increment_shared_with_excl(int cpu)
+{
+	(void)cpu;
+
+#if defined(__aarch64__)
+	asm volatile(
+	"1:	ldxr	w0, [%[sptr]]\n"
+	"	add     w0, w0, #0x1\n"
+	"	stxr	w1, w0, [%[sptr]]\n"
+	"	cbnz	w1, 1b\n"
+	: /* out */
+	: [sptr] "r" (&shared_value) /* in */
+	: "w0", "w1", "cc");
+#else
+	asm volatile(
+	"1:	ldrex	r0, [%[sptr]]\n"
+	"	add     r0, r0, #0x1\n"
+	"	strex	r1, r0, [%[sptr]]\n"
+	"	cmp	r1, #0\n"
+	"	bne	1b\n"
+	: /* out */
+	: [sptr] "r" (&shared_value) /* in */
+	: "r0", "r1", "cc");
+#endif
+}
+
+/* Test array */
+static test_descr_t tests[] = {
+	{ "none", false, increment_shared },
+	{ "lock", true, increment_shared_with_lock },
+	{ "atomic", true, increment_shared_with_atomic },
+	{ "wfelock", true, increment_shared_with_wfelock },
+	{ "excl", true, increment_shared_with_excl }
+};
+
+/* The idea of this is just to generate some random load/store
+ * activity which may or may not race with an un-barried incremented
+ * of the shared counter
+ */
+static void shuffle_memory(int cpu)
+{
+	int i;
+	uint32_t lspat = isaac_next_uint32(&prng_context[cpu]);
+	uint32_t seq = isaac_next_uint32(&prng_context[cpu]);
+	int count = seq & 0x1f;
+	uint32_t val = 0;
+
+	seq >>= 5;
+
+	for (i = 0; i < count; i++) {
+		int index = seq & ~PAGE_MASK;
+
+		if (lspat & 1)
+			val ^= memory_array[index];
+		else
+			memory_array[index] = val;
+
+		seq >>= PAGE_SHIFT;
+		seq ^= lspat;
+		lspat >>= 1;
+	}
+
+}
+
+static inc_fn increment_function;
+
+static void do_increment(void)
+{
+	int i;
+	int cpu = smp_processor_id();
+
+	printf("CPU%d: online and ++ing\n", cpu);
+
+	for (i = 0; i < increment_count; i++) {
+		per_cpu_value[cpu]++;
+		increment_function(cpu);
+
+		if (do_shuffle)
+			shuffle_memory(cpu);
+	}
+
+	printf("CPU%d: Done, %d incs\n", cpu, per_cpu_value[cpu]);
+
+	cpumask_set_cpu(cpu, &smp_test_complete);
+	if (cpu != 0)
+		halt();
+}
+
+static void setup_and_run_test(test_descr_t *test)
+{
+	unsigned int i, sum = 0;
+	int cpu, cpu_cnt = 0;
+
+	increment_function = test->main_fn;
+
+	/* fill our random page */
+	for (i = 0; i < PAGE_SIZE; i++)
+		memory_array[i] = isaac_next_uint32(&prng_context[0]);
+
+	for_each_present_cpu(cpu) {
+		uint32_t seed2 = isaac_next_uint32(&prng_context[0]);
+
+		cpu_cnt++;
+		if (cpu == 0)
+			continue;
+
+		isaac_init(&prng_context[cpu], (unsigned char *) &seed2, sizeof(seed2));
+		smp_boot_secondary(cpu, do_increment);
+	}
+
+	do_increment();
+
+	while (!cpumask_full(&smp_test_complete))
+		cpu_relax();
+
+	/* All CPUs done, do we add up */
+	for_each_present_cpu(cpu) {
+		sum += per_cpu_value[cpu];
+	}
+
+	if (test->should_pass)
+		report(sum == shared_value, "total incs %d", shared_value);
+	else
+		report_xfail(true, sum == shared_value, "total incs %d", shared_value);
+}
+
+int main(int argc, char **argv)
+{
+	static const unsigned char seed[] = "myseed";
+	test_descr_t *test = &tests[0];
+	int i;
+	unsigned int j;
+
+	isaac_init(&prng_context[0], &seed[0], sizeof(seed));
+
+	for (i = 0; i < argc; i++) {
+		char *arg = argv[i];
+
+		/* Check for test name */
+		for (j = 0; j < ARRAY_SIZE(tests); j++) {
+			if (strcmp(arg, tests[j].test_name) == 0)
+				test = &tests[j];
+		}
+
+		/* Test modifiers */
+		if (strcmp(arg, "noshuffle") == 0) {
+			do_shuffle = false;
+			report_prefix_push("noshuffle");
+		} else if (strstr(arg, "count=") != NULL) {
+			char *p = strstr(arg, "=");
+
+			increment_count = atol(p+1);
+		} else {
+			isaac_reseed(&prng_context[0], (unsigned char *) arg, strlen(arg));
+		}
+	}
+
+	if (test)
+		setup_and_run_test(test);
+	else
+		report(false, "Unknown test");
+
+	return report_summary();
+}
diff --git a/arm/spinlock-test.c b/arm/spinlock-test.c
deleted file mode 100644
index 73aea76..0000000
--- a/arm/spinlock-test.c
+++ /dev/null
@@ -1,87 +0,0 @@
-/*
- * Spinlock test
- *
- * This code is based on code from the tcg_baremetal_tests.
- *
- * Copyright (C) 2015 Virtual Open Systems SAS
- *
- * This program is free software; you can redistribute it and/or modify
- * it under the terms of the GNU General Public License version 2 as
- * published by the Free Software Foundation.
- */
-
-#include <libcflat.h>
-#include <asm/smp.h>
-#include <asm/barrier.h>
-
-#define LOOP_SIZE 10000000
-
-struct lock_ops {
-	void (*lock)(int *v);
-	void (*unlock)(int *v);
-};
-static struct lock_ops lock_ops;
-
-static void gcc_builtin_lock(int *lock_var)
-{
-	while (__sync_lock_test_and_set(lock_var, 1));
-}
-static void gcc_builtin_unlock(int *lock_var)
-{
-	__sync_lock_release(lock_var);
-}
-static void none_lock(int *lock_var)
-{
-	while (*(volatile int *)lock_var != 0);
-	*(volatile int *)lock_var = 1;
-}
-static void none_unlock(int *lock_var)
-{
-	*(volatile int *)lock_var = 0;
-}
-
-static int global_a, global_b;
-static int global_lock;
-
-static void test_spinlock(void *data __unused)
-{
-	int i, errors = 0;
-	int cpu = smp_processor_id();
-
-	printf("CPU%d online\n", cpu);
-
-	for (i = 0; i < LOOP_SIZE; i++) {
-
-		lock_ops.lock(&global_lock);
-
-		if (global_a == (cpu + 1) % 2) {
-			global_a = 1;
-			global_b = 0;
-		} else {
-			global_a = 0;
-			global_b = 1;
-		}
-
-		if (global_a == global_b)
-			errors++;
-
-		lock_ops.unlock(&global_lock);
-	}
-	report(errors == 0, "CPU%d: Done - Errors: %d", cpu, errors);
-}
-
-int main(int argc, char **argv)
-{
-	report_prefix_push("spinlock");
-	if (argc > 1 && strcmp(argv[1], "bad") != 0) {
-		lock_ops.lock = gcc_builtin_lock;
-		lock_ops.unlock = gcc_builtin_unlock;
-	} else {
-		lock_ops.lock = none_lock;
-		lock_ops.unlock = none_unlock;
-	}
-
-	on_cpus(test_spinlock, NULL);
-
-	return report_summary();
-}
diff --git a/arm/mttcgtests.cfg b/arm/mttcgtests.cfg
index d3ff102..46fcb57 100644
--- a/arm/mttcgtests.cfg
+++ b/arm/mttcgtests.cfg
@@ -28,3 +28,32 @@ file = tlbflush-code.flat
 smp = $(($MAX_SMP>4?4:$MAX_SMP))
 extra_params = -append 'page self'
 
+# Locking tests
+[locking::none]
+file = locking-test.flat
+smp = $(($MAX_SMP>4?4:$MAX_SMP))
+groups = locking
+
+[locking::lock]
+file = locking-test.flat
+smp = $(($MAX_SMP>4?4:$MAX_SMP))
+extra_params = -append 'lock'
+groups = locking
+
+[locking::atomic]
+file = locking-test.flat
+smp = $(($MAX_SMP>4?4:$MAX_SMP))
+extra_params = -append 'atomic'
+groups = locking
+
+[locking::wfelock]
+file = locking-test.flat
+smp = $(($MAX_SMP>4?4:$MAX_SMP))
+extra_params = -append 'wfelock'
+groups = locking
+
+[locking::excl]
+file = locking-test.flat
+smp = $(($MAX_SMP>4?4:$MAX_SMP))
+extra_params = -append 'excl'
+groups = locking
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [kvm-unit-tests PATCH v8 08/10] arm/barrier-litmus-tests: add simple mp and sal litmus tests
  2021-11-18 18:46 [kvm-unit-tests PATCH v8 00/10] MTTCG sanity tests for ARM Alex Bennée
                   ` (6 preceding siblings ...)
  2021-11-18 18:46 ` [kvm-unit-tests PATCH v8 07/10] arm/locking-tests: add comprehensive locking test Alex Bennée
@ 2021-11-18 18:46 ` Alex Bennée
  2021-11-24 16:14   ` Andrew Jones
  2021-11-18 18:46 ` [kvm-unit-tests PATCH v8 09/10] arm/run: use separate --accel form Alex Bennée
  2021-11-18 18:46 ` [kvm-unit-tests PATCH v8 10/10] arm/tcg-test: some basic TCG exercising tests Alex Bennée
  9 siblings, 1 reply; 19+ messages in thread
From: Alex Bennée @ 2021-11-18 18:46 UTC (permalink / raw)
  To: kvm
  Cc: idan.horowitz, qemu-arm, linux-arm-kernel, kvmarm,
	christoffer.dall, maz, Alex Bennée, Will Deacon

This adds a framework for adding simple barrier litmus tests against
ARM. The litmus tests aren't as comprehensive as the academic exercises
which will attempt to do all sorts of things to keep racing CPUs synced
up. These tests do honour the "sync" parameter to do a poor-mans
equivalent.

The two litmus tests are:
  - message passing
  - store-after-load

They both have case that should fail (although won't on single-threaded
TCG setups). If barriers aren't working properly the store-after-load
test will fail even on an x86 backend as x86 allows re-ording of non
aliased stores.

I've imported a few more of the barrier primatives from the Linux source
tree so we consistently use macros.

The arm64 barrier primitives trip up on -Wstrict-aliasing so this is
disabled in the Makefile.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
CC: Will Deacon <will@kernel.org>

---
v8
  - move to mttcgtests.cfg
  - fix checkpatch issues
  - fix report usage
v7
  - merge in store-after-load
  - clean-up sync-up code
  - use new counter api
  - fix xfail for sal test
v6
  - add a unittest.cfg
  - -fno-strict-aliasing
---
 arm/Makefile.common       |   1 +
 lib/arm/asm/barrier.h     |  61 ++++++
 lib/arm64/asm/barrier.h   |  50 +++++
 arm/barrier-litmus-test.c | 450 ++++++++++++++++++++++++++++++++++++++
 arm/mttcgtests.cfg        |  33 +++
 5 files changed, 595 insertions(+)
 create mode 100644 arm/barrier-litmus-test.c

diff --git a/arm/Makefile.common b/arm/Makefile.common
index f905971..861e5c7 100644
--- a/arm/Makefile.common
+++ b/arm/Makefile.common
@@ -13,6 +13,7 @@ tests-common += $(TEST_DIR)/sieve.flat
 tests-common += $(TEST_DIR)/pl031.flat
 tests-common += $(TEST_DIR)/tlbflush-code.flat
 tests-common += $(TEST_DIR)/locking-test.flat
+tests-common += $(TEST_DIR)/barrier-litmus-test.flat
 
 tests-all = $(tests-common) $(tests)
 all: directories $(tests-all)
diff --git a/lib/arm/asm/barrier.h b/lib/arm/asm/barrier.h
index 7f86831..2870080 100644
--- a/lib/arm/asm/barrier.h
+++ b/lib/arm/asm/barrier.h
@@ -8,6 +8,8 @@
  * This work is licensed under the terms of the GNU GPL, version 2.
  */
 
+#include <stdint.h>
+
 #define sev()		asm volatile("sev" : : : "memory")
 #define wfe()		asm volatile("wfe" : : : "memory")
 #define wfi()		asm volatile("wfi" : : : "memory")
@@ -25,4 +27,63 @@
 #define smp_rmb()	smp_mb()
 #define smp_wmb()	dmb(ishst)
 
+extern void abort(void);
+
+static inline void __write_once_size(volatile void *p, void *res, int size)
+{
+	switch (size) {
+	case 1: *(volatile uint8_t *)p = *(uint8_t *)res; break;
+	case 2: *(volatile uint16_t *)p = *(uint16_t *)res; break;
+	case 4: *(volatile uint32_t *)p = *(uint32_t *)res; break;
+	case 8: *(volatile uint64_t *)p = *(uint64_t *)res; break;
+	default:
+		/* unhandled case */
+		abort();
+	}
+}
+
+#define WRITE_ONCE(x, val) \
+({							\
+	union { typeof(x) __val; char __c[1]; } __u =	\
+		{ .__val = (typeof(x)) (val) }; \
+	__write_once_size(&(x), __u.__c, sizeof(x));	\
+	__u.__val;					\
+})
+
+#define smp_store_release(p, v)						\
+do {									\
+	smp_mb();							\
+	WRITE_ONCE(*p, v);						\
+} while (0)
+
+
+static inline
+void __read_once_size(const volatile void *p, void *res, int size)
+{
+	switch (size) {
+	case 1: *(uint8_t *)res = *(volatile uint8_t *)p; break;
+	case 2: *(uint16_t *)res = *(volatile uint16_t *)p; break;
+	case 4: *(uint32_t *)res = *(volatile uint32_t *)p; break;
+	case 8: *(uint64_t *)res = *(volatile uint64_t *)p; break;
+	default:
+		/* unhandled case */
+		abort();
+	}
+}
+
+#define READ_ONCE(x)							\
+({									\
+	union { typeof(x) __val; char __c[1]; } __u;			\
+	__read_once_size(&(x), __u.__c, sizeof(x));			\
+	__u.__val;							\
+})
+
+
+#define smp_load_acquire(p)						\
+({									\
+	typeof(*p) ___p1 = READ_ONCE(*p);				\
+	smp_mb();							\
+	___p1;								\
+})
+
 #endif /* _ASMARM_BARRIER_H_ */
diff --git a/lib/arm64/asm/barrier.h b/lib/arm64/asm/barrier.h
index 0e1904c..5e40519 100644
--- a/lib/arm64/asm/barrier.h
+++ b/lib/arm64/asm/barrier.h
@@ -24,4 +24,54 @@
 #define smp_rmb()	dmb(ishld)
 #define smp_wmb()	dmb(ishst)
 
+#define smp_store_release(p, v)						\
+do {									\
+	switch (sizeof(*p)) {						\
+	case 1:								\
+		asm volatile ("stlrb %w1, %0"				\
+				: "=Q" (*p) : "r" (v) : "memory");	\
+		break;							\
+	case 2:								\
+		asm volatile ("stlrh %w1, %0"				\
+				: "=Q" (*p) : "r" (v) : "memory");	\
+		break;							\
+	case 4:								\
+		asm volatile ("stlr %w1, %0"				\
+				: "=Q" (*p) : "r" (v) : "memory");	\
+		break;							\
+	case 8:								\
+		asm volatile ("stlr %1, %0"				\
+				: "=Q" (*p) : "r" (v) : "memory");	\
+		break;							\
+	}								\
+} while (0)
+
+#define smp_load_acquire(p)						\
+({									\
+	union { typeof(*p) __val; char __c[1]; } __u;			\
+	switch (sizeof(*p)) {						\
+	case 1:								\
+		asm volatile ("ldarb %w0, %1"				\
+			: "=r" (*(u8 *)__u.__c)				\
+			: "Q" (*p) : "memory");				\
+		break;							\
+	case 2:								\
+		asm volatile ("ldarh %w0, %1"				\
+			: "=r" (*(u16 *)__u.__c)			\
+			: "Q" (*p) : "memory");				\
+		break;							\
+	case 4:								\
+		asm volatile ("ldar %w0, %1"				\
+			: "=r" (*(u32 *)__u.__c)			\
+			: "Q" (*p) : "memory");				\
+		break;							\
+	case 8:								\
+		asm volatile ("ldar %0, %1"				\
+			: "=r" (*(u64 *)__u.__c)			\
+			: "Q" (*p) : "memory");				\
+		break;							\
+	}								\
+	__u.__val;							\
+})
+
 #endif /* _ASMARM64_BARRIER_H_ */
diff --git a/arm/barrier-litmus-test.c b/arm/barrier-litmus-test.c
new file mode 100644
index 0000000..e90f6dd
--- /dev/null
+++ b/arm/barrier-litmus-test.c
@@ -0,0 +1,450 @@
+// SPDX-License-Identifier: GPL-2.0-or-later
+/*
+ * ARM Barrier Litmus Tests
+ *
+ * This test provides a framework for testing barrier conditions on
+ * the processor. It's simpler than the more involved barrier testing
+ * frameworks as we are looking for simple failures of QEMU's TCG not
+ * weird edge cases the silicon gets wrong.
+ */
+
+#include <libcflat.h>
+#include <asm/smp.h>
+#include <asm/cpumask.h>
+#include <asm/barrier.h>
+#include <asm/mmu.h>
+
+#define MAX_CPUS 8
+
+/* Array size and access controls */
+static int array_size = 100000;
+static int wait_if_ahead;
+
+static cpumask_t cpu_mask;
+
+/*
+ * These test_array_* structures are a contiguous array modified by two or more
+ * competing CPUs. The padding is to ensure the variables do not share
+ * cache lines.
+ *
+ * All structures start zeroed.
+ */
+
+typedef struct test_array {
+	volatile unsigned int x;
+	uint8_t dummy[64];
+	volatile unsigned int y;
+	uint8_t dummy2[64];
+	volatile unsigned int r[MAX_CPUS];
+} test_array;
+
+volatile test_array *array;
+
+/* Test definition structure
+ *
+ * The first function will always run on the primary CPU, it is
+ * usually the one that will detect any weirdness and trigger the
+ * failure of the test.
+ */
+
+typedef void (*test_fn)(void);
+
+typedef struct {
+	const char *test_name;
+	bool  should_pass;
+	test_fn main_fn;
+	test_fn secondary_fns[MAX_CPUS-1];
+} test_descr_t;
+
+/* Litmus tests */
+
+static unsigned long sync_start(void)
+{
+	const unsigned long gate_mask = ~0x3ffff;
+	unsigned long gate, now;
+
+	gate = get_cntvct() & gate_mask;
+	do {
+		now = get_cntvct();
+	} while ((now & gate_mask) == gate);
+
+	return now;
+}
+
+/* Simple Message Passing
+ *
+ * x is the message data
+ * y is the flag to indicate the data is ready
+ *
+ * Reading x == 0 when y == 1 is a failure.
+ */
+
+static void message_passing_write(void)
+{
+	int i;
+
+	sync_start();
+	for (i = 0; i < array_size; i++) {
+		volatile test_array *entry = &array[i];
+
+		entry->x = 1;
+		entry->y = 1;
+	}
+
+	halt();
+}
+
+static void message_passing_read(void)
+{
+	int i;
+	int errors = 0, ready = 0;
+
+	sync_start();
+	for (i = 0; i < array_size; i++) {
+		volatile test_array *entry = &array[i];
+		unsigned int x, y;
+
+		y = entry->y;
+		x = entry->x;
+
+		if (y && !x)
+			errors++;
+		ready += y;
+	}
+
+	/*
+	 * We expect this to fail but with STO backends you often get
+	 * way with it. Fudge xfail if we did actually pass.
+	 */
+	report_xfail(errors == 0 ? false : true, errors == 0,
+		     "mp: %d errors, %d ready", errors, ready);
+}
+
+/* Simple Message Passing with barriers */
+static void message_passing_write_barrier(void)
+{
+	int i;
+
+	sync_start();
+	for (i = 0; i < array_size; i++) {
+		volatile test_array *entry = &array[i];
+
+		entry->x = 1;
+		smp_wmb();
+		entry->y = 1;
+	}
+
+	halt();
+}
+
+static void message_passing_read_barrier(void)
+{
+	int i;
+	int errors = 0, ready = 0, not_ready = 0;
+
+	sync_start();
+	for (i = 0; i < array_size; i++) {
+		volatile test_array *entry = &array[i];
+		unsigned int x, y;
+
+		y = entry->y;
+		smp_rmb();
+		x = entry->x;
+
+		if (y && !x)
+			errors++;
+
+		if (y) {
+			ready++;
+		} else {
+			not_ready++;
+
+			if (not_ready > 2) {
+				entry = &array[i+1];
+				do {
+					not_ready = 0;
+				} while (wait_if_ahead && !entry->y);
+			}
+		}
+	}
+
+	report(errors == 0, "mp barrier: %d errors, %d ready", errors, ready);
+}
+
+/* Simple Message Passing with Acquire/Release */
+static void message_passing_write_release(void)
+{
+	int i;
+
+	for (i = 0; i < array_size; i++) {
+		volatile test_array *entry = &array[i];
+
+		entry->x = 1;
+		smp_store_release(&entry->y, 1);
+	}
+
+	halt();
+}
+
+static void message_passing_read_acquire(void)
+{
+	int i;
+	int errors = 0, ready = 0, not_ready = 0;
+
+	for (i = 0; i < array_size; i++) {
+		volatile test_array *entry = &array[i];
+		unsigned int x, y;
+
+		y = smp_load_acquire(&entry->y);
+		x = entry->x;
+
+		if (y && !x)
+			errors++;
+
+		if (y) {
+			ready++;
+		} else {
+			not_ready++;
+
+			if (not_ready > 2) {
+				entry = &array[i+1];
+				do {
+					not_ready = 0;
+				} while (wait_if_ahead && !entry->y);
+			}
+		}
+	}
+
+	report(errors == 0, "mp acqrel: %d errors, %d ready", errors, ready);
+}
+
+/*
+ * Store after load
+ *
+ * T1: write 1 to x, load r from y
+ * T2: write 1 to y, load r from x
+ *
+ * Without memory fence r[0] && r[1] == 0
+ * With memory fence both == 0 should be impossible
+ */
+
+static void check_store_and_load_results(const char *name, int thread,
+					 bool xfail, unsigned long start,
+					 unsigned long end)
+{
+	int i;
+	int neither = 0;
+	int only_first = 0;
+	int only_second = 0;
+	int both = 0;
+
+	for ( i= 0; i < array_size; i++) {
+		volatile test_array *entry = &array[i];
+
+		if (entry->r[0] == 0 &&
+		    entry->r[1] == 0)
+			neither++;
+		else if (entry->r[0] &&
+			entry->r[1])
+			both++;
+		else if (entry->r[0])
+			only_first++;
+		else
+			only_second++;
+	}
+
+	printf("T%d: %08lx->%08lx neither=%d only_t1=%d only_t2=%d both=%d\n", thread,
+		start, end, neither, only_first, only_second, both);
+
+	if (thread == 1)
+		report_xfail(xfail, neither==0, "%s: errors=%d", name, neither);
+
+}
+
+/*
+ * This attempts to synchronise the start of both threads to roughly
+ * the same time. On real hardware there is a little latency as the
+ * secondary vCPUs are powered up however this effect it much more
+ * exaggerated on a TCG host.
+ *
+ * Busy waits until the we pass a future point in time, returns final
+ * start time.
+ */
+
+static void store_and_load_1(void)
+{
+	int i;
+	unsigned long start, end;
+
+	start = sync_start();
+	for (i = 0; i < array_size; i++) {
+		volatile test_array *entry = &array[i];
+		unsigned int r;
+
+		entry->x = 1;
+		r = entry->y;
+		entry->r[0] = r;
+	}
+	end = get_cntvct();
+
+	smp_mb();
+
+	while (!cpumask_test_cpu(1, &cpu_mask))
+		cpu_relax();
+
+	check_store_and_load_results("sal", 1, true, start, end);
+}
+
+static void store_and_load_2(void)
+{
+	int i;
+	unsigned long start, end;
+
+	start = sync_start();
+	for (i = 0; i < array_size; i++) {
+		volatile test_array *entry = &array[i];
+		unsigned int r;
+
+		entry->y = 1;
+		r = entry->x;
+		entry->r[1] = r;
+	}
+	end = get_cntvct();
+
+	check_store_and_load_results("sal", 2, true, start, end);
+
+	cpumask_set_cpu(1, &cpu_mask);
+
+	halt();
+}
+
+static void store_and_load_barrier_1(void)
+{
+	int i;
+	unsigned long start, end;
+
+	start = sync_start();
+	for (i = 0; i < array_size; i++) {
+		volatile test_array *entry = &array[i];
+		unsigned int r;
+
+		entry->x = 1;
+		smp_mb();
+		r = entry->y;
+		entry->r[0] = r;
+	}
+	end = get_cntvct();
+
+	smp_mb();
+
+	while (!cpumask_test_cpu(1, &cpu_mask))
+		cpu_relax();
+
+	check_store_and_load_results("sal_barrier", 1, false, start, end);
+}
+
+static void store_and_load_barrier_2(void)
+{
+	int i;
+	unsigned long start, end;
+
+	start = sync_start();
+	for (i = 0; i < array_size; i++) {
+		volatile test_array *entry = &array[i];
+		unsigned int r;
+
+		entry->y = 1;
+		smp_mb();
+		r = entry->x;
+		entry->r[1] = r;
+	}
+	end = get_cntvct();
+
+	check_store_and_load_results("sal_barrier", 2, false, start, end);
+
+	cpumask_set_cpu(1, &cpu_mask);
+
+	halt();
+}
+
+
+/* Test array */
+static test_descr_t tests[] = {
+
+	{ "mp",         false,
+	  message_passing_read,
+	  { message_passing_write }
+	},
+
+	{ "mp_barrier", true,
+	  message_passing_read_barrier,
+	  { message_passing_write_barrier }
+	},
+
+	{ "mp_acqrel", true,
+	  message_passing_read_acquire,
+	  { message_passing_write_release }
+	},
+
+	{ "sal",       false,
+	  store_and_load_1,
+	  { store_and_load_2 }
+	},
+
+	{ "sal_barrier", true,
+	  store_and_load_barrier_1,
+	  { store_and_load_barrier_2 }
+	},
+};
+
+
+static void setup_and_run_litmus(test_descr_t *test)
+{
+	array = calloc(array_size, sizeof(test_array));
+
+	if (array) {
+		int i = 0;
+
+		printf("Allocated test array @ %p\n", array);
+
+		while (test->secondary_fns[i]) {
+			smp_boot_secondary(i+1, test->secondary_fns[i]);
+			i++;
+		}
+
+		test->main_fn();
+	} else
+		report(false, "%s: failed to allocate memory", test->test_name);
+}
+
+int main(int argc, char **argv)
+{
+	int i;
+	unsigned int j;
+	test_descr_t *test = NULL;
+
+	for (i = 0; i < argc; i++) {
+		char *arg = argv[i];
+
+		for (j = 0; j < ARRAY_SIZE(tests); j++) {
+			if (strcmp(arg, tests[j].test_name) == 0)
+				test = &tests[j];
+		}
+
+		/* Test modifiers */
+		if (strstr(arg, "count=") != NULL) {
+			char *p = strstr(arg, "=");
+
+			array_size = atol(p+1);
+		} else if (strcmp(arg, "wait") == 0) {
+			wait_if_ahead = 1;
+		}
+	}
+
+	if (test)
+		setup_and_run_litmus(test);
+	else
+		report(false, "Unknown test");
+
+	return report_summary();
+}
diff --git a/arm/mttcgtests.cfg b/arm/mttcgtests.cfg
index 46fcb57..2b46756 100644
--- a/arm/mttcgtests.cfg
+++ b/arm/mttcgtests.cfg
@@ -57,3 +57,36 @@ file = locking-test.flat
 smp = $(($MAX_SMP>4?4:$MAX_SMP))
 extra_params = -append 'excl'
 groups = locking
+
+# Barrier Litmus tests
+[barrier-litmus::mp]
+file = barrier-litmus-test.flat
+smp = 2
+extra_params = -append 'mp'
+groups = barrier
+
+[barrier-litmus::mp-barrier]
+file = barrier-litmus-test.flat
+smp = 2
+extra_params = -append 'mp_barrier'
+groups = barrier
+
+[barrier-litmus::mp-acqrel]
+file = barrier-litmus-test.flat
+smp = 2
+extra_params = -append 'mp_acqrel'
+groups = barrier
+
+[barrier-litmus::sal]
+file = barrier-litmus-test.flat
+smp = 2
+extra_params = -append 'sal'
+groups = barrier
+accel = tcg
+
+[barrier-litmus::sal-barrier]
+file = barrier-litmus-test.flat
+smp = 2
+extra_params = -append 'sal_barrier'
+groups = barrier
+
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [kvm-unit-tests PATCH v8 09/10] arm/run: use separate --accel form
  2021-11-18 18:46 [kvm-unit-tests PATCH v8 00/10] MTTCG sanity tests for ARM Alex Bennée
                   ` (7 preceding siblings ...)
  2021-11-18 18:46 ` [kvm-unit-tests PATCH v8 08/10] arm/barrier-litmus-tests: add simple mp and sal litmus tests Alex Bennée
@ 2021-11-18 18:46 ` Alex Bennée
  2021-11-18 18:46 ` [kvm-unit-tests PATCH v8 10/10] arm/tcg-test: some basic TCG exercising tests Alex Bennée
  9 siblings, 0 replies; 19+ messages in thread
From: Alex Bennée @ 2021-11-18 18:46 UTC (permalink / raw)
  To: kvm
  Cc: idan.horowitz, qemu-arm, linux-arm-kernel, kvmarm,
	christoffer.dall, maz, Alex Bennée

This will allow TCG tests to alter things such as tb-size.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
---
 arm/run | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arm/run b/arm/run
index a390ca5..73c6c83 100755
--- a/arm/run
+++ b/arm/run
@@ -58,8 +58,8 @@ if $qemu $M -device '?' 2>&1 | grep pci-testdev > /dev/null; then
 	pci_testdev="-device pci-testdev"
 fi
 
-M+=",accel=$ACCEL"
-command="$qemu -nodefaults $M -cpu $processor $chr_testdev $pci_testdev"
+A="-accel $ACCEL"
+command="$qemu -nodefaults $M $A -cpu $processor $chr_testdev $pci_testdev"
 command+=" -display none -serial stdio -kernel"
 command="$(migration_cmd) $(timeout_cmd) $command"
 
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [kvm-unit-tests PATCH v8 10/10] arm/tcg-test: some basic TCG exercising tests
  2021-11-18 18:46 [kvm-unit-tests PATCH v8 00/10] MTTCG sanity tests for ARM Alex Bennée
                   ` (8 preceding siblings ...)
  2021-11-18 18:46 ` [kvm-unit-tests PATCH v8 09/10] arm/run: use separate --accel form Alex Bennée
@ 2021-11-18 18:46 ` Alex Bennée
  9 siblings, 0 replies; 19+ messages in thread
From: Alex Bennée @ 2021-11-18 18:46 UTC (permalink / raw)
  To: kvm
  Cc: idan.horowitz, qemu-arm, linux-arm-kernel, kvmarm,
	christoffer.dall, maz, Alex Bennée

These tests are not really aimed at KVM at all but exist to stretch
QEMU's TCG code generator. In particular these exercise the ability of
the TCG to:

  * Chain TranslationBlocks together (tight)
  * Handle heavy usage of the tb_jump_cache (paged)
  * Pathological case of computed local jumps (computed)

In addition the tests can be varied by adding IPI IRQs or SMC sequences
into the mix to stress the tcg_exit and invalidation mechanisms.

To explicitly stress the tb_flush() mechanism you can use the mod/rounds
parameters to force more frequent tb invalidation. Combined with setting
-tb-size 1 in QEMU to limit the code generation buffer size.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>

---
v5
  - added armv8 version of the tcg tests
  - max out at -smp 4 in unittests.cfg
  - add up IRQs sent and delivered for PASS/FAIL
  - take into account error count
  - add "rounds=" parameter
  - tweak smc to tb-size=1
  - printf fmt fix
v7
  - merged in IRQ numerology
  - updated to latest IRQ API
v8
  - fix report usage
  - fix checkpatch errors
---
 arm/Makefile.arm     |   2 +
 arm/Makefile.arm64   |   2 +
 arm/Makefile.common  |   1 +
 arm/tcg-test-asm.S   | 171 ++++++++++++++++++++++
 arm/tcg-test-asm64.S | 170 ++++++++++++++++++++++
 arm/tcg-test.c       | 338 +++++++++++++++++++++++++++++++++++++++++++
 arm/mttcgtests.cfg   |  84 +++++++++++
 7 files changed, 768 insertions(+)
 create mode 100644 arm/tcg-test-asm.S
 create mode 100644 arm/tcg-test-asm64.S
 create mode 100644 arm/tcg-test.c

diff --git a/arm/Makefile.arm b/arm/Makefile.arm
index 3a4cc6b..05e47f1 100644
--- a/arm/Makefile.arm
+++ b/arm/Makefile.arm
@@ -31,4 +31,6 @@ tests =
 
 include $(SRCDIR)/$(TEST_DIR)/Makefile.common
 
+$(TEST_DIR)/tcg-test.elf: $(cstart.o) $(TEST_DIR)/tcg-test.o $(TEST_DIR)/tcg-test-asm.o
+
 arch_clean: arm_clean
diff --git a/arm/Makefile.arm64 b/arm/Makefile.arm64
index e8a38d7..ac94f8e 100644
--- a/arm/Makefile.arm64
+++ b/arm/Makefile.arm64
@@ -34,5 +34,7 @@ tests += $(TEST_DIR)/cache.flat
 
 include $(SRCDIR)/$(TEST_DIR)/Makefile.common
 
+$(TEST_DIR)/tcg-test.elf: $(cstart.o) $(TEST_DIR)/tcg-test.o $(TEST_DIR)/tcg-test-asm64.o
+
 arch_clean: arm_clean
 	$(RM) lib/arm64/.*.d
diff --git a/arm/Makefile.common b/arm/Makefile.common
index 861e5c7..abb6948 100644
--- a/arm/Makefile.common
+++ b/arm/Makefile.common
@@ -14,6 +14,7 @@ tests-common += $(TEST_DIR)/pl031.flat
 tests-common += $(TEST_DIR)/tlbflush-code.flat
 tests-common += $(TEST_DIR)/locking-test.flat
 tests-common += $(TEST_DIR)/barrier-litmus-test.flat
+tests-common += $(TEST_DIR)/tcg-test.flat
 
 tests-all = $(tests-common) $(tests)
 all: directories $(tests-all)
diff --git a/arm/tcg-test-asm.S b/arm/tcg-test-asm.S
new file mode 100644
index 0000000..4ec4978
--- /dev/null
+++ b/arm/tcg-test-asm.S
@@ -0,0 +1,171 @@
+/* SPDX-License-Identifier: GPL-2.0-or-later */
+/*
+ * TCG Test assembler functions for armv7 tests.
+ *
+ * Copyright (C) 2016, Linaro Ltd, Alex Bennée <alex.bennee@linaro.org>
+ *
+ * This work is licensed under the terms of the GNU LGPL, version 2.
+ *
+ * These helper functions are written in pure asm to control the size
+ * of the basic blocks and ensure they fit neatly into page
+ * aligned chunks. The pattern of branches they follow is determined by
+ * the 32 bit seed they are passed. It should be the same for each set.
+ *
+ * Calling convention
+ *  - r0, iterations
+ *  - r1, jump pattern
+ *  - r2-r3, scratch
+ *
+ * Returns r0
+ */
+
+.arm
+
+.section .text
+
+/* Tight - all blocks should quickly be patched and should run
+ * very fast unless irqs or smc gets in the way
+ */
+
+.global tight_start
+tight_start:
+        subs    r0, r0, #1
+        beq     tight_end
+
+        ror     r1, r1, #1
+        tst     r1, #1
+        beq     tightA
+        b       tight_start
+
+tightA:
+        subs    r0, r0, #1
+        beq     tight_end
+
+        ror     r1, r1, #1
+        tst     r1, #1
+        beq     tightB
+        b       tight_start
+
+tightB:
+        subs    r0, r0, #1
+        beq     tight_end
+
+        ror     r1, r1, #1
+        tst     r1, #1
+        beq     tight_start
+        b       tightA
+
+.global tight_end
+tight_end:
+        mov     pc, lr
+
+/*
+ * Computed jumps cannot be hardwired into the basic blocks so each one
+ * will cause an exit for the main execution loop to look up the next block.
+ *
+ * There is some caching which should ameliorate the cost a little.
+ */
+
+        /* Align << 13 == 4096 byte alignment */
+        .align 13
+        .global computed_start
+computed_start:
+        subs    r0, r0, #1
+        beq     computed_end
+
+        /* Jump table */
+        ror     r1, r1, #1
+        and     r2, r1, #1
+        adr     r3, computed_jump_table
+        ldr     r2, [r3, r2, lsl #2]
+        mov     pc, r2
+
+        b       computed_err
+
+computed_jump_table:
+        .word   computed_start
+        .word   computedA
+
+computedA:
+        subs    r0, r0, #1
+        beq     computed_end
+
+        /* Jump into code */
+        ror     r1, r1, #1
+        and     r2, r1, #1
+        adr     r3, 1f
+        add	r3, r2, lsl #2
+        mov     pc, r3
+1:      b       computed_start
+        b       computedB
+
+        b       computed_err
+
+
+computedB:
+        subs    r0, r0, #1
+        beq     computed_end
+        ror     r1, r1, #1
+
+        /* Conditional register load */
+        adr     r3, computedA
+        tst     r1, #1
+        adreq   r3, computed_start
+        mov     pc, r3
+
+        b       computed_err
+
+computed_err:
+        mov     r0, #1
+        .global computed_end
+computed_end:
+        mov     pc, lr
+
+
+/*
+ * Page hoping
+ *
+ * Each block is in a different page, hence the blocks never get joined
+ */
+        /* Align << 13 == 4096 byte alignment */
+        .align 13
+        .global paged_start
+paged_start:
+        subs    r0, r0, #1
+        beq     paged_end
+
+        ror     r1, r1, #1
+        tst     r1, #1
+        beq     pagedA
+        b       paged_start
+
+        /* Align << 13 == 4096 byte alignment */
+        .align 13
+pagedA:
+        subs    r0, r0, #1
+        beq     paged_end
+
+        ror     r1, r1, #1
+        tst     r1, #1
+        beq     pagedB
+        b       paged_start
+
+        /* Align << 13 == 4096 byte alignment */
+        .align 13
+pagedB:
+        subs    r0, r0, #1
+        beq     paged_end
+
+        ror     r1, r1, #1
+        tst     r1, #1
+        beq     paged_start
+        b       pagedA
+
+        /* Align << 13 == 4096 byte alignment */
+        .align 13
+.global paged_end
+paged_end:
+        mov     pc, lr
+
+.global test_code_end
+test_code_end:
diff --git a/arm/tcg-test-asm64.S b/arm/tcg-test-asm64.S
new file mode 100644
index 0000000..2781eeb
--- /dev/null
+++ b/arm/tcg-test-asm64.S
@@ -0,0 +1,170 @@
+/* SPDX-License-Identifier: GPL-2.0-or-later */
+/*
+ * TCG Test assembler functions for armv8 tests.
+ *
+ * Copyright (C) 2016, Linaro Ltd, Alex Bennée <alex.bennee@linaro.org>
+ *
+ * This work is licensed under the terms of the GNU LGPL, version 2.
+ *
+ * These helper functions are written in pure asm to control the size
+ * of the basic blocks and ensure they fit neatly into page
+ * aligned chunks. The pattern of branches they follow is determined by
+ * the 32 bit seed they are passed. It should be the same for each set.
+ *
+ * Calling convention
+ *  - x0, iterations
+ *  - x1, jump pattern
+ *  - x2-x3, scratch
+ *
+ * Returns x0
+ */
+
+.section .text
+
+/* Tight - all blocks should quickly be patched and should run
+ * very fast unless irqs or smc gets in the way
+ */
+
+.global tight_start
+tight_start:
+        subs    x0, x0, #1
+        beq     tight_end
+
+        ror     x1, x1, #1
+        tst     x1, #1
+        beq     tightA
+        b       tight_start
+
+tightA:
+        subs    x0, x0, #1
+        beq     tight_end
+
+        ror     x1, x1, #1
+        tst     x1, #1
+        beq     tightB
+        b       tight_start
+
+tightB:
+        subs    x0, x0, #1
+        beq     tight_end
+
+        ror     x1, x1, #1
+        tst     x1, #1
+        beq     tight_start
+        b       tightA
+
+.global tight_end
+tight_end:
+        ret
+
+/*
+ * Computed jumps cannot be hardwired into the basic blocks so each one
+ * will cause an exit for the main execution loop to look up the next block.
+ *
+ * There is some caching which should ameliorate the cost a little.
+ */
+
+        /* Align << 13 == 4096 byte alignment */
+        .align 13
+        .global computed_start
+computed_start:
+        subs    x0, x0, #1
+        beq     computed_end
+
+        /* Jump table */
+        ror     x1, x1, #1
+        and     x2, x1, #1
+        adr     x3, computed_jump_table
+        ldr     x2, [x3, x2, lsl #3]
+        br      x2
+
+        b       computed_err
+
+computed_jump_table:
+        .quad   computed_start
+        .quad   computedA
+
+computedA:
+        subs    x0, x0, #1
+        beq     computed_end
+
+        /* Jump into code */
+        ror     x1, x1, #1
+        and     x2, x1, #1
+        adr     x3, 1f
+        add	x3, x3, x2, lsl #2
+        br      x3
+1:      b       computed_start
+        b       computedB
+
+        b       computed_err
+
+
+computedB:
+        subs    x0, x0, #1
+        beq     computed_end
+        ror     x1, x1, #1
+
+        /* Conditional register load */
+        adr     x2, computedA
+        adr     x3, computed_start
+        tst     x1, #1
+        csel    x2, x3, x2, eq
+        br      x2
+
+        b       computed_err
+
+computed_err:
+        mov     x0, #1
+        .global computed_end
+computed_end:
+        ret
+
+
+/*
+ * Page hoping
+ *
+ * Each block is in a different page, hence the blocks never get joined
+ */
+        /* Align << 13 == 4096 byte alignment */
+        .align 13
+        .global paged_start
+paged_start:
+        subs    x0, x0, #1
+        beq     paged_end
+
+        ror     x1, x1, #1
+        tst     x1, #1
+        beq     pagedA
+        b       paged_start
+
+        /* Align << 13 == 4096 byte alignment */
+        .align 13
+pagedA:
+        subs    x0, x0, #1
+        beq     paged_end
+
+        ror     x1, x1, #1
+        tst     x1, #1
+        beq     pagedB
+        b       paged_start
+
+        /* Align << 13 == 4096 byte alignment */
+        .align 13
+pagedB:
+        subs    x0, x0, #1
+        beq     paged_end
+
+        ror     x1, x1, #1
+        tst     x1, #1
+        beq     paged_start
+        b       pagedA
+
+        /* Align << 13 == 4096 byte alignment */
+        .align 13
+.global paged_end
+paged_end:
+        ret
+
+.global test_code_end
+test_code_end:
diff --git a/arm/tcg-test.c b/arm/tcg-test.c
new file mode 100644
index 0000000..fddab7b
--- /dev/null
+++ b/arm/tcg-test.c
@@ -0,0 +1,338 @@
+// SPDX-License-Identifier: GPL-2.0-or-later
+/*
+ * ARM TCG Tests
+ *
+ * These tests are explicitly aimed at stretching the QEMU TCG engine.
+ */
+
+#include <libcflat.h>
+#include <asm/processor.h>
+#include <asm/smp.h>
+#include <asm/cpumask.h>
+#include <asm/barrier.h>
+#include <asm/mmu.h>
+#include <asm/gic.h>
+
+#include <prng.h>
+
+#define MAX_CPUS 8
+
+/* These entry points are in the assembly code */
+extern int tight_start(uint32_t count, uint32_t pattern);
+extern int computed_start(uint32_t count, uint32_t pattern);
+extern int paged_start(uint32_t count, uint32_t pattern);
+extern uint32_t tight_end;
+extern uint32_t computed_end;
+extern uint32_t paged_end;
+extern unsigned long test_code_end;
+
+typedef int (*test_fn)(uint32_t count, uint32_t pattern);
+
+typedef struct {
+	const char *test_name;
+	bool       should_pass;
+	test_fn    start_fn;
+	uint32_t   *code_end;
+} test_descr_t;
+
+/* Test array */
+static test_descr_t tests[] = {
+       /*
+	* Tight chain.
+	*
+	* These are a bunch of basic blocks that have fixed branches in
+	* a page aligned space. The branches taken are decided by a
+	* psuedo-random bitmap for each CPU.
+	*
+	* Once the basic blocks have been chained together by the TCG they
+	* should run until they reach their block count. This will be the
+	* most efficient mode in which generated code is run. The only other
+	* exits will be caused by interrupts or TB invalidation.
+	*/
+	{ "tight", true, tight_start, &tight_end },
+	/*
+	 * Computed jumps.
+	 *
+	 * A bunch of basic blocks which just do computed jumps so the basic
+	 * block is never chained but they are all within a page (maybe not
+	 * required). This will exercise the cache lookup but not the new
+	 * generation.
+	 */
+	{ "computed", true, computed_start, &computed_end },
+	/*
+	 * Page ping pong.
+	 *
+	 * Have the blocks are separated by PAGE_SIZE so they can never
+	 * be chained together.
+	 *
+	 */
+	{ "paged", true, paged_start, &paged_end}
+};
+
+static test_descr_t *test;
+
+static int iterations = 1000000;
+static int rounds = 1000;
+static int mod_freq = 5;
+static uint32_t pattern[MAX_CPUS];
+
+/* control flags */
+static int smc;
+static int irq;
+static int check_irq;
+
+/* IRQ accounting */
+#define MAX_IRQ_IDS 16
+static int irqv;
+static unsigned long irq_sent_ts[MAX_CPUS][MAX_CPUS][MAX_IRQ_IDS];
+
+static int irq_recv[MAX_CPUS];
+static int irq_sent[MAX_CPUS];
+static int irq_overlap[MAX_CPUS];  /* if ts > now, i.e a race */
+static int irq_slow[MAX_CPUS];  /* if delay > threshold */
+static unsigned long irq_latency[MAX_CPUS]; /* cumulative time */
+
+static int errors[MAX_CPUS];
+
+static cpumask_t smp_test_complete;
+
+static cpumask_t ready;
+
+static void wait_on_ready(void)
+{
+	cpumask_set_cpu(smp_processor_id(), &ready);
+	while (!cpumask_full(&ready))
+		cpu_relax();
+}
+
+/* This triggers TCGs SMC detection by writing values to the executing
+ * code pages. We are not actually modifying the instructions and the
+ * underlying code will remain unchanged. However this should trigger
+ * invalidation of the Translation Blocks
+ */
+
+static void trigger_smc_detection(uint32_t *start, uint32_t *end)
+{
+	volatile uint32_t *ptr = start;
+
+	while (ptr < end) {
+		uint32_t inst = *ptr;
+		*ptr++ = inst;
+	}
+}
+
+/* Handler for receiving IRQs */
+
+static void irq_handler(struct pt_regs *regs __unused)
+{
+	unsigned long then, now = get_cntvct();
+	int cpu = smp_processor_id();
+	u32 irqstat = gic_read_iar();
+	u32 irqnr = gic_iar_irqnr(irqstat);
+
+	if (irqnr != GICC_INT_SPURIOUS) {
+		unsigned int src_cpu = (irqstat >> 10) & 0x7;
+
+		gic_write_eoir(irqstat);
+		irq_recv[cpu]++;
+
+		then = irq_sent_ts[src_cpu][cpu][irqnr];
+
+		if (then > now) {
+			irq_overlap[cpu]++;
+		} else {
+			unsigned long latency = (now - then);
+
+			if (latency > 30000)
+				irq_slow[cpu]++;
+			else
+				irq_latency[cpu] += latency;
+		}
+	}
+}
+
+/* This triggers cross-CPU IRQs. Each IRQ should cause the basic block
+ * execution to finish the main run-loop get entered again.
+ */
+static int send_cross_cpu_irqs(int this_cpu, int irq)
+{
+	int cpu, sent = 0;
+	cpumask_t mask;
+
+	cpumask_copy(&mask, &cpu_present_mask);
+
+	for_each_present_cpu(cpu) {
+		if (cpu != this_cpu) {
+			irq_sent_ts[this_cpu][cpu][irq] = get_cntvct();
+			cpumask_clear_cpu(cpu, &mask);
+			sent++;
+		}
+	}
+
+	gic_ipi_send_mask(irq, &mask);
+
+	return sent;
+}
+
+static void do_test(void)
+{
+	int cpu = smp_processor_id();
+	int i, irq_id = 0;
+
+	printf("CPU%d: online and setting up with pattern 0x%"PRIx32"\n",
+	       cpu, pattern[cpu]);
+
+	if (irq) {
+		gic_enable_defaults();
+#ifdef __arm__
+		install_exception_handler(EXCPTN_IRQ, irq_handler);
+#else
+		install_irq_handler(EL1H_IRQ, irq_handler);
+#endif
+		local_irq_enable();
+
+		wait_on_ready();
+	}
+
+	for (i = 0; i < rounds; i++) {
+		/* Enter the blocks */
+		errors[cpu] += test->start_fn(iterations, pattern[cpu]);
+
+		if ((i + cpu) % mod_freq == 0) {
+			if (smc)
+				trigger_smc_detection((uint32_t *) test->start_fn,
+						      test->code_end);
+
+			if (irq) {
+				irq_sent[cpu] += send_cross_cpu_irqs(cpu, irq_id);
+				irq_id++;
+				irq_id = irq_id % 15;
+			}
+		}
+	}
+
+	/* ensure everything complete before we finish */
+	smp_wmb();
+
+	cpumask_set_cpu(cpu, &smp_test_complete);
+	if (cpu != 0)
+		halt();
+}
+
+static void report_irq_stats(int cpu)
+{
+	int recv = irq_recv[cpu];
+	int race = irq_overlap[cpu];
+	int slow = irq_slow[cpu];
+
+	unsigned long avg_latency = irq_latency[cpu] / (recv - (race + slow));
+
+	printf("CPU%d: %d irqs (%d races, %d slow,  %ld ticks avg latency)\n",
+		cpu, recv, race, slow, avg_latency);
+}
+
+
+static void setup_and_run_tcg_test(void)
+{
+	static const unsigned char seed[] = "tcg-test";
+	struct isaac_ctx prng_context;
+	int cpu;
+	int total_err = 0, total_sent = 0, total_recv = 0;
+
+	isaac_init(&prng_context, &seed[0], sizeof(seed));
+
+	/* boot other CPUs */
+	for_each_present_cpu(cpu) {
+		pattern[cpu] = isaac_next_uint32(&prng_context);
+
+		if (cpu == 0)
+			continue;
+
+		smp_boot_secondary(cpu, do_test);
+	}
+
+	do_test();
+
+	while (!cpumask_full(&smp_test_complete))
+		cpu_relax();
+
+	/* Ensure everything completes before we check the data */
+	smp_mb();
+
+	/* Now total up errors and irqs */
+	for_each_present_cpu(cpu) {
+		total_err += errors[cpu];
+		total_sent += irq_sent[cpu];
+		total_recv += irq_recv[cpu];
+
+		if (check_irq)
+			report_irq_stats(cpu);
+	}
+
+	if (check_irq)
+		report(total_sent == total_recv && total_err == 0,
+		       "%d IRQs sent, %d received, %d errors\n",
+		       total_sent, total_recv, total_err == 0);
+	else
+		report(total_err == 0, "%d errors, IRQs not checked", total_err);
+}
+
+int main(int argc, char **argv)
+{
+	int i;
+	unsigned int j;
+
+	for (i = 0; i < argc; i++) {
+		char *arg = argv[i];
+
+		for (j = 0; j < ARRAY_SIZE(tests); j++) {
+			if (strcmp(arg, tests[j].test_name) == 0)
+				test = &tests[j];
+		}
+
+		/* Test modifiers */
+		if (strstr(arg, "mod=") != NULL) {
+			char *p = strstr(arg, "=");
+
+			mod_freq = atol(p+1);
+		}
+
+		if (strstr(arg, "rounds=") != NULL) {
+			char *p = strstr(arg, "=");
+
+			rounds = atol(p+1);
+		}
+
+		if (strcmp(arg, "smc") == 0) {
+			unsigned long test_start = (unsigned long) &tight_start;
+			unsigned long test_end = (unsigned long) &test_code_end;
+
+			smc = 1;
+			mmu_set_range_ptes(mmu_idmap, test_start, test_start, test_end,
+					__pgprot(PTE_WBWA));
+
+			report_prefix_push("smc");
+		}
+
+		if (strcmp(arg, "irq") == 0) {
+			irq = 1;
+			if (!gic_init())
+				report_abort("No supported gic present!");
+			irqv = gic_version();
+			report_prefix_push("irq");
+		}
+
+		if (strcmp(arg, "check_irq") == 0)
+			check_irq = 1;
+	}
+
+	if (test) {
+		/* ensure args visible to all cores */
+		smp_mb();
+		setup_and_run_tcg_test();
+	} else {
+		report(false, "Unknown test");
+	}
+
+	return report_summary();
+}
diff --git a/arm/mttcgtests.cfg b/arm/mttcgtests.cfg
index 2b46756..046a59c 100644
--- a/arm/mttcgtests.cfg
+++ b/arm/mttcgtests.cfg
@@ -90,3 +90,87 @@ smp = 2
 extra_params = -append 'sal_barrier'
 groups = barrier
 
+# TCG Tests
+[tcg::tight]
+file = tcg-test.flat
+smp = $(($MAX_SMP>4?4:$MAX_SMP))
+extra_params = -append 'tight'
+groups = tcg
+accel = tcg
+
+[tcg::tight-smc]
+file = tcg-test.flat
+smp = $(($MAX_SMP>4?4:$MAX_SMP))
+extra_params = -append 'tight smc' -accel tcg,tb-size=1
+groups = tcg
+accel = tcg
+
+[tcg::tight-irq]
+file = tcg-test.flat
+smp = $(($MAX_SMP>4?4:$MAX_SMP))
+extra_params = -append 'tight irq'
+groups = tcg
+accel = tcg
+
+[tcg::tight-smc-irq]
+file = tcg-test.flat
+smp = $(($MAX_SMP>4?4:$MAX_SMP))
+extra_params = -append 'tight smc irq'
+groups = tcg
+accel = tcg
+
+[tcg::computed]
+file = tcg-test.flat
+smp = $(($MAX_SMP>4?4:$MAX_SMP))
+extra_params = -append 'computed'
+groups = tcg
+accel = tcg
+
+[tcg::computed-smc]
+file = tcg-test.flat
+smp = $(($MAX_SMP>4?4:$MAX_SMP))
+extra_params = -append 'computed smc'
+groups = tcg
+accel = tcg
+
+[tcg::computed-irq]
+file = tcg-test.flat
+smp = $(($MAX_SMP>4?4:$MAX_SMP))
+extra_params = -append 'computed irq'
+groups = tcg
+accel = tcg
+
+[tcg::computed-smc-irq]
+file = tcg-test.flat
+smp = $(($MAX_SMP>4?4:$MAX_SMP))
+extra_params = -append 'computed smc irq'
+groups = tcg
+accel = tcg
+
+[tcg::paged]
+file = tcg-test.flat
+smp = $(($MAX_SMP>4?4:$MAX_SMP))
+extra_params = -append 'paged'
+groups = tcg
+accel = tcg
+
+[tcg::paged-smc]
+file = tcg-test.flat
+smp = $(($MAX_SMP>4?4:$MAX_SMP))
+extra_params = -append 'paged smc'
+groups = tcg
+accel = tcg
+
+[tcg::paged-irq]
+file = tcg-test.flat
+smp = $(($MAX_SMP>4?4:$MAX_SMP))
+extra_params = -append 'paged irq'
+groups = tcg
+accel = tcg
+
+[tcg::paged-smc-irq]
+file = tcg-test.flat
+smp = $(($MAX_SMP>4?4:$MAX_SMP))
+extra_params = -append 'paged smc irq'
+groups = tcg
+accel = tcg
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [kvm-unit-tests PATCH v8 01/10] docs: mention checkpatch in the README
  2021-11-18 18:46 ` [kvm-unit-tests PATCH v8 01/10] docs: mention checkpatch in the README Alex Bennée
@ 2021-11-24 11:06   ` Andrew Jones
  2021-11-24 11:08     ` Andrew Jones
  2021-11-24 11:38     ` Alex Bennée
  0 siblings, 2 replies; 19+ messages in thread
From: Andrew Jones @ 2021-11-24 11:06 UTC (permalink / raw)
  To: Alex Bennée
  Cc: kvm, maz, qemu-arm, idan.horowitz, kvmarm, linux-arm-kernel

On Thu, Nov 18, 2021 at 06:46:41PM +0000, Alex Bennée wrote:
> Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
> ---
>  README.md | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/README.md b/README.md
> index b498aaf..5db48e5 100644
> --- a/README.md
> +++ b/README.md
> @@ -182,3 +182,5 @@ the code files.  We also start with common code and finish with unit test
>  code. git-diff's orderFile feature allows us to specify the order in a
>  file.  The orderFile we use is `scripts/git.difforder`; adding the config
>  with `git config diff.orderFile scripts/git.difforder` enables it.
> +
> +Please run the kernel's ./scripts/checkpatch.pl on new patches

This is a bit of a problem for kvm-unit-tests code which still has a mix
of styles since it was originally written with a strange tab and space
mixed style. If somebody is patching one of those files we've usually
tried to maintain the original style rather than reformat the whole
thing (in hindsight maybe we should have just reformatted). We're also
more flexible with line length than Linux, although Linux now only warns
for anything over 80 as long as it's under 100, which is probably good
enough for us too. Anyway, let's see what Paolo and Thomas say. Personally
I wouldn't mind adding this line to the documentation, so I'll ack it.
Anyway, we can also ignore our own advise when it suits us :-)

Acked-by: Andrew Jones <drjones@redhat.com>

Thanks,
drew


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [kvm-unit-tests PATCH v8 01/10] docs: mention checkpatch in the README
  2021-11-24 11:06   ` Andrew Jones
@ 2021-11-24 11:08     ` Andrew Jones
  2021-11-24 11:38     ` Alex Bennée
  1 sibling, 0 replies; 19+ messages in thread
From: Andrew Jones @ 2021-11-24 11:08 UTC (permalink / raw)
  To: Alex Bennée
  Cc: kvm, maz, qemu-arm, idan.horowitz, kvmarm, linux-arm-kernel,
	pbonzini, thuth

On Wed, Nov 24, 2021 at 12:07:02PM +0100, Andrew Jones wrote:
> On Thu, Nov 18, 2021 at 06:46:41PM +0000, Alex Bennée wrote:
> > Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
> > ---
> >  README.md | 2 ++
> >  1 file changed, 2 insertions(+)
> > 
> > diff --git a/README.md b/README.md
> > index b498aaf..5db48e5 100644
> > --- a/README.md
> > +++ b/README.md
> > @@ -182,3 +182,5 @@ the code files.  We also start with common code and finish with unit test
> >  code. git-diff's orderFile feature allows us to specify the order in a
> >  file.  The orderFile we use is `scripts/git.difforder`; adding the config
> >  with `git config diff.orderFile scripts/git.difforder` enables it.
> > +
> > +Please run the kernel's ./scripts/checkpatch.pl on new patches
> 
> This is a bit of a problem for kvm-unit-tests code which still has a mix
> of styles since it was originally written with a strange tab and space
> mixed style. If somebody is patching one of those files we've usually
> tried to maintain the original style rather than reformat the whole
> thing (in hindsight maybe we should have just reformatted). We're also
> more flexible with line length than Linux, although Linux now only warns
> for anything over 80 as long as it's under 100, which is probably good
> enough for us too. Anyway, let's see what Paolo and Thomas say. Personally
> I wouldn't mind adding this line to the documentation, so I'll ack it.
> Anyway, we can also ignore our own advise when it suits us :-)
> 
> Acked-by: Andrew Jones <drjones@redhat.com>
>

Forgot to CC Thomas and Paolo, am now.

Thanks,
drew


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [kvm-unit-tests PATCH v8 01/10] docs: mention checkpatch in the README
  2021-11-24 11:06   ` Andrew Jones
  2021-11-24 11:08     ` Andrew Jones
@ 2021-11-24 11:38     ` Alex Bennée
  1 sibling, 0 replies; 19+ messages in thread
From: Alex Bennée @ 2021-11-24 11:38 UTC (permalink / raw)
  To: Andrew Jones; +Cc: kvm, maz, qemu-arm, idan.horowitz, kvmarm, linux-arm-kernel


Andrew Jones <drjones@redhat.com> writes:

> On Thu, Nov 18, 2021 at 06:46:41PM +0000, Alex Bennée wrote:
>> Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
>> ---
>>  README.md | 2 ++
>>  1 file changed, 2 insertions(+)
>> 
>> diff --git a/README.md b/README.md
>> index b498aaf..5db48e5 100644
>> --- a/README.md
>> +++ b/README.md
>> @@ -182,3 +182,5 @@ the code files.  We also start with common code and finish with unit test
>>  code. git-diff's orderFile feature allows us to specify the order in a
>>  file.  The orderFile we use is `scripts/git.difforder`; adding the config
>>  with `git config diff.orderFile scripts/git.difforder` enables it.
>> +
>> +Please run the kernel's ./scripts/checkpatch.pl on new patches
>
> This is a bit of a problem for kvm-unit-tests code which still has a mix
> of styles since it was originally written with a strange tab and space
> mixed style. If somebody is patching one of those files we've usually
> tried to maintain the original style rather than reformat the whole
> thing (in hindsight maybe we should have just reformatted). We're also
> more flexible with line length than Linux, although Linux now only warns
> for anything over 80 as long as it's under 100, which is probably good
> enough for us too. Anyway, let's see what Paolo and Thomas say. Personally
> I wouldn't mind adding this line to the documentation, so I'll ack it.
> Anyway, we can also ignore our own advise when it suits us :-)
>
> Acked-by: Andrew Jones <drjones@redhat.com>

I can make the wording more weaselly:

 We strive to follow the Linux kernels coding style so it's recommended
 to run the kernel's ./scripts/checkpatch.pl on new patches.

I added this reference because on the older iterations of these test
divergence from the kernel coding style was pointed out and I've fixed
them in this iteration.

>
> Thanks,
> drew


-- 
Alex Bennée

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [kvm-unit-tests PATCH v8 08/10] arm/barrier-litmus-tests: add simple mp and sal litmus tests
  2021-11-18 18:46 ` [kvm-unit-tests PATCH v8 08/10] arm/barrier-litmus-tests: add simple mp and sal litmus tests Alex Bennée
@ 2021-11-24 16:14   ` Andrew Jones
  0 siblings, 0 replies; 19+ messages in thread
From: Andrew Jones @ 2021-11-24 16:14 UTC (permalink / raw)
  To: Alex Bennée
  Cc: kvm, Will Deacon, maz, qemu-arm, idan.horowitz, kvmarm, linux-arm-kernel

On Thu, Nov 18, 2021 at 06:46:48PM +0000, Alex Bennée wrote:
> This adds a framework for adding simple barrier litmus tests against
> ARM. The litmus tests aren't as comprehensive as the academic exercises
> which will attempt to do all sorts of things to keep racing CPUs synced
> up. These tests do honour the "sync" parameter to do a poor-mans
> equivalent.
> 
> The two litmus tests are:
>   - message passing
>   - store-after-load
> 
> They both have case that should fail (although won't on single-threaded
> TCG setups). If barriers aren't working properly the store-after-load
> test will fail even on an x86 backend as x86 allows re-ording of non
> aliased stores.
> 
> I've imported a few more of the barrier primatives from the Linux source
> tree so we consistently use macros.
> 
> The arm64 barrier primitives trip up on -Wstrict-aliasing so this is
> disabled in the Makefile.
> 
> Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
> CC: Will Deacon <will@kernel.org>
> 
> ---
> v8
>   - move to mttcgtests.cfg
>   - fix checkpatch issues
>   - fix report usage
> v7
>   - merge in store-after-load
>   - clean-up sync-up code
>   - use new counter api
>   - fix xfail for sal test
> v6
>   - add a unittest.cfg
>   - -fno-strict-aliasing
> ---
>  arm/Makefile.common       |   1 +
>  lib/arm/asm/barrier.h     |  61 ++++++
>  lib/arm64/asm/barrier.h   |  50 +++++
>  arm/barrier-litmus-test.c | 450 ++++++++++++++++++++++++++++++++++++++
>  arm/mttcgtests.cfg        |  33 +++
>  5 files changed, 595 insertions(+)
>  create mode 100644 arm/barrier-litmus-test.c
> 
> diff --git a/arm/Makefile.common b/arm/Makefile.common
> index f905971..861e5c7 100644
> --- a/arm/Makefile.common
> +++ b/arm/Makefile.common
> @@ -13,6 +13,7 @@ tests-common += $(TEST_DIR)/sieve.flat
>  tests-common += $(TEST_DIR)/pl031.flat
>  tests-common += $(TEST_DIR)/tlbflush-code.flat
>  tests-common += $(TEST_DIR)/locking-test.flat
> +tests-common += $(TEST_DIR)/barrier-litmus-test.flat
>  
>  tests-all = $(tests-common) $(tests)
>  all: directories $(tests-all)
> diff --git a/lib/arm/asm/barrier.h b/lib/arm/asm/barrier.h
> index 7f86831..2870080 100644
> --- a/lib/arm/asm/barrier.h
> +++ b/lib/arm/asm/barrier.h
> @@ -8,6 +8,8 @@
>   * This work is licensed under the terms of the GNU GPL, version 2.
>   */
>  
> +#include <stdint.h>
> +
>  #define sev()		asm volatile("sev" : : : "memory")
>  #define wfe()		asm volatile("wfe" : : : "memory")
>  #define wfi()		asm volatile("wfi" : : : "memory")
> @@ -25,4 +27,63 @@
>  #define smp_rmb()	smp_mb()
>  #define smp_wmb()	dmb(ishst)
>  
> +extern void abort(void);
> +
> +static inline void __write_once_size(volatile void *p, void *res, int size)
> +{
> +	switch (size) {
> +	case 1: *(volatile uint8_t *)p = *(uint8_t *)res; break;
> +	case 2: *(volatile uint16_t *)p = *(uint16_t *)res; break;
> +	case 4: *(volatile uint32_t *)p = *(uint32_t *)res; break;
> +	case 8: *(volatile uint64_t *)p = *(uint64_t *)res; break;
> +	default:
> +		/* unhandled case */
> +		abort();
> +	}
> +}
> +
> +#define WRITE_ONCE(x, val) \
> +({							\
> +	union { typeof(x) __val; char __c[1]; } __u =	\
> +		{ .__val = (typeof(x)) (val) }; \
> +	__write_once_size(&(x), __u.__c, sizeof(x));	\
> +	__u.__val;					\
> +})
> +
> +#define smp_store_release(p, v)						\
> +do {									\
> +	smp_mb();							\
> +	WRITE_ONCE(*p, v);						\
> +} while (0)
> +
> +
> +static inline
> +void __read_once_size(const volatile void *p, void *res, int size)
> +{
> +	switch (size) {
> +	case 1: *(uint8_t *)res = *(volatile uint8_t *)p; break;
> +	case 2: *(uint16_t *)res = *(volatile uint16_t *)p; break;
> +	case 4: *(uint32_t *)res = *(volatile uint32_t *)p; break;
> +	case 8: *(uint64_t *)res = *(volatile uint64_t *)p; break;
> +	default:
> +		/* unhandled case */
> +		abort();
> +	}
> +}
> +
> +#define READ_ONCE(x)							\
> +({									\
> +	union { typeof(x) __val; char __c[1]; } __u;			\
> +	__read_once_size(&(x), __u.__c, sizeof(x));			\
> +	__u.__val;							\
> +})


WRITE_ONCE and READ_ONCE are already defined in lib/linux/compiler.h

Thanks,
drew


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [kvm-unit-tests PATCH v8 04/10] run_tests.sh: add --config option for alt test set
  2021-11-18 18:46 ` [kvm-unit-tests PATCH v8 04/10] run_tests.sh: add --config option for alt test set Alex Bennée
@ 2021-11-24 16:48   ` Andrew Jones
  2021-12-01 16:20     ` Alex Bennée
  0 siblings, 1 reply; 19+ messages in thread
From: Andrew Jones @ 2021-11-24 16:48 UTC (permalink / raw)
  To: Alex Bennée
  Cc: kvm, maz, qemu-arm, idan.horowitz, kvmarm, linux-arm-kernel

On Thu, Nov 18, 2021 at 06:46:44PM +0000, Alex Bennée wrote:
> The upcoming MTTCG tests don't need to be run for normal KVM unit
> tests so lets add the facility to have a custom set of tests.

I think an environment variable override would be better than this command
line override, because then we could also get mkstandalone to work with
the new unittests.cfg files. Or, it may be better to just add them to
the main unittests.cfg with lines like these

groups = nodefault mttcg
accel = tcg

That'll "dirty" the logs with SKIP ... (test marked as manual run only)
for each one, but at least we won't easily forget about running them from
time to time.

Thanks,
drew


> 
> Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
> ---
>  run_tests.sh | 11 ++++++++---
>  1 file changed, 8 insertions(+), 3 deletions(-)
> 
> diff --git a/run_tests.sh b/run_tests.sh
> index 9f233c5..b1088d2 100755
> --- a/run_tests.sh
> +++ b/run_tests.sh
> @@ -15,7 +15,7 @@ function usage()
>  {
>  cat <<EOF
>  
> -Usage: $0 [-h] [-v] [-a] [-g group] [-j NUM-TASKS] [-t]
> +Usage: $0 [-h] [-v] [-a] [-g group] [-j NUM-TASKS] [-t] [-c CONFIG]
>  
>      -h, --help      Output this help text
>      -v, --verbose   Enables verbose mode
> @@ -24,6 +24,7 @@ Usage: $0 [-h] [-v] [-a] [-g group] [-j NUM-TASKS] [-t]
>      -g, --group     Only execute tests in the given group
>      -j, --parallel  Execute tests in parallel
>      -t, --tap13     Output test results in TAP format
> +    -c, --config    Override default unittests.cfg
>  
>  Set the environment variable QEMU=/path/to/qemu-system-ARCH to
>  specify the appropriate qemu binary for ARCH-run.
> @@ -42,7 +43,7 @@ if [ $? -ne 4 ]; then
>  fi
>  
>  only_tests=""
> -args=$(getopt -u -o ag:htj:v -l all,group:,help,tap13,parallel:,verbose -- $*)
> +args=$(getopt -u -o ag:htj:vc: -l all,group:,help,tap13,parallel:,verbose,config: -- $*)
>  [ $? -ne 0 ] && exit 2;
>  set -- $args;
>  while [ $# -gt 0 ]; do
> @@ -73,6 +74,10 @@ while [ $# -gt 0 ]; do
>          -t | --tap13)
>              tap_output="yes"
>              ;;
> +        -c | --config)
> +            shift
> +            config=$1
> +            ;;
>          --)
>              ;;
>          *)
> @@ -152,7 +157,7 @@ function run_task()
>  
>  : ${unittest_log_dir:=logs}
>  : ${unittest_run_queues:=1}
> -config=$TEST_DIR/unittests.cfg
> +: ${config:=$TEST_DIR/unittests.cfg}
>  
>  rm -rf $unittest_log_dir.old
>  [ -d $unittest_log_dir ] && mv $unittest_log_dir $unittest_log_dir.old
> -- 
> 2.30.2
> 
> _______________________________________________
> kvmarm mailing list
> kvmarm@lists.cs.columbia.edu
> https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [kvm-unit-tests PATCH v8 04/10] run_tests.sh: add --config option for alt test set
  2021-11-24 16:48   ` Andrew Jones
@ 2021-12-01 16:20     ` Alex Bennée
  2021-12-01 16:41       ` Andrew Jones
  0 siblings, 1 reply; 19+ messages in thread
From: Alex Bennée @ 2021-12-01 16:20 UTC (permalink / raw)
  To: Andrew Jones; +Cc: kvm, maz, qemu-arm, idan.horowitz, kvmarm, linux-arm-kernel


Andrew Jones <drjones@redhat.com> writes:

> On Thu, Nov 18, 2021 at 06:46:44PM +0000, Alex Bennée wrote:
>> The upcoming MTTCG tests don't need to be run for normal KVM unit
>> tests so lets add the facility to have a custom set of tests.
>
> I think an environment variable override would be better than this command
> line override, because then we could also get mkstandalone to work with
> the new unittests.cfg files. Or, it may be better to just add them to
> the main unittests.cfg with lines like these
>
> groups = nodefault mttcg
> accel = tcg
>
> That'll "dirty" the logs with SKIP ... (test marked as manual run only)
> for each one, but at least we won't easily forget about running them from
> time to time.

So what is the meaning of accel here? Is it:

  - this test only runs on accel FOO

or

  - this test defaults to running on accel FOO

because while the tests are for TCG I want to run them on KVM (so I can
validate the test on real HW). If I have accel=tcg then:

  env ACCEL=kvm QEMU=$HOME/lsrc/qemu.git/builds/all/qemu-system-aarch64 ./run_tests.sh -g mttcg
  SKIP tlbflush-code::all_other (tcg only, but ACCEL=kvm)
  SKIP tlbflush-code::page_other (tcg only, but ACCEL=kvm)
  SKIP tlbflush-code::all_self (tcg only, but ACCEL=kvm)
  ...

so I can either drop the accel line and rely on nodefault to ensure it
doesn't run normally or make the env ACCEL processing less anal about
preventing me running TCG tests under KVM. What do you think?

-- 
Alex Bennée

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [kvm-unit-tests PATCH v8 04/10] run_tests.sh: add --config option for alt test set
  2021-12-01 16:20     ` Alex Bennée
@ 2021-12-01 16:41       ` Andrew Jones
  2021-12-01 17:07         ` Alex Bennée
  0 siblings, 1 reply; 19+ messages in thread
From: Andrew Jones @ 2021-12-01 16:41 UTC (permalink / raw)
  To: Alex Bennée
  Cc: kvm, maz, qemu-arm, idan.horowitz, kvmarm, linux-arm-kernel

On Wed, Dec 01, 2021 at 04:20:02PM +0000, Alex Bennée wrote:
> 
> Andrew Jones <drjones@redhat.com> writes:
> 
> > On Thu, Nov 18, 2021 at 06:46:44PM +0000, Alex Bennée wrote:
> >> The upcoming MTTCG tests don't need to be run for normal KVM unit
> >> tests so lets add the facility to have a custom set of tests.
> >
> > I think an environment variable override would be better than this command
> > line override, because then we could also get mkstandalone to work with
> > the new unittests.cfg files. Or, it may be better to just add them to
> > the main unittests.cfg with lines like these
> >
> > groups = nodefault mttcg
> > accel = tcg
> >
> > That'll "dirty" the logs with SKIP ... (test marked as manual run only)
> > for each one, but at least we won't easily forget about running them from
> > time to time.
> 
> So what is the meaning of accel here? Is it:
> 
>   - this test only runs on accel FOO
> 
> or
> 
>   - this test defaults to running on accel FOO
> 
> because while the tests are for TCG I want to run them on KVM (so I can
> validate the test on real HW). If I have accel=tcg then:
> 
>   env ACCEL=kvm QEMU=$HOME/lsrc/qemu.git/builds/all/qemu-system-aarch64 ./run_tests.sh -g mttcg
>   SKIP tlbflush-code::all_other (tcg only, but ACCEL=kvm)
>   SKIP tlbflush-code::page_other (tcg only, but ACCEL=kvm)
>   SKIP tlbflush-code::all_self (tcg only, but ACCEL=kvm)
>   ...
> 
> so I can either drop the accel line and rely on nodefault to ensure it
> doesn't run normally or make the env ACCEL processing less anal about
> preventing me running TCG tests under KVM. What do you think?

Just drop the 'accel = tcg' line. I only suggested it because I didn't
know you also wanted to run the MTTCG "specific" tests under KVM. Now,
that I do, I wonder why we wouldn't run them all the time, i.e. no
nodefault group? Do the tests not exercise enough hypervisor code to
be worth the energy used to run them?

Thanks,
drew


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [kvm-unit-tests PATCH v8 04/10] run_tests.sh: add --config option for alt test set
  2021-12-01 16:41       ` Andrew Jones
@ 2021-12-01 17:07         ` Alex Bennée
  0 siblings, 0 replies; 19+ messages in thread
From: Alex Bennée @ 2021-12-01 17:07 UTC (permalink / raw)
  To: Andrew Jones; +Cc: kvm, maz, qemu-arm, idan.horowitz, kvmarm, linux-arm-kernel


Andrew Jones <drjones@redhat.com> writes:

> On Wed, Dec 01, 2021 at 04:20:02PM +0000, Alex Bennée wrote:
>> 
>> Andrew Jones <drjones@redhat.com> writes:
>> 
>> > On Thu, Nov 18, 2021 at 06:46:44PM +0000, Alex Bennée wrote:
>> >> The upcoming MTTCG tests don't need to be run for normal KVM unit
>> >> tests so lets add the facility to have a custom set of tests.
>> >
>> > I think an environment variable override would be better than this command
>> > line override, because then we could also get mkstandalone to work with
>> > the new unittests.cfg files. Or, it may be better to just add them to
>> > the main unittests.cfg with lines like these
>> >
>> > groups = nodefault mttcg
>> > accel = tcg
>> >
>> > That'll "dirty" the logs with SKIP ... (test marked as manual run only)
>> > for each one, but at least we won't easily forget about running them from
>> > time to time.
>> 
>> So what is the meaning of accel here? Is it:
>> 
>>   - this test only runs on accel FOO
>> 
>> or
>> 
>>   - this test defaults to running on accel FOO
>> 
>> because while the tests are for TCG I want to run them on KVM (so I can
>> validate the test on real HW). If I have accel=tcg then:
>> 
>>   env ACCEL=kvm QEMU=$HOME/lsrc/qemu.git/builds/all/qemu-system-aarch64 ./run_tests.sh -g mttcg
>>   SKIP tlbflush-code::all_other (tcg only, but ACCEL=kvm)
>>   SKIP tlbflush-code::page_other (tcg only, but ACCEL=kvm)
>>   SKIP tlbflush-code::all_self (tcg only, but ACCEL=kvm)
>>   ...
>> 
>> so I can either drop the accel line and rely on nodefault to ensure it
>> doesn't run normally or make the env ACCEL processing less anal about
>> preventing me running TCG tests under KVM. What do you think?
>
> Just drop the 'accel = tcg' line. I only suggested it because I didn't
> know you also wanted to run the MTTCG "specific" tests under KVM. Now,
> that I do, I wonder why we wouldn't run them all the time, i.e. no
> nodefault group? Do the tests not exercise enough hypervisor code to
> be worth the energy used to run them?

I think in most cases if they fail under KVM it wouldn't be due to the
hypervisor being broken but the silicon not meeting it's architectural
specification. I'm fine with them being nodefault for that.

I'm not sure how much the tlbflush code exercises on the host. There is
a WIP tlbflush-data which might make a case for being run more
regularly on KVM.

>
> Thanks,
> drew


-- 
Alex Bennée

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2021-12-01 17:10 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-11-18 18:46 [kvm-unit-tests PATCH v8 00/10] MTTCG sanity tests for ARM Alex Bennée
2021-11-18 18:46 ` [kvm-unit-tests PATCH v8 01/10] docs: mention checkpatch in the README Alex Bennée
2021-11-24 11:06   ` Andrew Jones
2021-11-24 11:08     ` Andrew Jones
2021-11-24 11:38     ` Alex Bennée
2021-11-18 18:46 ` [kvm-unit-tests PATCH v8 02/10] arm/flat.lds: don't drop debug during link Alex Bennée
2021-11-18 18:46 ` [kvm-unit-tests PATCH v8 03/10] Makefile: add GNU global tags support Alex Bennée
2021-11-18 18:46 ` [kvm-unit-tests PATCH v8 04/10] run_tests.sh: add --config option for alt test set Alex Bennée
2021-11-24 16:48   ` Andrew Jones
2021-12-01 16:20     ` Alex Bennée
2021-12-01 16:41       ` Andrew Jones
2021-12-01 17:07         ` Alex Bennée
2021-11-18 18:46 ` [kvm-unit-tests PATCH v8 05/10] lib: add isaac prng library from CCAN Alex Bennée
2021-11-18 18:46 ` [kvm-unit-tests PATCH v8 06/10] arm/tlbflush-code: TLB flush during code execution Alex Bennée
2021-11-18 18:46 ` [kvm-unit-tests PATCH v8 07/10] arm/locking-tests: add comprehensive locking test Alex Bennée
2021-11-18 18:46 ` [kvm-unit-tests PATCH v8 08/10] arm/barrier-litmus-tests: add simple mp and sal litmus tests Alex Bennée
2021-11-24 16:14   ` Andrew Jones
2021-11-18 18:46 ` [kvm-unit-tests PATCH v8 09/10] arm/run: use separate --accel form Alex Bennée
2021-11-18 18:46 ` [kvm-unit-tests PATCH v8 10/10] arm/tcg-test: some basic TCG exercising tests Alex Bennée

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).