kvmarm.lists.cs.columbia.edu archive mirror
 help / color / mirror / Atom feed
* [kvm-unit-tests PATCH v10 0/7] MTTCG sanity tests for ARM
@ 2023-03-07 11:28 Alex Bennée
  2023-03-07 11:28 ` [kvm-unit-tests PATCH v10 1/7] Makefile: add GNU global tags support Alex Bennée
                   ` (7 more replies)
  0 siblings, 8 replies; 18+ messages in thread
From: Alex Bennée @ 2023-03-07 11:28 UTC (permalink / raw)
  To: kvmarm
  Cc: kvm, qemu-devel, Thomas Huth, Andrew Jones, Paolo Bonzini,
	kvmarm, qemu-arm, Alex Bennée

I last had a go at getting these up-streamed at the end of 2021 so
its probably worth having another go. From the last iteration a
number of the groundwork patches did get merged:

  Subject: [kvm-unit-tests PATCH v9 0/9] MTTCG sanity tests for ARM
  Date: Thu,  2 Dec 2021 11:53:43 +0000
  Message-Id: <20211202115352.951548-1-alex.bennee@linaro.org>

So this leaves a minor gtags patch, adding the isaac RNG library which
would also be useful for other users, see:

  Subject: [kvm-unit-tests PATCH v3 11/27] lib: Add random number generator
  Date: Tue, 22 Nov 2022 18:11:36 +0200
  Message-Id: <20221122161152.293072-12-mlevitsk@redhat.com>

Otherwise there are a few minor checkpatch tweaks.

I would still like to enable KVM unit tests inside QEMU as things like
the x86 APIC tests are probably a better fit for unit testing TCG
emulation than booting a whole OS with various APICs enabled.

Alex Bennée (7):
  Makefile: add GNU global tags support
  add .gitpublish metadata
  lib: add isaac prng library from CCAN
  arm/tlbflush-code: TLB flush during code execution
  arm/locking-tests: add comprehensive locking test
  arm/barrier-litmus-tests: add simple mp and sal litmus tests
  arm/tcg-test: some basic TCG exercising tests

 Makefile                  |   5 +-
 arm/Makefile.arm          |   2 +
 arm/Makefile.arm64        |   2 +
 arm/Makefile.common       |   6 +-
 lib/arm/asm/barrier.h     |  19 ++
 lib/arm64/asm/barrier.h   |  50 +++++
 lib/prng.h                |  83 +++++++
 lib/prng.c                | 163 ++++++++++++++
 arm/tcg-test-asm.S        | 171 +++++++++++++++
 arm/tcg-test-asm64.S      | 170 ++++++++++++++
 arm/barrier-litmus-test.c | 450 ++++++++++++++++++++++++++++++++++++++
 arm/locking-test.c        | 321 +++++++++++++++++++++++++++
 arm/spinlock-test.c       |  87 --------
 arm/tcg-test.c            | 340 ++++++++++++++++++++++++++++
 arm/tlbflush-code.c       | 209 ++++++++++++++++++
 arm/unittests.cfg         | 170 ++++++++++++++
 .gitignore                |   3 +
 .gitpublish               |  18 ++
 18 files changed, 2180 insertions(+), 89 deletions(-)
 create mode 100644 lib/prng.h
 create mode 100644 lib/prng.c
 create mode 100644 arm/tcg-test-asm.S
 create mode 100644 arm/tcg-test-asm64.S
 create mode 100644 arm/barrier-litmus-test.c
 create mode 100644 arm/locking-test.c
 delete mode 100644 arm/spinlock-test.c
 create mode 100644 arm/tcg-test.c
 create mode 100644 arm/tlbflush-code.c
 create mode 100644 .gitpublish

-- 
2.39.2


^ permalink raw reply	[flat|nested] 18+ messages in thread

* [kvm-unit-tests PATCH v10 1/7] Makefile: add GNU global tags support
  2023-03-07 11:28 [kvm-unit-tests PATCH v10 0/7] MTTCG sanity tests for ARM Alex Bennée
@ 2023-03-07 11:28 ` Alex Bennée
  2023-03-21 14:49   ` Andrew Jones
  2023-03-07 11:28 ` [kvm-unit-tests PATCH v10 2/7] add .gitpublish metadata Alex Bennée
                   ` (6 subsequent siblings)
  7 siblings, 1 reply; 18+ messages in thread
From: Alex Bennée @ 2023-03-07 11:28 UTC (permalink / raw)
  To: kvmarm
  Cc: kvm, qemu-devel, Thomas Huth, Andrew Jones, Paolo Bonzini,
	kvmarm, qemu-arm, Alex Bennée

If you have ctags you might as well offer gtags as a target.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20211118184650.661575-4-alex.bennee@linaro.org>

---
v10
  - update .gitignore
---
 Makefile   | 5 ++++-
 .gitignore | 3 +++
 2 files changed, 7 insertions(+), 1 deletion(-)

diff --git a/Makefile b/Makefile
index 6ed5deac..f22179de 100644
--- a/Makefile
+++ b/Makefile
@@ -145,6 +145,9 @@ cscope:
 		-name '*.[chsS]' -exec realpath --relative-base=$(CURDIR) {} \; | sort -u > ./cscope.files
 	cscope -bk
 
-.PHONY: tags
+.PHONY: tags gtags
 tags:
 	ctags -R
+
+gtags:
+	gtags
diff --git a/.gitignore b/.gitignore
index 33529b65..4d5f460f 100644
--- a/.gitignore
+++ b/.gitignore
@@ -12,6 +12,9 @@ tags
 patches
 .stgit-*
 cscope.*
+GPATH
+GRTAGS
+GTAGS
 *.swp
 /lib/asm
 /lib/config.h
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [kvm-unit-tests PATCH v10 2/7] add .gitpublish metadata
  2023-03-07 11:28 [kvm-unit-tests PATCH v10 0/7] MTTCG sanity tests for ARM Alex Bennée
  2023-03-07 11:28 ` [kvm-unit-tests PATCH v10 1/7] Makefile: add GNU global tags support Alex Bennée
@ 2023-03-07 11:28 ` Alex Bennée
  2023-03-21 14:53   ` Andrew Jones
  2023-03-07 11:28 ` [kvm-unit-tests PATCH v10 3/7] lib: add isaac prng library from CCAN Alex Bennée
                   ` (5 subsequent siblings)
  7 siblings, 1 reply; 18+ messages in thread
From: Alex Bennée @ 2023-03-07 11:28 UTC (permalink / raw)
  To: kvmarm
  Cc: kvm, qemu-devel, Thomas Huth, Andrew Jones, Paolo Bonzini,
	kvmarm, qemu-arm, Alex Bennée

This allows for users of git-publish to have default routing for kvm
and kvmarm patches.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
---
 .gitpublish | 18 ++++++++++++++++++
 1 file changed, 18 insertions(+)
 create mode 100644 .gitpublish

diff --git a/.gitpublish b/.gitpublish
new file mode 100644
index 00000000..39130f93
--- /dev/null
+++ b/.gitpublish
@@ -0,0 +1,18 @@
+#
+# Common git-publish profiles that can be used to send patches to QEMU upstream.
+#
+# See https://github.com/stefanha/git-publish for more information
+#
+[gitpublishprofile "default"]
+base = master
+to = kvm@vger.kernel.org
+cc = qemu-devel@nongnu.org
+cccmd = scripts/get_maintainer.pl --noroles --norolestats --nogit --nogit-fallback 2>/dev/null
+
+[gitpublishprofile "arm"]
+base = master
+to = kvmarm@lists.cs.columbia.edu
+cc = kvm@vger.kernel.org
+cc = qemu-devel@nongnu.org
+cc = qemu-arm@nongnu.org
+cccmd = scripts/get_maintainer.pl --noroles --norolestats --nogit --nogit-fallback 2>/dev/null
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [kvm-unit-tests PATCH v10 3/7] lib: add isaac prng library from CCAN
  2023-03-07 11:28 [kvm-unit-tests PATCH v10 0/7] MTTCG sanity tests for ARM Alex Bennée
  2023-03-07 11:28 ` [kvm-unit-tests PATCH v10 1/7] Makefile: add GNU global tags support Alex Bennée
  2023-03-07 11:28 ` [kvm-unit-tests PATCH v10 2/7] add .gitpublish metadata Alex Bennée
@ 2023-03-07 11:28 ` Alex Bennée
  2023-03-07 11:28 ` [kvm-unit-tests PATCH v10 4/7] arm/tlbflush-code: TLB flush during code execution Alex Bennée
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 18+ messages in thread
From: Alex Bennée @ 2023-03-07 11:28 UTC (permalink / raw)
  To: kvmarm
  Cc: kvm, qemu-devel, Thomas Huth, Andrew Jones, Paolo Bonzini,
	kvmarm, qemu-arm, Alex Bennée, Timothy B . Terriberry,
	Andrew Jones

It's often useful to introduce some sort of random variation when
testing several racing CPU conditions. Instead of each test implementing
some half-arsed PRNG bring in a a decent one which has good statistical
randomness. Obviously it is deterministic for a given seed value which
is likely the behaviour you want.

I've pulled in the ISAAC library from CCAN:

    http://ccodearchive.net/info/isaac.html

I shaved off the float related stuff which is less useful for unit
testing and re-indented to fit the style. The original license was
CC0 (Public Domain) which is compatible with the LGPL v2 of
kvm-unit-tests.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
CC: Timothy B. Terriberry <tterribe@xiph.org>
Acked-by: Andrew Jones <drjones@redhat.com>
Acked-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20211118184650.661575-6-alex.bennee@linaro.org>

---
v9
  - add SPDX CC0 tags
---
 arm/Makefile.common |   1 +
 lib/prng.h          |  83 ++++++++++++++++++++++
 lib/prng.c          | 163 ++++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 247 insertions(+)
 create mode 100644 lib/prng.h
 create mode 100644 lib/prng.c

diff --git a/arm/Makefile.common b/arm/Makefile.common
index 1bbec64f..16f8c6df 100644
--- a/arm/Makefile.common
+++ b/arm/Makefile.common
@@ -45,6 +45,7 @@ cflatobjs += lib/pci-testdev.o
 cflatobjs += lib/virtio.o
 cflatobjs += lib/virtio-mmio.o
 cflatobjs += lib/chr-testdev.o
+cflatobjs += lib/prng.o
 cflatobjs += lib/arm/io.o
 cflatobjs += lib/arm/setup.o
 cflatobjs += lib/arm/mmu.o
diff --git a/lib/prng.h b/lib/prng.h
new file mode 100644
index 00000000..4f9512f5
--- /dev/null
+++ b/lib/prng.h
@@ -0,0 +1,83 @@
+// SPDX-License-Identifier: CC0-1.0
+/*
+ * PRNG Header
+ */
+#ifndef __PRNG_H__
+#define __PRNG_H__
+
+# include <stdint.h>
+
+
+
+typedef struct isaac_ctx isaac_ctx;
+
+
+
+/*This value may be lowered to reduce memory usage on embedded platforms, at
+  the cost of reducing security and increasing bias.
+  Quoting Bob Jenkins: "The current best guess is that bias is detectable after
+  2**37 values for [ISAAC_SZ_LOG]=3, 2**45 for 4, 2**53 for 5, 2**61 for 6,
+  2**69 for 7, and 2**77 values for [ISAAC_SZ_LOG]=8."*/
+#define ISAAC_SZ_LOG      (8)
+#define ISAAC_SZ          (1<<ISAAC_SZ_LOG)
+#define ISAAC_SEED_SZ_MAX (ISAAC_SZ<<2)
+
+
+
+/*ISAAC is the most advanced of a series of pseudo-random number generators
+  designed by Robert J. Jenkins Jr. in 1996.
+  http://www.burtleburtle.net/bob/rand/isaac.html
+  To quote:
+  No efficient method is known for deducing their internal states.
+  ISAAC requires an amortized 18.75 instructions to produce a 32-bit value.
+  There are no cycles in ISAAC shorter than 2**40 values.
+  The expected cycle length is 2**8295 values.*/
+struct isaac_ctx{
+	unsigned n;
+	uint32_t r[ISAAC_SZ];
+	uint32_t m[ISAAC_SZ];
+	uint32_t a;
+	uint32_t b;
+	uint32_t c;
+};
+
+
+/**
+ * isaac_init - Initialize an instance of the ISAAC random number generator.
+ * @_ctx:   The instance to initialize.
+ * @_seed:  The specified seed bytes.
+ *          This may be NULL if _nseed is less than or equal to zero.
+ * @_nseed: The number of bytes to use for the seed.
+ *          If this is greater than ISAAC_SEED_SZ_MAX, the extra bytes are
+ *           ignored.
+ */
+void isaac_init(isaac_ctx *_ctx,const unsigned char *_seed,int _nseed);
+
+/**
+ * isaac_reseed - Mix a new batch of entropy into the current state.
+ * To reset ISAAC to a known state, call isaac_init() again instead.
+ * @_ctx:   The instance to reseed.
+ * @_seed:  The specified seed bytes.
+ *          This may be NULL if _nseed is zero.
+ * @_nseed: The number of bytes to use for the seed.
+ *          If this is greater than ISAAC_SEED_SZ_MAX, the extra bytes are
+ *           ignored.
+ */
+void isaac_reseed(isaac_ctx *_ctx,const unsigned char *_seed,int _nseed);
+/**
+ * isaac_next_uint32 - Return the next random 32-bit value.
+ * @_ctx: The ISAAC instance to generate the value with.
+ */
+uint32_t isaac_next_uint32(isaac_ctx *_ctx);
+/**
+ * isaac_next_uint - Uniform random integer less than the given value.
+ * @_ctx: The ISAAC instance to generate the value with.
+ * @_n:   The upper bound on the range of numbers returned (not inclusive).
+ *        This must be greater than zero and less than 2**32.
+ *        To return integers in the full range 0...2**32-1, use
+ *         isaac_next_uint32() instead.
+ * Return: An integer uniformly distributed between 0 and _n-1 (inclusive).
+ */
+uint32_t isaac_next_uint(isaac_ctx *_ctx,uint32_t _n);
+
+#endif
diff --git a/lib/prng.c b/lib/prng.c
new file mode 100644
index 00000000..cfad8c54
--- /dev/null
+++ b/lib/prng.c
@@ -0,0 +1,163 @@
+// SPDX-License-Identifier: CC0-1.0
+/*
+ * Pseudo Random Number Generator
+ *
+ * Lifted from ccan modules ilog/isaac under CC0
+ *   - http://ccodearchive.net/info/isaac.html
+ *   - http://ccodearchive.net/info/ilog.html
+ *
+ * And lightly hacked to compile under the KVM unit test environment.
+ * This provides a handy RNG for torture tests that want to vary
+ * delays and the like.
+ *
+ */
+
+/*Written by Timothy B. Terriberry (tterribe@xiph.org) 1999-2009.
+  CC0 (Public domain) - see LICENSE file for details
+  Based on the public domain implementation by Robert J. Jenkins Jr.*/
+
+#include "libcflat.h"
+
+#include <string.h>
+#include "prng.h"
+
+#define ISAAC_MASK        (0xFFFFFFFFU)
+
+/* Extract ISAAC_SZ_LOG bits (starting at bit 2). */
+static inline uint32_t lower_bits(uint32_t x)
+{
+	return (x & ((ISAAC_SZ-1) << 2)) >> 2;
+}
+
+/* Extract next ISAAC_SZ_LOG bits (starting at bit ISAAC_SZ_LOG+2). */
+static inline uint32_t upper_bits(uint32_t y)
+{
+	return (y >> (ISAAC_SZ_LOG+2)) & (ISAAC_SZ-1);
+}
+
+static void isaac_update(isaac_ctx *_ctx){
+	uint32_t *m;
+	uint32_t *r;
+	uint32_t  a;
+	uint32_t  b;
+	uint32_t  x;
+	uint32_t  y;
+	int       i;
+	m=_ctx->m;
+	r=_ctx->r;
+	a=_ctx->a;
+	b=_ctx->b+(++_ctx->c);
+	for(i=0;i<ISAAC_SZ/2;i++){
+		x=m[i];
+		a=(a^a<<13)+m[i+ISAAC_SZ/2];
+		m[i]=y=m[lower_bits(x)]+a+b;
+		r[i]=b=m[upper_bits(y)]+x;
+		x=m[++i];
+		a=(a^a>>6)+m[i+ISAAC_SZ/2];
+		m[i]=y=m[lower_bits(x)]+a+b;
+		r[i]=b=m[upper_bits(y)]+x;
+		x=m[++i];
+		a=(a^a<<2)+m[i+ISAAC_SZ/2];
+		m[i]=y=m[lower_bits(x)]+a+b;
+		r[i]=b=m[upper_bits(y)]+x;
+		x=m[++i];
+		a=(a^a>>16)+m[i+ISAAC_SZ/2];
+		m[i]=y=m[lower_bits(x)]+a+b;
+		r[i]=b=m[upper_bits(y)]+x;
+	}
+	for(i=ISAAC_SZ/2;i<ISAAC_SZ;i++){
+		x=m[i];
+		a=(a^a<<13)+m[i-ISAAC_SZ/2];
+		m[i]=y=m[lower_bits(x)]+a+b;
+		r[i]=b=m[upper_bits(y)]+x;
+		x=m[++i];
+		a=(a^a>>6)+m[i-ISAAC_SZ/2];
+		m[i]=y=m[lower_bits(x)]+a+b;
+		r[i]=b=m[upper_bits(y)]+x;
+		x=m[++i];
+		a=(a^a<<2)+m[i-ISAAC_SZ/2];
+		m[i]=y=m[lower_bits(x)]+a+b;
+		r[i]=b=m[upper_bits(y)]+x;
+		x=m[++i];
+		a=(a^a>>16)+m[i-ISAAC_SZ/2];
+		m[i]=y=m[lower_bits(x)]+a+b;
+		r[i]=b=m[upper_bits(y)]+x;
+	}
+	_ctx->b=b;
+	_ctx->a=a;
+	_ctx->n=ISAAC_SZ;
+}
+
+static void isaac_mix(uint32_t _x[8]){
+	static const unsigned char SHIFT[8]={11,2,8,16,10,4,8,9};
+	int i;
+	for(i=0;i<8;i++){
+		_x[i]^=_x[(i+1)&7]<<SHIFT[i];
+		_x[(i+3)&7]+=_x[i];
+		_x[(i+1)&7]+=_x[(i+2)&7];
+		i++;
+		_x[i]^=_x[(i+1)&7]>>SHIFT[i];
+		_x[(i+3)&7]+=_x[i];
+		_x[(i+1)&7]+=_x[(i+2)&7];
+	}
+}
+
+
+void isaac_init(isaac_ctx *_ctx,const unsigned char *_seed,int _nseed){
+	_ctx->a=_ctx->b=_ctx->c=0;
+	memset(_ctx->r,0,sizeof(_ctx->r));
+	isaac_reseed(_ctx,_seed,_nseed);
+}
+
+void isaac_reseed(isaac_ctx *_ctx,const unsigned char *_seed,int _nseed){
+	uint32_t *m;
+	uint32_t *r;
+	uint32_t  x[8];
+	int       i;
+	int       j;
+	m=_ctx->m;
+	r=_ctx->r;
+	if(_nseed>ISAAC_SEED_SZ_MAX)_nseed=ISAAC_SEED_SZ_MAX;
+	for(i=0;i<_nseed>>2;i++){
+		r[i]^=(uint32_t)_seed[i<<2|3]<<24|(uint32_t)_seed[i<<2|2]<<16|
+			(uint32_t)_seed[i<<2|1]<<8|_seed[i<<2];
+	}
+	_nseed-=i<<2;
+	if(_nseed>0){
+		uint32_t ri;
+		ri=_seed[i<<2];
+		for(j=1;j<_nseed;j++)ri|=(uint32_t)_seed[i<<2|j]<<(j<<3);
+		r[i++]^=ri;
+	}
+	x[0]=x[1]=x[2]=x[3]=x[4]=x[5]=x[6]=x[7]=0x9E3779B9U;
+	for(i=0;i<4;i++)isaac_mix(x);
+	for(i=0;i<ISAAC_SZ;i+=8){
+		for(j=0;j<8;j++)x[j]+=r[i+j];
+		isaac_mix(x);
+		memcpy(m+i,x,sizeof(x));
+	}
+	for(i=0;i<ISAAC_SZ;i+=8){
+		for(j=0;j<8;j++)x[j]+=m[i+j];
+		isaac_mix(x);
+		memcpy(m+i,x,sizeof(x));
+	}
+	isaac_update(_ctx);
+}
+
+uint32_t isaac_next_uint32(isaac_ctx *_ctx){
+	if(!_ctx->n)isaac_update(_ctx);
+	return _ctx->r[--_ctx->n];
+}
+
+uint32_t isaac_next_uint(isaac_ctx *_ctx,uint32_t _n){
+	uint32_t r;
+	uint32_t v;
+	uint32_t d;
+	do{
+		r=isaac_next_uint32(_ctx);
+		v=r%_n;
+		d=r-v;
+	}
+	while(((d+_n-1)&ISAAC_MASK)<d);
+	return v;
+}
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [kvm-unit-tests PATCH v10 4/7] arm/tlbflush-code: TLB flush during code execution
  2023-03-07 11:28 [kvm-unit-tests PATCH v10 0/7] MTTCG sanity tests for ARM Alex Bennée
                   ` (2 preceding siblings ...)
  2023-03-07 11:28 ` [kvm-unit-tests PATCH v10 3/7] lib: add isaac prng library from CCAN Alex Bennée
@ 2023-03-07 11:28 ` Alex Bennée
  2023-03-21 15:02   ` Andrew Jones
  2023-03-07 11:28 ` [kvm-unit-tests PATCH v10 5/7] arm/locking-tests: add comprehensive locking test Alex Bennée
                   ` (3 subsequent siblings)
  7 siblings, 1 reply; 18+ messages in thread
From: Alex Bennée @ 2023-03-07 11:28 UTC (permalink / raw)
  To: kvmarm
  Cc: kvm, qemu-devel, Thomas Huth, Andrew Jones, Paolo Bonzini,
	kvmarm, qemu-arm, Alex Bennée, Mark Rutland

This adds a fairly brain dead torture test for TLB flushes intended
for stressing the MTTCG QEMU build. It takes the usual -smp option for
multiple CPUs.

By default it CPU0 will do a TLBIALL flush after each cycle. You can
pass options via -append to control additional aspects of the test:

  - "page" flush each page in turn (one per function)
  - "self" do the flush after each computation cycle
  - "verbose" report progress on each computation cycle

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
CC: Mark Rutland <mark.rutland@arm.com>
Message-Id: <20211118184650.661575-7-alex.bennee@linaro.org>

---
v9
  - move tests back into unittests.cfg (with nodefault mttcg)
  - replace printf with report_info
  - drop accel = tcg
---
 arm/Makefile.common |   1 +
 arm/tlbflush-code.c | 209 ++++++++++++++++++++++++++++++++++++++++++++
 arm/unittests.cfg   |  25 ++++++
 3 files changed, 235 insertions(+)
 create mode 100644 arm/tlbflush-code.c

diff --git a/arm/Makefile.common b/arm/Makefile.common
index 16f8c6df..2c4aad38 100644
--- a/arm/Makefile.common
+++ b/arm/Makefile.common
@@ -12,6 +12,7 @@ tests-common += $(TEST_DIR)/gic.flat
 tests-common += $(TEST_DIR)/psci.flat
 tests-common += $(TEST_DIR)/sieve.flat
 tests-common += $(TEST_DIR)/pl031.flat
+tests-common += $(TEST_DIR)/tlbflush-code.flat
 
 tests-all = $(tests-common) $(tests)
 all: directories $(tests-all)
diff --git a/arm/tlbflush-code.c b/arm/tlbflush-code.c
new file mode 100644
index 00000000..bf9eb111
--- /dev/null
+++ b/arm/tlbflush-code.c
@@ -0,0 +1,209 @@
+// SPDX-License-Identifier: GPL-2.0-or-later
+/*
+ * TLB Flush Race Tests
+ *
+ * These tests are designed to test for incorrect TLB flush semantics
+ * under emulation. The initial CPU will set all the others working a
+ * compuation task and will then trigger TLB flushes across the
+ * system. It doesn't actually need to re-map anything but the flushes
+ * themselves will trigger QEMU's TCG self-modifying code detection
+ * which will invalidate any generated  code causing re-translation.
+ * Eventually the code buffer will fill and a general tb_lush() will
+ * be triggered.
+ *
+ * Copyright (C) 2016-2021, Linaro, Alex Bennée <alex.bennee@linaro.org>
+ *
+ * This work is licensed under the terms of the GNU LGPL, version 2.
+ */
+
+#include <libcflat.h>
+#include <asm/smp.h>
+#include <asm/cpumask.h>
+#include <asm/barrier.h>
+#include <asm/mmu.h>
+
+#define SEQ_LENGTH 10
+#define SEQ_HASH 0x7cd707fe
+
+static cpumask_t smp_test_complete;
+static int flush_count = 1000000;
+static bool flush_self;
+static bool flush_page;
+static bool flush_verbose;
+
+/*
+ * Work functions
+ *
+ * These work functions need to be:
+ *
+ *  - page aligned, so we can flush one function at a time
+ *  - have branches, so QEMU TCG generates multiple basic blocks
+ *  - call across pages, so we exercise the TCG basic block slow path
+ */
+
+/* Adler32 */
+__attribute__((aligned(PAGE_SIZE))) static
+uint32_t hash_array(const void *buf, size_t buflen)
+{
+	const uint8_t *data = (uint8_t *) buf;
+	uint32_t s1 = 1;
+	uint32_t s2 = 0;
+
+	for (size_t n = 0; n < buflen; n++) {
+		s1 = (s1 + data[n]) % 65521;
+		s2 = (s2 + s1) % 65521;
+	}
+	return (s2 << 16) | s1;
+}
+
+__attribute__((aligned(PAGE_SIZE))) static
+void create_fib_sequence(int length, unsigned int *array)
+{
+	int i;
+
+	/* first two values */
+	array[0] = 0;
+	array[1] = 1;
+	for (i = 2; i < length; i++)
+		array[i] = array[i-2] + array[i-1];
+}
+
+__attribute__((aligned(PAGE_SIZE))) static
+unsigned long long factorial(unsigned int n)
+{
+	unsigned int i;
+	unsigned long long fac = 1;
+
+	for (i = 1; i <= n; i++)
+		fac = fac * i;
+	return fac;
+}
+
+__attribute__((aligned(PAGE_SIZE))) static
+void factorial_array(unsigned int n, unsigned int *input,
+		     unsigned long long *output)
+{
+	unsigned int i;
+
+	for (i = 0; i < n; i++)
+		output[i] = factorial(input[i]);
+}
+
+__attribute__((aligned(PAGE_SIZE))) static
+unsigned int do_computation(void)
+{
+	unsigned int fib_array[SEQ_LENGTH];
+	unsigned long long facfib_array[SEQ_LENGTH];
+	uint32_t fib_hash, facfib_hash;
+
+	create_fib_sequence(SEQ_LENGTH, &fib_array[0]);
+	fib_hash = hash_array(&fib_array[0], sizeof(fib_array));
+	factorial_array(SEQ_LENGTH, &fib_array[0], &facfib_array[0]);
+	facfib_hash = hash_array(&facfib_array[0], sizeof(facfib_array));
+
+	return (fib_hash ^ facfib_hash);
+}
+
+/* This provides a table of the work functions so we can flush each
+ * page individually
+ */
+static void *pages[] = {&hash_array, &create_fib_sequence, &factorial,
+			&factorial_array, &do_computation};
+
+static void do_flush(int i)
+{
+	if (flush_page)
+		flush_tlb_page((unsigned long)pages[i % ARRAY_SIZE(pages)]);
+	else
+		flush_tlb_all();
+}
+
+
+static void just_compute(void)
+{
+	int i, errors = 0;
+	int cpu = smp_processor_id();
+
+	uint32_t result;
+
+	report_info("CPU%d online", cpu);
+
+	for (i = 0 ; i < flush_count; i++) {
+		result = do_computation();
+
+		if (result != SEQ_HASH) {
+			errors++;
+			report_info("CPU%d: seq%d 0x%"PRIx32"!=0x%x",
+				    cpu, i, result, SEQ_HASH);
+		}
+
+		if (flush_verbose && (i % 1000) == 0)
+			report_info("CPU%d: seq%d", cpu, i);
+
+		if (flush_self)
+			do_flush(i);
+	}
+
+	report(errors == 0, "CPU%d: Done - Errors: %d", cpu, errors);
+
+	cpumask_set_cpu(cpu, &smp_test_complete);
+	if (cpu != 0)
+		halt();
+}
+
+static void just_flush(void)
+{
+	int cpu = smp_processor_id();
+	int i = 0;
+
+	/*
+	 * Set our CPU as done, keep flushing until everyone else
+	 * finished
+	 */
+	cpumask_set_cpu(cpu, &smp_test_complete);
+
+	while (!cpumask_full(&smp_test_complete))
+		do_flush(i++);
+
+	report_info("CPU%d: Done - Triggered %d flushes", cpu, i);
+}
+
+int main(int argc, char **argv)
+{
+	int cpu, i;
+	char prefix[100];
+
+	for (i = 0; i < argc; i++) {
+		char *arg = argv[i];
+
+		if (strcmp(arg, "page") == 0)
+			flush_page = true;
+
+		if (strcmp(arg, "self") == 0)
+			flush_self = true;
+
+		if (strcmp(arg, "verbose") == 0)
+			flush_verbose = true;
+	}
+
+	snprintf(prefix, sizeof(prefix), "tlbflush_%s_%s",
+		 flush_page ? "page" : "all",
+		 flush_self ? "self" : "other");
+	report_prefix_push(prefix);
+
+	for_each_present_cpu(cpu) {
+		if (cpu == 0)
+			continue;
+		smp_boot_secondary(cpu, just_compute);
+	}
+
+	if (flush_self)
+		just_compute();
+	else
+		just_flush();
+
+	while (!cpumask_full(&smp_test_complete))
+		cpu_relax();
+
+	return report_summary();
+}
diff --git a/arm/unittests.cfg b/arm/unittests.cfg
index 5e67b558..ee21aef4 100644
--- a/arm/unittests.cfg
+++ b/arm/unittests.cfg
@@ -275,3 +275,28 @@ file = debug.flat
 arch = arm64
 extra_params = -append 'ss-migration'
 groups = debug migration
+
+# TLB Torture Tests
+[tlbflush-code::all_other]
+file = tlbflush-code.flat
+smp = $(($MAX_SMP>4?4:$MAX_SMP))
+groups = nodefault mttcg
+
+[tlbflush-code::page_other]
+file = tlbflush-code.flat
+smp = $(($MAX_SMP>4?4:$MAX_SMP))
+extra_params = -append 'page'
+groups = nodefault mttcg
+
+[tlbflush-code::all_self]
+file = tlbflush-code.flat
+smp = $(($MAX_SMP>4?4:$MAX_SMP))
+extra_params = -append 'self'
+groups = nodefault mttcg
+
+[tlbflush-code::page_self]
+file = tlbflush-code.flat
+smp = $(($MAX_SMP>4?4:$MAX_SMP))
+extra_params = -append 'page self'
+groups = nodefault mttcg
+
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [kvm-unit-tests PATCH v10 5/7] arm/locking-tests: add comprehensive locking test
  2023-03-07 11:28 [kvm-unit-tests PATCH v10 0/7] MTTCG sanity tests for ARM Alex Bennée
                   ` (3 preceding siblings ...)
  2023-03-07 11:28 ` [kvm-unit-tests PATCH v10 4/7] arm/tlbflush-code: TLB flush during code execution Alex Bennée
@ 2023-03-07 11:28 ` Alex Bennée
  2023-03-21 15:05   ` Andrew Jones
  2023-03-07 11:28 ` [kvm-unit-tests PATCH v10 6/7] arm/barrier-litmus-tests: add simple mp and sal litmus tests Alex Bennée
                   ` (2 subsequent siblings)
  7 siblings, 1 reply; 18+ messages in thread
From: Alex Bennée @ 2023-03-07 11:28 UTC (permalink / raw)
  To: kvmarm
  Cc: kvm, qemu-devel, Thomas Huth, Andrew Jones, Paolo Bonzini,
	kvmarm, qemu-arm, Alex Bennée

This test has been written mainly to stress multi-threaded TCG behaviour
but will demonstrate failure by default on real hardware. The test takes
the following parameters:

  - "lock" use GCC's locking semantics
  - "atomic" use GCC's __atomic primitives
  - "wfelock" use WaitForEvent sleep
  - "excl" use load/store exclusive semantics

Also two more options allow the test to be tweaked

  - "noshuffle" disables the memory shuffling
  - "count=%ld" set your own per-CPU increment count

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20211118184650.661575-8-alex.bennee@linaro.org>

---
v9
  - move back to unittests.cfg, drop accel=tcg
  - s/printf/report_info
v10
  - dropped spare extra line in shuffle_memory
---
 arm/Makefile.common |   2 +-
 arm/locking-test.c  | 321 ++++++++++++++++++++++++++++++++++++++++++++
 arm/spinlock-test.c |  87 ------------
 arm/unittests.cfg   |  30 +++++
 4 files changed, 352 insertions(+), 88 deletions(-)
 create mode 100644 arm/locking-test.c
 delete mode 100644 arm/spinlock-test.c

diff --git a/arm/Makefile.common b/arm/Makefile.common
index 2c4aad38..3089e3bf 100644
--- a/arm/Makefile.common
+++ b/arm/Makefile.common
@@ -5,7 +5,6 @@
 #
 
 tests-common  = $(TEST_DIR)/selftest.flat
-tests-common += $(TEST_DIR)/spinlock-test.flat
 tests-common += $(TEST_DIR)/pci-test.flat
 tests-common += $(TEST_DIR)/pmu.flat
 tests-common += $(TEST_DIR)/gic.flat
@@ -13,6 +12,7 @@ tests-common += $(TEST_DIR)/psci.flat
 tests-common += $(TEST_DIR)/sieve.flat
 tests-common += $(TEST_DIR)/pl031.flat
 tests-common += $(TEST_DIR)/tlbflush-code.flat
+tests-common += $(TEST_DIR)/locking-test.flat
 
 tests-all = $(tests-common) $(tests)
 all: directories $(tests-all)
diff --git a/arm/locking-test.c b/arm/locking-test.c
new file mode 100644
index 00000000..a49c2fd1
--- /dev/null
+++ b/arm/locking-test.c
@@ -0,0 +1,321 @@
+// SPDX-License-Identifier: GPL-2.0-or-later
+/*
+ * Locking Test
+ *
+ * This test allows us to stress the various atomic primitives of a VM
+ * guest. A number of methods are available that use various patterns
+ * to implement a lock.
+ *
+ * Copyright (C) 2017 Linaro
+ * Author: Alex Bennée <alex.bennee@linaro.org>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#include <libcflat.h>
+#include <asm/smp.h>
+#include <asm/cpumask.h>
+#include <asm/barrier.h>
+#include <asm/mmu.h>
+
+#include <prng.h>
+
+#define MAX_CPUS 8
+
+/* Test definition structure
+ *
+ * A simple structure that describes the test name, expected pass and
+ * increment function.
+ */
+
+/* Function pointers for test */
+typedef void (*inc_fn)(int cpu);
+
+typedef struct {
+	const char *test_name;
+	bool  should_pass;
+	inc_fn main_fn;
+} test_descr_t;
+
+/* How many increments to do */
+static int increment_count = 1000000;
+static bool do_shuffle = true;
+
+/* Shared value all the tests attempt to safely increment using
+ * various forms of atomic locking and exclusive behaviour.
+ */
+static unsigned int shared_value;
+
+/* PAGE_SIZE * uint32_t means we span several pages */
+__attribute__((aligned(PAGE_SIZE))) static uint32_t memory_array[PAGE_SIZE];
+
+/* We use the alignment of the following to ensure accesses to locking
+ * and synchronisation primatives don't interfere with the page of the
+ * shared value
+ */
+__attribute__((aligned(PAGE_SIZE))) static unsigned int per_cpu_value[MAX_CPUS];
+__attribute__((aligned(PAGE_SIZE))) static cpumask_t smp_test_complete;
+__attribute__((aligned(PAGE_SIZE))) struct isaac_ctx prng_context[MAX_CPUS];
+
+/* Some of the approaches use a global lock to prevent contention. */
+static int global_lock;
+
+/* In any SMP setting this *should* fail due to cores stepping on
+ * each other updating the shared variable
+ */
+static void increment_shared(int cpu)
+{
+	(void)cpu;
+
+	shared_value++;
+}
+
+/* GCC __sync primitives are deprecated in favour of __atomic */
+static void increment_shared_with_lock(int cpu)
+{
+	(void)cpu;
+
+	while (__sync_lock_test_and_set(&global_lock, 1));
+
+	shared_value++;
+
+	__sync_lock_release(&global_lock);
+}
+
+/*
+ * In practice even __ATOMIC_RELAXED uses ARM's ldxr/stex exclusive
+ * semantics
+ */
+static void increment_shared_with_atomic(int cpu)
+{
+	(void)cpu;
+
+	__atomic_add_fetch(&shared_value, 1, __ATOMIC_SEQ_CST);
+}
+
+
+/*
+ * Load/store exclusive with WFE (wait-for-event)
+ *
+ * See ARMv8 ARM examples:
+ *   Use of Wait For Event (WFE) and Send Event (SEV) with locks
+ */
+
+static void increment_shared_with_wfelock(int cpu)
+{
+	(void)cpu;
+
+#if defined(__aarch64__)
+	asm volatile(
+	"	mov     w1, #1\n"
+	"       sevl\n"
+	"       prfm PSTL1KEEP, [%[lock]]\n"
+	"1:     wfe\n"
+	"	ldaxr	w0, [%[lock]]\n"
+	"	cbnz    w0, 1b\n"
+	"	stxr    w0, w1, [%[lock]]\n"
+	"	cbnz	w0, 1b\n"
+	/* lock held */
+	"	ldr	w0, [%[sptr]]\n"
+	"	add	w0, w0, #0x1\n"
+	"	str	w0, [%[sptr]]\n"
+	/* now release */
+	"	stlr	wzr, [%[lock]]\n"
+	: /* out */
+	: [lock] "r" (&global_lock), [sptr] "r" (&shared_value) /* in */
+	: "w0", "w1", "cc");
+#else
+	asm volatile(
+	"	mov     r1, #1\n"
+	"1:	ldrex	r0, [%[lock]]\n"
+	"	cmp     r0, #0\n"
+	"	wfene\n"
+	"	strexeq r0, r1, [%[lock]]\n"
+	"	cmpeq	r0, #0\n"
+	"	bne	1b\n"
+	"	dmb\n"
+	/* lock held */
+	"	ldr	r0, [%[sptr]]\n"
+	"	add	r0, r0, #0x1\n"
+	"	str	r0, [%[sptr]]\n"
+	/* now release */
+	"	mov	r0, #0\n"
+	"	dmb\n"
+	"	str	r0, [%[lock]]\n"
+	"	dsb\n"
+	"	sev\n"
+	: /* out */
+	: [lock] "r" (&global_lock), [sptr] "r" (&shared_value) /* in */
+	: "r0", "r1", "cc");
+#endif
+}
+
+
+/*
+ * Hand-written version of the load/store exclusive
+ */
+static void increment_shared_with_excl(int cpu)
+{
+	(void)cpu;
+
+#if defined(__aarch64__)
+	asm volatile(
+	"1:	ldxr	w0, [%[sptr]]\n"
+	"	add     w0, w0, #0x1\n"
+	"	stxr	w1, w0, [%[sptr]]\n"
+	"	cbnz	w1, 1b\n"
+	: /* out */
+	: [sptr] "r" (&shared_value) /* in */
+	: "w0", "w1", "cc");
+#else
+	asm volatile(
+	"1:	ldrex	r0, [%[sptr]]\n"
+	"	add     r0, r0, #0x1\n"
+	"	strex	r1, r0, [%[sptr]]\n"
+	"	cmp	r1, #0\n"
+	"	bne	1b\n"
+	: /* out */
+	: [sptr] "r" (&shared_value) /* in */
+	: "r0", "r1", "cc");
+#endif
+}
+
+/* Test array */
+static test_descr_t tests[] = {
+	{ "none", false, increment_shared },
+	{ "lock", true, increment_shared_with_lock },
+	{ "atomic", true, increment_shared_with_atomic },
+	{ "wfelock", true, increment_shared_with_wfelock },
+	{ "excl", true, increment_shared_with_excl }
+};
+
+/* The idea of this is just to generate some random load/store
+ * activity which may or may not race with an un-barried incremented
+ * of the shared counter
+ */
+static void shuffle_memory(int cpu)
+{
+	int i;
+	uint32_t lspat = isaac_next_uint32(&prng_context[cpu]);
+	uint32_t seq = isaac_next_uint32(&prng_context[cpu]);
+	int count = seq & 0x1f;
+	uint32_t val = 0;
+
+	seq >>= 5;
+
+	for (i = 0; i < count; i++) {
+		int index = seq & ~PAGE_MASK;
+
+		if (lspat & 1)
+			val ^= memory_array[index];
+		else
+			memory_array[index] = val;
+
+		seq >>= PAGE_SHIFT;
+		seq ^= lspat;
+		lspat >>= 1;
+	}
+}
+
+static inc_fn increment_function;
+
+static void do_increment(void)
+{
+	int i;
+	int cpu = smp_processor_id();
+
+	report_info("CPU%d: online and ++ing", cpu);
+
+	for (i = 0; i < increment_count; i++) {
+		per_cpu_value[cpu]++;
+		increment_function(cpu);
+
+		if (do_shuffle)
+			shuffle_memory(cpu);
+	}
+
+	report_info("CPU%d: Done, %d incs\n", cpu, per_cpu_value[cpu]);
+
+	cpumask_set_cpu(cpu, &smp_test_complete);
+	if (cpu != 0)
+		halt();
+}
+
+static void setup_and_run_test(test_descr_t *test)
+{
+	unsigned int i, sum = 0;
+	int cpu, cpu_cnt = 0;
+
+	increment_function = test->main_fn;
+
+	/* fill our random page */
+	for (i = 0; i < PAGE_SIZE; i++)
+		memory_array[i] = isaac_next_uint32(&prng_context[0]);
+
+	for_each_present_cpu(cpu) {
+		uint32_t seed2 = isaac_next_uint32(&prng_context[0]);
+
+		cpu_cnt++;
+		if (cpu == 0)
+			continue;
+
+		isaac_init(&prng_context[cpu], (unsigned char *) &seed2, sizeof(seed2));
+		smp_boot_secondary(cpu, do_increment);
+	}
+
+	do_increment();
+
+	while (!cpumask_full(&smp_test_complete))
+		cpu_relax();
+
+	/* All CPUs done, do we add up */
+	for_each_present_cpu(cpu) {
+		sum += per_cpu_value[cpu];
+	}
+
+	if (test->should_pass)
+		report(sum == shared_value, "total incs %d", shared_value);
+	else
+		report_xfail(true, sum == shared_value, "total incs %d", shared_value);
+}
+
+int main(int argc, char **argv)
+{
+	static const unsigned char seed[] = "myseed";
+	test_descr_t *test = &tests[0];
+	int i;
+	unsigned int j;
+
+	isaac_init(&prng_context[0], &seed[0], sizeof(seed));
+
+	for (i = 0; i < argc; i++) {
+		char *arg = argv[i];
+
+		/* Check for test name */
+		for (j = 0; j < ARRAY_SIZE(tests); j++) {
+			if (strcmp(arg, tests[j].test_name) == 0)
+				test = &tests[j];
+		}
+
+		/* Test modifiers */
+		if (strcmp(arg, "noshuffle") == 0) {
+			do_shuffle = false;
+			report_prefix_push("noshuffle");
+		} else if (strstr(arg, "count=") != NULL) {
+			char *p = strstr(arg, "=");
+
+			increment_count = atol(p+1);
+		} else {
+			isaac_reseed(&prng_context[0], (unsigned char *) arg, strlen(arg));
+		}
+	}
+
+	if (test)
+		setup_and_run_test(test);
+	else
+		report(false, "Unknown test");
+
+	return report_summary();
+}
diff --git a/arm/spinlock-test.c b/arm/spinlock-test.c
deleted file mode 100644
index 73aea76a..00000000
--- a/arm/spinlock-test.c
+++ /dev/null
@@ -1,87 +0,0 @@
-/*
- * Spinlock test
- *
- * This code is based on code from the tcg_baremetal_tests.
- *
- * Copyright (C) 2015 Virtual Open Systems SAS
- *
- * This program is free software; you can redistribute it and/or modify
- * it under the terms of the GNU General Public License version 2 as
- * published by the Free Software Foundation.
- */
-
-#include <libcflat.h>
-#include <asm/smp.h>
-#include <asm/barrier.h>
-
-#define LOOP_SIZE 10000000
-
-struct lock_ops {
-	void (*lock)(int *v);
-	void (*unlock)(int *v);
-};
-static struct lock_ops lock_ops;
-
-static void gcc_builtin_lock(int *lock_var)
-{
-	while (__sync_lock_test_and_set(lock_var, 1));
-}
-static void gcc_builtin_unlock(int *lock_var)
-{
-	__sync_lock_release(lock_var);
-}
-static void none_lock(int *lock_var)
-{
-	while (*(volatile int *)lock_var != 0);
-	*(volatile int *)lock_var = 1;
-}
-static void none_unlock(int *lock_var)
-{
-	*(volatile int *)lock_var = 0;
-}
-
-static int global_a, global_b;
-static int global_lock;
-
-static void test_spinlock(void *data __unused)
-{
-	int i, errors = 0;
-	int cpu = smp_processor_id();
-
-	printf("CPU%d online\n", cpu);
-
-	for (i = 0; i < LOOP_SIZE; i++) {
-
-		lock_ops.lock(&global_lock);
-
-		if (global_a == (cpu + 1) % 2) {
-			global_a = 1;
-			global_b = 0;
-		} else {
-			global_a = 0;
-			global_b = 1;
-		}
-
-		if (global_a == global_b)
-			errors++;
-
-		lock_ops.unlock(&global_lock);
-	}
-	report(errors == 0, "CPU%d: Done - Errors: %d", cpu, errors);
-}
-
-int main(int argc, char **argv)
-{
-	report_prefix_push("spinlock");
-	if (argc > 1 && strcmp(argv[1], "bad") != 0) {
-		lock_ops.lock = gcc_builtin_lock;
-		lock_ops.unlock = gcc_builtin_unlock;
-	} else {
-		lock_ops.lock = none_lock;
-		lock_ops.unlock = none_unlock;
-	}
-
-	on_cpus(test_spinlock, NULL);
-
-	return report_summary();
-}
diff --git a/arm/unittests.cfg b/arm/unittests.cfg
index ee21aef4..45ac61c8 100644
--- a/arm/unittests.cfg
+++ b/arm/unittests.cfg
@@ -300,3 +300,33 @@ smp = $(($MAX_SMP>4?4:$MAX_SMP))
 extra_params = -append 'page self'
 groups = nodefault mttcg
 
+# Locking tests
+[locking::none]
+file = locking-test.flat
+smp = $(($MAX_SMP>4?4:$MAX_SMP))
+groups = nodefault mttcg locking
+
+[locking::lock]
+file = locking-test.flat
+smp = $(($MAX_SMP>4?4:$MAX_SMP))
+extra_params = -append 'lock'
+groups = nodefault mttcg locking
+
+[locking::atomic]
+file = locking-test.flat
+smp = $(($MAX_SMP>4?4:$MAX_SMP))
+extra_params = -append 'atomic'
+groups = nodefault mttcg locking
+
+[locking::wfelock]
+file = locking-test.flat
+smp = $(($MAX_SMP>4?4:$MAX_SMP))
+extra_params = -append 'wfelock'
+groups = nodefault mttcg locking
+
+[locking::excl]
+file = locking-test.flat
+smp = $(($MAX_SMP>4?4:$MAX_SMP))
+extra_params = -append 'excl'
+groups = nodefault mttcg locking
+
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [kvm-unit-tests PATCH v10 6/7] arm/barrier-litmus-tests: add simple mp and sal litmus tests
  2023-03-07 11:28 [kvm-unit-tests PATCH v10 0/7] MTTCG sanity tests for ARM Alex Bennée
                   ` (4 preceding siblings ...)
  2023-03-07 11:28 ` [kvm-unit-tests PATCH v10 5/7] arm/locking-tests: add comprehensive locking test Alex Bennée
@ 2023-03-07 11:28 ` Alex Bennée
  2023-03-07 11:28 ` [kvm-unit-tests PATCH v10 7/7] arm/tcg-test: some basic TCG exercising tests Alex Bennée
  2023-03-21 15:26 ` [kvm-unit-tests PATCH v10 0/7] MTTCG sanity tests for ARM Andrew Jones
  7 siblings, 0 replies; 18+ messages in thread
From: Alex Bennée @ 2023-03-07 11:28 UTC (permalink / raw)
  To: kvmarm
  Cc: kvm, qemu-devel, Thomas Huth, Andrew Jones, Paolo Bonzini,
	kvmarm, qemu-arm, Alex Bennée, Will Deacon

This adds a framework for adding simple barrier litmus tests against
ARM. The litmus tests aren't as comprehensive as the academic exercises
which will attempt to do all sorts of things to keep racing CPUs synced
up. These tests do honour the "sync" parameter to do a poor-mans
equivalent.

The two litmus tests are:
  - message passing
  - store-after-load

They both have case that should fail (although won't on single-threaded
TCG setups). If barriers aren't working properly the store-after-load
test will fail even on an x86 backend as x86 allows re-ording of non
aliased stores.

I've imported a few more of the barrier primatives from the Linux source
tree so we consistently use macros.

The arm64 barrier primitives trip up on -Wstrict-aliasing so this is
disabled in the Makefile.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
CC: Will Deacon <will@kernel.org>
Message-Id: <20211118184650.661575-9-alex.bennee@linaro.org>

---
v9
  - return to unittests.cfg, drop accel=tcg
  - use compiler.h for barriers instead of defining outselves
  - s/printf/report_info/
v10
  - use compiler WRITE/READ_ONCE macros
---
 arm/Makefile.common       |   1 +
 lib/arm/asm/barrier.h     |  19 ++
 lib/arm64/asm/barrier.h   |  50 +++++
 arm/barrier-litmus-test.c | 450 ++++++++++++++++++++++++++++++++++++++
 arm/unittests.cfg         |  31 +++
 5 files changed, 551 insertions(+)
 create mode 100644 arm/barrier-litmus-test.c

diff --git a/arm/Makefile.common b/arm/Makefile.common
index 3089e3bf..0a2bdcfc 100644
--- a/arm/Makefile.common
+++ b/arm/Makefile.common
@@ -13,6 +13,7 @@ tests-common += $(TEST_DIR)/sieve.flat
 tests-common += $(TEST_DIR)/pl031.flat
 tests-common += $(TEST_DIR)/tlbflush-code.flat
 tests-common += $(TEST_DIR)/locking-test.flat
+tests-common += $(TEST_DIR)/barrier-litmus-test.flat
 
 tests-all = $(tests-common) $(tests)
 all: directories $(tests-all)
diff --git a/lib/arm/asm/barrier.h b/lib/arm/asm/barrier.h
index 7f868314..0f3670b8 100644
--- a/lib/arm/asm/barrier.h
+++ b/lib/arm/asm/barrier.h
@@ -8,6 +8,9 @@
  * This work is licensed under the terms of the GNU GPL, version 2.
  */
 
+#include <stdint.h>
+#include <linux/compiler.h>
+
 #define sev()		asm volatile("sev" : : : "memory")
 #define wfe()		asm volatile("wfe" : : : "memory")
 #define wfi()		asm volatile("wfi" : : : "memory")
@@ -25,4 +28,20 @@
 #define smp_rmb()	smp_mb()
 #define smp_wmb()	dmb(ishst)
 
+extern void abort(void);
+
+#define smp_store_release(p, v)						\
+do {									\
+	smp_mb();							\
+	WRITE_ONCE(*p, v);						\
+} while (0)
+
+
+#define smp_load_acquire(p)						\
+({									\
+	typeof(*p) ___p1 = READ_ONCE(*p);				\
+	smp_mb();							\
+	___p1;								\
+})
+
 #endif /* _ASMARM_BARRIER_H_ */
diff --git a/lib/arm64/asm/barrier.h b/lib/arm64/asm/barrier.h
index 0e1904cf..5e405190 100644
--- a/lib/arm64/asm/barrier.h
+++ b/lib/arm64/asm/barrier.h
@@ -24,4 +24,54 @@
 #define smp_rmb()	dmb(ishld)
 #define smp_wmb()	dmb(ishst)
 
+#define smp_store_release(p, v)						\
+do {									\
+	switch (sizeof(*p)) {						\
+	case 1:								\
+		asm volatile ("stlrb %w1, %0"				\
+				: "=Q" (*p) : "r" (v) : "memory");	\
+		break;							\
+	case 2:								\
+		asm volatile ("stlrh %w1, %0"				\
+				: "=Q" (*p) : "r" (v) : "memory");	\
+		break;							\
+	case 4:								\
+		asm volatile ("stlr %w1, %0"				\
+				: "=Q" (*p) : "r" (v) : "memory");	\
+		break;							\
+	case 8:								\
+		asm volatile ("stlr %1, %0"				\
+				: "=Q" (*p) : "r" (v) : "memory");	\
+		break;							\
+	}								\
+} while (0)
+
+#define smp_load_acquire(p)						\
+({									\
+	union { typeof(*p) __val; char __c[1]; } __u;			\
+	switch (sizeof(*p)) {						\
+	case 1:								\
+		asm volatile ("ldarb %w0, %1"				\
+			: "=r" (*(u8 *)__u.__c)				\
+			: "Q" (*p) : "memory");				\
+		break;							\
+	case 2:								\
+		asm volatile ("ldarh %w0, %1"				\
+			: "=r" (*(u16 *)__u.__c)			\
+			: "Q" (*p) : "memory");				\
+		break;							\
+	case 4:								\
+		asm volatile ("ldar %w0, %1"				\
+			: "=r" (*(u32 *)__u.__c)			\
+			: "Q" (*p) : "memory");				\
+		break;							\
+	case 8:								\
+		asm volatile ("ldar %0, %1"				\
+			: "=r" (*(u64 *)__u.__c)			\
+			: "Q" (*p) : "memory");				\
+		break;							\
+	}								\
+	__u.__val;							\
+})
+
 #endif /* _ASMARM64_BARRIER_H_ */
diff --git a/arm/barrier-litmus-test.c b/arm/barrier-litmus-test.c
new file mode 100644
index 00000000..5d7e61d1
--- /dev/null
+++ b/arm/barrier-litmus-test.c
@@ -0,0 +1,450 @@
+// SPDX-License-Identifier: GPL-2.0-or-later
+/*
+ * ARM Barrier Litmus Tests
+ *
+ * This test provides a framework for testing barrier conditions on
+ * the processor. It's simpler than the more involved barrier testing
+ * frameworks as we are looking for simple failures of QEMU's TCG not
+ * weird edge cases the silicon gets wrong.
+ */
+
+#include <libcflat.h>
+#include <asm/smp.h>
+#include <asm/cpumask.h>
+#include <asm/barrier.h>
+#include <asm/mmu.h>
+
+#define MAX_CPUS 8
+
+/* Array size and access controls */
+static int array_size = 100000;
+static int wait_if_ahead;
+
+static cpumask_t cpu_mask;
+
+/*
+ * These test_array_* structures are a contiguous array modified by two or more
+ * competing CPUs. The padding is to ensure the variables do not share
+ * cache lines.
+ *
+ * All structures start zeroed.
+ */
+
+typedef struct test_array {
+	volatile unsigned int x;
+	uint8_t dummy[64];
+	volatile unsigned int y;
+	uint8_t dummy2[64];
+	volatile unsigned int r[MAX_CPUS];
+} test_array;
+
+volatile test_array *array;
+
+/* Test definition structure
+ *
+ * The first function will always run on the primary CPU, it is
+ * usually the one that will detect any weirdness and trigger the
+ * failure of the test.
+ */
+
+typedef void (*test_fn)(void);
+
+typedef struct {
+	const char *test_name;
+	bool  should_pass;
+	test_fn main_fn;
+	test_fn secondary_fns[MAX_CPUS-1];
+} test_descr_t;
+
+/* Litmus tests */
+
+static unsigned long sync_start(void)
+{
+	const unsigned long gate_mask = ~0x3ffff;
+	unsigned long gate, now;
+
+	gate = get_cntvct() & gate_mask;
+	do {
+		now = get_cntvct();
+	} while ((now & gate_mask) == gate);
+
+	return now;
+}
+
+/* Simple Message Passing
+ *
+ * x is the message data
+ * y is the flag to indicate the data is ready
+ *
+ * Reading x == 0 when y == 1 is a failure.
+ */
+
+static void message_passing_write(void)
+{
+	int i;
+
+	sync_start();
+	for (i = 0; i < array_size; i++) {
+		volatile test_array *entry = &array[i];
+
+		entry->x = 1;
+		entry->y = 1;
+	}
+
+	halt();
+}
+
+static void message_passing_read(void)
+{
+	int i;
+	int errors = 0, ready = 0;
+
+	sync_start();
+	for (i = 0; i < array_size; i++) {
+		volatile test_array *entry = &array[i];
+		unsigned int x, y;
+
+		y = entry->y;
+		x = entry->x;
+
+		if (y && !x)
+			errors++;
+		ready += y;
+	}
+
+	/*
+	 * We expect this to fail but with STO backends you often get
+	 * away with it. Fudge xfail if we did actually pass.
+	 */
+	report_xfail(errors == 0 ? false : true, errors == 0,
+		     "mp: %d errors, %d ready", errors, ready);
+}
+
+/* Simple Message Passing with barriers */
+static void message_passing_write_barrier(void)
+{
+	int i;
+
+	sync_start();
+	for (i = 0; i < array_size; i++) {
+		volatile test_array *entry = &array[i];
+
+		entry->x = 1;
+		smp_wmb();
+		entry->y = 1;
+	}
+
+	halt();
+}
+
+static void message_passing_read_barrier(void)
+{
+	int i;
+	int errors = 0, ready = 0, not_ready = 0;
+
+	sync_start();
+	for (i = 0; i < array_size; i++) {
+		volatile test_array *entry = &array[i];
+		unsigned int x, y;
+
+		y = entry->y;
+		smp_rmb();
+		x = entry->x;
+
+		if (y && !x)
+			errors++;
+
+		if (y) {
+			ready++;
+		} else {
+			not_ready++;
+
+			if (not_ready > 2) {
+				entry = &array[i+1];
+				do {
+					not_ready = 0;
+				} while (wait_if_ahead && !entry->y);
+			}
+		}
+	}
+
+	report(errors == 0, "mp barrier: %d errors, %d ready", errors, ready);
+}
+
+/* Simple Message Passing with Acquire/Release */
+static void message_passing_write_release(void)
+{
+	int i;
+
+	for (i = 0; i < array_size; i++) {
+		volatile test_array *entry = &array[i];
+
+		entry->x = 1;
+		smp_store_release(&entry->y, 1);
+	}
+
+	halt();
+}
+
+static void message_passing_read_acquire(void)
+{
+	int i;
+	int errors = 0, ready = 0, not_ready = 0;
+
+	for (i = 0; i < array_size; i++) {
+		volatile test_array *entry = &array[i];
+		unsigned int x, y;
+
+		y = smp_load_acquire(&entry->y);
+		x = entry->x;
+
+		if (y && !x)
+			errors++;
+
+		if (y) {
+			ready++;
+		} else {
+			not_ready++;
+
+			if (not_ready > 2) {
+				entry = &array[i+1];
+				do {
+					not_ready = 0;
+				} while (wait_if_ahead && !entry->y);
+			}
+		}
+	}
+
+	report(errors == 0, "mp acqrel: %d errors, %d ready", errors, ready);
+}
+
+/*
+ * Store after load
+ *
+ * T1: write 1 to x, load r from y
+ * T2: write 1 to y, load r from x
+ *
+ * Without memory fence r[0] && r[1] == 0
+ * With memory fence both == 0 should be impossible
+ */
+
+static void check_store_and_load_results(const char *name, int thread,
+					 bool xfail, unsigned long start,
+					 unsigned long end)
+{
+	int i;
+	int neither = 0;
+	int only_first = 0;
+	int only_second = 0;
+	int both = 0;
+
+	for ( i= 0; i < array_size; i++) {
+		volatile test_array *entry = &array[i];
+
+		if (entry->r[0] == 0 &&
+		    entry->r[1] == 0)
+			neither++;
+		else if (entry->r[0] &&
+			entry->r[1])
+			both++;
+		else if (entry->r[0])
+			only_first++;
+		else
+			only_second++;
+	}
+
+	report_info("T%d: %08lx->%08lx neither=%d only_t1=%d only_t2=%d both=%d\n",
+		    thread, start, end, neither, only_first, only_second, both);
+
+	if (thread == 1)
+		report_xfail(xfail, neither==0, "%s: errors=%d", name, neither);
+
+}
+
+/*
+ * This attempts to synchronise the start of both threads to roughly
+ * the same time. On real hardware there is a little latency as the
+ * secondary vCPUs are powered up however this effect it much more
+ * exaggerated on a TCG host.
+ *
+ * Busy waits until the we pass a future point in time, returns final
+ * start time.
+ */
+
+static void store_and_load_1(void)
+{
+	int i;
+	unsigned long start, end;
+
+	start = sync_start();
+	for (i = 0; i < array_size; i++) {
+		volatile test_array *entry = &array[i];
+		unsigned int r;
+
+		entry->x = 1;
+		r = entry->y;
+		entry->r[0] = r;
+	}
+	end = get_cntvct();
+
+	smp_mb();
+
+	while (!cpumask_test_cpu(1, &cpu_mask))
+		cpu_relax();
+
+	check_store_and_load_results("sal", 1, true, start, end);
+}
+
+static void store_and_load_2(void)
+{
+	int i;
+	unsigned long start, end;
+
+	start = sync_start();
+	for (i = 0; i < array_size; i++) {
+		volatile test_array *entry = &array[i];
+		unsigned int r;
+
+		entry->y = 1;
+		r = entry->x;
+		entry->r[1] = r;
+	}
+	end = get_cntvct();
+
+	check_store_and_load_results("sal", 2, true, start, end);
+
+	cpumask_set_cpu(1, &cpu_mask);
+
+	halt();
+}
+
+static void store_and_load_barrier_1(void)
+{
+	int i;
+	unsigned long start, end;
+
+	start = sync_start();
+	for (i = 0; i < array_size; i++) {
+		volatile test_array *entry = &array[i];
+		unsigned int r;
+
+		entry->x = 1;
+		smp_mb();
+		r = entry->y;
+		entry->r[0] = r;
+	}
+	end = get_cntvct();
+
+	smp_mb();
+
+	while (!cpumask_test_cpu(1, &cpu_mask))
+		cpu_relax();
+
+	check_store_and_load_results("sal_barrier", 1, false, start, end);
+}
+
+static void store_and_load_barrier_2(void)
+{
+	int i;
+	unsigned long start, end;
+
+	start = sync_start();
+	for (i = 0; i < array_size; i++) {
+		volatile test_array *entry = &array[i];
+		unsigned int r;
+
+		entry->y = 1;
+		smp_mb();
+		r = entry->x;
+		entry->r[1] = r;
+	}
+	end = get_cntvct();
+
+	check_store_and_load_results("sal_barrier", 2, false, start, end);
+
+	cpumask_set_cpu(1, &cpu_mask);
+
+	halt();
+}
+
+
+/* Test array */
+static test_descr_t tests[] = {
+
+	{ "mp",         false,
+	  message_passing_read,
+	  { message_passing_write }
+	},
+
+	{ "mp_barrier", true,
+	  message_passing_read_barrier,
+	  { message_passing_write_barrier }
+	},
+
+	{ "mp_acqrel", true,
+	  message_passing_read_acquire,
+	  { message_passing_write_release }
+	},
+
+	{ "sal",       false,
+	  store_and_load_1,
+	  { store_and_load_2 }
+	},
+
+	{ "sal_barrier", true,
+	  store_and_load_barrier_1,
+	  { store_and_load_barrier_2 }
+	},
+};
+
+
+static void setup_and_run_litmus(test_descr_t *test)
+{
+	array = calloc(array_size, sizeof(test_array));
+
+	if (array) {
+		int i = 0;
+
+		report_info("Allocated test array @ %p", array);
+
+		while (test->secondary_fns[i]) {
+			smp_boot_secondary(i+1, test->secondary_fns[i]);
+			i++;
+		}
+
+		test->main_fn();
+	} else
+		report(false, "%s: failed to allocate memory", test->test_name);
+}
+
+int main(int argc, char **argv)
+{
+	int i;
+	unsigned int j;
+	test_descr_t *test = NULL;
+
+	for (i = 0; i < argc; i++) {
+		char *arg = argv[i];
+
+		for (j = 0; j < ARRAY_SIZE(tests); j++) {
+			if (strcmp(arg, tests[j].test_name) == 0)
+				test = &tests[j];
+		}
+
+		/* Test modifiers */
+		if (strstr(arg, "count=") != NULL) {
+			char *p = strstr(arg, "=");
+
+			array_size = atol(p+1);
+		} else if (strcmp(arg, "wait") == 0) {
+			wait_if_ahead = 1;
+		}
+	}
+
+	if (test)
+		setup_and_run_litmus(test);
+	else
+		report(false, "Unknown test");
+
+	return report_summary();
+}
diff --git a/arm/unittests.cfg b/arm/unittests.cfg
index 45ac61c8..3d73e308 100644
--- a/arm/unittests.cfg
+++ b/arm/unittests.cfg
@@ -330,3 +330,34 @@ smp = $(($MAX_SMP>4?4:$MAX_SMP))
 extra_params = -append 'excl'
 groups = nodefault mttcg locking
 
+# Barrier Litmus tests
+[barrier-litmus::mp]
+file = barrier-litmus-test.flat
+smp = 2
+extra_params = -append 'mp'
+groups = nodefault mttcg barrier
+
+[barrier-litmus::mp-barrier]
+file = barrier-litmus-test.flat
+smp = 2
+extra_params = -append 'mp_barrier'
+groups = nodefault mttcg barrier
+
+[barrier-litmus::mp-acqrel]
+file = barrier-litmus-test.flat
+smp = 2
+extra_params = -append 'mp_acqrel'
+groups = nodefault mttcg barrier
+
+[barrier-litmus::sal]
+file = barrier-litmus-test.flat
+smp = 2
+extra_params = -append 'sal'
+groups = nodefault mttcg barrier
+
+[barrier-litmus::sal-barrier]
+file = barrier-litmus-test.flat
+smp = 2
+extra_params = -append 'sal_barrier'
+groups = nodefault mttcg barrier
+
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [kvm-unit-tests PATCH v10 7/7] arm/tcg-test: some basic TCG exercising tests
  2023-03-07 11:28 [kvm-unit-tests PATCH v10 0/7] MTTCG sanity tests for ARM Alex Bennée
                   ` (5 preceding siblings ...)
  2023-03-07 11:28 ` [kvm-unit-tests PATCH v10 6/7] arm/barrier-litmus-tests: add simple mp and sal litmus tests Alex Bennée
@ 2023-03-07 11:28 ` Alex Bennée
  2023-03-21 15:26 ` [kvm-unit-tests PATCH v10 0/7] MTTCG sanity tests for ARM Andrew Jones
  7 siblings, 0 replies; 18+ messages in thread
From: Alex Bennée @ 2023-03-07 11:28 UTC (permalink / raw)
  To: kvmarm
  Cc: kvm, qemu-devel, Thomas Huth, Andrew Jones, Paolo Bonzini,
	kvmarm, qemu-arm, Alex Bennée

These tests are not really aimed at KVM at all but exist to stretch
QEMU's TCG code generator. In particular these exercise the ability of
the TCG to:

  * Chain TranslationBlocks together (tight)
  * Handle heavy usage of the tb_jump_cache (paged)
  * Pathological case of computed local jumps (computed)

In addition the tests can be varied by adding IPI IRQs or SMC sequences
into the mix to stress the tcg_exit and invalidation mechanisms.

To explicitly stress the tb_flush() mechanism you can use the mod/rounds
parameters to force more frequent tb invalidation. Combined with setting
-tb-size 1 in QEMU to limit the code generation buffer size.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20211118184650.661575-11-alex.bennee@linaro.org>

---
v9
  - moved back to unittests.cfg
  - fixed some missing accel tags
  - s/printf/report_info/
  - clean up some comment blocks
---
 arm/Makefile.arm     |   2 +
 arm/Makefile.arm64   |   2 +
 arm/Makefile.common  |   1 +
 arm/tcg-test-asm.S   | 171 ++++++++++++++++++++++
 arm/tcg-test-asm64.S | 170 ++++++++++++++++++++++
 arm/tcg-test.c       | 340 +++++++++++++++++++++++++++++++++++++++++++
 arm/unittests.cfg    |  84 +++++++++++
 7 files changed, 770 insertions(+)
 create mode 100644 arm/tcg-test-asm.S
 create mode 100644 arm/tcg-test-asm64.S
 create mode 100644 arm/tcg-test.c

diff --git a/arm/Makefile.arm b/arm/Makefile.arm
index 01fd4c7b..6af61033 100644
--- a/arm/Makefile.arm
+++ b/arm/Makefile.arm
@@ -37,4 +37,6 @@ tests =
 
 include $(SRCDIR)/$(TEST_DIR)/Makefile.common
 
+$(TEST_DIR)/tcg-test.elf: $(cstart.o) $(TEST_DIR)/tcg-test.o $(TEST_DIR)/tcg-test-asm.o
+
 arch_clean: arm_clean
diff --git a/arm/Makefile.arm64 b/arm/Makefile.arm64
index 42e18e77..72da5033 100644
--- a/arm/Makefile.arm64
+++ b/arm/Makefile.arm64
@@ -35,5 +35,7 @@ tests += $(TEST_DIR)/debug.flat
 
 include $(SRCDIR)/$(TEST_DIR)/Makefile.common
 
+$(TEST_DIR)/tcg-test.elf: $(cstart.o) $(TEST_DIR)/tcg-test.o $(TEST_DIR)/tcg-test-asm64.o
+
 arch_clean: arm_clean
 	$(RM) lib/arm64/.*.d
diff --git a/arm/Makefile.common b/arm/Makefile.common
index 0a2bdcfc..0b81fa72 100644
--- a/arm/Makefile.common
+++ b/arm/Makefile.common
@@ -14,6 +14,7 @@ tests-common += $(TEST_DIR)/pl031.flat
 tests-common += $(TEST_DIR)/tlbflush-code.flat
 tests-common += $(TEST_DIR)/locking-test.flat
 tests-common += $(TEST_DIR)/barrier-litmus-test.flat
+tests-common += $(TEST_DIR)/tcg-test.flat
 
 tests-all = $(tests-common) $(tests)
 all: directories $(tests-all)
diff --git a/arm/tcg-test-asm.S b/arm/tcg-test-asm.S
new file mode 100644
index 00000000..f58fac08
--- /dev/null
+++ b/arm/tcg-test-asm.S
@@ -0,0 +1,171 @@
+/* SPDX-License-Identifier: GPL-2.0-or-later */
+/*
+ * TCG Test assembler functions for armv7 tests.
+ *
+ * Copyright (C) 2016, Linaro Ltd, Alex Bennée <alex.bennee@linaro.org>
+ *
+ * These helper functions are written in pure asm to control the size
+ * of the basic blocks and ensure they fit neatly into page
+ * aligned chunks. The pattern of branches they follow is determined by
+ * the 32 bit seed they are passed. It should be the same for each set.
+ *
+ * Calling convention
+ *  - r0, iterations
+ *  - r1, jump pattern
+ *  - r2-r3, scratch
+ *
+ * Returns r0
+ */
+
+.arm
+
+.section .text
+
+/*
+ * Tight - all blocks should quickly be patched and should run
+ * very fast unless irqs or smc gets in the way
+ */
+
+.global tight_start
+tight_start:
+        subs    r0, r0, #1
+        beq     tight_end
+
+        ror     r1, r1, #1
+        tst     r1, #1
+        beq     tightA
+        b       tight_start
+
+tightA:
+        subs    r0, r0, #1
+        beq     tight_end
+
+        ror     r1, r1, #1
+        tst     r1, #1
+        beq     tightB
+        b       tight_start
+
+tightB:
+        subs    r0, r0, #1
+        beq     tight_end
+
+        ror     r1, r1, #1
+        tst     r1, #1
+        beq     tight_start
+        b       tightA
+
+.global tight_end
+tight_end:
+        mov     pc, lr
+
+/*
+ * Computed jumps cannot be hardwired into the basic blocks so each one
+ * will either cause an exit for the main execution loop or trigger an
+ * inline look up for the next block.
+ *
+ * There is some caching which should ameliorate the cost a little.
+ */
+
+        /* Align << 13 == 4096 byte alignment */
+        .align 13
+        .global computed_start
+computed_start:
+        subs    r0, r0, #1
+        beq     computed_end
+
+        /* Jump table */
+        ror     r1, r1, #1
+        and     r2, r1, #1
+        adr     r3, computed_jump_table
+        ldr     r2, [r3, r2, lsl #2]
+        mov     pc, r2
+
+        b       computed_err
+
+computed_jump_table:
+        .word   computed_start
+        .word   computedA
+
+computedA:
+        subs    r0, r0, #1
+        beq     computed_end
+
+        /* Jump into code */
+        ror     r1, r1, #1
+        and     r2, r1, #1
+        adr     r3, 1f
+        add	r3, r2, lsl #2
+        mov     pc, r3
+1:      b       computed_start
+        b       computedB
+
+        b       computed_err
+
+
+computedB:
+        subs    r0, r0, #1
+        beq     computed_end
+        ror     r1, r1, #1
+
+        /* Conditional register load */
+        adr     r3, computedA
+        tst     r1, #1
+        adreq   r3, computed_start
+        mov     pc, r3
+
+        b       computed_err
+
+computed_err:
+        mov     r0, #1
+        .global computed_end
+computed_end:
+        mov     pc, lr
+
+
+/*
+ * Page hoping
+ *
+ * Each block is in a different page, hence the blocks never get joined
+ */
+        /* Align << 13 == 4096 byte alignment */
+        .align 13
+        .global paged_start
+paged_start:
+        subs    r0, r0, #1
+        beq     paged_end
+
+        ror     r1, r1, #1
+        tst     r1, #1
+        beq     pagedA
+        b       paged_start
+
+        /* Align << 13 == 4096 byte alignment */
+        .align 13
+pagedA:
+        subs    r0, r0, #1
+        beq     paged_end
+
+        ror     r1, r1, #1
+        tst     r1, #1
+        beq     pagedB
+        b       paged_start
+
+        /* Align << 13 == 4096 byte alignment */
+        .align 13
+pagedB:
+        subs    r0, r0, #1
+        beq     paged_end
+
+        ror     r1, r1, #1
+        tst     r1, #1
+        beq     paged_start
+        b       pagedA
+
+        /* Align << 13 == 4096 byte alignment */
+        .align 13
+.global paged_end
+paged_end:
+        mov     pc, lr
+
+.global test_code_end
+test_code_end:
diff --git a/arm/tcg-test-asm64.S b/arm/tcg-test-asm64.S
new file mode 100644
index 00000000..e69a8c72
--- /dev/null
+++ b/arm/tcg-test-asm64.S
@@ -0,0 +1,170 @@
+/* SPDX-License-Identifier: GPL-2.0-or-later */
+/*
+ * TCG Test assembler functions for armv8 tests.
+ *
+ * Copyright (C) 2016, Linaro Ltd, Alex Bennée <alex.bennee@linaro.org>
+ *
+ * These helper functions are written in pure asm to control the size
+ * of the basic blocks and ensure they fit neatly into page
+ * aligned chunks. The pattern of branches they follow is determined by
+ * the 32 bit seed they are passed. It should be the same for each set.
+ *
+ * Calling convention
+ *  - x0, iterations
+ *  - x1, jump pattern
+ *  - x2-x3, scratch
+ *
+ * Returns x0
+ */
+
+.section .text
+
+/*
+ * Tight - all blocks should quickly be patched and should run
+ * very fast unless irqs or smc gets in the way
+ */
+
+.global tight_start
+tight_start:
+        subs    x0, x0, #1
+        beq     tight_end
+
+        ror     x1, x1, #1
+        tst     x1, #1
+        beq     tightA
+        b       tight_start
+
+tightA:
+        subs    x0, x0, #1
+        beq     tight_end
+
+        ror     x1, x1, #1
+        tst     x1, #1
+        beq     tightB
+        b       tight_start
+
+tightB:
+        subs    x0, x0, #1
+        beq     tight_end
+
+        ror     x1, x1, #1
+        tst     x1, #1
+        beq     tight_start
+        b       tightA
+
+.global tight_end
+tight_end:
+        ret
+
+/*
+ * Computed jumps cannot be hardwired into the basic blocks so each one
+ * will either cause an exit for the main execution loop or trigger an
+ * inline look up for the next block.
+ *
+ * There is some caching which should ameliorate the cost a little.
+ */
+
+        /* Align << 13 == 4096 byte alignment */
+        .align 13
+        .global computed_start
+computed_start:
+        subs    x0, x0, #1
+        beq     computed_end
+
+        /* Jump table */
+        ror     x1, x1, #1
+        and     x2, x1, #1
+        adr     x3, computed_jump_table
+        ldr     x2, [x3, x2, lsl #3]
+        br      x2
+
+        b       computed_err
+
+computed_jump_table:
+        .quad   computed_start
+        .quad   computedA
+
+computedA:
+        subs    x0, x0, #1
+        beq     computed_end
+
+        /* Jump into code */
+        ror     x1, x1, #1
+        and     x2, x1, #1
+        adr     x3, 1f
+        add	x3, x3, x2, lsl #2
+        br      x3
+1:      b       computed_start
+        b       computedB
+
+        b       computed_err
+
+
+computedB:
+        subs    x0, x0, #1
+        beq     computed_end
+        ror     x1, x1, #1
+
+        /* Conditional register load */
+        adr     x2, computedA
+        adr     x3, computed_start
+        tst     x1, #1
+        csel    x2, x3, x2, eq
+        br      x2
+
+        b       computed_err
+
+computed_err:
+        mov     x0, #1
+        .global computed_end
+computed_end:
+        ret
+
+
+/*
+ * Page hoping
+ *
+ * Each block is in a different page, hence the blocks never get joined
+ */
+        /* Align << 13 == 4096 byte alignment */
+        .align 13
+        .global paged_start
+paged_start:
+        subs    x0, x0, #1
+        beq     paged_end
+
+        ror     x1, x1, #1
+        tst     x1, #1
+        beq     pagedA
+        b       paged_start
+
+        /* Align << 13 == 4096 byte alignment */
+        .align 13
+pagedA:
+        subs    x0, x0, #1
+        beq     paged_end
+
+        ror     x1, x1, #1
+        tst     x1, #1
+        beq     pagedB
+        b       paged_start
+
+        /* Align << 13 == 4096 byte alignment */
+        .align 13
+pagedB:
+        subs    x0, x0, #1
+        beq     paged_end
+
+        ror     x1, x1, #1
+        tst     x1, #1
+        beq     paged_start
+        b       pagedA
+
+        /* Align << 13 == 4096 byte alignment */
+        .align 13
+.global paged_end
+paged_end:
+        ret
+
+.global test_code_end
+test_code_end:
diff --git a/arm/tcg-test.c b/arm/tcg-test.c
new file mode 100644
index 00000000..d04d56e4
--- /dev/null
+++ b/arm/tcg-test.c
@@ -0,0 +1,340 @@
+// SPDX-License-Identifier: GPL-2.0-or-later
+/*
+ * ARM TCG Tests
+ *
+ * These tests are explicitly aimed at stretching the QEMU TCG engine.
+ */
+
+#include <libcflat.h>
+#include <asm/processor.h>
+#include <asm/smp.h>
+#include <asm/cpumask.h>
+#include <asm/barrier.h>
+#include <asm/mmu.h>
+#include <asm/gic.h>
+
+#include <prng.h>
+
+#define MAX_CPUS 8
+
+/* These entry points are in the assembly code */
+extern int tight_start(uint32_t count, uint32_t pattern);
+extern int computed_start(uint32_t count, uint32_t pattern);
+extern int paged_start(uint32_t count, uint32_t pattern);
+extern uint32_t tight_end;
+extern uint32_t computed_end;
+extern uint32_t paged_end;
+extern unsigned long test_code_end;
+
+typedef int (*test_fn)(uint32_t count, uint32_t pattern);
+
+typedef struct {
+	const char *test_name;
+	bool       should_pass;
+	test_fn    start_fn;
+	uint32_t   *code_end;
+} test_descr_t;
+
+/* Test array */
+static test_descr_t tests[] = {
+       /*
+	* Tight chain.
+	*
+	* These are a bunch of basic blocks that have fixed branches in
+	* a page aligned space. The branches taken are decided by a
+	* psuedo-random bitmap for each CPU.
+	*
+	* Once the basic blocks have been chained together by the TCG they
+	* should run until they reach their block count. This will be the
+	* most efficient mode in which generated code is run. The only other
+	* exits will be caused by interrupts or TB invalidation.
+	*/
+	{ "tight", true, tight_start, &tight_end },
+	/*
+	 * Computed jumps.
+	 *
+	 * A bunch of basic blocks which just do computed jumps so the basic
+	 * block is never chained but they are all within a page (maybe not
+	 * required). This will exercise the cache lookup but not the new
+	 * generation.
+	 */
+	{ "computed", true, computed_start, &computed_end },
+	/*
+	 * Page ping pong.
+	 *
+	 * Have the blocks are separated by PAGE_SIZE so they can never
+	 * be chained together.
+	 *
+	 */
+	{ "paged", true, paged_start, &paged_end}
+};
+
+static test_descr_t *test;
+
+static int iterations = 1000000;
+static int rounds = 1000;
+static int mod_freq = 5;
+static uint32_t pattern[MAX_CPUS];
+
+/* control flags */
+static int smc;
+static int irq;
+static int check_irq;
+
+/* IRQ accounting */
+#define MAX_IRQ_IDS 16
+static int irqv;
+static unsigned long irq_sent_ts[MAX_CPUS][MAX_CPUS][MAX_IRQ_IDS];
+
+static int irq_recv[MAX_CPUS];
+static int irq_sent[MAX_CPUS];
+static int irq_overlap[MAX_CPUS];  /* if ts > now, i.e a race */
+static int irq_slow[MAX_CPUS];  /* if delay > threshold */
+static unsigned long irq_latency[MAX_CPUS]; /* cumulative time */
+
+static int errors[MAX_CPUS];
+
+static cpumask_t smp_test_complete;
+
+static cpumask_t ready;
+
+static void wait_on_ready(void)
+{
+	cpumask_set_cpu(smp_processor_id(), &ready);
+	while (!cpumask_full(&ready))
+		cpu_relax();
+}
+
+/*
+ * This triggers TCGs SMC detection by writing values to the executing
+ * code pages. We are not actually modifying the instructions and the
+ * underlying code will remain unchanged. However this should trigger
+ * invalidation of the Translation Blocks
+ */
+
+static void trigger_smc_detection(uint32_t *start, uint32_t *end)
+{
+	volatile uint32_t *ptr = start;
+
+	while (ptr < end) {
+		uint32_t inst = *ptr;
+		*ptr++ = inst;
+	}
+}
+
+/* Handler for receiving IRQs */
+
+static void irq_handler(struct pt_regs *regs __unused)
+{
+	unsigned long then, now = get_cntvct();
+	int cpu = smp_processor_id();
+	u32 irqstat = gic_read_iar();
+	u32 irqnr = gic_iar_irqnr(irqstat);
+
+	if (irqnr != GICC_INT_SPURIOUS) {
+		unsigned int src_cpu = (irqstat >> 10) & 0x7;
+
+		gic_write_eoir(irqstat);
+		irq_recv[cpu]++;
+
+		then = irq_sent_ts[src_cpu][cpu][irqnr];
+
+		if (then > now) {
+			irq_overlap[cpu]++;
+		} else {
+			unsigned long latency = (now - then);
+
+			if (latency > 30000)
+				irq_slow[cpu]++;
+			else
+				irq_latency[cpu] += latency;
+		}
+	}
+}
+
+/*
+ * This triggers cross-CPU IRQs. Each IRQ should cause the basic block
+ * execution to finish the main run-loop get entered again.
+ */
+static int send_cross_cpu_irqs(int this_cpu, int irq)
+{
+	int cpu, sent = 0;
+	cpumask_t mask;
+
+	cpumask_copy(&mask, &cpu_present_mask);
+
+	for_each_present_cpu(cpu) {
+		if (cpu != this_cpu) {
+			irq_sent_ts[this_cpu][cpu][irq] = get_cntvct();
+			cpumask_clear_cpu(cpu, &mask);
+			sent++;
+		}
+	}
+
+	gic_ipi_send_mask(irq, &mask);
+
+	return sent;
+}
+
+static void do_test(void)
+{
+	int cpu = smp_processor_id();
+	int i, irq_id = 0;
+
+	report_info("CPU%d: online and setting up with pattern 0x%"PRIx32,
+		    cpu, pattern[cpu]);
+
+	if (irq) {
+		gic_enable_defaults();
+#ifdef __arm__
+		install_exception_handler(EXCPTN_IRQ, irq_handler);
+#else
+		install_irq_handler(EL1H_IRQ, irq_handler);
+#endif
+		local_irq_enable();
+
+		wait_on_ready();
+	}
+
+	for (i = 0; i < rounds; i++) {
+		/* Enter the blocks */
+		errors[cpu] += test->start_fn(iterations, pattern[cpu]);
+
+		if ((i + cpu) % mod_freq == 0) {
+			if (smc)
+				trigger_smc_detection((uint32_t *) test->start_fn,
+						      test->code_end);
+
+			if (irq) {
+				irq_sent[cpu] += send_cross_cpu_irqs(cpu, irq_id);
+				irq_id++;
+				irq_id = irq_id % 15;
+			}
+		}
+	}
+
+	/* ensure everything complete before we finish */
+	smp_wmb();
+
+	cpumask_set_cpu(cpu, &smp_test_complete);
+	if (cpu != 0)
+		halt();
+}
+
+static void report_irq_stats(int cpu)
+{
+	int recv = irq_recv[cpu];
+	int race = irq_overlap[cpu];
+	int slow = irq_slow[cpu];
+
+	unsigned long avg_latency = irq_latency[cpu] / (recv - (race + slow));
+
+	report_info("CPU%d: %d irqs (%d races, %d slow,  %ld ticks avg latency)",
+		    cpu, recv, race, slow, avg_latency);
+}
+
+
+static void setup_and_run_tcg_test(void)
+{
+	static const unsigned char seed[] = "tcg-test";
+	struct isaac_ctx prng_context;
+	int cpu;
+	int total_err = 0, total_sent = 0, total_recv = 0;
+
+	isaac_init(&prng_context, &seed[0], sizeof(seed));
+
+	/* boot other CPUs */
+	for_each_present_cpu(cpu) {
+		pattern[cpu] = isaac_next_uint32(&prng_context);
+
+		if (cpu == 0)
+			continue;
+
+		smp_boot_secondary(cpu, do_test);
+	}
+
+	do_test();
+
+	while (!cpumask_full(&smp_test_complete))
+		cpu_relax();
+
+	/* Ensure everything completes before we check the data */
+	smp_mb();
+
+	/* Now total up errors and irqs */
+	for_each_present_cpu(cpu) {
+		total_err += errors[cpu];
+		total_sent += irq_sent[cpu];
+		total_recv += irq_recv[cpu];
+
+		if (check_irq)
+			report_irq_stats(cpu);
+	}
+
+	if (check_irq)
+		report(total_sent == total_recv && total_err == 0,
+		       "%d IRQs sent, %d received, %d errors\n",
+		       total_sent, total_recv, total_err == 0);
+	else
+		report(total_err == 0, "%d errors, IRQs not checked", total_err);
+}
+
+int main(int argc, char **argv)
+{
+	int i;
+	unsigned int j;
+
+	for (i = 0; i < argc; i++) {
+		char *arg = argv[i];
+
+		for (j = 0; j < ARRAY_SIZE(tests); j++) {
+			if (strcmp(arg, tests[j].test_name) == 0)
+				test = &tests[j];
+		}
+
+		/* Test modifiers */
+		if (strstr(arg, "mod=") != NULL) {
+			char *p = strstr(arg, "=");
+
+			mod_freq = atol(p+1);
+		}
+
+		if (strstr(arg, "rounds=") != NULL) {
+			char *p = strstr(arg, "=");
+
+			rounds = atol(p+1);
+		}
+
+		if (strcmp(arg, "smc") == 0) {
+			unsigned long test_start = (unsigned long) &tight_start;
+			unsigned long test_end = (unsigned long) &test_code_end;
+
+			smc = 1;
+			mmu_set_range_ptes(mmu_idmap, test_start, test_start,
+					   test_end, __pgprot(PTE_WBWA));
+
+			report_prefix_push("smc");
+		}
+
+		if (strcmp(arg, "irq") == 0) {
+			irq = 1;
+			if (!gic_init())
+				report_abort("No supported gic present!");
+			irqv = gic_version();
+			report_prefix_push("irq");
+		}
+
+		if (strcmp(arg, "check_irq") == 0)
+			check_irq = 1;
+	}
+
+	if (test) {
+		/* ensure args visible to all cores */
+		smp_mb();
+		setup_and_run_tcg_test();
+	} else {
+		report(false, "Unknown test");
+	}
+
+	return report_summary();
+}
diff --git a/arm/unittests.cfg b/arm/unittests.cfg
index 3d73e308..5b46ff5b 100644
--- a/arm/unittests.cfg
+++ b/arm/unittests.cfg
@@ -361,3 +361,87 @@ smp = 2
 extra_params = -append 'sal_barrier'
 groups = nodefault mttcg barrier
 
+# TCG Tests
+[tcg::tight]
+file = tcg-test.flat
+smp = $(($MAX_SMP>4?4:$MAX_SMP))
+extra_params = -append 'tight'
+groups = nodefault mttcg
+accel = tcg
+
+[tcg::tight-smc]
+file = tcg-test.flat
+smp = $(($MAX_SMP>4?4:$MAX_SMP))
+extra_params = -append 'tight smc' -accel tcg,tb-size=1
+groups = nodefault mttcg
+accel = tcg
+
+[tcg::tight-irq]
+file = tcg-test.flat
+smp = $(($MAX_SMP>4?4:$MAX_SMP))
+extra_params = -append 'tight irq'
+groups = nodefault mttcg
+accel = tcg
+
+[tcg::tight-smc-irq]
+file = tcg-test.flat
+smp = $(($MAX_SMP>4?4:$MAX_SMP))
+extra_params = -append 'tight smc irq'
+groups = nodefault mttcg
+accel = tcg
+
+[tcg::computed]
+file = tcg-test.flat
+smp = $(($MAX_SMP>4?4:$MAX_SMP))
+extra_params = -append 'computed'
+groups = nodefault mttcg
+accel = tcg
+
+[tcg::computed-smc]
+file = tcg-test.flat
+smp = $(($MAX_SMP>4?4:$MAX_SMP))
+extra_params = -append 'computed smc'
+groups = nodefault mttcg
+accel = tcg
+
+[tcg::computed-irq]
+file = tcg-test.flat
+smp = $(($MAX_SMP>4?4:$MAX_SMP))
+extra_params = -append 'computed irq'
+groups = nodefault mttcg
+accel = tcg
+
+[tcg::computed-smc-irq]
+file = tcg-test.flat
+smp = $(($MAX_SMP>4?4:$MAX_SMP))
+extra_params = -append 'computed smc irq'
+groups = nodefault mttcg
+accel = tcg
+
+[tcg::paged]
+file = tcg-test.flat
+smp = $(($MAX_SMP>4?4:$MAX_SMP))
+extra_params = -append 'paged'
+groups = nodefault mttcg
+accel = tcg
+
+[tcg::paged-smc]
+file = tcg-test.flat
+smp = $(($MAX_SMP>4?4:$MAX_SMP))
+extra_params = -append 'paged smc'
+groups = nodefault mttcg
+accel = tcg
+
+[tcg::paged-irq]
+file = tcg-test.flat
+smp = $(($MAX_SMP>4?4:$MAX_SMP))
+extra_params = -append 'paged irq'
+groups = nodefault mttcg
+accel = tcg
+
+[tcg::paged-smc-irq]
+file = tcg-test.flat
+smp = $(($MAX_SMP>4?4:$MAX_SMP))
+extra_params = -append 'paged smc irq'
+groups = nodefault mttcg
+accel = tcg
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [kvm-unit-tests PATCH v10 1/7] Makefile: add GNU global tags support
  2023-03-07 11:28 ` [kvm-unit-tests PATCH v10 1/7] Makefile: add GNU global tags support Alex Bennée
@ 2023-03-21 14:49   ` Andrew Jones
  0 siblings, 0 replies; 18+ messages in thread
From: Andrew Jones @ 2023-03-21 14:49 UTC (permalink / raw)
  To: Alex Bennée
  Cc: kvmarm, kvm, qemu-devel, Thomas Huth, Paolo Bonzini, kvmarm, qemu-arm

On Tue, Mar 07, 2023 at 11:28:39AM +0000, Alex Bennée wrote:
> If you have ctags you might as well offer gtags as a target.
> 
> Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
> Message-Id: <20211118184650.661575-4-alex.bennee@linaro.org>
> 
> ---
> v10
>   - update .gitignore
> ---
>  Makefile   | 5 ++++-
>  .gitignore | 3 +++
>  2 files changed, 7 insertions(+), 1 deletion(-)
> 
> diff --git a/Makefile b/Makefile
> index 6ed5deac..f22179de 100644
> --- a/Makefile
> +++ b/Makefile
> @@ -145,6 +145,9 @@ cscope:
>  		-name '*.[chsS]' -exec realpath --relative-base=$(CURDIR) {} \; | sort -u > ./cscope.files
>  	cscope -bk
>  
> -.PHONY: tags
> +.PHONY: tags gtags
>  tags:
>  	ctags -R
> +
> +gtags:
> +	gtags
> diff --git a/.gitignore b/.gitignore
> index 33529b65..4d5f460f 100644
> --- a/.gitignore
> +++ b/.gitignore
> @@ -12,6 +12,9 @@ tags
>  patches
>  .stgit-*
>  cscope.*
> +GPATH
> +GRTAGS
> +GTAGS
>  *.swp
>  /lib/asm
>  /lib/config.h
> -- 
> 2.39.2
>

Acked-by: Andrew Jones <andrew.jones@linux.dev>

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [kvm-unit-tests PATCH v10 2/7] add .gitpublish metadata
  2023-03-07 11:28 ` [kvm-unit-tests PATCH v10 2/7] add .gitpublish metadata Alex Bennée
@ 2023-03-21 14:53   ` Andrew Jones
  0 siblings, 0 replies; 18+ messages in thread
From: Andrew Jones @ 2023-03-21 14:53 UTC (permalink / raw)
  To: Alex Bennée
  Cc: kvmarm, kvm, qemu-devel, Thomas Huth, Paolo Bonzini, kvmarm, qemu-arm

On Tue, Mar 07, 2023 at 11:28:40AM +0000, Alex Bennée wrote:
> This allows for users of git-publish to have default routing for kvm
> and kvmarm patches.
> 
> Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
> ---
>  .gitpublish | 18 ++++++++++++++++++
>  1 file changed, 18 insertions(+)
>  create mode 100644 .gitpublish
> 
> diff --git a/.gitpublish b/.gitpublish
> new file mode 100644
> index 00000000..39130f93
> --- /dev/null
> +++ b/.gitpublish
> @@ -0,0 +1,18 @@
> +#
> +# Common git-publish profiles that can be used to send patches to QEMU upstream.
> +#
> +# See https://github.com/stefanha/git-publish for more information
> +#
> +[gitpublishprofile "default"]
> +base = master
> +to = kvm@vger.kernel.org
> +cc = qemu-devel@nongnu.org
> +cccmd = scripts/get_maintainer.pl --noroles --norolestats --nogit --nogit-fallback 2>/dev/null
> +
> +[gitpublishprofile "arm"]
> +base = master
> +to = kvmarm@lists.cs.columbia.edu
> +cc = kvm@vger.kernel.org
> +cc = qemu-devel@nongnu.org
> +cc = qemu-arm@nongnu.org
> +cccmd = scripts/get_maintainer.pl --noroles --norolestats --nogit --nogit-fallback 2>/dev/null

Should we also set the prefix for these?

 prefix = kvm-unit-tests PATCH

And maybe even, 'signoff = true'?

Otherwise,

Acked-by: Andrew Jones <andrew.jones@linux.dev>

Thanks,
drew

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [kvm-unit-tests PATCH v10 4/7] arm/tlbflush-code: TLB flush during code execution
  2023-03-07 11:28 ` [kvm-unit-tests PATCH v10 4/7] arm/tlbflush-code: TLB flush during code execution Alex Bennée
@ 2023-03-21 15:02   ` Andrew Jones
  2023-03-21 15:07     ` Andrew Jones
  2023-04-11  8:26     ` Alex Bennée
  0 siblings, 2 replies; 18+ messages in thread
From: Andrew Jones @ 2023-03-21 15:02 UTC (permalink / raw)
  To: Alex Bennée
  Cc: kvmarm, kvm, qemu-devel, Thomas Huth, Paolo Bonzini, kvmarm,
	qemu-arm, Mark Rutland

On Tue, Mar 07, 2023 at 11:28:42AM +0000, Alex Bennée wrote:
> This adds a fairly brain dead torture test for TLB flushes intended
> for stressing the MTTCG QEMU build. It takes the usual -smp option for
> multiple CPUs.
> 
> By default it CPU0 will do a TLBIALL flush after each cycle. You can
> pass options via -append to control additional aspects of the test:
> 
>   - "page" flush each page in turn (one per function)
>   - "self" do the flush after each computation cycle
>   - "verbose" report progress on each computation cycle
> 
> Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
> CC: Mark Rutland <mark.rutland@arm.com>
> Message-Id: <20211118184650.661575-7-alex.bennee@linaro.org>
> 
> ---
> v9
>   - move tests back into unittests.cfg (with nodefault mttcg)
>   - replace printf with report_info
>   - drop accel = tcg
> ---
>  arm/Makefile.common |   1 +
>  arm/tlbflush-code.c | 209 ++++++++++++++++++++++++++++++++++++++++++++
>  arm/unittests.cfg   |  25 ++++++
>  3 files changed, 235 insertions(+)
>  create mode 100644 arm/tlbflush-code.c
> 
> diff --git a/arm/Makefile.common b/arm/Makefile.common
> index 16f8c6df..2c4aad38 100644
> --- a/arm/Makefile.common
> +++ b/arm/Makefile.common
> @@ -12,6 +12,7 @@ tests-common += $(TEST_DIR)/gic.flat
>  tests-common += $(TEST_DIR)/psci.flat
>  tests-common += $(TEST_DIR)/sieve.flat
>  tests-common += $(TEST_DIR)/pl031.flat
> +tests-common += $(TEST_DIR)/tlbflush-code.flat
>  
>  tests-all = $(tests-common) $(tests)
>  all: directories $(tests-all)
> diff --git a/arm/tlbflush-code.c b/arm/tlbflush-code.c
> new file mode 100644
> index 00000000..bf9eb111
> --- /dev/null
> +++ b/arm/tlbflush-code.c
> @@ -0,0 +1,209 @@
> +// SPDX-License-Identifier: GPL-2.0-or-later
> +/*
> + * TLB Flush Race Tests
> + *
> + * These tests are designed to test for incorrect TLB flush semantics
> + * under emulation. The initial CPU will set all the others working a
> + * compuation task and will then trigger TLB flushes across the

computation

> + * system. It doesn't actually need to re-map anything but the flushes
> + * themselves will trigger QEMU's TCG self-modifying code detection
> + * which will invalidate any generated  code causing re-translation.
> + * Eventually the code buffer will fill and a general tb_lush() will
> + * be triggered.
> + *
> + * Copyright (C) 2016-2021, Linaro, Alex Bennée <alex.bennee@linaro.org>
> + *
> + * This work is licensed under the terms of the GNU LGPL, version 2.
> + */
> +
> +#include <libcflat.h>
> +#include <asm/smp.h>
> +#include <asm/cpumask.h>
> +#include <asm/barrier.h>
> +#include <asm/mmu.h>
> +
> +#define SEQ_LENGTH 10
> +#define SEQ_HASH 0x7cd707fe
> +
> +static cpumask_t smp_test_complete;
> +static int flush_count = 1000000;
> +static bool flush_self;
> +static bool flush_page;
> +static bool flush_verbose;
> +
> +/*
> + * Work functions
> + *
> + * These work functions need to be:
> + *
> + *  - page aligned, so we can flush one function at a time
> + *  - have branches, so QEMU TCG generates multiple basic blocks
> + *  - call across pages, so we exercise the TCG basic block slow path
> + */
> +
> +/* Adler32 */
> +__attribute__((aligned(PAGE_SIZE))) static
> +uint32_t hash_array(const void *buf, size_t buflen)
> +{
> +	const uint8_t *data = (uint8_t *) buf;
> +	uint32_t s1 = 1;
> +	uint32_t s2 = 0;
> +
> +	for (size_t n = 0; n < buflen; n++) {
> +		s1 = (s1 + data[n]) % 65521;
> +		s2 = (s2 + s1) % 65521;
> +	}
> +	return (s2 << 16) | s1;
> +}
> +
> +__attribute__((aligned(PAGE_SIZE))) static
> +void create_fib_sequence(int length, unsigned int *array)
> +{
> +	int i;
> +
> +	/* first two values */
> +	array[0] = 0;
> +	array[1] = 1;
> +	for (i = 2; i < length; i++)
> +		array[i] = array[i-2] + array[i-1];
> +}
> +
> +__attribute__((aligned(PAGE_SIZE))) static
> +unsigned long long factorial(unsigned int n)
> +{
> +	unsigned int i;
> +	unsigned long long fac = 1;
> +
> +	for (i = 1; i <= n; i++)
> +		fac = fac * i;
> +	return fac;
> +}
> +
> +__attribute__((aligned(PAGE_SIZE))) static
> +void factorial_array(unsigned int n, unsigned int *input,
> +		     unsigned long long *output)
> +{
> +	unsigned int i;
> +
> +	for (i = 0; i < n; i++)
> +		output[i] = factorial(input[i]);
> +}
> +
> +__attribute__((aligned(PAGE_SIZE))) static
> +unsigned int do_computation(void)
> +{
> +	unsigned int fib_array[SEQ_LENGTH];
> +	unsigned long long facfib_array[SEQ_LENGTH];
> +	uint32_t fib_hash, facfib_hash;
> +
> +	create_fib_sequence(SEQ_LENGTH, &fib_array[0]);
> +	fib_hash = hash_array(&fib_array[0], sizeof(fib_array));
> +	factorial_array(SEQ_LENGTH, &fib_array[0], &facfib_array[0]);
> +	facfib_hash = hash_array(&facfib_array[0], sizeof(facfib_array));
> +
> +	return (fib_hash ^ facfib_hash);
> +}
> +
> +/* This provides a table of the work functions so we can flush each
> + * page individually
> + */
> +static void *pages[] = {&hash_array, &create_fib_sequence, &factorial,
> +			&factorial_array, &do_computation};
> +
> +static void do_flush(int i)
> +{
> +	if (flush_page)
> +		flush_tlb_page((unsigned long)pages[i % ARRAY_SIZE(pages)]);
> +	else
> +		flush_tlb_all();
> +}
> +
> +
> +static void just_compute(void)
> +{
> +	int i, errors = 0;
> +	int cpu = smp_processor_id();
> +
> +	uint32_t result;
> +
> +	report_info("CPU%d online", cpu);
> +
> +	for (i = 0 ; i < flush_count; i++) {
> +		result = do_computation();
> +
> +		if (result != SEQ_HASH) {
> +			errors++;
> +			report_info("CPU%d: seq%d 0x%"PRIx32"!=0x%x",
> +				    cpu, i, result, SEQ_HASH);
> +		}
> +
> +		if (flush_verbose && (i % 1000) == 0)
> +			report_info("CPU%d: seq%d", cpu, i);
> +
> +		if (flush_self)
> +			do_flush(i);
> +	}
> +
> +	report(errors == 0, "CPU%d: Done - Errors: %d", cpu, errors);
> +
> +	cpumask_set_cpu(cpu, &smp_test_complete);
> +	if (cpu != 0)
> +		halt();
> +}
> +
> +static void just_flush(void)
> +{
> +	int cpu = smp_processor_id();
> +	int i = 0;
> +
> +	/*
> +	 * Set our CPU as done, keep flushing until everyone else
> +	 * finished
> +	 */
> +	cpumask_set_cpu(cpu, &smp_test_complete);
> +
> +	while (!cpumask_full(&smp_test_complete))
> +		do_flush(i++);
> +
> +	report_info("CPU%d: Done - Triggered %d flushes", cpu, i);
> +}
> +
> +int main(int argc, char **argv)
> +{
> +	int cpu, i;
> +	char prefix[100];
> +
> +	for (i = 0; i < argc; i++) {
> +		char *arg = argv[i];
> +
> +		if (strcmp(arg, "page") == 0)
> +			flush_page = true;
> +
> +		if (strcmp(arg, "self") == 0)
> +			flush_self = true;
> +
> +		if (strcmp(arg, "verbose") == 0)
> +			flush_verbose = true;
> +	}
> +
> +	snprintf(prefix, sizeof(prefix), "tlbflush_%s_%s",
> +		 flush_page ? "page" : "all",
> +		 flush_self ? "self" : "other");
> +	report_prefix_push(prefix);
> +
> +	for_each_present_cpu(cpu) {
> +		if (cpu == 0)
> +			continue;
> +		smp_boot_secondary(cpu, just_compute);
> +	}
> +
> +	if (flush_self)
> +		just_compute();
> +	else
> +		just_flush();
> +
> +	while (!cpumask_full(&smp_test_complete))
> +		cpu_relax();
> +
> +	return report_summary();
> +}
> diff --git a/arm/unittests.cfg b/arm/unittests.cfg
> index 5e67b558..ee21aef4 100644
> --- a/arm/unittests.cfg
> +++ b/arm/unittests.cfg
> @@ -275,3 +275,28 @@ file = debug.flat
>  arch = arm64
>  extra_params = -append 'ss-migration'
>  groups = debug migration
> +
> +# TLB Torture Tests
> +[tlbflush-code::all_other]

It's better to use '-', '_', '.', or ',' than '::' because otherwise the
standalone test will have a filename like tests/tlbflush-code::all_other
which will be awkward for shells.

BTW, have you tried running these tests as standalone? Since they're
'nodefault' it'd be good if they work that way.

> +file = tlbflush-code.flat
> +smp = $(($MAX_SMP>4?4:$MAX_SMP))
> +groups = nodefault mttcg
> +
> +[tlbflush-code::page_other]
> +file = tlbflush-code.flat
> +smp = $(($MAX_SMP>4?4:$MAX_SMP))
> +extra_params = -append 'page'
> +groups = nodefault mttcg
> +
> +[tlbflush-code::all_self]
> +file = tlbflush-code.flat
> +smp = $(($MAX_SMP>4?4:$MAX_SMP))
> +extra_params = -append 'self'
> +groups = nodefault mttcg
> +
> +[tlbflush-code::page_self]
> +file = tlbflush-code.flat
> +smp = $(($MAX_SMP>4?4:$MAX_SMP))
> +extra_params = -append 'page self'
> +groups = nodefault mttcg
> +
> -- 
> 2.39.2
>

Thanks,
drew

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [kvm-unit-tests PATCH v10 5/7] arm/locking-tests: add comprehensive locking test
  2023-03-07 11:28 ` [kvm-unit-tests PATCH v10 5/7] arm/locking-tests: add comprehensive locking test Alex Bennée
@ 2023-03-21 15:05   ` Andrew Jones
  0 siblings, 0 replies; 18+ messages in thread
From: Andrew Jones @ 2023-03-21 15:05 UTC (permalink / raw)
  To: Alex Bennée
  Cc: kvmarm, kvm, qemu-devel, Thomas Huth, Paolo Bonzini, kvmarm, qemu-arm

On Tue, Mar 07, 2023 at 11:28:43AM +0000, Alex Bennée wrote:
> This test has been written mainly to stress multi-threaded TCG behaviour
> but will demonstrate failure by default on real hardware. The test takes
> the following parameters:
> 
>   - "lock" use GCC's locking semantics
>   - "atomic" use GCC's __atomic primitives
>   - "wfelock" use WaitForEvent sleep
>   - "excl" use load/store exclusive semantics
> 
> Also two more options allow the test to be tweaked
> 
>   - "noshuffle" disables the memory shuffling
>   - "count=%ld" set your own per-CPU increment count
> 
> Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
> Message-Id: <20211118184650.661575-8-alex.bennee@linaro.org>
> 
> ---
> v9
>   - move back to unittests.cfg, drop accel=tcg
>   - s/printf/report_info
> v10
>   - dropped spare extra line in shuffle_memory
> ---
>  arm/Makefile.common |   2 +-
>  arm/locking-test.c  | 321 ++++++++++++++++++++++++++++++++++++++++++++
>  arm/spinlock-test.c |  87 ------------
>  arm/unittests.cfg   |  30 +++++
>  4 files changed, 352 insertions(+), 88 deletions(-)
>  create mode 100644 arm/locking-test.c
>  delete mode 100644 arm/spinlock-test.c
> 
> diff --git a/arm/Makefile.common b/arm/Makefile.common
> index 2c4aad38..3089e3bf 100644
> --- a/arm/Makefile.common
> +++ b/arm/Makefile.common
> @@ -5,7 +5,6 @@
>  #
>  
>  tests-common  = $(TEST_DIR)/selftest.flat
> -tests-common += $(TEST_DIR)/spinlock-test.flat
>  tests-common += $(TEST_DIR)/pci-test.flat
>  tests-common += $(TEST_DIR)/pmu.flat
>  tests-common += $(TEST_DIR)/gic.flat
> @@ -13,6 +12,7 @@ tests-common += $(TEST_DIR)/psci.flat
>  tests-common += $(TEST_DIR)/sieve.flat
>  tests-common += $(TEST_DIR)/pl031.flat
>  tests-common += $(TEST_DIR)/tlbflush-code.flat
> +tests-common += $(TEST_DIR)/locking-test.flat
>  
>  tests-all = $(tests-common) $(tests)
>  all: directories $(tests-all)
> diff --git a/arm/locking-test.c b/arm/locking-test.c
> new file mode 100644
> index 00000000..a49c2fd1
> --- /dev/null
> +++ b/arm/locking-test.c
> @@ -0,0 +1,321 @@
> +// SPDX-License-Identifier: GPL-2.0-or-later
> +/*
> + * Locking Test
> + *
> + * This test allows us to stress the various atomic primitives of a VM
> + * guest. A number of methods are available that use various patterns
> + * to implement a lock.
> + *
> + * Copyright (C) 2017 Linaro
> + * Author: Alex Bennée <alex.bennee@linaro.org>
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License version 2 as
> + * published by the Free Software Foundation.
> + */
> +
> +#include <libcflat.h>
> +#include <asm/smp.h>
> +#include <asm/cpumask.h>
> +#include <asm/barrier.h>
> +#include <asm/mmu.h>
> +
> +#include <prng.h>
> +
> +#define MAX_CPUS 8
> +
> +/* Test definition structure
> + *
> + * A simple structure that describes the test name, expected pass and
> + * increment function.
> + */

nit: This and many comment blocks below are missing their opening wings.

Thanks,
drew

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [kvm-unit-tests PATCH v10 4/7] arm/tlbflush-code: TLB flush during code execution
  2023-03-21 15:02   ` Andrew Jones
@ 2023-03-21 15:07     ` Andrew Jones
  2023-04-11  8:26     ` Alex Bennée
  1 sibling, 0 replies; 18+ messages in thread
From: Andrew Jones @ 2023-03-21 15:07 UTC (permalink / raw)
  To: Alex Bennée
  Cc: kvmarm, kvm, qemu-devel, Thomas Huth, Paolo Bonzini, kvmarm,
	qemu-arm, Mark Rutland

On Tue, Mar 21, 2023 at 04:02:21PM +0100, Andrew Jones wrote:
...
> > +
> > +# TLB Torture Tests
> > +[tlbflush-code::all_other]
> 
> It's better to use '-', '_', '.', or ',' than '::' because otherwise the
> standalone test will have a filename like tests/tlbflush-code::all_other
> which will be awkward for shells.
> 
> BTW, have you tried running these tests as standalone? Since they're
> 'nodefault' it'd be good if they work that way.
> 
> > +file = tlbflush-code.flat
> > +smp = $(($MAX_SMP>4?4:$MAX_SMP))
> > +groups = nodefault mttcg
> > +
> > +[tlbflush-code::page_other]
> > +file = tlbflush-code.flat
> > +smp = $(($MAX_SMP>4?4:$MAX_SMP))
> > +extra_params = -append 'page'
> > +groups = nodefault mttcg
> > +
> > +[tlbflush-code::all_self]
> > +file = tlbflush-code.flat
> > +smp = $(($MAX_SMP>4?4:$MAX_SMP))
> > +extra_params = -append 'self'
> > +groups = nodefault mttcg
> > +
> > +[tlbflush-code::page_self]
> > +file = tlbflush-code.flat
> > +smp = $(($MAX_SMP>4?4:$MAX_SMP))
> > +extra_params = -append 'page self'
> > +groups = nodefault mttcg

Shouldn't these also be in something like a "tlb" group?

Thanks,
drew

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [kvm-unit-tests PATCH v10 0/7] MTTCG sanity tests for ARM
  2023-03-07 11:28 [kvm-unit-tests PATCH v10 0/7] MTTCG sanity tests for ARM Alex Bennée
                   ` (6 preceding siblings ...)
  2023-03-07 11:28 ` [kvm-unit-tests PATCH v10 7/7] arm/tcg-test: some basic TCG exercising tests Alex Bennée
@ 2023-03-21 15:26 ` Andrew Jones
  2023-04-11  7:43   ` Alex Bennée
  7 siblings, 1 reply; 18+ messages in thread
From: Andrew Jones @ 2023-03-21 15:26 UTC (permalink / raw)
  To: Alex Bennée
  Cc: kvmarm, kvm, qemu-devel, Thomas Huth, Paolo Bonzini, kvmarm, qemu-arm

On Tue, Mar 07, 2023 at 11:28:38AM +0000, Alex Bennée wrote:
> I last had a go at getting these up-streamed at the end of 2021 so
> its probably worth having another go. From the last iteration a
> number of the groundwork patches did get merged:
> 
>   Subject: [kvm-unit-tests PATCH v9 0/9] MTTCG sanity tests for ARM
>   Date: Thu,  2 Dec 2021 11:53:43 +0000
>   Message-Id: <20211202115352.951548-1-alex.bennee@linaro.org>
> 
> So this leaves a minor gtags patch, adding the isaac RNG library which
> would also be useful for other users, see:
> 
>   Subject: [kvm-unit-tests PATCH v3 11/27] lib: Add random number generator
>   Date: Tue, 22 Nov 2022 18:11:36 +0200
>   Message-Id: <20221122161152.293072-12-mlevitsk@redhat.com>
> 
> Otherwise there are a few minor checkpatch tweaks.
> 
> I would still like to enable KVM unit tests inside QEMU as things like
> the x86 APIC tests are probably a better fit for unit testing TCG
> emulation than booting a whole OS with various APICs enabled.
> 
> Alex Bennée (7):
>   Makefile: add GNU global tags support
>   add .gitpublish metadata
>   lib: add isaac prng library from CCAN
>   arm/tlbflush-code: TLB flush during code execution
>   arm/locking-tests: add comprehensive locking test
>   arm/barrier-litmus-tests: add simple mp and sal litmus tests
>   arm/tcg-test: some basic TCG exercising tests
> 
>  Makefile                  |   5 +-
>  arm/Makefile.arm          |   2 +
>  arm/Makefile.arm64        |   2 +
>  arm/Makefile.common       |   6 +-
>  lib/arm/asm/barrier.h     |  19 ++
>  lib/arm64/asm/barrier.h   |  50 +++++
>  lib/prng.h                |  83 +++++++
>  lib/prng.c                | 163 ++++++++++++++
>  arm/tcg-test-asm.S        | 171 +++++++++++++++
>  arm/tcg-test-asm64.S      | 170 ++++++++++++++
>  arm/barrier-litmus-test.c | 450 ++++++++++++++++++++++++++++++++++++++
>  arm/locking-test.c        | 321 +++++++++++++++++++++++++++
>  arm/spinlock-test.c       |  87 --------
>  arm/tcg-test.c            | 340 ++++++++++++++++++++++++++++
>  arm/tlbflush-code.c       | 209 ++++++++++++++++++
>  arm/unittests.cfg         | 170 ++++++++++++++
>  .gitignore                |   3 +
>  .gitpublish               |  18 ++
>  18 files changed, 2180 insertions(+), 89 deletions(-)
>  create mode 100644 lib/prng.h
>  create mode 100644 lib/prng.c
>  create mode 100644 arm/tcg-test-asm.S
>  create mode 100644 arm/tcg-test-asm64.S
>  create mode 100644 arm/barrier-litmus-test.c
>  create mode 100644 arm/locking-test.c
>  delete mode 100644 arm/spinlock-test.c
>  create mode 100644 arm/tcg-test.c
>  create mode 100644 arm/tlbflush-code.c
>  create mode 100644 .gitpublish
> 
> -- 
> 2.39.2
>

I don't see any problem with the series, but I didn't review it closely.
I think it's unlikely we'll get reviewers, but, as the tests are
nodefault, then that's probably OK. Can you make sure all tests have a
"tcg" type prefix when they are TCG-only, like the last patch does for
its tests? That will help filter them out when building all tests as
standalone tests. Someday mkstandalone could maybe learn how to build
a directory hierarchy using the group names, e.g.

 tests/mttcg/tlb/all_other

but I don't expect to have time for that myself anytime soon, so prefixes
will likely have to do for now (or forever).

Thanks,
drew

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [kvm-unit-tests PATCH v10 0/7] MTTCG sanity tests for ARM
  2023-03-21 15:26 ` [kvm-unit-tests PATCH v10 0/7] MTTCG sanity tests for ARM Andrew Jones
@ 2023-04-11  7:43   ` Alex Bennée
  2023-04-11  9:08     ` Andrew Jones
  0 siblings, 1 reply; 18+ messages in thread
From: Alex Bennée @ 2023-04-11  7:43 UTC (permalink / raw)
  To: Andrew Jones
  Cc: kvmarm, kvm, qemu-devel, Thomas Huth, Paolo Bonzini, kvmarm, qemu-arm


Andrew Jones <andrew.jones@linux.dev> writes:

> On Tue, Mar 07, 2023 at 11:28:38AM +0000, Alex Bennée wrote:
>> I last had a go at getting these up-streamed at the end of 2021 so
>> its probably worth having another go. From the last iteration a
>> number of the groundwork patches did get merged:
>> 
>>   Subject: [kvm-unit-tests PATCH v9 0/9] MTTCG sanity tests for ARM
>>   Date: Thu,  2 Dec 2021 11:53:43 +0000
>>   Message-Id: <20211202115352.951548-1-alex.bennee@linaro.org>
>> 
>> So this leaves a minor gtags patch, adding the isaac RNG library which
>> would also be useful for other users, see:
>> 
>>   Subject: [kvm-unit-tests PATCH v3 11/27] lib: Add random number generator
>>   Date: Tue, 22 Nov 2022 18:11:36 +0200
>>   Message-Id: <20221122161152.293072-12-mlevitsk@redhat.com>
>> 
>> Otherwise there are a few minor checkpatch tweaks.
>> 
>> I would still like to enable KVM unit tests inside QEMU as things like
>> the x86 APIC tests are probably a better fit for unit testing TCG
>> emulation than booting a whole OS with various APICs enabled.
>> 
>> Alex Bennée (7):
>>   Makefile: add GNU global tags support
>>   add .gitpublish metadata
>>   lib: add isaac prng library from CCAN
>>   arm/tlbflush-code: TLB flush during code execution
>>   arm/locking-tests: add comprehensive locking test
>>   arm/barrier-litmus-tests: add simple mp and sal litmus tests
>>   arm/tcg-test: some basic TCG exercising tests
>> 
>>  Makefile                  |   5 +-
>>  arm/Makefile.arm          |   2 +
>>  arm/Makefile.arm64        |   2 +
>>  arm/Makefile.common       |   6 +-
>>  lib/arm/asm/barrier.h     |  19 ++
>>  lib/arm64/asm/barrier.h   |  50 +++++
>>  lib/prng.h                |  83 +++++++
>>  lib/prng.c                | 163 ++++++++++++++
>>  arm/tcg-test-asm.S        | 171 +++++++++++++++
>>  arm/tcg-test-asm64.S      | 170 ++++++++++++++
>>  arm/barrier-litmus-test.c | 450 ++++++++++++++++++++++++++++++++++++++
>>  arm/locking-test.c        | 321 +++++++++++++++++++++++++++
>>  arm/spinlock-test.c       |  87 --------
>>  arm/tcg-test.c            | 340 ++++++++++++++++++++++++++++
>>  arm/tlbflush-code.c       | 209 ++++++++++++++++++
>>  arm/unittests.cfg         | 170 ++++++++++++++
>>  .gitignore                |   3 +
>>  .gitpublish               |  18 ++
>>  18 files changed, 2180 insertions(+), 89 deletions(-)
>>  create mode 100644 lib/prng.h
>>  create mode 100644 lib/prng.c
>>  create mode 100644 arm/tcg-test-asm.S
>>  create mode 100644 arm/tcg-test-asm64.S
>>  create mode 100644 arm/barrier-litmus-test.c
>>  create mode 100644 arm/locking-test.c
>>  delete mode 100644 arm/spinlock-test.c
>>  create mode 100644 arm/tcg-test.c
>>  create mode 100644 arm/tlbflush-code.c
>>  create mode 100644 .gitpublish
>> 
>> -- 
>> 2.39.2
>>
>
> I don't see any problem with the series, but I didn't review it closely.
> I think it's unlikely we'll get reviewers, but, as the tests are
> nodefault, then that's probably OK. Can you make sure all tests have a
> "tcg" type prefix when they are TCG-only, like the last patch does for
> its tests? That will help filter them out when building all tests as
> standalone tests.

tcg-tests is the only test that explicitly targets TCG behaviour (in the
choice of loops and SMC). The other tests are architecture validation
tests which should just be easy passes on real silicon.

> Someday mkstandalone could maybe learn how to build
> a directory hierarchy using the group names, e.g.
>
>  tests/mttcg/tlb/all_other

So nodefault isn't enough for this?

>
> but I don't expect to have time for that myself anytime soon, so prefixes
> will likely have to do for now (or forever).
>
> Thanks,
> drew


-- 
Alex Bennée
Virtualisation Tech Lead @ Linaro

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [kvm-unit-tests PATCH v10 4/7] arm/tlbflush-code: TLB flush during code execution
  2023-03-21 15:02   ` Andrew Jones
  2023-03-21 15:07     ` Andrew Jones
@ 2023-04-11  8:26     ` Alex Bennée
  2023-04-11  8:58       ` Andrew Jones
  1 sibling, 1 reply; 18+ messages in thread
From: Alex Bennée @ 2023-04-11  8:26 UTC (permalink / raw)
  To: Andrew Jones
  Cc: kvmarm, kvm, qemu-devel, Thomas Huth, Paolo Bonzini, kvmarm,
	qemu-arm, Mark Rutland


Andrew Jones <andrew.jones@linux.dev> writes:

> On Tue, Mar 07, 2023 at 11:28:42AM +0000, Alex Bennée wrote:
>> This adds a fairly brain dead torture test for TLB flushes intended
>> for stressing the MTTCG QEMU build. It takes the usual -smp option for
>> multiple CPUs.
>> 
<snip>
>
> BTW, have you tried running these tests as standalone? Since they're
> 'nodefault' it'd be good if they work that way.

It works but I couldn't get it to skip pass the nodefault check
automaticaly:

  env run_all_tests=1 QEMU=$HOME/lsrc/qemu.git/builds/arm.all/qemu-system-aarch64 ./tests/tcg.computed 
  BUILD_HEAD=c9cf6e90
  Test marked not to be run by default, are you sure (y/N)?

>
>> +file = tlbflush-code.flat
>> +smp = $(($MAX_SMP>4?4:$MAX_SMP))
>> +groups = nodefault mttcg
>> +
>> +[tlbflush-code::page_other]
>> +file = tlbflush-code.flat
>> +smp = $(($MAX_SMP>4?4:$MAX_SMP))
>> +extra_params = -append 'page'
>> +groups = nodefault mttcg
>> +
>> +[tlbflush-code::all_self]
>> +file = tlbflush-code.flat
>> +smp = $(($MAX_SMP>4?4:$MAX_SMP))
>> +extra_params = -append 'self'
>> +groups = nodefault mttcg
>> +
>> +[tlbflush-code::page_self]
>> +file = tlbflush-code.flat
>> +smp = $(($MAX_SMP>4?4:$MAX_SMP))
>> +extra_params = -append 'page self'
>> +groups = nodefault mttcg
>> +
>> -- 
>> 2.39.2
>>
>
> Thanks,
> drew


-- 
Alex Bennée
Virtualisation Tech Lead @ Linaro

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [kvm-unit-tests PATCH v10 4/7] arm/tlbflush-code: TLB flush during code execution
  2023-04-11  8:26     ` Alex Bennée
@ 2023-04-11  8:58       ` Andrew Jones
  0 siblings, 0 replies; 18+ messages in thread
From: Andrew Jones @ 2023-04-11  8:58 UTC (permalink / raw)
  To: Alex Bennée
  Cc: kvmarm, kvm, qemu-devel, Thomas Huth, Paolo Bonzini, kvmarm,
	qemu-arm, Mark Rutland

On Tue, Apr 11, 2023 at 09:26:56AM +0100, Alex Bennée wrote:
> 
> Andrew Jones <andrew.jones@linux.dev> writes:
> 
> > On Tue, Mar 07, 2023 at 11:28:42AM +0000, Alex Bennée wrote:
> >> This adds a fairly brain dead torture test for TLB flushes intended
> >> for stressing the MTTCG QEMU build. It takes the usual -smp option for
> >> multiple CPUs.
> >> 
> <snip>
> >
> > BTW, have you tried running these tests as standalone? Since they're
> > 'nodefault' it'd be good if they work that way.
> 
> It works but I couldn't get it to skip pass the nodefault check
> automaticaly:
> 
>   env run_all_tests=1 QEMU=$HOME/lsrc/qemu.git/builds/arm.all/qemu-system-aarch64 ./tests/tcg.computed 
>   BUILD_HEAD=c9cf6e90
>   Test marked not to be run by default, are you sure (y/N)?
>

I think

 $ yes | tests/some-nodefault-test

should work.

Thanks,
drew

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [kvm-unit-tests PATCH v10 0/7] MTTCG sanity tests for ARM
  2023-04-11  7:43   ` Alex Bennée
@ 2023-04-11  9:08     ` Andrew Jones
  0 siblings, 0 replies; 18+ messages in thread
From: Andrew Jones @ 2023-04-11  9:08 UTC (permalink / raw)
  To: Alex Bennée
  Cc: kvmarm, kvm, qemu-devel, Thomas Huth, Paolo Bonzini, kvmarm, qemu-arm

On Tue, Apr 11, 2023 at 08:43:49AM +0100, Alex Bennée wrote:
> 
> Andrew Jones <andrew.jones@linux.dev> writes:
...
> > Someday mkstandalone could maybe learn how to build
> > a directory hierarchy using the group names, e.g.
> >
> >  tests/mttcg/tlb/all_other
> 
> So nodefault isn't enough for this?
>

nodefault is enough to avoid running a test with run_tests.sh,
when its group hasn't been explicitly selected, i.e.

 ./run_tests.sh

doesn't run the test

 ./run_tests.sh -g test-group-name

does run the test.

standalone test filtering is only done by filename (but
potentially pathname), which is why I suggested we someday use
the group names as directory names. Anyway, that's future work.
This series just needs to ensure it gets its group names right.

Thanks,
drew

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2023-04-11  9:08 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-03-07 11:28 [kvm-unit-tests PATCH v10 0/7] MTTCG sanity tests for ARM Alex Bennée
2023-03-07 11:28 ` [kvm-unit-tests PATCH v10 1/7] Makefile: add GNU global tags support Alex Bennée
2023-03-21 14:49   ` Andrew Jones
2023-03-07 11:28 ` [kvm-unit-tests PATCH v10 2/7] add .gitpublish metadata Alex Bennée
2023-03-21 14:53   ` Andrew Jones
2023-03-07 11:28 ` [kvm-unit-tests PATCH v10 3/7] lib: add isaac prng library from CCAN Alex Bennée
2023-03-07 11:28 ` [kvm-unit-tests PATCH v10 4/7] arm/tlbflush-code: TLB flush during code execution Alex Bennée
2023-03-21 15:02   ` Andrew Jones
2023-03-21 15:07     ` Andrew Jones
2023-04-11  8:26     ` Alex Bennée
2023-04-11  8:58       ` Andrew Jones
2023-03-07 11:28 ` [kvm-unit-tests PATCH v10 5/7] arm/locking-tests: add comprehensive locking test Alex Bennée
2023-03-21 15:05   ` Andrew Jones
2023-03-07 11:28 ` [kvm-unit-tests PATCH v10 6/7] arm/barrier-litmus-tests: add simple mp and sal litmus tests Alex Bennée
2023-03-07 11:28 ` [kvm-unit-tests PATCH v10 7/7] arm/tcg-test: some basic TCG exercising tests Alex Bennée
2023-03-21 15:26 ` [kvm-unit-tests PATCH v10 0/7] MTTCG sanity tests for ARM Andrew Jones
2023-04-11  7:43   ` Alex Bennée
2023-04-11  9:08     ` Andrew Jones

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).