From: Mark Brown <broonie@kernel.org>
To: Catalin Marinas <catalin.marinas@arm.com>,
Will Deacon <will@kernel.org>, Marc Zyngier <maz@kernel.org>,
Shuah Khan <skhan@linuxfoundation.org>,
Shuah Khan <shuah@kernel.org>
Cc: Alan Hayward <alan.hayward@arm.com>,
Luis Machado <luis.machado@arm.com>,
Salil Akerkar <Salil.Akerkar@arm.com>,
Basant Kumar Dwivedi <Basant.KumarDwivedi@arm.com>,
Szabolcs Nagy <szabolcs.nagy@arm.com>,
James Morse <james.morse@arm.com>,
Alexandru Elisei <alexandru.elisei@arm.com>,
Suzuki K Poulose <suzuki.poulose@arm.com>,
linux-arm-kernel@lists.infradead.org,
linux-kselftest@vger.kernel.org, kvmarm@lists.cs.columbia.edu,
Mark Brown <broonie@kernel.org>
Subject: [PATCH v10 35/39] kselftest/arm64: Add stress test for SME ZA context switching
Date: Wed, 26 Jan 2022 15:27:45 +0000 [thread overview]
Message-ID: <20220126152749.233712-36-broonie@kernel.org> (raw)
In-Reply-To: <20220126152749.233712-1-broonie@kernel.org>
Add a stress test for context switching of the ZA register state based on
the similar tests Dave Martin wrote for FPSIMD and SVE registers. The test
loops indefinitely writing a data pattern to ZA then reading it back and
verifying that it's what was expected.
Unlike the other tests we manually assemble the SME instructions since at
present no released toolchain has SME support integrated.
Signed-off-by: Mark Brown <broonie@kernel.org>
---
tools/testing/selftests/arm64/fp/.gitignore | 1 +
tools/testing/selftests/arm64/fp/Makefile | 3 +
tools/testing/selftests/arm64/fp/za-stress | 59 +++
tools/testing/selftests/arm64/fp/za-test.S | 426 ++++++++++++++++++++
4 files changed, 489 insertions(+)
create mode 100644 tools/testing/selftests/arm64/fp/za-stress
create mode 100644 tools/testing/selftests/arm64/fp/za-test.S
diff --git a/tools/testing/selftests/arm64/fp/.gitignore b/tools/testing/selftests/arm64/fp/.gitignore
index 5729a5b1adfc..ead3197e720b 100644
--- a/tools/testing/selftests/arm64/fp/.gitignore
+++ b/tools/testing/selftests/arm64/fp/.gitignore
@@ -8,3 +8,4 @@ sve-test
ssve-test
vec-syscfg
vlset
+za-test
diff --git a/tools/testing/selftests/arm64/fp/Makefile b/tools/testing/selftests/arm64/fp/Makefile
index e6643c9b0474..38d2d0d5a0eb 100644
--- a/tools/testing/selftests/arm64/fp/Makefile
+++ b/tools/testing/selftests/arm64/fp/Makefile
@@ -6,6 +6,7 @@ TEST_PROGS_EXTENDED := fp-pidbench fpsimd-test fpsimd-stress \
rdvl-sme rdvl-sve \
sve-test sve-stress \
ssve-test ssve-stress \
+ za-test za-stress \
vlset
all: $(TEST_GEN_PROGS) $(TEST_PROGS_EXTENDED)
@@ -24,5 +25,7 @@ ssve-test: sve-test.S asm-utils.o
$(CC) -DSSVE -nostdlib $^ -o $@
vec-syscfg: vec-syscfg.o rdvl.o
vlset: vlset.o
+za-test: za-test.o asm-utils.o
+ $(CC) -nostdlib $^ -o $@
include ../../lib.mk
diff --git a/tools/testing/selftests/arm64/fp/za-stress b/tools/testing/selftests/arm64/fp/za-stress
new file mode 100644
index 000000000000..5ac386b55b95
--- /dev/null
+++ b/tools/testing/selftests/arm64/fp/za-stress
@@ -0,0 +1,59 @@
+#!/bin/bash
+# SPDX-License-Identifier: GPL-2.0-only
+# Copyright (C) 2015-2019 ARM Limited.
+# Original author: Dave Martin <Dave.Martin@arm.com>
+
+set -ue
+
+NR_CPUS=`nproc`
+
+pids=
+logs=
+
+cleanup () {
+ trap - INT TERM CHLD
+ set +e
+
+ if [ -n "$pids" ]; then
+ kill $pids
+ wait $pids
+ pids=
+ fi
+
+ if [ -n "$logs" ]; then
+ cat $logs
+ rm $logs
+ logs=
+ fi
+}
+
+interrupt () {
+ cleanup
+ exit 0
+}
+
+child_died () {
+ cleanup
+ exit 1
+}
+
+trap interrupt INT TERM EXIT
+
+for x in `seq 0 $((NR_CPUS * 4))`; do
+ log=`mktemp`
+ logs=$logs\ $log
+ ./za-test >$log &
+ pids=$pids\ $!
+done
+
+# Wait for all child processes to be created:
+sleep 10
+
+while :; do
+ kill -USR1 $pids
+done &
+pids=$pids\ $!
+
+wait
+
+exit 1
diff --git a/tools/testing/selftests/arm64/fp/za-test.S b/tools/testing/selftests/arm64/fp/za-test.S
new file mode 100644
index 000000000000..99e1bb3b6b18
--- /dev/null
+++ b/tools/testing/selftests/arm64/fp/za-test.S
@@ -0,0 +1,426 @@
+// SPDX-License-Identifier: GPL-2.0-only
+// Copyright (C) 2021 ARM Limited.
+// Original author: Mark Brown <broonie@kernel.org>
+//
+// Scalable Matrix Extension ZA context switch test
+// Repeatedly writes unique test patterns into each ZA tile
+// and reads them back to verify integrity.
+//
+// for x in `seq 1 NR_CPUS`; do sve-test & pids=$pids\ $! ; done
+// (leave it running for as long as you want...)
+// kill $pids
+
+#include <asm/unistd.h>
+#include "assembler.h"
+#include "asm-offsets.h"
+
+.arch_extension sve
+
+#define MAXVL 2048
+#define MAXVL_B (MAXVL / 8)
+
+/*
+ * LDR (vector to ZA array):
+ * LDR ZA[\nw, #\offset], [X\nxbase, #\offset, MUL VL]
+ */
+.macro _ldr_za nw, nxbase, offset=0
+ .inst 0xe1000000 \
+ | (((\nw) & 3) << 13) \
+ | ((\nxbase) << 5) \
+ | ((\offset) & 7)
+.endm
+
+/*
+ * STR (vector from ZA array):
+ * STR ZA[\nw, #\offset], [X\nxbase, #\offset, MUL VL]
+ */
+.macro _str_za nw, nxbase, offset=0
+ .inst 0xe1200000 \
+ | (((\nw) & 3) << 13) \
+ | ((\nxbase) << 5) \
+ | ((\offset) & 7)
+.endm
+
+/*
+ * RDSVL X\nx, #\imm
+ */
+.macro rdsvl nx, imm
+ .inst 0x4bf5800 \
+ | (\imm << 5) \
+ | (\nx)
+.endm
+
+.macro smstop
+ msr S0_3_C4_C6_3, xzr
+.endm
+
+.macro smstart_za
+ msr S0_3_C4_C5_3, xzr
+.endm
+
+// Declare some storage space to shadow ZA register contents and a
+// scratch buffer for a vector.
+.pushsection .text
+.data
+.align 4
+zaref:
+ .space MAXVL_B * MAXVL_B
+scratch:
+ .space MAXVL_B
+.popsection
+
+// Trivial memory copy: copy x2 bytes, starting at address x1, to address x0.
+// Clobbers x0-x3
+function memcpy
+ cmp x2, #0
+ b.eq 1f
+0: ldrb w3, [x1], #1
+ strb w3, [x0], #1
+ subs x2, x2, #1
+ b.ne 0b
+1: ret
+endfunction
+
+// Generate a test pattern for storage in ZA
+// x0: pid
+// x1: row in ZA
+// x2: generation
+
+// These values are used to constuct a 32-bit pattern that is repeated in the
+// scratch buffer as many times as will fit:
+// bits 31:28 generation number (increments once per test_loop)
+// bits 27:16 pid
+// bits 15: 8 row number
+// bits 7: 0 32-bit lane index
+
+function pattern
+ mov w3, wzr
+ bfi w3, w0, #16, #12 // PID
+ bfi w3, w1, #8, #8 // Row
+ bfi w3, w2, #28, #4 // Generation
+
+ ldr x0, =scratch
+ mov w1, #MAXVL_B / 4
+
+0: str w3, [x0], #4
+ add w3, w3, #1 // Lane
+ subs w1, w1, #1
+ b.ne 0b
+
+ ret
+endfunction
+
+// Get the address of shadow data for ZA horizontal vector xn
+.macro _adrza xd, xn, nrtmp
+ ldr \xd, =zaref
+ rdsvl \nrtmp, 1
+ madd \xd, x\nrtmp, \xn, \xd
+.endm
+
+// Set up test pattern in a ZA horizontal vector
+// x0: pid
+// x1: row number
+// x2: generation
+function setup_za
+ mov x4, x30
+ mov x12, x1 // Use x12 for vector select
+
+ bl pattern // Get pattern in scratch buffer
+ _adrza x0, x12, 2 // Shadow buffer pointer to x0 and x5
+ mov x5, x0
+ ldr x1, =scratch
+ bl memcpy // length set up in x2 by _adrza
+
+ _ldr_za 12, 5 // load vector w12 from pointer x5
+
+ ret x4
+endfunction
+
+// Trivial memory compare: compare x2 bytes starting at address x0 with
+// bytes starting at address x1.
+// Returns only if all bytes match; otherwise, the program is aborted.
+// Clobbers x0-x5.
+function memcmp
+ cbz x2, 2f
+
+ stp x0, x1, [sp, #-0x20]!
+ str x2, [sp, #0x10]
+
+ mov x5, #0
+0: ldrb w3, [x0, x5]
+ ldrb w4, [x1, x5]
+ add x5, x5, #1
+ cmp w3, w4
+ b.ne 1f
+ subs x2, x2, #1
+ b.ne 0b
+
+1: ldr x2, [sp, #0x10]
+ ldp x0, x1, [sp], #0x20
+ b.ne barf
+
+2: ret
+endfunction
+
+// Verify that a ZA vector matches its shadow in memory, else abort
+// x0: row number
+// Clobbers x0-x7 and x12.
+function check_za
+ mov x3, x30
+
+ mov x12, x0
+ _adrza x5, x0, 6 // pointer to expected value in x5
+ mov x4, x0
+ ldr x7, =scratch // x7 is scratch
+
+ mov x0, x7 // Poison scratch
+ mov x1, x6
+ bl memfill_ae
+
+ _str_za 12, 7 // save vector w12 to pointer x7
+
+ mov x0, x5
+ mov x1, x7
+ mov x2, x6
+ mov x30, x3
+ b memcmp
+endfunction
+
+// Any SME register modified here can cause corruption in the main
+// thread -- but *only* the locations modified here.
+function irritator_handler
+ // Increment the irritation signal count (x23):
+ ldr x0, [x2, #ucontext_regs + 8 * 23]
+ add x0, x0, #1
+ str x0, [x2, #ucontext_regs + 8 * 23]
+
+ // Corrupt some random ZA data
+#if 0
+ adr x0, .text + (irritator_handler - .text) / 16 * 16
+ movi v0.8b, #1
+ movi v9.16b, #2
+ movi v31.8b, #3
+#endif
+
+ ret
+endfunction
+
+function terminate_handler
+ mov w21, w0
+ mov x20, x2
+
+ puts "Terminated by signal "
+ mov w0, w21
+ bl putdec
+ puts ", no error, iterations="
+ ldr x0, [x20, #ucontext_regs + 8 * 22]
+ bl putdec
+ puts ", signals="
+ ldr x0, [x20, #ucontext_regs + 8 * 23]
+ bl putdecn
+
+ mov x0, #0
+ mov x8, #__NR_exit
+ svc #0
+endfunction
+
+// w0: signal number
+// x1: sa_action
+// w2: sa_flags
+// Clobbers x0-x6,x8
+function setsignal
+ str x30, [sp, #-((sa_sz + 15) / 16 * 16 + 16)]!
+
+ mov w4, w0
+ mov x5, x1
+ mov w6, w2
+
+ add x0, sp, #16
+ mov x1, #sa_sz
+ bl memclr
+
+ mov w0, w4
+ add x1, sp, #16
+ str w6, [x1, #sa_flags]
+ str x5, [x1, #sa_handler]
+ mov x2, #0
+ mov x3, #sa_mask_sz
+ mov x8, #__NR_rt_sigaction
+ svc #0
+
+ cbz w0, 1f
+
+ puts "sigaction failure\n"
+ b .Labort
+
+1: ldr x30, [sp], #((sa_sz + 15) / 16 * 16 + 16)
+ ret
+endfunction
+
+// Main program entry point
+.globl _start
+function _start
+_start:
+ puts "Streaming mode "
+ smstart_za
+
+ // Sanity-check and report the vector length
+
+ rdsvl 19, 8
+ cmp x19, #128
+ b.lo 1f
+ cmp x19, #2048
+ b.hi 1f
+ tst x19, #(8 - 1)
+ b.eq 2f
+
+1: puts "bad vector length: "
+ mov x0, x19
+ bl putdecn
+ b .Labort
+
+2: puts "vector length:\t"
+ mov x0, x19
+ bl putdec
+ puts " bits\n"
+
+ // Obtain our PID, to ensure test pattern uniqueness between processes
+ mov x8, #__NR_getpid
+ svc #0
+ mov x20, x0
+
+ puts "PID:\t"
+ mov x0, x20
+ bl putdecn
+
+ mov x23, #0 // Irritation signal count
+
+ mov w0, #SIGINT
+ adr x1, terminate_handler
+ mov w2, #SA_SIGINFO
+ bl setsignal
+
+ mov w0, #SIGTERM
+ adr x1, terminate_handler
+ mov w2, #SA_SIGINFO
+ bl setsignal
+
+ mov w0, #SIGUSR1
+ adr x1, irritator_handler
+ mov w2, #SA_SIGINFO
+ orr w2, w2, #SA_NODEFER
+ bl setsignal
+
+ mov x22, #0 // generation number, increments per iteration
+.Ltest_loop:
+ rdsvl 0, 8
+ cmp x0, x19
+ b.ne vl_barf
+
+ rdsvl 21, 1 // Set up ZA & shadow with test pattern
+0: mov x0, x20
+ sub x1, x21, #1
+ mov x2, x22
+ bl setup_za
+ subs x21, x21, #1
+ b.ne 0b
+
+ and x8, x22, #127 // Every 128 interations...
+ cbz x8, 0f
+ mov x8, #__NR_getpid // (otherwise minimal syscall)
+ b 1f
+0:
+ mov x8, #__NR_sched_yield // ...encourage preemption
+1:
+ svc #0
+
+ mrs x0, S3_3_C4_C2_2 // SVCR should have ZA=1,SM=0
+ and x1, x0, #3
+ cmp x1, #2
+ b.ne svcr_barf
+
+ rdsvl 21, 1 // Verify that the data made it through
+ rdsvl 24, 1 // Verify that the data made it through
+0: sub x0, x24, x21
+ bl check_za
+ subs x21, x21, #1
+ bne 0b
+
+ add x22, x22, #1 // Everything still working
+ b .Ltest_loop
+
+.Labort:
+ mov x0, #0
+ mov x1, #SIGABRT
+ mov x8, #__NR_kill
+ svc #0
+endfunction
+
+function barf
+// fpsimd.c acitivty log dump hack
+// ldr w0, =0xdeadc0de
+// mov w8, #__NR_exit
+// svc #0
+// end hack
+ smstop
+ mov x10, x0 // expected data
+ mov x11, x1 // actual data
+ mov x12, x2 // data size
+
+ puts "Mismatch: PID="
+ mov x0, x20
+ bl putdec
+ puts ", iteration="
+ mov x0, x22
+ bl putdec
+ puts ", row="
+ mov x0, x21
+ bl putdecn
+ puts "\tExpected ["
+ mov x0, x10
+ mov x1, x12
+ bl dumphex
+ puts "]\n\tGot ["
+ mov x0, x11
+ mov x1, x12
+ bl dumphex
+ puts "]\n"
+
+ mov x8, #__NR_getpid
+ svc #0
+// fpsimd.c acitivty log dump hack
+// ldr w0, =0xdeadc0de
+// mov w8, #__NR_exit
+// svc #0
+// ^ end of hack
+ mov x1, #SIGABRT
+ mov x8, #__NR_kill
+ svc #0
+// mov x8, #__NR_exit
+// mov x1, #1
+// svc #0
+endfunction
+
+function vl_barf
+ mov x10, x0
+
+ puts "Bad active VL: "
+ mov x0, x10
+ bl putdecn
+
+ mov x8, #__NR_exit
+ mov x1, #1
+ svc #0
+endfunction
+
+function svcr_barf
+ mov x10, x0
+
+ puts "Bad SVCR: "
+ mov x0, x10
+ bl putdecn
+
+ mov x8, #__NR_exit
+ mov x1, #1
+ svc #0
+endfunction
--
2.30.2
WARNING: multiple messages have this Message-ID (diff)
From: Mark Brown <broonie@kernel.org>
To: Catalin Marinas <catalin.marinas@arm.com>,
Will Deacon <will@kernel.org>, Marc Zyngier <maz@kernel.org>,
Shuah Khan <skhan@linuxfoundation.org>,
Shuah Khan <shuah@kernel.org>
Cc: Basant Kumar Dwivedi <Basant.KumarDwivedi@arm.com>,
Luis Machado <luis.machado@arm.com>,
Szabolcs Nagy <szabolcs.nagy@arm.com>,
Mark Brown <broonie@kernel.org>,
linux-arm-kernel@lists.infradead.org,
linux-kselftest@vger.kernel.org,
Alan Hayward <alan.hayward@arm.com>,
kvmarm@lists.cs.columbia.edu,
Salil Akerkar <Salil.Akerkar@arm.com>
Subject: [PATCH v10 35/39] kselftest/arm64: Add stress test for SME ZA context switching
Date: Wed, 26 Jan 2022 15:27:45 +0000 [thread overview]
Message-ID: <20220126152749.233712-36-broonie@kernel.org> (raw)
In-Reply-To: <20220126152749.233712-1-broonie@kernel.org>
Add a stress test for context switching of the ZA register state based on
the similar tests Dave Martin wrote for FPSIMD and SVE registers. The test
loops indefinitely writing a data pattern to ZA then reading it back and
verifying that it's what was expected.
Unlike the other tests we manually assemble the SME instructions since at
present no released toolchain has SME support integrated.
Signed-off-by: Mark Brown <broonie@kernel.org>
---
tools/testing/selftests/arm64/fp/.gitignore | 1 +
tools/testing/selftests/arm64/fp/Makefile | 3 +
tools/testing/selftests/arm64/fp/za-stress | 59 +++
tools/testing/selftests/arm64/fp/za-test.S | 426 ++++++++++++++++++++
4 files changed, 489 insertions(+)
create mode 100644 tools/testing/selftests/arm64/fp/za-stress
create mode 100644 tools/testing/selftests/arm64/fp/za-test.S
diff --git a/tools/testing/selftests/arm64/fp/.gitignore b/tools/testing/selftests/arm64/fp/.gitignore
index 5729a5b1adfc..ead3197e720b 100644
--- a/tools/testing/selftests/arm64/fp/.gitignore
+++ b/tools/testing/selftests/arm64/fp/.gitignore
@@ -8,3 +8,4 @@ sve-test
ssve-test
vec-syscfg
vlset
+za-test
diff --git a/tools/testing/selftests/arm64/fp/Makefile b/tools/testing/selftests/arm64/fp/Makefile
index e6643c9b0474..38d2d0d5a0eb 100644
--- a/tools/testing/selftests/arm64/fp/Makefile
+++ b/tools/testing/selftests/arm64/fp/Makefile
@@ -6,6 +6,7 @@ TEST_PROGS_EXTENDED := fp-pidbench fpsimd-test fpsimd-stress \
rdvl-sme rdvl-sve \
sve-test sve-stress \
ssve-test ssve-stress \
+ za-test za-stress \
vlset
all: $(TEST_GEN_PROGS) $(TEST_PROGS_EXTENDED)
@@ -24,5 +25,7 @@ ssve-test: sve-test.S asm-utils.o
$(CC) -DSSVE -nostdlib $^ -o $@
vec-syscfg: vec-syscfg.o rdvl.o
vlset: vlset.o
+za-test: za-test.o asm-utils.o
+ $(CC) -nostdlib $^ -o $@
include ../../lib.mk
diff --git a/tools/testing/selftests/arm64/fp/za-stress b/tools/testing/selftests/arm64/fp/za-stress
new file mode 100644
index 000000000000..5ac386b55b95
--- /dev/null
+++ b/tools/testing/selftests/arm64/fp/za-stress
@@ -0,0 +1,59 @@
+#!/bin/bash
+# SPDX-License-Identifier: GPL-2.0-only
+# Copyright (C) 2015-2019 ARM Limited.
+# Original author: Dave Martin <Dave.Martin@arm.com>
+
+set -ue
+
+NR_CPUS=`nproc`
+
+pids=
+logs=
+
+cleanup () {
+ trap - INT TERM CHLD
+ set +e
+
+ if [ -n "$pids" ]; then
+ kill $pids
+ wait $pids
+ pids=
+ fi
+
+ if [ -n "$logs" ]; then
+ cat $logs
+ rm $logs
+ logs=
+ fi
+}
+
+interrupt () {
+ cleanup
+ exit 0
+}
+
+child_died () {
+ cleanup
+ exit 1
+}
+
+trap interrupt INT TERM EXIT
+
+for x in `seq 0 $((NR_CPUS * 4))`; do
+ log=`mktemp`
+ logs=$logs\ $log
+ ./za-test >$log &
+ pids=$pids\ $!
+done
+
+# Wait for all child processes to be created:
+sleep 10
+
+while :; do
+ kill -USR1 $pids
+done &
+pids=$pids\ $!
+
+wait
+
+exit 1
diff --git a/tools/testing/selftests/arm64/fp/za-test.S b/tools/testing/selftests/arm64/fp/za-test.S
new file mode 100644
index 000000000000..99e1bb3b6b18
--- /dev/null
+++ b/tools/testing/selftests/arm64/fp/za-test.S
@@ -0,0 +1,426 @@
+// SPDX-License-Identifier: GPL-2.0-only
+// Copyright (C) 2021 ARM Limited.
+// Original author: Mark Brown <broonie@kernel.org>
+//
+// Scalable Matrix Extension ZA context switch test
+// Repeatedly writes unique test patterns into each ZA tile
+// and reads them back to verify integrity.
+//
+// for x in `seq 1 NR_CPUS`; do sve-test & pids=$pids\ $! ; done
+// (leave it running for as long as you want...)
+// kill $pids
+
+#include <asm/unistd.h>
+#include "assembler.h"
+#include "asm-offsets.h"
+
+.arch_extension sve
+
+#define MAXVL 2048
+#define MAXVL_B (MAXVL / 8)
+
+/*
+ * LDR (vector to ZA array):
+ * LDR ZA[\nw, #\offset], [X\nxbase, #\offset, MUL VL]
+ */
+.macro _ldr_za nw, nxbase, offset=0
+ .inst 0xe1000000 \
+ | (((\nw) & 3) << 13) \
+ | ((\nxbase) << 5) \
+ | ((\offset) & 7)
+.endm
+
+/*
+ * STR (vector from ZA array):
+ * STR ZA[\nw, #\offset], [X\nxbase, #\offset, MUL VL]
+ */
+.macro _str_za nw, nxbase, offset=0
+ .inst 0xe1200000 \
+ | (((\nw) & 3) << 13) \
+ | ((\nxbase) << 5) \
+ | ((\offset) & 7)
+.endm
+
+/*
+ * RDSVL X\nx, #\imm
+ */
+.macro rdsvl nx, imm
+ .inst 0x4bf5800 \
+ | (\imm << 5) \
+ | (\nx)
+.endm
+
+.macro smstop
+ msr S0_3_C4_C6_3, xzr
+.endm
+
+.macro smstart_za
+ msr S0_3_C4_C5_3, xzr
+.endm
+
+// Declare some storage space to shadow ZA register contents and a
+// scratch buffer for a vector.
+.pushsection .text
+.data
+.align 4
+zaref:
+ .space MAXVL_B * MAXVL_B
+scratch:
+ .space MAXVL_B
+.popsection
+
+// Trivial memory copy: copy x2 bytes, starting at address x1, to address x0.
+// Clobbers x0-x3
+function memcpy
+ cmp x2, #0
+ b.eq 1f
+0: ldrb w3, [x1], #1
+ strb w3, [x0], #1
+ subs x2, x2, #1
+ b.ne 0b
+1: ret
+endfunction
+
+// Generate a test pattern for storage in ZA
+// x0: pid
+// x1: row in ZA
+// x2: generation
+
+// These values are used to constuct a 32-bit pattern that is repeated in the
+// scratch buffer as many times as will fit:
+// bits 31:28 generation number (increments once per test_loop)
+// bits 27:16 pid
+// bits 15: 8 row number
+// bits 7: 0 32-bit lane index
+
+function pattern
+ mov w3, wzr
+ bfi w3, w0, #16, #12 // PID
+ bfi w3, w1, #8, #8 // Row
+ bfi w3, w2, #28, #4 // Generation
+
+ ldr x0, =scratch
+ mov w1, #MAXVL_B / 4
+
+0: str w3, [x0], #4
+ add w3, w3, #1 // Lane
+ subs w1, w1, #1
+ b.ne 0b
+
+ ret
+endfunction
+
+// Get the address of shadow data for ZA horizontal vector xn
+.macro _adrza xd, xn, nrtmp
+ ldr \xd, =zaref
+ rdsvl \nrtmp, 1
+ madd \xd, x\nrtmp, \xn, \xd
+.endm
+
+// Set up test pattern in a ZA horizontal vector
+// x0: pid
+// x1: row number
+// x2: generation
+function setup_za
+ mov x4, x30
+ mov x12, x1 // Use x12 for vector select
+
+ bl pattern // Get pattern in scratch buffer
+ _adrza x0, x12, 2 // Shadow buffer pointer to x0 and x5
+ mov x5, x0
+ ldr x1, =scratch
+ bl memcpy // length set up in x2 by _adrza
+
+ _ldr_za 12, 5 // load vector w12 from pointer x5
+
+ ret x4
+endfunction
+
+// Trivial memory compare: compare x2 bytes starting at address x0 with
+// bytes starting at address x1.
+// Returns only if all bytes match; otherwise, the program is aborted.
+// Clobbers x0-x5.
+function memcmp
+ cbz x2, 2f
+
+ stp x0, x1, [sp, #-0x20]!
+ str x2, [sp, #0x10]
+
+ mov x5, #0
+0: ldrb w3, [x0, x5]
+ ldrb w4, [x1, x5]
+ add x5, x5, #1
+ cmp w3, w4
+ b.ne 1f
+ subs x2, x2, #1
+ b.ne 0b
+
+1: ldr x2, [sp, #0x10]
+ ldp x0, x1, [sp], #0x20
+ b.ne barf
+
+2: ret
+endfunction
+
+// Verify that a ZA vector matches its shadow in memory, else abort
+// x0: row number
+// Clobbers x0-x7 and x12.
+function check_za
+ mov x3, x30
+
+ mov x12, x0
+ _adrza x5, x0, 6 // pointer to expected value in x5
+ mov x4, x0
+ ldr x7, =scratch // x7 is scratch
+
+ mov x0, x7 // Poison scratch
+ mov x1, x6
+ bl memfill_ae
+
+ _str_za 12, 7 // save vector w12 to pointer x7
+
+ mov x0, x5
+ mov x1, x7
+ mov x2, x6
+ mov x30, x3
+ b memcmp
+endfunction
+
+// Any SME register modified here can cause corruption in the main
+// thread -- but *only* the locations modified here.
+function irritator_handler
+ // Increment the irritation signal count (x23):
+ ldr x0, [x2, #ucontext_regs + 8 * 23]
+ add x0, x0, #1
+ str x0, [x2, #ucontext_regs + 8 * 23]
+
+ // Corrupt some random ZA data
+#if 0
+ adr x0, .text + (irritator_handler - .text) / 16 * 16
+ movi v0.8b, #1
+ movi v9.16b, #2
+ movi v31.8b, #3
+#endif
+
+ ret
+endfunction
+
+function terminate_handler
+ mov w21, w0
+ mov x20, x2
+
+ puts "Terminated by signal "
+ mov w0, w21
+ bl putdec
+ puts ", no error, iterations="
+ ldr x0, [x20, #ucontext_regs + 8 * 22]
+ bl putdec
+ puts ", signals="
+ ldr x0, [x20, #ucontext_regs + 8 * 23]
+ bl putdecn
+
+ mov x0, #0
+ mov x8, #__NR_exit
+ svc #0
+endfunction
+
+// w0: signal number
+// x1: sa_action
+// w2: sa_flags
+// Clobbers x0-x6,x8
+function setsignal
+ str x30, [sp, #-((sa_sz + 15) / 16 * 16 + 16)]!
+
+ mov w4, w0
+ mov x5, x1
+ mov w6, w2
+
+ add x0, sp, #16
+ mov x1, #sa_sz
+ bl memclr
+
+ mov w0, w4
+ add x1, sp, #16
+ str w6, [x1, #sa_flags]
+ str x5, [x1, #sa_handler]
+ mov x2, #0
+ mov x3, #sa_mask_sz
+ mov x8, #__NR_rt_sigaction
+ svc #0
+
+ cbz w0, 1f
+
+ puts "sigaction failure\n"
+ b .Labort
+
+1: ldr x30, [sp], #((sa_sz + 15) / 16 * 16 + 16)
+ ret
+endfunction
+
+// Main program entry point
+.globl _start
+function _start
+_start:
+ puts "Streaming mode "
+ smstart_za
+
+ // Sanity-check and report the vector length
+
+ rdsvl 19, 8
+ cmp x19, #128
+ b.lo 1f
+ cmp x19, #2048
+ b.hi 1f
+ tst x19, #(8 - 1)
+ b.eq 2f
+
+1: puts "bad vector length: "
+ mov x0, x19
+ bl putdecn
+ b .Labort
+
+2: puts "vector length:\t"
+ mov x0, x19
+ bl putdec
+ puts " bits\n"
+
+ // Obtain our PID, to ensure test pattern uniqueness between processes
+ mov x8, #__NR_getpid
+ svc #0
+ mov x20, x0
+
+ puts "PID:\t"
+ mov x0, x20
+ bl putdecn
+
+ mov x23, #0 // Irritation signal count
+
+ mov w0, #SIGINT
+ adr x1, terminate_handler
+ mov w2, #SA_SIGINFO
+ bl setsignal
+
+ mov w0, #SIGTERM
+ adr x1, terminate_handler
+ mov w2, #SA_SIGINFO
+ bl setsignal
+
+ mov w0, #SIGUSR1
+ adr x1, irritator_handler
+ mov w2, #SA_SIGINFO
+ orr w2, w2, #SA_NODEFER
+ bl setsignal
+
+ mov x22, #0 // generation number, increments per iteration
+.Ltest_loop:
+ rdsvl 0, 8
+ cmp x0, x19
+ b.ne vl_barf
+
+ rdsvl 21, 1 // Set up ZA & shadow with test pattern
+0: mov x0, x20
+ sub x1, x21, #1
+ mov x2, x22
+ bl setup_za
+ subs x21, x21, #1
+ b.ne 0b
+
+ and x8, x22, #127 // Every 128 interations...
+ cbz x8, 0f
+ mov x8, #__NR_getpid // (otherwise minimal syscall)
+ b 1f
+0:
+ mov x8, #__NR_sched_yield // ...encourage preemption
+1:
+ svc #0
+
+ mrs x0, S3_3_C4_C2_2 // SVCR should have ZA=1,SM=0
+ and x1, x0, #3
+ cmp x1, #2
+ b.ne svcr_barf
+
+ rdsvl 21, 1 // Verify that the data made it through
+ rdsvl 24, 1 // Verify that the data made it through
+0: sub x0, x24, x21
+ bl check_za
+ subs x21, x21, #1
+ bne 0b
+
+ add x22, x22, #1 // Everything still working
+ b .Ltest_loop
+
+.Labort:
+ mov x0, #0
+ mov x1, #SIGABRT
+ mov x8, #__NR_kill
+ svc #0
+endfunction
+
+function barf
+// fpsimd.c acitivty log dump hack
+// ldr w0, =0xdeadc0de
+// mov w8, #__NR_exit
+// svc #0
+// end hack
+ smstop
+ mov x10, x0 // expected data
+ mov x11, x1 // actual data
+ mov x12, x2 // data size
+
+ puts "Mismatch: PID="
+ mov x0, x20
+ bl putdec
+ puts ", iteration="
+ mov x0, x22
+ bl putdec
+ puts ", row="
+ mov x0, x21
+ bl putdecn
+ puts "\tExpected ["
+ mov x0, x10
+ mov x1, x12
+ bl dumphex
+ puts "]\n\tGot ["
+ mov x0, x11
+ mov x1, x12
+ bl dumphex
+ puts "]\n"
+
+ mov x8, #__NR_getpid
+ svc #0
+// fpsimd.c acitivty log dump hack
+// ldr w0, =0xdeadc0de
+// mov w8, #__NR_exit
+// svc #0
+// ^ end of hack
+ mov x1, #SIGABRT
+ mov x8, #__NR_kill
+ svc #0
+// mov x8, #__NR_exit
+// mov x1, #1
+// svc #0
+endfunction
+
+function vl_barf
+ mov x10, x0
+
+ puts "Bad active VL: "
+ mov x0, x10
+ bl putdecn
+
+ mov x8, #__NR_exit
+ mov x1, #1
+ svc #0
+endfunction
+
+function svcr_barf
+ mov x10, x0
+
+ puts "Bad SVCR: "
+ mov x0, x10
+ bl putdecn
+
+ mov x8, #__NR_exit
+ mov x1, #1
+ svc #0
+endfunction
--
2.30.2
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
WARNING: multiple messages have this Message-ID (diff)
From: Mark Brown <broonie@kernel.org>
To: Catalin Marinas <catalin.marinas@arm.com>,
Will Deacon <will@kernel.org>, Marc Zyngier <maz@kernel.org>,
Shuah Khan <skhan@linuxfoundation.org>,
Shuah Khan <shuah@kernel.org>
Cc: Alan Hayward <alan.hayward@arm.com>,
Luis Machado <luis.machado@arm.com>,
Salil Akerkar <Salil.Akerkar@arm.com>,
Basant Kumar Dwivedi <Basant.KumarDwivedi@arm.com>,
Szabolcs Nagy <szabolcs.nagy@arm.com>,
James Morse <james.morse@arm.com>,
Alexandru Elisei <alexandru.elisei@arm.com>,
Suzuki K Poulose <suzuki.poulose@arm.com>,
linux-arm-kernel@lists.infradead.org,
linux-kselftest@vger.kernel.org, kvmarm@lists.cs.columbia.edu,
Mark Brown <broonie@kernel.org>
Subject: [PATCH v10 35/39] kselftest/arm64: Add stress test for SME ZA context switching
Date: Wed, 26 Jan 2022 15:27:45 +0000 [thread overview]
Message-ID: <20220126152749.233712-36-broonie@kernel.org> (raw)
In-Reply-To: <20220126152749.233712-1-broonie@kernel.org>
Add a stress test for context switching of the ZA register state based on
the similar tests Dave Martin wrote for FPSIMD and SVE registers. The test
loops indefinitely writing a data pattern to ZA then reading it back and
verifying that it's what was expected.
Unlike the other tests we manually assemble the SME instructions since at
present no released toolchain has SME support integrated.
Signed-off-by: Mark Brown <broonie@kernel.org>
---
tools/testing/selftests/arm64/fp/.gitignore | 1 +
tools/testing/selftests/arm64/fp/Makefile | 3 +
tools/testing/selftests/arm64/fp/za-stress | 59 +++
tools/testing/selftests/arm64/fp/za-test.S | 426 ++++++++++++++++++++
4 files changed, 489 insertions(+)
create mode 100644 tools/testing/selftests/arm64/fp/za-stress
create mode 100644 tools/testing/selftests/arm64/fp/za-test.S
diff --git a/tools/testing/selftests/arm64/fp/.gitignore b/tools/testing/selftests/arm64/fp/.gitignore
index 5729a5b1adfc..ead3197e720b 100644
--- a/tools/testing/selftests/arm64/fp/.gitignore
+++ b/tools/testing/selftests/arm64/fp/.gitignore
@@ -8,3 +8,4 @@ sve-test
ssve-test
vec-syscfg
vlset
+za-test
diff --git a/tools/testing/selftests/arm64/fp/Makefile b/tools/testing/selftests/arm64/fp/Makefile
index e6643c9b0474..38d2d0d5a0eb 100644
--- a/tools/testing/selftests/arm64/fp/Makefile
+++ b/tools/testing/selftests/arm64/fp/Makefile
@@ -6,6 +6,7 @@ TEST_PROGS_EXTENDED := fp-pidbench fpsimd-test fpsimd-stress \
rdvl-sme rdvl-sve \
sve-test sve-stress \
ssve-test ssve-stress \
+ za-test za-stress \
vlset
all: $(TEST_GEN_PROGS) $(TEST_PROGS_EXTENDED)
@@ -24,5 +25,7 @@ ssve-test: sve-test.S asm-utils.o
$(CC) -DSSVE -nostdlib $^ -o $@
vec-syscfg: vec-syscfg.o rdvl.o
vlset: vlset.o
+za-test: za-test.o asm-utils.o
+ $(CC) -nostdlib $^ -o $@
include ../../lib.mk
diff --git a/tools/testing/selftests/arm64/fp/za-stress b/tools/testing/selftests/arm64/fp/za-stress
new file mode 100644
index 000000000000..5ac386b55b95
--- /dev/null
+++ b/tools/testing/selftests/arm64/fp/za-stress
@@ -0,0 +1,59 @@
+#!/bin/bash
+# SPDX-License-Identifier: GPL-2.0-only
+# Copyright (C) 2015-2019 ARM Limited.
+# Original author: Dave Martin <Dave.Martin@arm.com>
+
+set -ue
+
+NR_CPUS=`nproc`
+
+pids=
+logs=
+
+cleanup () {
+ trap - INT TERM CHLD
+ set +e
+
+ if [ -n "$pids" ]; then
+ kill $pids
+ wait $pids
+ pids=
+ fi
+
+ if [ -n "$logs" ]; then
+ cat $logs
+ rm $logs
+ logs=
+ fi
+}
+
+interrupt () {
+ cleanup
+ exit 0
+}
+
+child_died () {
+ cleanup
+ exit 1
+}
+
+trap interrupt INT TERM EXIT
+
+for x in `seq 0 $((NR_CPUS * 4))`; do
+ log=`mktemp`
+ logs=$logs\ $log
+ ./za-test >$log &
+ pids=$pids\ $!
+done
+
+# Wait for all child processes to be created:
+sleep 10
+
+while :; do
+ kill -USR1 $pids
+done &
+pids=$pids\ $!
+
+wait
+
+exit 1
diff --git a/tools/testing/selftests/arm64/fp/za-test.S b/tools/testing/selftests/arm64/fp/za-test.S
new file mode 100644
index 000000000000..99e1bb3b6b18
--- /dev/null
+++ b/tools/testing/selftests/arm64/fp/za-test.S
@@ -0,0 +1,426 @@
+// SPDX-License-Identifier: GPL-2.0-only
+// Copyright (C) 2021 ARM Limited.
+// Original author: Mark Brown <broonie@kernel.org>
+//
+// Scalable Matrix Extension ZA context switch test
+// Repeatedly writes unique test patterns into each ZA tile
+// and reads them back to verify integrity.
+//
+// for x in `seq 1 NR_CPUS`; do sve-test & pids=$pids\ $! ; done
+// (leave it running for as long as you want...)
+// kill $pids
+
+#include <asm/unistd.h>
+#include "assembler.h"
+#include "asm-offsets.h"
+
+.arch_extension sve
+
+#define MAXVL 2048
+#define MAXVL_B (MAXVL / 8)
+
+/*
+ * LDR (vector to ZA array):
+ * LDR ZA[\nw, #\offset], [X\nxbase, #\offset, MUL VL]
+ */
+.macro _ldr_za nw, nxbase, offset=0
+ .inst 0xe1000000 \
+ | (((\nw) & 3) << 13) \
+ | ((\nxbase) << 5) \
+ | ((\offset) & 7)
+.endm
+
+/*
+ * STR (vector from ZA array):
+ * STR ZA[\nw, #\offset], [X\nxbase, #\offset, MUL VL]
+ */
+.macro _str_za nw, nxbase, offset=0
+ .inst 0xe1200000 \
+ | (((\nw) & 3) << 13) \
+ | ((\nxbase) << 5) \
+ | ((\offset) & 7)
+.endm
+
+/*
+ * RDSVL X\nx, #\imm
+ */
+.macro rdsvl nx, imm
+ .inst 0x4bf5800 \
+ | (\imm << 5) \
+ | (\nx)
+.endm
+
+.macro smstop
+ msr S0_3_C4_C6_3, xzr
+.endm
+
+.macro smstart_za
+ msr S0_3_C4_C5_3, xzr
+.endm
+
+// Declare some storage space to shadow ZA register contents and a
+// scratch buffer for a vector.
+.pushsection .text
+.data
+.align 4
+zaref:
+ .space MAXVL_B * MAXVL_B
+scratch:
+ .space MAXVL_B
+.popsection
+
+// Trivial memory copy: copy x2 bytes, starting at address x1, to address x0.
+// Clobbers x0-x3
+function memcpy
+ cmp x2, #0
+ b.eq 1f
+0: ldrb w3, [x1], #1
+ strb w3, [x0], #1
+ subs x2, x2, #1
+ b.ne 0b
+1: ret
+endfunction
+
+// Generate a test pattern for storage in ZA
+// x0: pid
+// x1: row in ZA
+// x2: generation
+
+// These values are used to constuct a 32-bit pattern that is repeated in the
+// scratch buffer as many times as will fit:
+// bits 31:28 generation number (increments once per test_loop)
+// bits 27:16 pid
+// bits 15: 8 row number
+// bits 7: 0 32-bit lane index
+
+function pattern
+ mov w3, wzr
+ bfi w3, w0, #16, #12 // PID
+ bfi w3, w1, #8, #8 // Row
+ bfi w3, w2, #28, #4 // Generation
+
+ ldr x0, =scratch
+ mov w1, #MAXVL_B / 4
+
+0: str w3, [x0], #4
+ add w3, w3, #1 // Lane
+ subs w1, w1, #1
+ b.ne 0b
+
+ ret
+endfunction
+
+// Get the address of shadow data for ZA horizontal vector xn
+.macro _adrza xd, xn, nrtmp
+ ldr \xd, =zaref
+ rdsvl \nrtmp, 1
+ madd \xd, x\nrtmp, \xn, \xd
+.endm
+
+// Set up test pattern in a ZA horizontal vector
+// x0: pid
+// x1: row number
+// x2: generation
+function setup_za
+ mov x4, x30
+ mov x12, x1 // Use x12 for vector select
+
+ bl pattern // Get pattern in scratch buffer
+ _adrza x0, x12, 2 // Shadow buffer pointer to x0 and x5
+ mov x5, x0
+ ldr x1, =scratch
+ bl memcpy // length set up in x2 by _adrza
+
+ _ldr_za 12, 5 // load vector w12 from pointer x5
+
+ ret x4
+endfunction
+
+// Trivial memory compare: compare x2 bytes starting at address x0 with
+// bytes starting at address x1.
+// Returns only if all bytes match; otherwise, the program is aborted.
+// Clobbers x0-x5.
+function memcmp
+ cbz x2, 2f
+
+ stp x0, x1, [sp, #-0x20]!
+ str x2, [sp, #0x10]
+
+ mov x5, #0
+0: ldrb w3, [x0, x5]
+ ldrb w4, [x1, x5]
+ add x5, x5, #1
+ cmp w3, w4
+ b.ne 1f
+ subs x2, x2, #1
+ b.ne 0b
+
+1: ldr x2, [sp, #0x10]
+ ldp x0, x1, [sp], #0x20
+ b.ne barf
+
+2: ret
+endfunction
+
+// Verify that a ZA vector matches its shadow in memory, else abort
+// x0: row number
+// Clobbers x0-x7 and x12.
+function check_za
+ mov x3, x30
+
+ mov x12, x0
+ _adrza x5, x0, 6 // pointer to expected value in x5
+ mov x4, x0
+ ldr x7, =scratch // x7 is scratch
+
+ mov x0, x7 // Poison scratch
+ mov x1, x6
+ bl memfill_ae
+
+ _str_za 12, 7 // save vector w12 to pointer x7
+
+ mov x0, x5
+ mov x1, x7
+ mov x2, x6
+ mov x30, x3
+ b memcmp
+endfunction
+
+// Any SME register modified here can cause corruption in the main
+// thread -- but *only* the locations modified here.
+function irritator_handler
+ // Increment the irritation signal count (x23):
+ ldr x0, [x2, #ucontext_regs + 8 * 23]
+ add x0, x0, #1
+ str x0, [x2, #ucontext_regs + 8 * 23]
+
+ // Corrupt some random ZA data
+#if 0
+ adr x0, .text + (irritator_handler - .text) / 16 * 16
+ movi v0.8b, #1
+ movi v9.16b, #2
+ movi v31.8b, #3
+#endif
+
+ ret
+endfunction
+
+function terminate_handler
+ mov w21, w0
+ mov x20, x2
+
+ puts "Terminated by signal "
+ mov w0, w21
+ bl putdec
+ puts ", no error, iterations="
+ ldr x0, [x20, #ucontext_regs + 8 * 22]
+ bl putdec
+ puts ", signals="
+ ldr x0, [x20, #ucontext_regs + 8 * 23]
+ bl putdecn
+
+ mov x0, #0
+ mov x8, #__NR_exit
+ svc #0
+endfunction
+
+// w0: signal number
+// x1: sa_action
+// w2: sa_flags
+// Clobbers x0-x6,x8
+function setsignal
+ str x30, [sp, #-((sa_sz + 15) / 16 * 16 + 16)]!
+
+ mov w4, w0
+ mov x5, x1
+ mov w6, w2
+
+ add x0, sp, #16
+ mov x1, #sa_sz
+ bl memclr
+
+ mov w0, w4
+ add x1, sp, #16
+ str w6, [x1, #sa_flags]
+ str x5, [x1, #sa_handler]
+ mov x2, #0
+ mov x3, #sa_mask_sz
+ mov x8, #__NR_rt_sigaction
+ svc #0
+
+ cbz w0, 1f
+
+ puts "sigaction failure\n"
+ b .Labort
+
+1: ldr x30, [sp], #((sa_sz + 15) / 16 * 16 + 16)
+ ret
+endfunction
+
+// Main program entry point
+.globl _start
+function _start
+_start:
+ puts "Streaming mode "
+ smstart_za
+
+ // Sanity-check and report the vector length
+
+ rdsvl 19, 8
+ cmp x19, #128
+ b.lo 1f
+ cmp x19, #2048
+ b.hi 1f
+ tst x19, #(8 - 1)
+ b.eq 2f
+
+1: puts "bad vector length: "
+ mov x0, x19
+ bl putdecn
+ b .Labort
+
+2: puts "vector length:\t"
+ mov x0, x19
+ bl putdec
+ puts " bits\n"
+
+ // Obtain our PID, to ensure test pattern uniqueness between processes
+ mov x8, #__NR_getpid
+ svc #0
+ mov x20, x0
+
+ puts "PID:\t"
+ mov x0, x20
+ bl putdecn
+
+ mov x23, #0 // Irritation signal count
+
+ mov w0, #SIGINT
+ adr x1, terminate_handler
+ mov w2, #SA_SIGINFO
+ bl setsignal
+
+ mov w0, #SIGTERM
+ adr x1, terminate_handler
+ mov w2, #SA_SIGINFO
+ bl setsignal
+
+ mov w0, #SIGUSR1
+ adr x1, irritator_handler
+ mov w2, #SA_SIGINFO
+ orr w2, w2, #SA_NODEFER
+ bl setsignal
+
+ mov x22, #0 // generation number, increments per iteration
+.Ltest_loop:
+ rdsvl 0, 8
+ cmp x0, x19
+ b.ne vl_barf
+
+ rdsvl 21, 1 // Set up ZA & shadow with test pattern
+0: mov x0, x20
+ sub x1, x21, #1
+ mov x2, x22
+ bl setup_za
+ subs x21, x21, #1
+ b.ne 0b
+
+ and x8, x22, #127 // Every 128 interations...
+ cbz x8, 0f
+ mov x8, #__NR_getpid // (otherwise minimal syscall)
+ b 1f
+0:
+ mov x8, #__NR_sched_yield // ...encourage preemption
+1:
+ svc #0
+
+ mrs x0, S3_3_C4_C2_2 // SVCR should have ZA=1,SM=0
+ and x1, x0, #3
+ cmp x1, #2
+ b.ne svcr_barf
+
+ rdsvl 21, 1 // Verify that the data made it through
+ rdsvl 24, 1 // Verify that the data made it through
+0: sub x0, x24, x21
+ bl check_za
+ subs x21, x21, #1
+ bne 0b
+
+ add x22, x22, #1 // Everything still working
+ b .Ltest_loop
+
+.Labort:
+ mov x0, #0
+ mov x1, #SIGABRT
+ mov x8, #__NR_kill
+ svc #0
+endfunction
+
+function barf
+// fpsimd.c acitivty log dump hack
+// ldr w0, =0xdeadc0de
+// mov w8, #__NR_exit
+// svc #0
+// end hack
+ smstop
+ mov x10, x0 // expected data
+ mov x11, x1 // actual data
+ mov x12, x2 // data size
+
+ puts "Mismatch: PID="
+ mov x0, x20
+ bl putdec
+ puts ", iteration="
+ mov x0, x22
+ bl putdec
+ puts ", row="
+ mov x0, x21
+ bl putdecn
+ puts "\tExpected ["
+ mov x0, x10
+ mov x1, x12
+ bl dumphex
+ puts "]\n\tGot ["
+ mov x0, x11
+ mov x1, x12
+ bl dumphex
+ puts "]\n"
+
+ mov x8, #__NR_getpid
+ svc #0
+// fpsimd.c acitivty log dump hack
+// ldr w0, =0xdeadc0de
+// mov w8, #__NR_exit
+// svc #0
+// ^ end of hack
+ mov x1, #SIGABRT
+ mov x8, #__NR_kill
+ svc #0
+// mov x8, #__NR_exit
+// mov x1, #1
+// svc #0
+endfunction
+
+function vl_barf
+ mov x10, x0
+
+ puts "Bad active VL: "
+ mov x0, x10
+ bl putdecn
+
+ mov x8, #__NR_exit
+ mov x1, #1
+ svc #0
+endfunction
+
+function svcr_barf
+ mov x10, x0
+
+ puts "Bad SVCR: "
+ mov x0, x10
+ bl putdecn
+
+ mov x8, #__NR_exit
+ mov x1, #1
+ svc #0
+endfunction
--
2.30.2
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
next prev parent reply other threads:[~2022-01-26 15:32 UTC|newest]
Thread overview: 127+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-01-26 15:27 [PATCH v10 00/39] arm64/sme: Initial support for the Scalable Matrix Extension Mark Brown
2022-01-26 15:27 ` Mark Brown
2022-01-26 15:27 ` Mark Brown
2022-01-26 15:27 ` [PATCH v10 01/39] arm64: Define CPACR_EL1_FPEN similarly to other floating point controls Mark Brown
2022-01-26 15:27 ` Mark Brown
2022-01-26 15:27 ` Mark Brown
2022-01-26 15:27 ` [PATCH v10 02/39] arm64: Always use individual bits in CPACR floating point enables Mark Brown
2022-01-26 15:27 ` Mark Brown
2022-01-26 15:27 ` Mark Brown
2022-01-26 15:27 ` [PATCH v10 03/39] arm64: cpufeature: Always specify and use a field width for capabilities Mark Brown
2022-01-26 15:27 ` Mark Brown
2022-01-26 15:27 ` Mark Brown
2022-01-26 15:27 ` [PATCH v10 04/39] kselftest/arm64: Remove local ARRAY_SIZE() definitions Mark Brown
2022-01-26 15:27 ` Mark Brown
2022-01-26 15:27 ` Mark Brown
2022-01-26 15:27 ` [PATCH v10 05/39] arm64/sme: Provide ABI documentation for SME Mark Brown
2022-01-26 15:27 ` Mark Brown
2022-01-26 15:27 ` Mark Brown
2022-01-26 15:27 ` [PATCH v10 06/39] arm64/sme: System register and exception syndrome definitions Mark Brown
2022-01-26 15:27 ` Mark Brown
2022-01-26 15:27 ` Mark Brown
2022-01-26 15:27 ` [PATCH v10 07/39] arm64/sme: Manually encode SME instructions Mark Brown
2022-01-26 15:27 ` Mark Brown
2022-01-26 15:27 ` Mark Brown
2022-01-26 15:27 ` [PATCH v10 08/39] arm64/sme: Early CPU setup for SME Mark Brown
2022-01-26 15:27 ` Mark Brown
2022-01-26 15:27 ` Mark Brown
2022-01-26 15:27 ` [PATCH v10 09/39] arm64/sme: Basic enumeration support Mark Brown
2022-01-26 15:27 ` Mark Brown
2022-01-26 15:27 ` Mark Brown
2022-11-08 16:30 ` Christophe Fergeau
2022-11-09 13:32 ` Mark Brown
2022-11-09 15:24 ` Christophe Fergeau
2022-11-09 15:52 ` Mark Brown
2022-11-14 14:08 ` Catalin Marinas
2022-11-15 9:10 ` Christophe Fergeau
2022-11-15 10:42 ` Mark Brown
2022-01-26 15:27 ` [PATCH v10 10/39] arm64/sme: Identify supported SME vector lengths at boot Mark Brown
2022-01-26 15:27 ` Mark Brown
2022-01-26 15:27 ` Mark Brown
2022-01-26 15:27 ` [PATCH v10 11/39] arm64/sme: Implement sysctl to set the default vector length Mark Brown
2022-01-26 15:27 ` Mark Brown
2022-01-26 15:27 ` Mark Brown
2022-01-26 15:27 ` [PATCH v10 12/39] arm64/sme: Implement vector length configuration prctl()s Mark Brown
2022-01-26 15:27 ` Mark Brown
2022-01-26 15:27 ` Mark Brown
2022-01-26 15:27 ` [PATCH v10 13/39] arm64/sme: Implement support for TPIDR2 Mark Brown
2022-01-26 15:27 ` Mark Brown
2022-01-26 15:27 ` Mark Brown
2022-01-26 15:27 ` [PATCH v10 14/39] arm64/sme: Implement SVCR context switching Mark Brown
2022-01-26 15:27 ` Mark Brown
2022-01-26 15:27 ` Mark Brown
2022-01-26 15:27 ` [PATCH v10 15/39] arm64/sme: Implement streaming SVE " Mark Brown
2022-01-26 15:27 ` Mark Brown
2022-01-26 15:27 ` Mark Brown
2022-01-26 15:27 ` [PATCH v10 16/39] arm64/sme: Implement ZA " Mark Brown
2022-01-26 15:27 ` Mark Brown
2022-01-26 15:27 ` Mark Brown
2022-01-26 15:27 ` [PATCH v10 17/39] arm64/sme: Implement traps and syscall handling for SME Mark Brown
2022-01-26 15:27 ` Mark Brown
2022-01-26 15:27 ` Mark Brown
2022-01-26 15:27 ` [PATCH v10 18/39] arm64/sme: Disable ZA and streaming mode when handling signals Mark Brown
2022-01-26 15:27 ` Mark Brown
2022-01-26 15:27 ` Mark Brown
2022-01-26 15:27 ` [PATCH v10 19/39] arm64/sme: Implement streaming SVE signal handling Mark Brown
2022-01-26 15:27 ` Mark Brown
2022-01-26 15:27 ` Mark Brown
2022-01-26 15:27 ` [PATCH v10 20/39] arm64/sme: Implement ZA " Mark Brown
2022-01-26 15:27 ` Mark Brown
2022-01-26 15:27 ` Mark Brown
2022-01-26 15:27 ` [PATCH v10 21/39] arm64/sme: Implement ptrace support for streaming mode SVE registers Mark Brown
2022-01-26 15:27 ` Mark Brown
2022-01-26 15:27 ` Mark Brown
2022-01-26 15:27 ` [PATCH v10 22/39] arm64/sme: Add ptrace support for ZA Mark Brown
2022-01-26 15:27 ` Mark Brown
2022-01-26 15:27 ` Mark Brown
2022-01-26 15:27 ` [PATCH v10 23/39] arm64/sme: Disable streaming mode and ZA when flushing CPU state Mark Brown
2022-01-26 15:27 ` Mark Brown
2022-01-26 15:27 ` Mark Brown
2022-01-26 15:27 ` [PATCH v10 24/39] arm64/sme: Save and restore streaming mode over EFI runtime calls Mark Brown
2022-01-26 15:27 ` Mark Brown
2022-01-26 15:27 ` Mark Brown
2022-01-26 15:27 ` [PATCH v10 25/39] KVM: arm64: Hide SME system registers from guests Mark Brown
2022-01-26 15:27 ` Mark Brown
2022-01-26 15:27 ` Mark Brown
2022-01-26 15:27 ` [PATCH v10 26/39] KVM: arm64: Trap SME usage in guest Mark Brown
2022-01-26 15:27 ` Mark Brown
2022-01-26 15:27 ` Mark Brown
2022-01-26 15:27 ` [PATCH v10 27/39] KVM: arm64: Handle SME host state when running guests Mark Brown
2022-01-26 15:27 ` Mark Brown
2022-01-26 15:27 ` Mark Brown
2022-01-26 15:27 ` [PATCH v10 28/39] arm64/sme: Provide Kconfig for SME Mark Brown
2022-01-26 15:27 ` Mark Brown
2022-01-26 15:27 ` Mark Brown
2022-01-26 15:27 ` [PATCH v10 29/39] kselftest/arm64: sme: Add streaming SME support to vlset Mark Brown
2022-01-26 15:27 ` Mark Brown
2022-01-26 15:27 ` Mark Brown
2022-01-26 15:27 ` [PATCH v10 30/39] kselftest/arm64: Add tests for TPIDR2 Mark Brown
2022-01-26 15:27 ` Mark Brown
2022-01-26 15:27 ` Mark Brown
2022-01-26 15:27 ` [PATCH v10 31/39] kselftest/arm64: Extend vector configuration API tests to cover SME Mark Brown
2022-01-26 15:27 ` Mark Brown
2022-01-26 15:27 ` Mark Brown
2022-01-26 15:27 ` [PATCH v10 32/39] kselftest/arm64: sme: Provide streaming mode SVE stress test Mark Brown
2022-01-26 15:27 ` Mark Brown
2022-01-26 15:27 ` Mark Brown
2022-01-26 15:27 ` [PATCH v10 33/39] kselftest/arm64: signal: Allow tests to be incompatible with features Mark Brown
2022-01-26 15:27 ` Mark Brown
2022-01-26 15:27 ` Mark Brown
2022-01-26 15:27 ` [PATCH v10 34/39] kselftest/arm64: signal: Handle ZA signal context in core code Mark Brown
2022-01-26 15:27 ` Mark Brown
2022-01-26 15:27 ` Mark Brown
2022-01-26 15:27 ` Mark Brown [this message]
2022-01-26 15:27 ` [PATCH v10 35/39] kselftest/arm64: Add stress test for SME ZA context switching Mark Brown
2022-01-26 15:27 ` Mark Brown
2022-01-26 15:27 ` [PATCH v10 36/39] kselftest/arm64: signal: Add SME signal handling tests Mark Brown
2022-01-26 15:27 ` Mark Brown
2022-01-26 15:27 ` Mark Brown
2022-01-26 15:27 ` [PATCH v10 37/39] kselftest/arm64: Add streaming SVE to SVE ptrace tests Mark Brown
2022-01-26 15:27 ` Mark Brown
2022-01-26 15:27 ` Mark Brown
2022-01-26 15:27 ` [PATCH v10 38/39] kselftest/arm64: Add coverage for the ZA ptrace interface Mark Brown
2022-01-26 15:27 ` Mark Brown
2022-01-26 15:27 ` Mark Brown
2022-01-26 15:27 ` [PATCH v10 39/39] kselftest/arm64: Add SME support to syscall ABI test Mark Brown
2022-01-26 15:27 ` Mark Brown
2022-01-26 15:27 ` Mark Brown
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20220126152749.233712-36-broonie@kernel.org \
--to=broonie@kernel.org \
--cc=Basant.KumarDwivedi@arm.com \
--cc=Salil.Akerkar@arm.com \
--cc=alan.hayward@arm.com \
--cc=alexandru.elisei@arm.com \
--cc=catalin.marinas@arm.com \
--cc=james.morse@arm.com \
--cc=kvmarm@lists.cs.columbia.edu \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-kselftest@vger.kernel.org \
--cc=luis.machado@arm.com \
--cc=maz@kernel.org \
--cc=shuah@kernel.org \
--cc=skhan@linuxfoundation.org \
--cc=suzuki.poulose@arm.com \
--cc=szabolcs.nagy@arm.com \
--cc=will@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.