All of lore.kernel.org
 help / color / mirror / Atom feed
* [kvm-unit-tests PATCH v1 0/3] Add panic test support
@ 2022-06-30 11:30 Nico Boehr
  2022-06-30 11:30 ` [kvm-unit-tests PATCH v1 1/3] runtime: add support for panic tests Nico Boehr
                   ` (2 more replies)
  0 siblings, 3 replies; 15+ messages in thread
From: Nico Boehr @ 2022-06-30 11:30 UTC (permalink / raw)
  To: kvm, linux-s390; +Cc: frankja, imbrenda, thuth

QEMU suports a guest state "guest-panicked" which indicates something in
the guest went wrong, for example on s390x, when an external interrupt
loop was triggered.

Since the guest does not continue to run when it is in the
guest-panicked state, it is currently impossible to write panicking
tests in kvm-unit-tests. Support from the runtime is needed to check
that the guest enters the guest-panicked state.

This series adds the required support to the runtime together with two
tests for s390x which cause guest panics.

Nico Boehr (3):
  runtime: add support for panic tests
  s390x: add extint loop test
  s390x: add pgm spec interrupt loop test

 s390x/Makefile        |  2 ++
 s390x/extint-loop.c   | 64 +++++++++++++++++++++++++++++++++++++++++++
 s390x/pgmint-loop.c   | 46 +++++++++++++++++++++++++++++++
 s390x/run             |  2 +-
 s390x/unittests.cfg   |  8 ++++++
 scripts/arch-run.bash | 47 +++++++++++++++++++++++++++++++
 scripts/runtime.bash  |  3 ++
 7 files changed, 171 insertions(+), 1 deletion(-)
 create mode 100644 s390x/extint-loop.c
 create mode 100644 s390x/pgmint-loop.c

-- 
2.36.1


^ permalink raw reply	[flat|nested] 15+ messages in thread

* [kvm-unit-tests PATCH v1 1/3] runtime: add support for panic tests
  2022-06-30 11:30 [kvm-unit-tests PATCH v1 0/3] Add panic test support Nico Boehr
@ 2022-06-30 11:30 ` Nico Boehr
  2022-06-30 17:49   ` Thomas Huth
  2022-06-30 11:30 ` [kvm-unit-tests PATCH v1 2/3] s390x: add extint loop test Nico Boehr
  2022-06-30 11:30 ` [kvm-unit-tests PATCH v1 3/3] s390x: add pgm spec interrupt " Nico Boehr
  2 siblings, 1 reply; 15+ messages in thread
From: Nico Boehr @ 2022-06-30 11:30 UTC (permalink / raw)
  To: kvm, linux-s390; +Cc: frankja, imbrenda, thuth

QEMU suports a guest state "guest-panicked" which indicates something in
the guest went wrong, for example on s390x, when an external interrupt
loop was triggered.

Since the guest does not continue to run when it is in the
guest-panicked state, it is currently impossible to write panicking
tests in kvm-unit-tests. Support from the runtime is needed to check
that the guest enters the guest-panicked state.

Similar to migration tests, add a new group panic. Tests in this
group must enter the guest-panicked state to succeed.

The runtime will spawn a QEMU instance, connect to the QMP and listen
for events. To parse the QMP protocol, jq[1] is used. Same as with
netcat in the migration tests, panic tests won't run if jq is not
installed.

The guest is created in the stopped state and only continued when
connection to the QMP was successful. This ensures no events are missed
between QEMU start and the connect to the QMP.

[1] https://stedolan.github.io/jq/

Signed-off-by: Nico Boehr <nrb@linux.ibm.com>
---
 s390x/run             |  2 +-
 scripts/arch-run.bash | 47 +++++++++++++++++++++++++++++++++++++++++++
 scripts/runtime.bash  |  3 +++
 3 files changed, 51 insertions(+), 1 deletion(-)

diff --git a/s390x/run b/s390x/run
index 24138f6803be..f1111dbdbe62 100755
--- a/s390x/run
+++ b/s390x/run
@@ -30,7 +30,7 @@ M+=",accel=$ACCEL"
 command="$qemu -nodefaults -nographic $M"
 command+=" -chardev stdio,id=con0 -device sclpconsole,chardev=con0"
 command+=" -kernel"
-command="$(migration_cmd) $(timeout_cmd) $command"
+command="$(panic_cmd) $(migration_cmd) $(timeout_cmd) $command"
 
 # We return the exit code via stdout, not via the QEMU return code
 run_qemu_status $command "$@"
diff --git a/scripts/arch-run.bash b/scripts/arch-run.bash
index 0dfaf017db0a..5663a1ddb09e 100644
--- a/scripts/arch-run.bash
+++ b/scripts/arch-run.bash
@@ -104,6 +104,12 @@ qmp ()
 	echo '{ "execute": "qmp_capabilities" }{ "execute":' "$2" '}' | ncat -U $1
 }
 
+qmp_events ()
+{
+	while ! test -S "$1"; do sleep 0.1; done
+	echo '{ "execute": "qmp_capabilities" }{ "execute": "cont" }' | ncat --no-shutdown -U $1 | jq -c 'select(has("event"))'
+}
+
 run_migration ()
 {
 	if ! command -v ncat >/dev/null 2>&1; then
@@ -164,6 +170,40 @@ run_migration ()
 	return $ret
 }
 
+run_panic ()
+{
+	if ! command -v ncat >/dev/null 2>&1; then
+		echo "${FUNCNAME[0]} needs ncat (netcat)" >&2
+		return 77
+	fi
+
+	if ! command -v jq >/dev/null 2>&1; then
+		echo "${FUNCNAME[0]} needs jq" >&2
+		return 77
+	fi
+
+	qmp=$(mktemp -u -t panic-qmp.XXXXXXXXXX)
+
+	trap 'kill 0; exit 2' INT TERM
+	trap 'rm -f ${qmp}' RETURN EXIT
+
+	# start VM stopped so we don't miss any events
+	eval "$@" -chardev socket,id=mon1,path=${qmp},server=on,wait=off \
+		-mon chardev=mon1,mode=control -S &
+
+	panic_event_count=$(qmp_events ${qmp} | jq -c 'select(.event == "GUEST_PANICKED")' | wc -l)
+	if [ $panic_event_count -lt 1 ]; then
+		echo "FAIL: guest did not panic"
+		ret=3
+	else
+		# some QEMU versions report multiple panic events
+		echo "PASS: guest panicked"
+		ret=1
+	fi
+
+	return $ret
+}
+
 migration_cmd ()
 {
 	if [ "$MIGRATION" = "yes" ]; then
@@ -171,6 +211,13 @@ migration_cmd ()
 	fi
 }
 
+panic_cmd ()
+{
+	if [ "$PANIC" = "yes" ]; then
+		echo "run_panic"
+	fi
+}
+
 search_qemu_binary ()
 {
 	local save_path=$PATH
diff --git a/scripts/runtime.bash b/scripts/runtime.bash
index 7d0180bf14bd..8072f3bb536a 100644
--- a/scripts/runtime.bash
+++ b/scripts/runtime.bash
@@ -145,6 +145,9 @@ function run()
     if find_word "migration" "$groups"; then
         cmdline="MIGRATION=yes $cmdline"
     fi
+    if find_word "panic" "$groups"; then
+        cmdline="PANIC=yes $cmdline"
+    fi
     if [ "$verbose" = "yes" ]; then
         echo $cmdline
     fi
-- 
2.36.1


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [kvm-unit-tests PATCH v1 2/3] s390x: add extint loop test
  2022-06-30 11:30 [kvm-unit-tests PATCH v1 0/3] Add panic test support Nico Boehr
  2022-06-30 11:30 ` [kvm-unit-tests PATCH v1 1/3] runtime: add support for panic tests Nico Boehr
@ 2022-06-30 11:30 ` Nico Boehr
  2022-06-30 17:55   ` Thomas Huth
  2022-07-04  8:32   ` Janosch Frank
  2022-06-30 11:30 ` [kvm-unit-tests PATCH v1 3/3] s390x: add pgm spec interrupt " Nico Boehr
  2 siblings, 2 replies; 15+ messages in thread
From: Nico Boehr @ 2022-06-30 11:30 UTC (permalink / raw)
  To: kvm, linux-s390; +Cc: frankja, imbrenda, thuth

The CPU timer interrupt stays pending as long as the CPU timer value is
negative. This can lead to interruption loops when the ext_new_psw mask
has external interrupts enabled.

QEMU is able to detect this situation and panic the guest, so add a test
for it.

Signed-off-by: Nico Boehr <nrb@linux.ibm.com>
---
 s390x/Makefile      |  1 +
 s390x/extint-loop.c | 64 +++++++++++++++++++++++++++++++++++++++++++++
 s390x/unittests.cfg |  4 +++
 3 files changed, 69 insertions(+)
 create mode 100644 s390x/extint-loop.c

diff --git a/s390x/Makefile b/s390x/Makefile
index efd5e0c13102..92a020234c9f 100644
--- a/s390x/Makefile
+++ b/s390x/Makefile
@@ -34,6 +34,7 @@ tests += $(TEST_DIR)/migration.elf
 tests += $(TEST_DIR)/pv-attest.elf
 tests += $(TEST_DIR)/migration-cmm.elf
 tests += $(TEST_DIR)/migration-skey.elf
+tests += $(TEST_DIR)/extint-loop.elf
 
 pv-tests += $(TEST_DIR)/pv-diags.elf
 
diff --git a/s390x/extint-loop.c b/s390x/extint-loop.c
new file mode 100644
index 000000000000..5276d86a156f
--- /dev/null
+++ b/s390x/extint-loop.c
@@ -0,0 +1,64 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * External interrupt loop test
+ *
+ * Copyright IBM Corp. 2022
+ *
+ * Authors:
+ *  Nico Boehr <nrb@linux.ibm.com>
+ */
+#include <libcflat.h>
+#include <asm/interrupt.h>
+#include <asm/barrier.h>
+#include <asm/time.h>
+
+static void ext_int_handler(void)
+{
+	/*
+	 * return to ext_old_psw. This gives us the chance to print the return_fail
+	 * in case something goes wrong.
+	 */
+	asm volatile (
+		"lpswe %[ext_old_psw]\n"
+		:
+		: [ext_old_psw] "Q"(lowcore.ext_old_psw)
+		: "memory"
+	);
+}
+
+static void start_cpu_timer(int64_t timeout_ms)
+{
+#define CPU_TIMER_US_SHIFT 12
+	int64_t timer_value = (timeout_ms * 1000) << CPU_TIMER_US_SHIFT;
+	asm volatile (
+		"spt %[timer_value]\n"
+		:
+		: [timer_value] "Q" (timer_value)
+	);
+}
+
+int main(void)
+{
+	struct psw ext_new_psw_orig;
+
+	report_prefix_push("extint-loop");
+
+	ext_new_psw_orig = lowcore.ext_new_psw;
+	lowcore.ext_new_psw.addr = (uint64_t)ext_int_handler;
+	lowcore.ext_new_psw.mask |= PSW_MASK_EXT;
+
+	load_psw_mask(extract_psw_mask() | PSW_MASK_EXT);
+	ctl_set_bit(0, CTL0_CLOCK_COMPARATOR);
+
+	start_cpu_timer(1);
+
+	mdelay(2000);
+
+	/* restore previous ext_new_psw so QEMU can properly terminate */
+	lowcore.ext_new_psw = ext_new_psw_orig;
+
+	report_fail("survived extint loop");
+
+	report_prefix_pop();
+	return report_summary();
+}
diff --git a/s390x/unittests.cfg b/s390x/unittests.cfg
index 8e52f560bb1e..7d408f2d5310 100644
--- a/s390x/unittests.cfg
+++ b/s390x/unittests.cfg
@@ -184,3 +184,7 @@ groups = migration
 [migration-skey]
 file = migration-skey.elf
 groups = migration
+
+[extint-loop]
+file = extint-loop.elf
+groups = panic
-- 
2.36.1


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [kvm-unit-tests PATCH v1 3/3] s390x: add pgm spec interrupt loop test
  2022-06-30 11:30 [kvm-unit-tests PATCH v1 0/3] Add panic test support Nico Boehr
  2022-06-30 11:30 ` [kvm-unit-tests PATCH v1 1/3] runtime: add support for panic tests Nico Boehr
  2022-06-30 11:30 ` [kvm-unit-tests PATCH v1 2/3] s390x: add extint loop test Nico Boehr
@ 2022-06-30 11:30 ` Nico Boehr
  2022-06-30 14:38   ` Janis Schoetterl-Glausch
                     ` (2 more replies)
  2 siblings, 3 replies; 15+ messages in thread
From: Nico Boehr @ 2022-06-30 11:30 UTC (permalink / raw)
  To: kvm, linux-s390; +Cc: frankja, imbrenda, thuth

An invalid PSW causes a program interrupt. When an invalid PSW is
introduced in the pgm_new_psw, an interrupt loop occurs as soon as a
program interrupt is caused.

QEMU should detect that and panick the guest, hence add a test for it.

Signed-off-by: Nico Boehr <nrb@linux.ibm.com>
---
 s390x/Makefile      |  1 +
 s390x/pgmint-loop.c | 46 +++++++++++++++++++++++++++++++++++++++++++++
 s390x/unittests.cfg |  4 ++++
 3 files changed, 51 insertions(+)
 create mode 100644 s390x/pgmint-loop.c

diff --git a/s390x/Makefile b/s390x/Makefile
index 92a020234c9f..a600dbfb3f4c 100644
--- a/s390x/Makefile
+++ b/s390x/Makefile
@@ -35,6 +35,7 @@ tests += $(TEST_DIR)/pv-attest.elf
 tests += $(TEST_DIR)/migration-cmm.elf
 tests += $(TEST_DIR)/migration-skey.elf
 tests += $(TEST_DIR)/extint-loop.elf
+tests += $(TEST_DIR)/pgmint-loop.elf
 
 pv-tests += $(TEST_DIR)/pv-diags.elf
 
diff --git a/s390x/pgmint-loop.c b/s390x/pgmint-loop.c
new file mode 100644
index 000000000000..5b74f26dbc3d
--- /dev/null
+++ b/s390x/pgmint-loop.c
@@ -0,0 +1,46 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * Program interrupt loop test
+ *
+ * Copyright IBM Corp. 2022
+ *
+ * Authors:
+ *  Nico Boehr <nrb@linux.ibm.com>
+ */
+#include <libcflat.h>
+#include <bitops.h>
+#include <asm/interrupt.h>
+#include <asm/barrier.h>
+
+static void pgm_int_handler(void)
+{
+	/*
+	 * return to pgm_old_psw. This gives us the chance to print the return_fail
+	 * in case something goes wrong.
+	 */
+	asm volatile (
+		"lpswe %[pgm_old_psw]\n"
+		:
+		: [pgm_old_psw] "Q"(lowcore.pgm_old_psw)
+		: "memory"
+	);
+}
+
+int main(void)
+{
+	report_prefix_push("pgmint-loop");
+
+	lowcore.pgm_new_psw.addr = (uint64_t) pgm_int_handler;
+	/* bit 12 set is invalid */
+	lowcore.pgm_new_psw.mask = extract_psw_mask() | BIT(63 - 12);
+	mb();
+
+	/* cause a pgm int */
+	*((int *)-4) = 0x42;
+	mb();
+
+	report_fail("survived pgmint loop");
+
+	report_prefix_pop();
+	return report_summary();
+}
diff --git a/s390x/unittests.cfg b/s390x/unittests.cfg
index 7d408f2d5310..c3073bfc4363 100644
--- a/s390x/unittests.cfg
+++ b/s390x/unittests.cfg
@@ -188,3 +188,7 @@ groups = migration
 [extint-loop]
 file = extint-loop.elf
 groups = panic
+
+[pgmint-loop]
+file = pgmint-loop.elf
+groups = panic
-- 
2.36.1


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [kvm-unit-tests PATCH v1 3/3] s390x: add pgm spec interrupt loop test
  2022-06-30 11:30 ` [kvm-unit-tests PATCH v1 3/3] s390x: add pgm spec interrupt " Nico Boehr
@ 2022-06-30 14:38   ` Janis Schoetterl-Glausch
  2022-06-30 17:11     ` Thomas Huth
  2022-07-01  8:10     ` Nico Boehr
  2022-06-30 17:25   ` Thomas Huth
  2022-07-04  9:06   ` Janosch Frank
  2 siblings, 2 replies; 15+ messages in thread
From: Janis Schoetterl-Glausch @ 2022-06-30 14:38 UTC (permalink / raw)
  To: Nico Boehr, kvm, linux-s390; +Cc: frankja, imbrenda, thuth

On 6/30/22 13:30, Nico Boehr wrote:
> An invalid PSW causes a program interrupt. When an invalid PSW is
> introduced in the pgm_new_psw, an interrupt loop occurs as soon as a
> program interrupt is caused.
> 
> QEMU should detect that and panick the guest, hence add a test for it.

Why is that, after all in LPAR it would just spin, right?
Also, panicK.
How do you assert that the guest doesn't spin forever, is there a timeout?



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [kvm-unit-tests PATCH v1 3/3] s390x: add pgm spec interrupt loop test
  2022-06-30 14:38   ` Janis Schoetterl-Glausch
@ 2022-06-30 17:11     ` Thomas Huth
  2022-07-01 10:49       ` Janis Schoetterl-Glausch
  2022-07-01  8:10     ` Nico Boehr
  1 sibling, 1 reply; 15+ messages in thread
From: Thomas Huth @ 2022-06-30 17:11 UTC (permalink / raw)
  To: Janis Schoetterl-Glausch, Nico Boehr, kvm, linux-s390; +Cc: frankja, imbrenda

On 30/06/2022 16.38, Janis Schoetterl-Glausch wrote:
> On 6/30/22 13:30, Nico Boehr wrote:
>> An invalid PSW causes a program interrupt. When an invalid PSW is
>> introduced in the pgm_new_psw, an interrupt loop occurs as soon as a
>> program interrupt is caused.
>>
>> QEMU should detect that and panick the guest, hence add a test for it.
> 
> Why is that, after all in LPAR it would just spin, right?

Not sure what the LPAR is doing, but the guest is certainly completely 
unusable, so a panic event is the right thing to do here for QEMU.

> Also, panicK.
> How do you assert that the guest doesn't spin forever, is there a timeout?

I agree, it would be good to have a "timeout" set in the unittests.cfg for 
this test here (some few seconds should be enough).

  Thomas


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [kvm-unit-tests PATCH v1 3/3] s390x: add pgm spec interrupt loop test
  2022-06-30 11:30 ` [kvm-unit-tests PATCH v1 3/3] s390x: add pgm spec interrupt " Nico Boehr
  2022-06-30 14:38   ` Janis Schoetterl-Glausch
@ 2022-06-30 17:25   ` Thomas Huth
  2022-07-01  8:17     ` Nico Boehr
  2022-07-04  9:06   ` Janosch Frank
  2 siblings, 1 reply; 15+ messages in thread
From: Thomas Huth @ 2022-06-30 17:25 UTC (permalink / raw)
  To: Nico Boehr, kvm, linux-s390; +Cc: frankja, imbrenda

On 30/06/2022 13.30, Nico Boehr wrote:
> An invalid PSW causes a program interrupt. When an invalid PSW is
> introduced in the pgm_new_psw, an interrupt loop occurs as soon as a
> program interrupt is caused.
> 
> QEMU should detect that and panick the guest, hence add a test for it.
> 
> Signed-off-by: Nico Boehr <nrb@linux.ibm.com>
> ---
....
> +int main(void)
> +{
> +	report_prefix_push("pgmint-loop");
> +
> +	lowcore.pgm_new_psw.addr = (uint64_t) pgm_int_handler;
> +	/* bit 12 set is invalid */
> +	lowcore.pgm_new_psw.mask = extract_psw_mask() | BIT(63 - 12);

Basically patch looks fine to me ... just an idea for an extension (but that 
could also be done later):

Looking at the is_valid_psw() function in the Linux kernel sources, there 
are a couple of additional condition that could cause a PGM interrupt loop 
... you could maybe check them here, too, e.g. by adding a "extra_params = 
-append '...'" in the unittests.cfg file to select the indiviual tests via 
argv[] ?

  Thomas


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [kvm-unit-tests PATCH v1 1/3] runtime: add support for panic tests
  2022-06-30 11:30 ` [kvm-unit-tests PATCH v1 1/3] runtime: add support for panic tests Nico Boehr
@ 2022-06-30 17:49   ` Thomas Huth
  2022-07-01  7:02     ` Nico Boehr
  0 siblings, 1 reply; 15+ messages in thread
From: Thomas Huth @ 2022-06-30 17:49 UTC (permalink / raw)
  To: Nico Boehr, kvm, linux-s390; +Cc: frankja, imbrenda

On 30/06/2022 13.30, Nico Boehr wrote:
> QEMU suports a guest state "guest-panicked" which indicates something in

s/suports/supports/

> the guest went wrong, for example on s390x, when an external interrupt
> loop was triggered.
> 
> Since the guest does not continue to run when it is in the
> guest-panicked state, it is currently impossible to write panicking
> tests in kvm-unit-tests. Support from the runtime is needed to check
> that the guest enters the guest-panicked state.
> 
> Similar to migration tests, add a new group panic. Tests in this
> group must enter the guest-panicked state to succeed.
> 
> The runtime will spawn a QEMU instance, connect to the QMP and listen
> for events. To parse the QMP protocol, jq[1] is used. Same as with
> netcat in the migration tests, panic tests won't run if jq is not
> installed.
> 
> The guest is created in the stopped state and only continued when
> connection to the QMP was successful. This ensures no events are missed
> between QEMU start and the connect to the QMP.
> 
> [1] https://stedolan.github.io/jq/
> 
> Signed-off-by: Nico Boehr <nrb@linux.ibm.com>
> ---
>   s390x/run             |  2 +-
>   scripts/arch-run.bash | 47 +++++++++++++++++++++++++++++++++++++++++++
>   scripts/runtime.bash  |  3 +++
>   3 files changed, 51 insertions(+), 1 deletion(-)
> 
> diff --git a/s390x/run b/s390x/run
> index 24138f6803be..f1111dbdbe62 100755
> --- a/s390x/run
> +++ b/s390x/run
> @@ -30,7 +30,7 @@ M+=",accel=$ACCEL"
>   command="$qemu -nodefaults -nographic $M"
>   command+=" -chardev stdio,id=con0 -device sclpconsole,chardev=con0"
>   command+=" -kernel"
> -command="$(migration_cmd) $(timeout_cmd) $command"
> +command="$(panic_cmd) $(migration_cmd) $(timeout_cmd) $command"
>   
>   # We return the exit code via stdout, not via the QEMU return code
>   run_qemu_status $command "$@"
> diff --git a/scripts/arch-run.bash b/scripts/arch-run.bash
> index 0dfaf017db0a..5663a1ddb09e 100644
> --- a/scripts/arch-run.bash
> +++ b/scripts/arch-run.bash
> @@ -104,6 +104,12 @@ qmp ()
>   	echo '{ "execute": "qmp_capabilities" }{ "execute":' "$2" '}' | ncat -U $1
>   }
>   
> +qmp_events ()
> +{
> +	while ! test -S "$1"; do sleep 0.1; done
> +	echo '{ "execute": "qmp_capabilities" }{ "execute": "cont" }' | ncat --no-shutdown -U $1 | jq -c 'select(has("event"))'

Break the long line into two or three?

> +}
> +
>   run_migration ()
>   {
>   	if ! command -v ncat >/dev/null 2>&1; then
> @@ -164,6 +170,40 @@ run_migration ()
>   	return $ret
>   }
>   
> +run_panic ()
> +{
> +	if ! command -v ncat >/dev/null 2>&1; then
> +		echo "${FUNCNAME[0]} needs ncat (netcat)" >&2
> +		return 77
> +	fi
> +
> +	if ! command -v jq >/dev/null 2>&1; then
> +		echo "${FUNCNAME[0]} needs jq" >&2
> +		return 77
> +	fi
> +
> +	qmp=$(mktemp -u -t panic-qmp.XXXXXXXXXX)
> +
> +	trap 'kill 0; exit 2' INT TERM
> +	trap 'rm -f ${qmp}' RETURN EXIT
> +
> +	# start VM stopped so we don't miss any events
> +	eval "$@" -chardev socket,id=mon1,path=${qmp},server=on,wait=off \
> +		-mon chardev=mon1,mode=control -S &
> +
> +	panic_event_count=$(qmp_events ${qmp} | jq -c 'select(.event == "GUEST_PANICKED")' | wc -l)
> +	if [ $panic_event_count -lt 1 ]; then

Maybe put double-quotes around $panic_event_count , just to be sure?

With the nits fixed:

Reviewed-by: Thomas Huth <thuth@redhat.com>


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [kvm-unit-tests PATCH v1 2/3] s390x: add extint loop test
  2022-06-30 11:30 ` [kvm-unit-tests PATCH v1 2/3] s390x: add extint loop test Nico Boehr
@ 2022-06-30 17:55   ` Thomas Huth
  2022-07-04  8:32   ` Janosch Frank
  1 sibling, 0 replies; 15+ messages in thread
From: Thomas Huth @ 2022-06-30 17:55 UTC (permalink / raw)
  To: Nico Boehr, kvm, linux-s390; +Cc: frankja, imbrenda

On 30/06/2022 13.30, Nico Boehr wrote:
> The CPU timer interrupt stays pending as long as the CPU timer value is
> negative. This can lead to interruption loops when the ext_new_psw mask
> has external interrupts enabled.
> 
> QEMU is able to detect this situation and panic the guest, so add a test
> for it.
> 
> Signed-off-by: Nico Boehr <nrb@linux.ibm.com>
> ---
>   s390x/Makefile      |  1 +
>   s390x/extint-loop.c | 64 +++++++++++++++++++++++++++++++++++++++++++++
>   s390x/unittests.cfg |  4 +++
>   3 files changed, 69 insertions(+)
>   create mode 100644 s390x/extint-loop.c

Reviewed-by: Thomas Huth <thuth@redhat.com>


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [kvm-unit-tests PATCH v1 1/3] runtime: add support for panic tests
  2022-06-30 17:49   ` Thomas Huth
@ 2022-07-01  7:02     ` Nico Boehr
  0 siblings, 0 replies; 15+ messages in thread
From: Nico Boehr @ 2022-07-01  7:02 UTC (permalink / raw)
  To: kvm; +Cc: frankja, imbrenda

Quoting Thomas Huth (2022-06-30 19:49:45)
> On 30/06/2022 13.30, Nico Boehr wrote:
> > QEMU suports a guest state "guest-panicked" which indicates something in
> 
> s/suports/supports/

Fixed.

> > diff --git a/scripts/arch-run.bash b/scripts/arch-run.bash
> > index 0dfaf017db0a..5663a1ddb09e 100644
> > --- a/scripts/arch-run.bash
> > +++ b/scripts/arch-run.bash
> > @@ -104,6 +104,12 @@ qmp ()
> >       echo '{ "execute": "qmp_capabilities" }{ "execute":' "$2" '}' | ncat -U $1
> >   }
> >   
> > +qmp_events ()
> > +{
> > +     while ! test -S "$1"; do sleep 0.1; done
> > +     echo '{ "execute": "qmp_capabilities" }{ "execute": "cont" }' | ncat --no-shutdown -U $1 | jq -c 'select(has("event"))'
> 
> Break the long line into two or three?

Fixed.

> > +run_panic ()
> > +{
[...]
> > +     panic_event_count=$(qmp_events ${qmp} | jq -c 'select(.event == "GUEST_PANICKED")' | wc -l)
> > +     if [ $panic_event_count -lt 1 ]; then
> 
> Maybe put double-quotes around $panic_event_count , just to be sure?

Yes, quoting is a bit broken anyways, but we have to start somewhere, thanks.

> With the nits fixed:
> 
> Reviewed-by: Thomas Huth <thuth@redhat.com>

Thanks.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [kvm-unit-tests PATCH v1 3/3] s390x: add pgm spec interrupt loop test
  2022-06-30 14:38   ` Janis Schoetterl-Glausch
  2022-06-30 17:11     ` Thomas Huth
@ 2022-07-01  8:10     ` Nico Boehr
  1 sibling, 0 replies; 15+ messages in thread
From: Nico Boehr @ 2022-07-01  8:10 UTC (permalink / raw)
  To: kvm; +Cc: frankja, imbrenda, thuth

Quoting Janis Schoetterl-Glausch (2022-06-30 16:38:47)
> On 6/30/22 13:30, Nico Boehr wrote:
> > An invalid PSW causes a program interrupt. When an invalid PSW is
> > introduced in the pgm_new_psw, an interrupt loop occurs as soon as a
> > program interrupt is caused.
> > 
> > QEMU should detect that and panick the guest, hence add a test for it.
> 
> Why is that, after all in LPAR it would just spin, right?

The test doesn't spin for me under LPAR so it seems like LPAR can detect this as well. KVM has code to detect this situation, see handle_prog() in intercept.c, which then exits to userspace.

> Also, panicK.

Right, fixed.

> How do you assert that the guest doesn't spin forever, is there a timeout?

There is the default kvm-unit-tests timeout of 90 seconds, but that is probably too much for this test. I think 5 seconds should be plenty, will add.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [kvm-unit-tests PATCH v1 3/3] s390x: add pgm spec interrupt loop test
  2022-06-30 17:25   ` Thomas Huth
@ 2022-07-01  8:17     ` Nico Boehr
  0 siblings, 0 replies; 15+ messages in thread
From: Nico Boehr @ 2022-07-01  8:17 UTC (permalink / raw)
  To: kvm; +Cc: frankja, imbrenda

Quoting Thomas Huth (2022-06-30 19:25:57)
> On 30/06/2022 13.30, Nico Boehr wrote:
> > An invalid PSW causes a program interrupt. When an invalid PSW is
> > introduced in the pgm_new_psw, an interrupt loop occurs as soon as a
> > program interrupt is caused.
> > 
> > QEMU should detect that and panick the guest, hence add a test for it.
> > 
> > Signed-off-by: Nico Boehr <nrb@linux.ibm.com>
> > ---
> ....
> > +int main(void)
> > +{
> > +     report_prefix_push("pgmint-loop");
> > +
> > +     lowcore.pgm_new_psw.addr = (uint64_t) pgm_int_handler;
> > +     /* bit 12 set is invalid */
> > +     lowcore.pgm_new_psw.mask = extract_psw_mask() | BIT(63 - 12);
> 
> Basically patch looks fine to me ... just an idea for an extension (but that 
> could also be done later):
> 
> Looking at the is_valid_psw() function in the Linux kernel sources, there 
> are a couple of additional condition that could cause a PGM interrupt loop 
> ... you could maybe check them here, too, e.g. by adding a "extra_params = 
> -append '...'" in the unittests.cfg file to select the indiviual tests via 
> argv[] ?

It is a good idea, I have it on my TODO and will address it in a upcoming patchset.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [kvm-unit-tests PATCH v1 3/3] s390x: add pgm spec interrupt loop test
  2022-06-30 17:11     ` Thomas Huth
@ 2022-07-01 10:49       ` Janis Schoetterl-Glausch
  0 siblings, 0 replies; 15+ messages in thread
From: Janis Schoetterl-Glausch @ 2022-07-01 10:49 UTC (permalink / raw)
  To: Thomas Huth, Nico Boehr, kvm, linux-s390; +Cc: frankja, imbrenda

On 6/30/22 19:11, Thomas Huth wrote:
> On 30/06/2022 16.38, Janis Schoetterl-Glausch wrote:
>> On 6/30/22 13:30, Nico Boehr wrote:
>>> An invalid PSW causes a program interrupt. When an invalid PSW is
>>> introduced in the pgm_new_psw, an interrupt loop occurs as soon as a
>>> program interrupt is caused.
>>>
>>> QEMU should detect that and panick the guest, hence add a test for it.
>>
>> Why is that, after all in LPAR it would just spin, right?
> 
> Not sure what the LPAR is doing, but the guest is certainly completely unusable, so a panic event is the right thing to do here for QEMU.

I suppose some other kind of interrupt could fix things up somehow, but I guess in practice
panicking does indeed make more sense.


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [kvm-unit-tests PATCH v1 2/3] s390x: add extint loop test
  2022-06-30 11:30 ` [kvm-unit-tests PATCH v1 2/3] s390x: add extint loop test Nico Boehr
  2022-06-30 17:55   ` Thomas Huth
@ 2022-07-04  8:32   ` Janosch Frank
  1 sibling, 0 replies; 15+ messages in thread
From: Janosch Frank @ 2022-07-04  8:32 UTC (permalink / raw)
  To: Nico Boehr, kvm, linux-s390; +Cc: imbrenda, thuth

On 6/30/22 13:30, Nico Boehr wrote:
> The CPU timer interrupt stays pending as long as the CPU timer value is
> negative. This can lead to interruption loops when the ext_new_psw mask
> has external interrupts enabled.
> 
> QEMU is able to detect this situation and panic the guest, so add a test
> for it.
> 
> Signed-off-by: Nico Boehr <nrb@linux.ibm.com>
> ---
>   s390x/Makefile      |  1 +
>   s390x/extint-loop.c | 64 +++++++++++++++++++++++++++++++++++++++++++++
>   s390x/unittests.cfg |  4 +++
>   3 files changed, 69 insertions(+)
>   create mode 100644 s390x/extint-loop.c
> 
> diff --git a/s390x/Makefile b/s390x/Makefile
> index efd5e0c13102..92a020234c9f 100644
> --- a/s390x/Makefile
> +++ b/s390x/Makefile
> @@ -34,6 +34,7 @@ tests += $(TEST_DIR)/migration.elf
>   tests += $(TEST_DIR)/pv-attest.elf
>   tests += $(TEST_DIR)/migration-cmm.elf
>   tests += $(TEST_DIR)/migration-skey.elf
> +tests += $(TEST_DIR)/extint-loop.elf

I'd suggest giving these tests a "panic" prefix. panic-loop-extint.c 
panic-loop-pgm.c

>   
>   pv-tests += $(TEST_DIR)/pv-diags.elf
>   
> diff --git a/s390x/extint-loop.c b/s390x/extint-loop.c
> new file mode 100644
> index 000000000000..5276d86a156f
> --- /dev/null
> +++ b/s390x/extint-loop.c
> @@ -0,0 +1,64 @@
> +/* SPDX-License-Identifier: GPL-2.0-only */
> +/*
> + * External interrupt loop test
> + *
> + * Copyright IBM Corp. 2022
> + *
> + * Authors:
> + *  Nico Boehr <nrb@linux.ibm.com>
> + */
> +#include <libcflat.h>
> +#include <asm/interrupt.h>
> +#include <asm/barrier.h>
> +#include <asm/time.h>
> +
> +static void ext_int_handler(void)
> +{
> +	/*
> +	 * return to ext_old_psw. This gives us the chance to print the return_fail
> +	 * in case something goes wrong.
> +	 */
> +	asm volatile (
> +		"lpswe %[ext_old_psw]\n"
> +		:
> +		: [ext_old_psw] "Q"(lowcore.ext_old_psw)
> +		: "memory"
> +	);
> +}
> +
> +static void start_cpu_timer(int64_t timeout_ms)

cpu_timer_set

> +{
> +#define CPU_TIMER_US_SHIFT 12

The clock and the timer use the same shift so maybe we can rename or 
reuse time.h constants?

We could rename STCK_SHIFT_US to TIMING_S390_SHIFT_US since we need that 
for the TOD, todcmp and cputimer.

> +	int64_t timer_value = (timeout_ms * 1000) << CPU_TIMER_US_SHIFT;
> +	asm volatile (
> +		"spt %[timer_value]\n"
> +		:
> +		: [timer_value] "Q" (timer_value)
> +	);
> +}
> +
> +int main(void)
> +{
> +	struct psw ext_new_psw_orig;
> +
> +	report_prefix_push("extint-loop");

This is a QEMU only test so I think we should fence other hypervisors.

> +
> +	ext_new_psw_orig = lowcore.ext_new_psw;
> +	lowcore.ext_new_psw.addr = (uint64_t)ext_int_handler;
> +	lowcore.ext_new_psw.mask |= PSW_MASK_EXT;
> +
> +	load_psw_mask(extract_psw_mask() | PSW_MASK_EXT);
> +	ctl_set_bit(0, CTL0_CLOCK_COMPARATOR);
> +
> +	start_cpu_timer(1);
> +
> +	mdelay(2000);
> +
> +	/* restore previous ext_new_psw so QEMU can properly terminate */
> +	lowcore.ext_new_psw = ext_new_psw_orig;
> +
> +	report_fail("survived extint loop");
> +
> +	report_prefix_pop();
> +	return report_summary();
> +}
> diff --git a/s390x/unittests.cfg b/s390x/unittests.cfg
> index 8e52f560bb1e..7d408f2d5310 100644
> --- a/s390x/unittests.cfg
> +++ b/s390x/unittests.cfg
> @@ -184,3 +184,7 @@ groups = migration
>   [migration-skey]
>   file = migration-skey.elf
>   groups = migration
> +
> +[extint-loop]
> +file = extint-loop.elf
> +groups = panic


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [kvm-unit-tests PATCH v1 3/3] s390x: add pgm spec interrupt loop test
  2022-06-30 11:30 ` [kvm-unit-tests PATCH v1 3/3] s390x: add pgm spec interrupt " Nico Boehr
  2022-06-30 14:38   ` Janis Schoetterl-Glausch
  2022-06-30 17:25   ` Thomas Huth
@ 2022-07-04  9:06   ` Janosch Frank
  2 siblings, 0 replies; 15+ messages in thread
From: Janosch Frank @ 2022-07-04  9:06 UTC (permalink / raw)
  To: Nico Boehr, kvm, linux-s390; +Cc: imbrenda, thuth

On 6/30/22 13:30, Nico Boehr wrote:
> An invalid PSW causes a program interrupt. When an invalid PSW is
> introduced in the pgm_new_psw, an interrupt loop occurs as soon as a
> program interrupt is caused.
> 
> QEMU should detect that and panick the guest, hence add a test for it.
> 
> Signed-off-by: Nico Boehr <nrb@linux.ibm.com>

Test is fine but the same general comments as in patch #2 apply.

> ---
>   s390x/Makefile      |  1 +
>   s390x/pgmint-loop.c | 46 +++++++++++++++++++++++++++++++++++++++++++++
>   s390x/unittests.cfg |  4 ++++
>   3 files changed, 51 insertions(+)
>   create mode 100644 s390x/pgmint-loop.c
> 
> diff --git a/s390x/Makefile b/s390x/Makefile
> index 92a020234c9f..a600dbfb3f4c 100644
> --- a/s390x/Makefile
> +++ b/s390x/Makefile
> @@ -35,6 +35,7 @@ tests += $(TEST_DIR)/pv-attest.elf
>   tests += $(TEST_DIR)/migration-cmm.elf
>   tests += $(TEST_DIR)/migration-skey.elf
>   tests += $(TEST_DIR)/extint-loop.elf
> +tests += $(TEST_DIR)/pgmint-loop.elf
>   
>   pv-tests += $(TEST_DIR)/pv-diags.elf
>   
> diff --git a/s390x/pgmint-loop.c b/s390x/pgmint-loop.c
> new file mode 100644
> index 000000000000..5b74f26dbc3d
> --- /dev/null
> +++ b/s390x/pgmint-loop.c
> @@ -0,0 +1,46 @@
> +/* SPDX-License-Identifier: GPL-2.0-only */
> +/*
> + * Program interrupt loop test
> + *
> + * Copyright IBM Corp. 2022
> + *
> + * Authors:
> + *  Nico Boehr <nrb@linux.ibm.com>
> + */
> +#include <libcflat.h>
> +#include <bitops.h>
> +#include <asm/interrupt.h>
> +#include <asm/barrier.h>
> +
> +static void pgm_int_handler(void)
> +{
> +	/*
> +	 * return to pgm_old_psw. This gives us the chance to print the return_fail
> +	 * in case something goes wrong.
> +	 */
> +	asm volatile (
> +		"lpswe %[pgm_old_psw]\n"
> +		:
> +		: [pgm_old_psw] "Q"(lowcore.pgm_old_psw)
> +		: "memory"
> +	);
> +}
> +
> +int main(void)
> +{
> +	report_prefix_push("pgmint-loop");
> +
> +	lowcore.pgm_new_psw.addr = (uint64_t) pgm_int_handler;
> +	/* bit 12 set is invalid */
> +	lowcore.pgm_new_psw.mask = extract_psw_mask() | BIT(63 - 12);
> +	mb();
> +
> +	/* cause a pgm int */
> +	*((int *)-4) = 0x42;
> +	mb();
> +
> +	report_fail("survived pgmint loop");
> +
> +	report_prefix_pop();
> +	return report_summary();
> +}
> diff --git a/s390x/unittests.cfg b/s390x/unittests.cfg
> index 7d408f2d5310..c3073bfc4363 100644
> --- a/s390x/unittests.cfg
> +++ b/s390x/unittests.cfg
> @@ -188,3 +188,7 @@ groups = migration
>   [extint-loop]
>   file = extint-loop.elf
>   groups = panic
> +
> +[pgmint-loop]
> +file = pgmint-loop.elf
> +groups = panic


^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2022-07-04  9:06 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-06-30 11:30 [kvm-unit-tests PATCH v1 0/3] Add panic test support Nico Boehr
2022-06-30 11:30 ` [kvm-unit-tests PATCH v1 1/3] runtime: add support for panic tests Nico Boehr
2022-06-30 17:49   ` Thomas Huth
2022-07-01  7:02     ` Nico Boehr
2022-06-30 11:30 ` [kvm-unit-tests PATCH v1 2/3] s390x: add extint loop test Nico Boehr
2022-06-30 17:55   ` Thomas Huth
2022-07-04  8:32   ` Janosch Frank
2022-06-30 11:30 ` [kvm-unit-tests PATCH v1 3/3] s390x: add pgm spec interrupt " Nico Boehr
2022-06-30 14:38   ` Janis Schoetterl-Glausch
2022-06-30 17:11     ` Thomas Huth
2022-07-01 10:49       ` Janis Schoetterl-Glausch
2022-07-01  8:10     ` Nico Boehr
2022-06-30 17:25   ` Thomas Huth
2022-07-01  8:17     ` Nico Boehr
2022-07-04  9:06   ` Janosch Frank

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.