All of lore.kernel.org
 help / color / mirror / Atom feed
* [kvm-unit-tests PATCH v4 0/2] arm: Add PSCI CPU_OFF test
@ 2023-01-18 14:49 Alexandru Elisei
  2023-01-18 14:49 ` [kvm-unit-tests PATCH v4 1/2] arm/psci: Test that CPU 1 has been successfully brought online Alexandru Elisei
  2023-01-18 14:49 ` [kvm-unit-tests PATCH v4 2/2] arm/psci: Add PSCI CPU_OFF test case Alexandru Elisei
  0 siblings, 2 replies; 11+ messages in thread
From: Alexandru Elisei @ 2023-01-18 14:49 UTC (permalink / raw)
  To: andrew.jones, kvm, kvmarm

The series adds a test for the PSCI function CPU_OFF. The test is
implemented in patch #2, "arm/psci: Add PSCI CPU_OFF test case".  Patch #1,
"arm/psci: Test that CPU 1 has been successfully brought online" is a
preparatory patch to allow the CPU_OFF test to run after the CPU_ON test.
Executing the CPU_OFF test after the CPU_ON test makes the most sense,
since CPU_OFF requires all the CPUs to be brought online before the test
can execute, and this is exactly what the CPU_ON test does.

I believe that proving that a CPU has been successfully offlined is an
undecidable problem - it might just be that the CPU is stuck at a higher
exception level doing something else entirely, and if the test were to wait
long enough, the CPU would return from the CPU_OFF call and start executing
code, thus failing to be offlined. Right now, the test waits for 1 second
before checking that all the other CPUs are offline. I thought this was a
good balance between making the test fast and being reasonably sure that
the offline succeeded. I'm open to suggestions here if anyone thinks
otherwise.

Tested for both the arm and arm64 architectures: on an x86 machine (qemu
with TCG), and an ampere emag (qemu and kvmtool).

And Nikita is not longer at Arm, and I don't have a new email address where
she can be reached, so I didn't CC her.

Changelog:

v3 -> v4:
- Moved ownership of the series to Alexandru Elisei. All bugs are mine.
- Moved the timeout for the CPU_ON test to patch #1.
- Changed the include order for arm/psic.c in patch #1 to order the headers
  alphabetically.
- Run the CPU_OFF test only if CPU_ON succeeds, because CPU_OFF expects all
  CPUs to be online, the test would otherwise hang forever.
- Minor style changes here and there.

v2 -> v3:
- Add timeout so that test does not hang if CPU1 fails to come online
- Remove unnecessary call of on_cpus() in the condition where target CPU is not online.

v1 -> v2:
- Modify PSCI CPU_ON test to ensure CPU 1 remains online after the execution of the test.
- Addition of PSCI CPU_OFF test and calling it after PSCI CPU_ON test has been executed.

Alexandru Elisei (1):
  arm/psci: Test that CPU 1 has been successfully brought online

Nikita Venkatesh (1):
  arm/psci: Add PSCI CPU_OFF test case

 arm/psci.c        | 132 +++++++++++++++++++++++++++++++++++++++-------
 lib/arm/asm/smp.h |   1 +
 lib/arm/smp.c     |  12 +++--
 lib/errata.h      |   2 +
 4 files changed, 125 insertions(+), 22 deletions(-)


base-commit: 2480430a36102f8ea276b3bfb1d64d5dacc23b8f
-- 
2.25.1


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [kvm-unit-tests PATCH v4 1/2] arm/psci: Test that CPU 1 has been successfully brought online
  2023-01-18 14:49 [kvm-unit-tests PATCH v4 0/2] arm: Add PSCI CPU_OFF test Alexandru Elisei
@ 2023-01-18 14:49 ` Alexandru Elisei
  2023-01-18 18:35   ` Andrew Jones
  2023-01-19 11:46   ` Andrew Jones
  2023-01-18 14:49 ` [kvm-unit-tests PATCH v4 2/2] arm/psci: Add PSCI CPU_OFF test case Alexandru Elisei
  1 sibling, 2 replies; 11+ messages in thread
From: Alexandru Elisei @ 2023-01-18 14:49 UTC (permalink / raw)
  To: andrew.jones, kvm, kvmarm

For the PSCI CPU_ON function test, all other CPUs perform a CPU_ON call
that target CPU 1. The test is considered a success if CPU_ON returns PSCI
SUCCESS exactly once, and for the rest of the calls PSCI ALREADY_ON.

Enhance the test by checking that CPU 1 is actually online and able to
execute code. Also make the test more robust by checking that the CPU_ON
call returns, instead of assuming that it will always succeed and
hanging indefinitely if it doesn't.

Since the CPU 1 thread is now being set up properly by kvm-unit-tests
when being brought online, it becomes possible to add other tests in the
future that require all CPUs.

The include header order in arm/psci.c has been changed to be in
alphabetic order. This means moving the errata.h include before
libcflat.h, which causes compilation to fail because of missing includes
in errata.h. Fix that also by including the needed header in errata.h.

Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
---
 arm/psci.c        | 60 ++++++++++++++++++++++++++++++++++++-----------
 lib/arm/asm/smp.h |  1 +
 lib/arm/smp.c     | 12 +++++++---
 lib/errata.h      |  2 ++
 4 files changed, 58 insertions(+), 17 deletions(-)

diff --git a/arm/psci.c b/arm/psci.c
index efa0722c0566..e96be941953b 100644
--- a/arm/psci.c
+++ b/arm/psci.c
@@ -7,11 +7,13 @@
  *
  * This work is licensed under the terms of the GNU LGPL, version 2.
  */
-#include <libcflat.h>
 #include <errata.h>
+#include <libcflat.h>
+
+#include <asm/delay.h>
 #include <asm/processor.h>
-#include <asm/smp.h>
 #include <asm/psci.h>
+#include <asm/smp.h>
 
 static bool invalid_function_exception;
 
@@ -72,14 +74,23 @@ static int cpu_on_ret[NR_CPUS];
 static cpumask_t cpu_on_ready, cpu_on_done;
 static volatile int cpu_on_start;
 
-static void cpu_on_secondary_entry(void)
+extern void secondary_entry(void);
+static void cpu_on_do_wake_target(void)
 {
 	int cpu = smp_processor_id();
 
 	cpumask_set_cpu(cpu, &cpu_on_ready);
 	while (!cpu_on_start)
 		cpu_relax();
-	cpu_on_ret[cpu] = psci_cpu_on(cpus[1], __pa(halt));
+	cpu_on_ret[cpu] = psci_cpu_on(cpus[1], __pa(secondary_entry));
+	cpumask_set_cpu(cpu, &cpu_on_done);
+}
+
+static void cpu_on_target(void)
+{
+	int cpu = smp_processor_id();
+
+	cpu_on_ret[cpu] = PSCI_RET_ALREADY_ON;
 	cpumask_set_cpu(cpu, &cpu_on_done);
 }
 
@@ -87,33 +98,53 @@ static bool psci_cpu_on_test(void)
 {
 	bool failed = false;
 	int ret_success = 0;
-	int cpu;
-
-	cpumask_set_cpu(1, &cpu_on_ready);
-	cpumask_set_cpu(1, &cpu_on_done);
+	int i, cpu;
 
 	for_each_present_cpu(cpu) {
 		if (cpu < 2)
 			continue;
-		smp_boot_secondary(cpu, cpu_on_secondary_entry);
+		smp_boot_secondary(cpu, cpu_on_do_wake_target);
 	}
 
 	cpumask_set_cpu(0, &cpu_on_ready);
+	cpumask_set_cpu(1, &cpu_on_ready);
 	while (!cpumask_full(&cpu_on_ready))
 		cpu_relax();
 
+	/*
+	 * Wait for all other CPUs to be online before configuring the thread
+	 * for the target CPU, as all secondaries are set up using the same
+	 * global variable.
+	 */
+	smp_prepare_secondary(1, cpu_on_target);
+
 	cpu_on_start = 1;
 	smp_mb();
 
-	cpu_on_ret[0] = psci_cpu_on(cpus[1], __pa(halt));
+	cpu_on_ret[0] = psci_cpu_on(cpus[1], __pa(secondary_entry));
 	cpumask_set_cpu(0, &cpu_on_done);
 
-	while (!cpumask_full(&cpu_on_done))
-		cpu_relax();
+	report_info("waiting for CPU1 to come online...");
+	for (i = 0; i < 10; i++) {
+		mdelay(100);
+		if (cpumask_full(&cpu_on_done))
+			break;
+	}
+
+	if (!cpumask_full(&cpu_on_done)) {
+		for_each_present_cpu(cpu) {
+			if (!cpumask_test_cpu(cpu, &cpu_on_done)) {
+				if (cpu == 1)
+					report_info("CPU1 failed to come online");
+				else
+					report_info("CPU%d failed to online CPU1", cpu);
+			}
+		}
+		failed = true;
+		goto out;
+	}
 
 	for_each_present_cpu(cpu) {
-		if (cpu == 1)
-			continue;
 		if (cpu_on_ret[cpu] == PSCI_RET_SUCCESS) {
 			ret_success++;
 		} else if (cpu_on_ret[cpu] != PSCI_RET_ALREADY_ON) {
@@ -127,6 +158,7 @@ static bool psci_cpu_on_test(void)
 		failed = true;
 	}
 
+out:
 	return !failed;
 }
 
diff --git a/lib/arm/asm/smp.h b/lib/arm/asm/smp.h
index 077afde85520..ff2ef8f88247 100644
--- a/lib/arm/asm/smp.h
+++ b/lib/arm/asm/smp.h
@@ -49,6 +49,7 @@ static inline void set_cpu_idle(int cpu, bool idle)
 }
 
 typedef void (*secondary_entry_fn)(void);
+extern void smp_prepare_secondary(int cpu, secondary_entry_fn entry);
 extern void smp_boot_secondary(int cpu, secondary_entry_fn entry);
 extern void on_cpu_async(int cpu, void (*func)(void *data), void *data);
 extern void on_cpu(int cpu, void (*func)(void *data), void *data);
diff --git a/lib/arm/smp.c b/lib/arm/smp.c
index 98a5054e039b..947f417f4aea 100644
--- a/lib/arm/smp.c
+++ b/lib/arm/smp.c
@@ -58,13 +58,19 @@ secondary_entry_fn secondary_cinit(void)
 	return entry;
 }
 
-static void __smp_boot_secondary(int cpu, secondary_entry_fn entry)
+void smp_prepare_secondary(int cpu, secondary_entry_fn entry)
 {
-	int ret;
-
 	secondary_data.stack = thread_stack_alloc();
 	secondary_data.entry = entry;
 	mmu_mark_disabled(cpu);
+}
+
+static void __smp_boot_secondary(int cpu, secondary_entry_fn entry)
+{
+	int ret;
+
+	smp_prepare_secondary(cpu, entry);
+
 	ret = cpu_psci_cpu_boot(cpu);
 	assert(ret == 0);
 
diff --git a/lib/errata.h b/lib/errata.h
index 5af0eb3bf8e2..de8205d8b370 100644
--- a/lib/errata.h
+++ b/lib/errata.h
@@ -6,6 +6,8 @@
  */
 #ifndef _ERRATA_H_
 #define _ERRATA_H_
+#include <libcflat.h>
+
 #include "config.h"
 
 #ifndef CONFIG_ERRATA_FORCE
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [kvm-unit-tests PATCH v4 2/2] arm/psci: Add PSCI CPU_OFF test case
  2023-01-18 14:49 [kvm-unit-tests PATCH v4 0/2] arm: Add PSCI CPU_OFF test Alexandru Elisei
  2023-01-18 14:49 ` [kvm-unit-tests PATCH v4 1/2] arm/psci: Test that CPU 1 has been successfully brought online Alexandru Elisei
@ 2023-01-18 14:49 ` Alexandru Elisei
  2023-01-18 18:48   ` Andrew Jones
  1 sibling, 1 reply; 11+ messages in thread
From: Alexandru Elisei @ 2023-01-18 14:49 UTC (permalink / raw)
  To: andrew.jones, kvm, kvmarm

From: Nikita Venkatesh <Nikita.Venkatesh@arm.com>

The test uses the following method.

The primary CPU brings up all the secondary CPUs, which are held in a wait
loop. Once the primary releases the CPUs, each of the secondary CPUs
proceed to issue PSCI_CPU_OFF. This is indicated by a cpumask and also the
status of the call is updated by the secondary CPU in cpu_off_done[].

The primary CPU waits for all the secondary CPUs to update the cpumask and
then proceeds to check for the status of the individual CPU CPU_OFF
request. There is a chance that some CPUs might fail at the CPU_OFF request
and come back and update the status once the primary CPU has finished the
scan. There is no fool proof method to handle this. As of now, we add a
1sec delay between the cpumask check and the scan for the status.

Signed-off-by: Nikita Venkatesh <Nikita.Venkatesh@arm.com>
[ Alex E: Skip CPU_OFF test if CPU_ON failed ]
Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
---
 arm/psci.c | 71 ++++++++++++++++++++++++++++++++++++++++++++++++++----
 1 file changed, 66 insertions(+), 5 deletions(-)

diff --git a/arm/psci.c b/arm/psci.c
index e96be941953b..d045616bfcd4 100644
--- a/arm/psci.c
+++ b/arm/psci.c
@@ -15,6 +15,8 @@
 #include <asm/psci.h>
 #include <asm/smp.h>
 
+#define CPU_OFF_TEST_WAIT_TIME 1000
+
 static bool invalid_function_exception;
 
 #ifdef __arm__
@@ -71,8 +73,10 @@ static bool psci_affinity_info_off(void)
 }
 
 static int cpu_on_ret[NR_CPUS];
-static cpumask_t cpu_on_ready, cpu_on_done;
+static bool cpu_off_success[NR_CPUS];
+static cpumask_t cpu_on_ready, cpu_on_done, cpu_off_done;
 static volatile int cpu_on_start;
+static volatile int cpu_off_start;
 
 extern void secondary_entry(void);
 static void cpu_on_do_wake_target(void)
@@ -94,6 +98,20 @@ static void cpu_on_target(void)
 	cpumask_set_cpu(cpu, &cpu_on_done);
 }
 
+static void cpu_off_secondary_entry(void *data)
+{
+	int cpu = smp_processor_id();
+
+	while (!cpu_off_start)
+		cpu_relax();
+	/* On to the CPU off test */
+	cpu_off_success[cpu] = true;
+	cpumask_set_cpu(cpu, &cpu_off_done);
+	cpu_psci_cpu_die();
+	/* The CPU shouldn't execute the next steps. */
+	cpu_off_success[cpu] = false;
+}
+
 static bool psci_cpu_on_test(void)
 {
 	bool failed = false;
@@ -162,9 +180,45 @@ out:
 	return !failed;
 }
 
-int main(void)
+static bool psci_cpu_off_test(void)
+{
+	bool failed = false;
+	int cpu;
+
+	for_each_present_cpu(cpu) {
+		if (cpu == 0)
+			continue;
+		on_cpu_async(cpu, cpu_off_secondary_entry, NULL);
+	}
+
+	cpumask_set_cpu(0, &cpu_off_done);
+
+	report_info("starting CPU_OFF test...");
+
+	cpu_off_start = 1;
+	while (!cpumask_full(&cpu_off_done))
+		cpu_relax();
+
+	/* Allow all the other CPUs to complete the operation */
+	mdelay(CPU_OFF_TEST_WAIT_TIME);
+
+	for_each_present_cpu(cpu) {
+		if (cpu == 0)
+			continue;
+
+		if (!cpu_off_success[cpu]) {
+			report_info("CPU%d could not be turned off", cpu);
+			failed = true;
+		}
+	}
+
+	return !failed;
+}
+
+int main(int argc, char **argv)
 {
 	int ver = psci_invoke(PSCI_0_2_FN_PSCI_VERSION, 0, 0, 0);
+	bool cpu_on_success = true;
 
 	report_prefix_push("psci");
 
@@ -179,10 +233,17 @@ int main(void)
 	report(psci_affinity_info_on(), "affinity-info-on");
 	report(psci_affinity_info_off(), "affinity-info-off");
 
-	if (ERRATA(6c7a5dce22b3))
-		report(psci_cpu_on_test(), "cpu-on");
-	else
+	if (ERRATA(6c7a5dce22b3)) {
+		cpu_on_success = psci_cpu_on_test();
+		report(cpu_on_success, "cpu-on");
+	} else {
 		report_skip("Skipping unsafe cpu-on test. Set ERRATA_6c7a5dce22b3=y to enable.");
+	}
+
+	if (!cpu_on_success)
+		report_skip("Skipping cpu-off test because the cpu-on test failed");
+	else
+		report(psci_cpu_off_test(), "cpu-off");
 
 done:
 #if 0
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [kvm-unit-tests PATCH v4 1/2] arm/psci: Test that CPU 1 has been successfully brought online
  2023-01-18 14:49 ` [kvm-unit-tests PATCH v4 1/2] arm/psci: Test that CPU 1 has been successfully brought online Alexandru Elisei
@ 2023-01-18 18:35   ` Andrew Jones
  2023-01-19 10:32     ` Alexandru Elisei
  2023-01-19 11:46   ` Andrew Jones
  1 sibling, 1 reply; 11+ messages in thread
From: Andrew Jones @ 2023-01-18 18:35 UTC (permalink / raw)
  To: Alexandru Elisei; +Cc: kvm, kvmarm

On Wed, Jan 18, 2023 at 02:49:11PM +0000, Alexandru Elisei wrote:
> For the PSCI CPU_ON function test, all other CPUs perform a CPU_ON call
> that target CPU 1. The test is considered a success if CPU_ON returns PSCI
> SUCCESS exactly once, and for the rest of the calls PSCI ALREADY_ON.
> 
> Enhance the test by checking that CPU 1 is actually online and able to
> execute code. Also make the test more robust by checking that the CPU_ON
> call returns, instead of assuming that it will always succeed and
> hanging indefinitely if it doesn't.
> 
> Since the CPU 1 thread is now being set up properly by kvm-unit-tests
> when being brought online, it becomes possible to add other tests in the
> future that require all CPUs.
> 
> The include header order in arm/psci.c has been changed to be in
> alphabetic order. This means moving the errata.h include before
> libcflat.h, which causes compilation to fail because of missing includes
> in errata.h. Fix that also by including the needed header in errata.h.
> 
> Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
> ---
>  arm/psci.c        | 60 ++++++++++++++++++++++++++++++++++++-----------
>  lib/arm/asm/smp.h |  1 +
>  lib/arm/smp.c     | 12 +++++++---
>  lib/errata.h      |  2 ++
>  4 files changed, 58 insertions(+), 17 deletions(-)
> 
> diff --git a/arm/psci.c b/arm/psci.c
> index efa0722c0566..e96be941953b 100644
> --- a/arm/psci.c
> +++ b/arm/psci.c
> @@ -7,11 +7,13 @@
>   *
>   * This work is licensed under the terms of the GNU LGPL, version 2.
>   */
> -#include <libcflat.h>
>  #include <errata.h>
> +#include <libcflat.h>
> +
> +#include <asm/delay.h>
>  #include <asm/processor.h>
> -#include <asm/smp.h>
>  #include <asm/psci.h>
> +#include <asm/smp.h>
>  
>  static bool invalid_function_exception;
>  
> @@ -72,14 +74,23 @@ static int cpu_on_ret[NR_CPUS];
>  static cpumask_t cpu_on_ready, cpu_on_done;
>  static volatile int cpu_on_start;
>  
> -static void cpu_on_secondary_entry(void)
> +extern void secondary_entry(void);
> +static void cpu_on_do_wake_target(void)
>  {
>  	int cpu = smp_processor_id();
>  
>  	cpumask_set_cpu(cpu, &cpu_on_ready);
>  	while (!cpu_on_start)
>  		cpu_relax();
> -	cpu_on_ret[cpu] = psci_cpu_on(cpus[1], __pa(halt));
> +	cpu_on_ret[cpu] = psci_cpu_on(cpus[1], __pa(secondary_entry));
> +	cpumask_set_cpu(cpu, &cpu_on_done);
> +}
> +
> +static void cpu_on_target(void)
> +{
> +	int cpu = smp_processor_id();
> +
> +	cpu_on_ret[cpu] = PSCI_RET_ALREADY_ON;

I'm not sure this is better than just skipping cpu1 in the check loop, as
is done now, but OK. 

>  	cpumask_set_cpu(cpu, &cpu_on_done);
>  }
>  
> @@ -87,33 +98,53 @@ static bool psci_cpu_on_test(void)
>  {
>  	bool failed = false;
>  	int ret_success = 0;
> -	int cpu;
> -
> -	cpumask_set_cpu(1, &cpu_on_ready);
> -	cpumask_set_cpu(1, &cpu_on_done);
> +	int i, cpu;
>  
>  	for_each_present_cpu(cpu) {
>  		if (cpu < 2)
>  			continue;
> -		smp_boot_secondary(cpu, cpu_on_secondary_entry);
> +		smp_boot_secondary(cpu, cpu_on_do_wake_target);
>  	}
>  
>  	cpumask_set_cpu(0, &cpu_on_ready);
> +	cpumask_set_cpu(1, &cpu_on_ready);
>  	while (!cpumask_full(&cpu_on_ready))
>  		cpu_relax();
>  
> +	/*
> +	 * Wait for all other CPUs to be online before configuring the thread
> +	 * for the target CPU, as all secondaries are set up using the same
> +	 * global variable.
> +	 */

This comment says "Wait", so I'd move the while loop under it.

> +	smp_prepare_secondary(1, cpu_on_target);

This new smp_prepare_secondary() function should be local to this unit
test, please see my justification below.

> +
>  	cpu_on_start = 1;
>  	smp_mb();
>  
> -	cpu_on_ret[0] = psci_cpu_on(cpus[1], __pa(halt));
> +	cpu_on_ret[0] = psci_cpu_on(cpus[1], __pa(secondary_entry));
>  	cpumask_set_cpu(0, &cpu_on_done);
>  
> -	while (!cpumask_full(&cpu_on_done))
> -		cpu_relax();
> +	report_info("waiting for CPU1 to come online...");
> +	for (i = 0; i < 10; i++) {
> +		mdelay(100);
> +		if (cpumask_full(&cpu_on_done))
> +			break;
> +	}
> +
> +	if (!cpumask_full(&cpu_on_done)) {
> +		for_each_present_cpu(cpu) {
> +			if (!cpumask_test_cpu(cpu, &cpu_on_done)) {
> +				if (cpu == 1)
> +					report_info("CPU1 failed to come online");
> +				else
> +					report_info("CPU%d failed to online CPU1", cpu);
> +			}
> +		}
> +		failed = true;
> +		goto out;

We could still run the other checks below, perhaps guarded with
cpumask_test_cpu(cpu, &cpu_on_done), rather than bail early. I'm
also OK with bailing early though. But, for that, I'd just return
false right here rather than introduce the goto.

> +	}
>  
>  	for_each_present_cpu(cpu) {
> -		if (cpu == 1)
> -			continue;
>  		if (cpu_on_ret[cpu] == PSCI_RET_SUCCESS) {
>  			ret_success++;
>  		} else if (cpu_on_ret[cpu] != PSCI_RET_ALREADY_ON) {
> @@ -127,6 +158,7 @@ static bool psci_cpu_on_test(void)
>  		failed = true;
>  	}
>  
> +out:
>  	return !failed;
>  }
>  
> diff --git a/lib/arm/asm/smp.h b/lib/arm/asm/smp.h
> index 077afde85520..ff2ef8f88247 100644
> --- a/lib/arm/asm/smp.h
> +++ b/lib/arm/asm/smp.h
> @@ -49,6 +49,7 @@ static inline void set_cpu_idle(int cpu, bool idle)
>  }
>  
>  typedef void (*secondary_entry_fn)(void);
> +extern void smp_prepare_secondary(int cpu, secondary_entry_fn entry);
>  extern void smp_boot_secondary(int cpu, secondary_entry_fn entry);
>  extern void on_cpu_async(int cpu, void (*func)(void *data), void *data);
>  extern void on_cpu(int cpu, void (*func)(void *data), void *data);
> diff --git a/lib/arm/smp.c b/lib/arm/smp.c
> index 98a5054e039b..947f417f4aea 100644
> --- a/lib/arm/smp.c
> +++ b/lib/arm/smp.c
> @@ -58,13 +58,19 @@ secondary_entry_fn secondary_cinit(void)
>  	return entry;
>  }
>  
> -static void __smp_boot_secondary(int cpu, secondary_entry_fn entry)
> +void smp_prepare_secondary(int cpu, secondary_entry_fn entry)

I'd rather not create an unsafe library function, especially one named
without the leading underscores. It's OK for a unit test to duplicate
__smp_boot_secondary() (secondary_data is available), but then the unit
test also needs to ensure it does its own synchronization, as you do
with the wait on cpu_on_ready already.

>  {
> -	int ret;
> -
>  	secondary_data.stack = thread_stack_alloc();
>  	secondary_data.entry = entry;
>  	mmu_mark_disabled(cpu);
> +}
> +
> +static void __smp_boot_secondary(int cpu, secondary_entry_fn entry)
> +{
> +	int ret;
> +
> +	smp_prepare_secondary(cpu, entry);
> +
>  	ret = cpu_psci_cpu_boot(cpu);
>  	assert(ret == 0);
>  
> diff --git a/lib/errata.h b/lib/errata.h
> index 5af0eb3bf8e2..de8205d8b370 100644
> --- a/lib/errata.h
> +++ b/lib/errata.h
> @@ -6,6 +6,8 @@
>   */
>  #ifndef _ERRATA_H_
>  #define _ERRATA_H_
> +#include <libcflat.h>
> +
>  #include "config.h"
>  
>  #ifndef CONFIG_ERRATA_FORCE
> -- 
> 2.25.1
>

Thanks,
drew

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [kvm-unit-tests PATCH v4 2/2] arm/psci: Add PSCI CPU_OFF test case
  2023-01-18 14:49 ` [kvm-unit-tests PATCH v4 2/2] arm/psci: Add PSCI CPU_OFF test case Alexandru Elisei
@ 2023-01-18 18:48   ` Andrew Jones
  2023-01-19 10:35     ` Alexandru Elisei
  0 siblings, 1 reply; 11+ messages in thread
From: Andrew Jones @ 2023-01-18 18:48 UTC (permalink / raw)
  To: Alexandru Elisei; +Cc: kvm, kvmarm

On Wed, Jan 18, 2023 at 02:49:12PM +0000, Alexandru Elisei wrote:
> From: Nikita Venkatesh <Nikita.Venkatesh@arm.com>
> 
> The test uses the following method.
> 
> The primary CPU brings up all the secondary CPUs, which are held in a wait
> loop. Once the primary releases the CPUs, each of the secondary CPUs
> proceed to issue PSCI_CPU_OFF. This is indicated by a cpumask and also the
> status of the call is updated by the secondary CPU in cpu_off_done[].
> 
> The primary CPU waits for all the secondary CPUs to update the cpumask and
> then proceeds to check for the status of the individual CPU CPU_OFF
> request. There is a chance that some CPUs might fail at the CPU_OFF request
> and come back and update the status once the primary CPU has finished the
> scan. There is no fool proof method to handle this. As of now, we add a
> 1sec delay between the cpumask check and the scan for the status.
> 
> Signed-off-by: Nikita Venkatesh <Nikita.Venkatesh@arm.com>
> [ Alex E: Skip CPU_OFF test if CPU_ON failed ]
> Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
> ---
>  arm/psci.c | 71 ++++++++++++++++++++++++++++++++++++++++++++++++++----
>  1 file changed, 66 insertions(+), 5 deletions(-)
> 
> diff --git a/arm/psci.c b/arm/psci.c
> index e96be941953b..d045616bfcd4 100644
> --- a/arm/psci.c
> +++ b/arm/psci.c
> @@ -15,6 +15,8 @@
>  #include <asm/psci.h>
>  #include <asm/smp.h>
>  
> +#define CPU_OFF_TEST_WAIT_TIME 1000
> +
>  static bool invalid_function_exception;
>  
>  #ifdef __arm__
> @@ -71,8 +73,10 @@ static bool psci_affinity_info_off(void)
>  }
>  
>  static int cpu_on_ret[NR_CPUS];
> -static cpumask_t cpu_on_ready, cpu_on_done;
> +static bool cpu_off_success[NR_CPUS];
> +static cpumask_t cpu_on_ready, cpu_on_done, cpu_off_done;
>  static volatile int cpu_on_start;
> +static volatile int cpu_off_start;
>  
>  extern void secondary_entry(void);
>  static void cpu_on_do_wake_target(void)
> @@ -94,6 +98,20 @@ static void cpu_on_target(void)
>  	cpumask_set_cpu(cpu, &cpu_on_done);
>  }
>  
> +static void cpu_off_secondary_entry(void *data)
> +{
> +	int cpu = smp_processor_id();
> +
> +	while (!cpu_off_start)
> +		cpu_relax();
> +	/* On to the CPU off test */
> +	cpu_off_success[cpu] = true;
> +	cpumask_set_cpu(cpu, &cpu_off_done);
> +	cpu_psci_cpu_die();
> +	/* The CPU shouldn't execute the next steps. */
> +	cpu_off_success[cpu] = false;
> +}
> +
>  static bool psci_cpu_on_test(void)
>  {
>  	bool failed = false;
> @@ -162,9 +180,45 @@ out:
>  	return !failed;
>  }
>  
> -int main(void)
> +static bool psci_cpu_off_test(void)
> +{
> +	bool failed = false;
> +	int cpu;
> +
> +	for_each_present_cpu(cpu) {
> +		if (cpu == 0)
> +			continue;
> +		on_cpu_async(cpu, cpu_off_secondary_entry, NULL);
> +	}
> +
> +	cpumask_set_cpu(0, &cpu_off_done);

Since we're setting cpu_off_done for cpu0, then we could also set
cpu_off_success[0] = true and not have to skip it in the check loop
below.

> +
> +	report_info("starting CPU_OFF test...");
> +
> +	cpu_off_start = 1;
> +	while (!cpumask_full(&cpu_off_done))
> +		cpu_relax();
> +
> +	/* Allow all the other CPUs to complete the operation */
> +	mdelay(CPU_OFF_TEST_WAIT_TIME);

Don't really need the define, just the numbers work for stuff
like this, but OK.

> +
> +	for_each_present_cpu(cpu) {
> +		if (cpu == 0)
> +			continue;
> +
> +		if (!cpu_off_success[cpu]) {
> +			report_info("CPU%d could not be turned off", cpu);
> +			failed = true;
> +		}
> +	}
> +
> +	return !failed;
> +}
> +
> +int main(int argc, char **argv)
>  {
>  	int ver = psci_invoke(PSCI_0_2_FN_PSCI_VERSION, 0, 0, 0);
> +	bool cpu_on_success = true;
>  
>  	report_prefix_push("psci");
>  
> @@ -179,10 +233,17 @@ int main(void)
>  	report(psci_affinity_info_on(), "affinity-info-on");
>  	report(psci_affinity_info_off(), "affinity-info-off");
>  
> -	if (ERRATA(6c7a5dce22b3))
> -		report(psci_cpu_on_test(), "cpu-on");
> -	else
> +	if (ERRATA(6c7a5dce22b3)) {
> +		cpu_on_success = psci_cpu_on_test();
> +		report(cpu_on_success, "cpu-on");
> +	} else {
>  		report_skip("Skipping unsafe cpu-on test. Set ERRATA_6c7a5dce22b3=y to enable.");
> +	}
> +
> +	if (!cpu_on_success)
> +		report_skip("Skipping cpu-off test because the cpu-on test failed");
> +	else
> +		report(psci_cpu_off_test(), "cpu-off");
>  
>  done:
>  #if 0
> -- 
> 2.25.1
> 

Besides the nits,

Reviewed-by: Andrew Jones <andrew.jones@linux.dev>

Thanks,
drew

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [kvm-unit-tests PATCH v4 1/2] arm/psci: Test that CPU 1 has been successfully brought online
  2023-01-18 18:35   ` Andrew Jones
@ 2023-01-19 10:32     ` Alexandru Elisei
  2023-01-19 11:01       ` Andrew Jones
  0 siblings, 1 reply; 11+ messages in thread
From: Alexandru Elisei @ 2023-01-19 10:32 UTC (permalink / raw)
  To: Andrew Jones; +Cc: kvm, kvmarm

Hi Drew,

Thanks for the quick review!

On Wed, Jan 18, 2023 at 07:35:12PM +0100, Andrew Jones wrote:
> On Wed, Jan 18, 2023 at 02:49:11PM +0000, Alexandru Elisei wrote:
> > For the PSCI CPU_ON function test, all other CPUs perform a CPU_ON call
> > that target CPU 1. The test is considered a success if CPU_ON returns PSCI
> > SUCCESS exactly once, and for the rest of the calls PSCI ALREADY_ON.
> > 
> > Enhance the test by checking that CPU 1 is actually online and able to
> > execute code. Also make the test more robust by checking that the CPU_ON
> > call returns, instead of assuming that it will always succeed and
> > hanging indefinitely if it doesn't.
> > 
> > Since the CPU 1 thread is now being set up properly by kvm-unit-tests
> > when being brought online, it becomes possible to add other tests in the
> > future that require all CPUs.
> > 
> > The include header order in arm/psci.c has been changed to be in
> > alphabetic order. This means moving the errata.h include before
> > libcflat.h, which causes compilation to fail because of missing includes
> > in errata.h. Fix that also by including the needed header in errata.h.
> > 
> > Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
> > ---
> >  arm/psci.c        | 60 ++++++++++++++++++++++++++++++++++++-----------
> >  lib/arm/asm/smp.h |  1 +
> >  lib/arm/smp.c     | 12 +++++++---
> >  lib/errata.h      |  2 ++
> >  4 files changed, 58 insertions(+), 17 deletions(-)
> > 
> > diff --git a/arm/psci.c b/arm/psci.c
> > index efa0722c0566..e96be941953b 100644
> > --- a/arm/psci.c
> > +++ b/arm/psci.c
> > @@ -7,11 +7,13 @@
> >   *
> >   * This work is licensed under the terms of the GNU LGPL, version 2.
> >   */
> > -#include <libcflat.h>
> >  #include <errata.h>
> > +#include <libcflat.h>
> > +
> > +#include <asm/delay.h>
> >  #include <asm/processor.h>
> > -#include <asm/smp.h>
> >  #include <asm/psci.h>
> > +#include <asm/smp.h>
> >  
> >  static bool invalid_function_exception;
> >  
> > @@ -72,14 +74,23 @@ static int cpu_on_ret[NR_CPUS];
> >  static cpumask_t cpu_on_ready, cpu_on_done;
> >  static volatile int cpu_on_start;
> >  
> > -static void cpu_on_secondary_entry(void)
> > +extern void secondary_entry(void);
> > +static void cpu_on_do_wake_target(void)
> >  {
> >  	int cpu = smp_processor_id();
> >  
> >  	cpumask_set_cpu(cpu, &cpu_on_ready);
> >  	while (!cpu_on_start)
> >  		cpu_relax();
> > -	cpu_on_ret[cpu] = psci_cpu_on(cpus[1], __pa(halt));
> > +	cpu_on_ret[cpu] = psci_cpu_on(cpus[1], __pa(secondary_entry));
> > +	cpumask_set_cpu(cpu, &cpu_on_done);
> > +}
> > +
> > +static void cpu_on_target(void)
> > +{
> > +	int cpu = smp_processor_id();
> > +
> > +	cpu_on_ret[cpu] = PSCI_RET_ALREADY_ON;
> 
> I'm not sure this is better than just skipping cpu1 in the check loop, as
> is done now, but OK. 

I think that's a good idea, I'll remove this and skip CPU 1 in the check
loop, because CPU 1 never invokes psci_cpu_on for itself.

> 
> >  	cpumask_set_cpu(cpu, &cpu_on_done);
> >  }
> >  
> > @@ -87,33 +98,53 @@ static bool psci_cpu_on_test(void)
> >  {
> >  	bool failed = false;
> >  	int ret_success = 0;
> > -	int cpu;
> > -
> > -	cpumask_set_cpu(1, &cpu_on_ready);
> > -	cpumask_set_cpu(1, &cpu_on_done);
> > +	int i, cpu;
> >  
> >  	for_each_present_cpu(cpu) {
> >  		if (cpu < 2)
> >  			continue;
> > -		smp_boot_secondary(cpu, cpu_on_secondary_entry);
> > +		smp_boot_secondary(cpu, cpu_on_do_wake_target);
> >  	}
> >  
> >  	cpumask_set_cpu(0, &cpu_on_ready);
> > +	cpumask_set_cpu(1, &cpu_on_ready);
> >  	while (!cpumask_full(&cpu_on_ready))
> >  		cpu_relax();
> >  
> > +	/*
> > +	 * Wait for all other CPUs to be online before configuring the thread
> > +	 * for the target CPU, as all secondaries are set up using the same
> > +	 * global variable.
> > +	 */
> 
> This comment says "Wait", so I'd move the while loop under it.

Can I reword it to keep it here?

	/*
	 * Configure CPU 1 after all the secondaries are online to avoid
	 * secondary_data being overwritten.
	 */

What do you think?

> 
> > +	smp_prepare_secondary(1, cpu_on_target);
> 
> This new smp_prepare_secondary() function should be local to this unit
> test, please see my justification below.

Reply below.

> 
> > +
> >  	cpu_on_start = 1;
> >  	smp_mb();
> >  
> > -	cpu_on_ret[0] = psci_cpu_on(cpus[1], __pa(halt));
> > +	cpu_on_ret[0] = psci_cpu_on(cpus[1], __pa(secondary_entry));
> >  	cpumask_set_cpu(0, &cpu_on_done);
> >  
> > -	while (!cpumask_full(&cpu_on_done))
> > -		cpu_relax();
> > +	report_info("waiting for CPU1 to come online...");
> > +	for (i = 0; i < 10; i++) {
> > +		mdelay(100);
> > +		if (cpumask_full(&cpu_on_done))
> > +			break;
> > +	}
> > +
> > +	if (!cpumask_full(&cpu_on_done)) {
> > +		for_each_present_cpu(cpu) {
> > +			if (!cpumask_test_cpu(cpu, &cpu_on_done)) {
> > +				if (cpu == 1)
> > +					report_info("CPU1 failed to come online");
> > +				else
> > +					report_info("CPU%d failed to online CPU1", cpu);
> > +			}
> > +		}
> > +		failed = true;
> > +		goto out;
> 
> We could still run the other checks below, perhaps guarded with
> cpumask_test_cpu(cpu, &cpu_on_done), rather than bail early. I'm
> also OK with bailing early though. But, for that, I'd just return
> false right here rather than introduce the goto.

I like the idea of checking the return value for the CPU_ON calls that have
returned (I'll use for_each_cpu(cpu, &cpu_on _done) in the for loop below).

> 
> > +	}
> >  
> >  	for_each_present_cpu(cpu) {
> > -		if (cpu == 1)
> > -			continue;
> >  		if (cpu_on_ret[cpu] == PSCI_RET_SUCCESS) {
> >  			ret_success++;
> >  		} else if (cpu_on_ret[cpu] != PSCI_RET_ALREADY_ON) {
> > @@ -127,6 +158,7 @@ static bool psci_cpu_on_test(void)
> >  		failed = true;
> >  	}
> >  
> > +out:
> >  	return !failed;
> >  }
> >  
> > diff --git a/lib/arm/asm/smp.h b/lib/arm/asm/smp.h
> > index 077afde85520..ff2ef8f88247 100644
> > --- a/lib/arm/asm/smp.h
> > +++ b/lib/arm/asm/smp.h
> > @@ -49,6 +49,7 @@ static inline void set_cpu_idle(int cpu, bool idle)
> >  }
> >  
> >  typedef void (*secondary_entry_fn)(void);
> > +extern void smp_prepare_secondary(int cpu, secondary_entry_fn entry);
> >  extern void smp_boot_secondary(int cpu, secondary_entry_fn entry);
> >  extern void on_cpu_async(int cpu, void (*func)(void *data), void *data);
> >  extern void on_cpu(int cpu, void (*func)(void *data), void *data);
> > diff --git a/lib/arm/smp.c b/lib/arm/smp.c
> > index 98a5054e039b..947f417f4aea 100644
> > --- a/lib/arm/smp.c
> > +++ b/lib/arm/smp.c
> > @@ -58,13 +58,19 @@ secondary_entry_fn secondary_cinit(void)
> >  	return entry;
> >  }
> >  
> > -static void __smp_boot_secondary(int cpu, secondary_entry_fn entry)
> > +void smp_prepare_secondary(int cpu, secondary_entry_fn entry)
> 
> I'd rather not create an unsafe library function, especially one named
> without the leading underscores. It's OK for a unit test to duplicate
> __smp_boot_secondary() (secondary_data is available), but then the unit
> test also needs to ensure it does its own synchronization, as you do
> with the wait on cpu_on_ready already.

I created the function to share the code between __smp_boot_secondary and
the PSCI test. Cache maintenance would have to be added here (one CPU
writes to secondary_data, another CPU reads from it, possible with the MMU
off), and I would prefer to keep it in one place, in the library code,
rather than expose the boot cache maintenance requirements to tests.

If I rename it to __smp_boot_secondary(), would you find that acceptable?

Thanks,
Alex

> 
> >  {
> > -	int ret;
> > -
> >  	secondary_data.stack = thread_stack_alloc();
> >  	secondary_data.entry = entry;
> >  	mmu_mark_disabled(cpu);
> > +}
> > +
> > +static void __smp_boot_secondary(int cpu, secondary_entry_fn entry)
> > +{
> > +	int ret;
> > +
> > +	smp_prepare_secondary(cpu, entry);
> > +
> >  	ret = cpu_psci_cpu_boot(cpu);
> >  	assert(ret == 0);
> >  
> > diff --git a/lib/errata.h b/lib/errata.h
> > index 5af0eb3bf8e2..de8205d8b370 100644
> > --- a/lib/errata.h
> > +++ b/lib/errata.h
> > @@ -6,6 +6,8 @@
> >   */
> >  #ifndef _ERRATA_H_
> >  #define _ERRATA_H_
> > +#include <libcflat.h>
> > +
> >  #include "config.h"
> >  
> >  #ifndef CONFIG_ERRATA_FORCE
> > -- 
> > 2.25.1
> >
> 
> Thanks,
> drew

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [kvm-unit-tests PATCH v4 2/2] arm/psci: Add PSCI CPU_OFF test case
  2023-01-18 18:48   ` Andrew Jones
@ 2023-01-19 10:35     ` Alexandru Elisei
  2023-01-19 11:36       ` Andrew Jones
  0 siblings, 1 reply; 11+ messages in thread
From: Alexandru Elisei @ 2023-01-19 10:35 UTC (permalink / raw)
  To: Andrew Jones; +Cc: kvm, kvmarm

Hi Drew,

On Wed, Jan 18, 2023 at 07:48:21PM +0100, Andrew Jones wrote:
> On Wed, Jan 18, 2023 at 02:49:12PM +0000, Alexandru Elisei wrote:
> > From: Nikita Venkatesh <Nikita.Venkatesh@arm.com>
> > 
> > The test uses the following method.
> > 
> > The primary CPU brings up all the secondary CPUs, which are held in a wait
> > loop. Once the primary releases the CPUs, each of the secondary CPUs
> > proceed to issue PSCI_CPU_OFF. This is indicated by a cpumask and also the
> > status of the call is updated by the secondary CPU in cpu_off_done[].
> > 
> > The primary CPU waits for all the secondary CPUs to update the cpumask and
> > then proceeds to check for the status of the individual CPU CPU_OFF
> > request. There is a chance that some CPUs might fail at the CPU_OFF request
> > and come back and update the status once the primary CPU has finished the
> > scan. There is no fool proof method to handle this. As of now, we add a
> > 1sec delay between the cpumask check and the scan for the status.
> > 
> > Signed-off-by: Nikita Venkatesh <Nikita.Venkatesh@arm.com>
> > [ Alex E: Skip CPU_OFF test if CPU_ON failed ]
> > Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
> > ---
> >  arm/psci.c | 71 ++++++++++++++++++++++++++++++++++++++++++++++++++----
> >  1 file changed, 66 insertions(+), 5 deletions(-)
> > 
> > diff --git a/arm/psci.c b/arm/psci.c
> > index e96be941953b..d045616bfcd4 100644
> > --- a/arm/psci.c
> > +++ b/arm/psci.c
> > @@ -15,6 +15,8 @@
> >  #include <asm/psci.h>
> >  #include <asm/smp.h>
> >  
> > +#define CPU_OFF_TEST_WAIT_TIME 1000
> > +
> >  static bool invalid_function_exception;
> >  
> >  #ifdef __arm__
> > @@ -71,8 +73,10 @@ static bool psci_affinity_info_off(void)
> >  }
> >  
> >  static int cpu_on_ret[NR_CPUS];
> > -static cpumask_t cpu_on_ready, cpu_on_done;
> > +static bool cpu_off_success[NR_CPUS];
> > +static cpumask_t cpu_on_ready, cpu_on_done, cpu_off_done;
> >  static volatile int cpu_on_start;
> > +static volatile int cpu_off_start;
> >  
> >  extern void secondary_entry(void);
> >  static void cpu_on_do_wake_target(void)
> > @@ -94,6 +98,20 @@ static void cpu_on_target(void)
> >  	cpumask_set_cpu(cpu, &cpu_on_done);
> >  }
> >  
> > +static void cpu_off_secondary_entry(void *data)
> > +{
> > +	int cpu = smp_processor_id();
> > +
> > +	while (!cpu_off_start)
> > +		cpu_relax();
> > +	/* On to the CPU off test */
> > +	cpu_off_success[cpu] = true;
> > +	cpumask_set_cpu(cpu, &cpu_off_done);
> > +	cpu_psci_cpu_die();
> > +	/* The CPU shouldn't execute the next steps. */
> > +	cpu_off_success[cpu] = false;
> > +}
> > +
> >  static bool psci_cpu_on_test(void)
> >  {
> >  	bool failed = false;
> > @@ -162,9 +180,45 @@ out:
> >  	return !failed;
> >  }
> >  
> > -int main(void)
> > +static bool psci_cpu_off_test(void)
> > +{
> > +	bool failed = false;
> > +	int cpu;
> > +
> > +	for_each_present_cpu(cpu) {
> > +		if (cpu == 0)
> > +			continue;
> > +		on_cpu_async(cpu, cpu_off_secondary_entry, NULL);
> > +	}
> > +
> > +	cpumask_set_cpu(0, &cpu_off_done);
> 
> Since we're setting cpu_off_done for cpu0, then we could also set
> cpu_off_success[0] = true and not have to skip it in the check loop
> below.

I would prefer not to, since CPU 0 never invokes CPU_OFF for itself and setting
cpu_off_success to true for CPU 0 might get confusing. Unless you insist :)

> 
> > +
> > +	report_info("starting CPU_OFF test...");
> > +
> > +	cpu_off_start = 1;
> > +	while (!cpumask_full(&cpu_off_done))
> > +		cpu_relax();
> > +
> > +	/* Allow all the other CPUs to complete the operation */
> > +	mdelay(CPU_OFF_TEST_WAIT_TIME);
> 
> Don't really need the define, just the numbers work for stuff
> like this, but OK.

I'll get rid of the define.

You ok with waiting for 1 second for each test run? I remember you had
objections when I added a similar delay to the timer tests.

> 
> > +
> > +	for_each_present_cpu(cpu) {
> > +		if (cpu == 0)
> > +			continue;
> > +
> > +		if (!cpu_off_success[cpu]) {
> > +			report_info("CPU%d could not be turned off", cpu);
> > +			failed = true;
> > +		}
> > +	}
> > +
> > +	return !failed;
> > +}
> > +
> > +int main(int argc, char **argv)
> >  {
> >  	int ver = psci_invoke(PSCI_0_2_FN_PSCI_VERSION, 0, 0, 0);
> > +	bool cpu_on_success = true;
> >  
> >  	report_prefix_push("psci");
> >  
> > @@ -179,10 +233,17 @@ int main(void)
> >  	report(psci_affinity_info_on(), "affinity-info-on");
> >  	report(psci_affinity_info_off(), "affinity-info-off");
> >  
> > -	if (ERRATA(6c7a5dce22b3))
> > -		report(psci_cpu_on_test(), "cpu-on");
> > -	else
> > +	if (ERRATA(6c7a5dce22b3)) {
> > +		cpu_on_success = psci_cpu_on_test();
> > +		report(cpu_on_success, "cpu-on");
> > +	} else {
> >  		report_skip("Skipping unsafe cpu-on test. Set ERRATA_6c7a5dce22b3=y to enable.");
> > +	}
> > +
> > +	if (!cpu_on_success)
> > +		report_skip("Skipping cpu-off test because the cpu-on test failed");
> > +	else
> > +		report(psci_cpu_off_test(), "cpu-off");
> >  
> >  done:
> >  #if 0
> > -- 
> > 2.25.1
> > 
> 
> Besides the nits,
> 
> Reviewed-by: Andrew Jones <andrew.jones@linux.dev>

Thanks!

Alex

> 
> Thanks,
> drew

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [kvm-unit-tests PATCH v4 1/2] arm/psci: Test that CPU 1 has been successfully brought online
  2023-01-19 10:32     ` Alexandru Elisei
@ 2023-01-19 11:01       ` Andrew Jones
  0 siblings, 0 replies; 11+ messages in thread
From: Andrew Jones @ 2023-01-19 11:01 UTC (permalink / raw)
  To: Alexandru Elisei; +Cc: kvm, kvmarm

On Thu, Jan 19, 2023 at 10:32:07AM +0000, Alexandru Elisei wrote:
> Hi Drew,
> 
> Thanks for the quick review!
> 
> On Wed, Jan 18, 2023 at 07:35:12PM +0100, Andrew Jones wrote:
> > On Wed, Jan 18, 2023 at 02:49:11PM +0000, Alexandru Elisei wrote:
> > > For the PSCI CPU_ON function test, all other CPUs perform a CPU_ON call
> > > that target CPU 1. The test is considered a success if CPU_ON returns PSCI
> > > SUCCESS exactly once, and for the rest of the calls PSCI ALREADY_ON.
> > > 
> > > Enhance the test by checking that CPU 1 is actually online and able to
> > > execute code. Also make the test more robust by checking that the CPU_ON
> > > call returns, instead of assuming that it will always succeed and
> > > hanging indefinitely if it doesn't.
> > > 
> > > Since the CPU 1 thread is now being set up properly by kvm-unit-tests
> > > when being brought online, it becomes possible to add other tests in the
> > > future that require all CPUs.
> > > 
> > > The include header order in arm/psci.c has been changed to be in
> > > alphabetic order. This means moving the errata.h include before
> > > libcflat.h, which causes compilation to fail because of missing includes
> > > in errata.h. Fix that also by including the needed header in errata.h.
> > > 
> > > Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
> > > ---
> > >  arm/psci.c        | 60 ++++++++++++++++++++++++++++++++++++-----------
> > >  lib/arm/asm/smp.h |  1 +
> > >  lib/arm/smp.c     | 12 +++++++---
> > >  lib/errata.h      |  2 ++
> > >  4 files changed, 58 insertions(+), 17 deletions(-)
> > > 
> > > diff --git a/arm/psci.c b/arm/psci.c
> > > index efa0722c0566..e96be941953b 100644
> > > --- a/arm/psci.c
> > > +++ b/arm/psci.c
> > > @@ -7,11 +7,13 @@
> > >   *
> > >   * This work is licensed under the terms of the GNU LGPL, version 2.
> > >   */
> > > -#include <libcflat.h>
> > >  #include <errata.h>
> > > +#include <libcflat.h>
> > > +
> > > +#include <asm/delay.h>
> > >  #include <asm/processor.h>
> > > -#include <asm/smp.h>
> > >  #include <asm/psci.h>
> > > +#include <asm/smp.h>
> > >  
> > >  static bool invalid_function_exception;
> > >  
> > > @@ -72,14 +74,23 @@ static int cpu_on_ret[NR_CPUS];
> > >  static cpumask_t cpu_on_ready, cpu_on_done;
> > >  static volatile int cpu_on_start;
> > >  
> > > -static void cpu_on_secondary_entry(void)
> > > +extern void secondary_entry(void);
> > > +static void cpu_on_do_wake_target(void)
> > >  {
> > >  	int cpu = smp_processor_id();
> > >  
> > >  	cpumask_set_cpu(cpu, &cpu_on_ready);
> > >  	while (!cpu_on_start)
> > >  		cpu_relax();
> > > -	cpu_on_ret[cpu] = psci_cpu_on(cpus[1], __pa(halt));
> > > +	cpu_on_ret[cpu] = psci_cpu_on(cpus[1], __pa(secondary_entry));
> > > +	cpumask_set_cpu(cpu, &cpu_on_done);
> > > +}
> > > +
> > > +static void cpu_on_target(void)
> > > +{
> > > +	int cpu = smp_processor_id();
> > > +
> > > +	cpu_on_ret[cpu] = PSCI_RET_ALREADY_ON;
> > 
> > I'm not sure this is better than just skipping cpu1 in the check loop, as
> > is done now, but OK. 
> 
> I think that's a good idea, I'll remove this and skip CPU 1 in the check
> loop, because CPU 1 never invokes psci_cpu_on for itself.
> 
> > 
> > >  	cpumask_set_cpu(cpu, &cpu_on_done);
> > >  }
> > >  
> > > @@ -87,33 +98,53 @@ static bool psci_cpu_on_test(void)
> > >  {
> > >  	bool failed = false;
> > >  	int ret_success = 0;
> > > -	int cpu;
> > > -
> > > -	cpumask_set_cpu(1, &cpu_on_ready);
> > > -	cpumask_set_cpu(1, &cpu_on_done);
> > > +	int i, cpu;
> > >  
> > >  	for_each_present_cpu(cpu) {
> > >  		if (cpu < 2)
> > >  			continue;
> > > -		smp_boot_secondary(cpu, cpu_on_secondary_entry);
> > > +		smp_boot_secondary(cpu, cpu_on_do_wake_target);
> > >  	}
> > >  
> > >  	cpumask_set_cpu(0, &cpu_on_ready);
> > > +	cpumask_set_cpu(1, &cpu_on_ready);
> > >  	while (!cpumask_full(&cpu_on_ready))
> > >  		cpu_relax();
> > >  
> > > +	/*
> > > +	 * Wait for all other CPUs to be online before configuring the thread
> > > +	 * for the target CPU, as all secondaries are set up using the same
> > > +	 * global variable.
> > > +	 */
> > 
> > This comment says "Wait", so I'd move the while loop under it.
> 
> Can I reword it to keep it here?
> 
> 	/*
> 	 * Configure CPU 1 after all the secondaries are online to avoid
> 	 * secondary_data being overwritten.
> 	 */
> 
> What do you think?

Works for me.

> 
> > 
> > > +	smp_prepare_secondary(1, cpu_on_target);
> > 
> > This new smp_prepare_secondary() function should be local to this unit
> > test, please see my justification below.
> 
> Reply below.
> 
> > 
> > > +
> > >  	cpu_on_start = 1;
> > >  	smp_mb();
> > >  
> > > -	cpu_on_ret[0] = psci_cpu_on(cpus[1], __pa(halt));
> > > +	cpu_on_ret[0] = psci_cpu_on(cpus[1], __pa(secondary_entry));
> > >  	cpumask_set_cpu(0, &cpu_on_done);
> > >  
> > > -	while (!cpumask_full(&cpu_on_done))
> > > -		cpu_relax();
> > > +	report_info("waiting for CPU1 to come online...");
> > > +	for (i = 0; i < 10; i++) {
> > > +		mdelay(100);
> > > +		if (cpumask_full(&cpu_on_done))
> > > +			break;
> > > +	}
> > > +
> > > +	if (!cpumask_full(&cpu_on_done)) {
> > > +		for_each_present_cpu(cpu) {
> > > +			if (!cpumask_test_cpu(cpu, &cpu_on_done)) {
> > > +				if (cpu == 1)
> > > +					report_info("CPU1 failed to come online");
> > > +				else
> > > +					report_info("CPU%d failed to online CPU1", cpu);
> > > +			}
> > > +		}
> > > +		failed = true;
> > > +		goto out;
> > 
> > We could still run the other checks below, perhaps guarded with
> > cpumask_test_cpu(cpu, &cpu_on_done), rather than bail early. I'm
> > also OK with bailing early though. But, for that, I'd just return
> > false right here rather than introduce the goto.
> 
> I like the idea of checking the return value for the CPU_ON calls that have
> returned (I'll use for_each_cpu(cpu, &cpu_on _done) in the for loop below).
> 
> > 
> > > +	}
> > >  
> > >  	for_each_present_cpu(cpu) {
> > > -		if (cpu == 1)
> > > -			continue;
> > >  		if (cpu_on_ret[cpu] == PSCI_RET_SUCCESS) {
> > >  			ret_success++;
> > >  		} else if (cpu_on_ret[cpu] != PSCI_RET_ALREADY_ON) {
> > > @@ -127,6 +158,7 @@ static bool psci_cpu_on_test(void)
> > >  		failed = true;
> > >  	}
> > >  
> > > +out:
> > >  	return !failed;
> > >  }
> > >  
> > > diff --git a/lib/arm/asm/smp.h b/lib/arm/asm/smp.h
> > > index 077afde85520..ff2ef8f88247 100644
> > > --- a/lib/arm/asm/smp.h
> > > +++ b/lib/arm/asm/smp.h
> > > @@ -49,6 +49,7 @@ static inline void set_cpu_idle(int cpu, bool idle)
> > >  }
> > >  
> > >  typedef void (*secondary_entry_fn)(void);
> > > +extern void smp_prepare_secondary(int cpu, secondary_entry_fn entry);
> > >  extern void smp_boot_secondary(int cpu, secondary_entry_fn entry);
> > >  extern void on_cpu_async(int cpu, void (*func)(void *data), void *data);
> > >  extern void on_cpu(int cpu, void (*func)(void *data), void *data);
> > > diff --git a/lib/arm/smp.c b/lib/arm/smp.c
> > > index 98a5054e039b..947f417f4aea 100644
> > > --- a/lib/arm/smp.c
> > > +++ b/lib/arm/smp.c
> > > @@ -58,13 +58,19 @@ secondary_entry_fn secondary_cinit(void)
> > >  	return entry;
> > >  }
> > >  
> > > -static void __smp_boot_secondary(int cpu, secondary_entry_fn entry)
> > > +void smp_prepare_secondary(int cpu, secondary_entry_fn entry)
> > 
> > I'd rather not create an unsafe library function, especially one named
> > without the leading underscores. It's OK for a unit test to duplicate
> > __smp_boot_secondary() (secondary_data is available), but then the unit
> > test also needs to ensure it does its own synchronization, as you do
> > with the wait on cpu_on_ready already.
> 
> I created the function to share the code between __smp_boot_secondary and
> the PSCI test. Cache maintenance would have to be added here (one CPU
> writes to secondary_data, another CPU reads from it, possible with the MMU
> off), and I would prefer to keep it in one place, in the library code,
> rather than expose the boot cache maintenance requirements to tests.
> 
> If I rename it to __smp_boot_secondary(), would you find that acceptable?

I guess you mean __smp_prepare_secondary(). I suppose, but at this time
we're only avoiding duplicating 3 lines, so it looks like premature
refactoring. We can do it now though, since I hope to see the cache
maint patches soon. Please add a comment above the new __ function warning
users about its risks.

Another option is to make an smp_prepare_secondary() that is safe to
use without synchronization. We'd just need to make secondary_data per
cpu.

Thanks,
drew


> 
> Thanks,
> Alex
> 
> > 
> > >  {
> > > -	int ret;
> > > -
> > >  	secondary_data.stack = thread_stack_alloc();
> > >  	secondary_data.entry = entry;
> > >  	mmu_mark_disabled(cpu);
> > > +}
> > > +
> > > +static void __smp_boot_secondary(int cpu, secondary_entry_fn entry)
> > > +{
> > > +	int ret;
> > > +
> > > +	smp_prepare_secondary(cpu, entry);
> > > +
> > >  	ret = cpu_psci_cpu_boot(cpu);
> > >  	assert(ret == 0);
> > >  
> > > diff --git a/lib/errata.h b/lib/errata.h
> > > index 5af0eb3bf8e2..de8205d8b370 100644
> > > --- a/lib/errata.h
> > > +++ b/lib/errata.h
> > > @@ -6,6 +6,8 @@
> > >   */
> > >  #ifndef _ERRATA_H_
> > >  #define _ERRATA_H_
> > > +#include <libcflat.h>
> > > +
> > >  #include "config.h"
> > >  
> > >  #ifndef CONFIG_ERRATA_FORCE
> > > -- 
> > > 2.25.1
> > >
> > 
> > Thanks,
> > drew

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [kvm-unit-tests PATCH v4 2/2] arm/psci: Add PSCI CPU_OFF test case
  2023-01-19 10:35     ` Alexandru Elisei
@ 2023-01-19 11:36       ` Andrew Jones
  2023-01-19 11:45         ` Andrew Jones
  0 siblings, 1 reply; 11+ messages in thread
From: Andrew Jones @ 2023-01-19 11:36 UTC (permalink / raw)
  To: Alexandru Elisei; +Cc: kvm, kvmarm

On Thu, Jan 19, 2023 at 10:35:35AM +0000, Alexandru Elisei wrote:
> Hi Drew,
> 
> On Wed, Jan 18, 2023 at 07:48:21PM +0100, Andrew Jones wrote:
> > On Wed, Jan 18, 2023 at 02:49:12PM +0000, Alexandru Elisei wrote:
> > > From: Nikita Venkatesh <Nikita.Venkatesh@arm.com>
> > > 
> > > The test uses the following method.
> > > 
> > > The primary CPU brings up all the secondary CPUs, which are held in a wait
> > > loop. Once the primary releases the CPUs, each of the secondary CPUs
> > > proceed to issue PSCI_CPU_OFF. This is indicated by a cpumask and also the
> > > status of the call is updated by the secondary CPU in cpu_off_done[].
> > > 
> > > The primary CPU waits for all the secondary CPUs to update the cpumask and
> > > then proceeds to check for the status of the individual CPU CPU_OFF
> > > request. There is a chance that some CPUs might fail at the CPU_OFF request
> > > and come back and update the status once the primary CPU has finished the
> > > scan. There is no fool proof method to handle this. As of now, we add a
> > > 1sec delay between the cpumask check and the scan for the status.
> > > 
> > > Signed-off-by: Nikita Venkatesh <Nikita.Venkatesh@arm.com>
> > > [ Alex E: Skip CPU_OFF test if CPU_ON failed ]
> > > Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
> > > ---
> > >  arm/psci.c | 71 ++++++++++++++++++++++++++++++++++++++++++++++++++----
> > >  1 file changed, 66 insertions(+), 5 deletions(-)
> > > 
> > > diff --git a/arm/psci.c b/arm/psci.c
> > > index e96be941953b..d045616bfcd4 100644
> > > --- a/arm/psci.c
> > > +++ b/arm/psci.c
> > > @@ -15,6 +15,8 @@
> > >  #include <asm/psci.h>
> > >  #include <asm/smp.h>
> > >  
> > > +#define CPU_OFF_TEST_WAIT_TIME 1000
> > > +
> > >  static bool invalid_function_exception;
> > >  
> > >  #ifdef __arm__
> > > @@ -71,8 +73,10 @@ static bool psci_affinity_info_off(void)
> > >  }
> > >  
> > >  static int cpu_on_ret[NR_CPUS];
> > > -static cpumask_t cpu_on_ready, cpu_on_done;
> > > +static bool cpu_off_success[NR_CPUS];
> > > +static cpumask_t cpu_on_ready, cpu_on_done, cpu_off_done;
> > >  static volatile int cpu_on_start;
> > > +static volatile int cpu_off_start;
> > >  
> > >  extern void secondary_entry(void);
> > >  static void cpu_on_do_wake_target(void)
> > > @@ -94,6 +98,20 @@ static void cpu_on_target(void)
> > >  	cpumask_set_cpu(cpu, &cpu_on_done);
> > >  }
> > >  
> > > +static void cpu_off_secondary_entry(void *data)
> > > +{
> > > +	int cpu = smp_processor_id();
> > > +
> > > +	while (!cpu_off_start)
> > > +		cpu_relax();
> > > +	/* On to the CPU off test */
> > > +	cpu_off_success[cpu] = true;
> > > +	cpumask_set_cpu(cpu, &cpu_off_done);
> > > +	cpu_psci_cpu_die();
> > > +	/* The CPU shouldn't execute the next steps. */
> > > +	cpu_off_success[cpu] = false;
> > > +}
> > > +
> > >  static bool psci_cpu_on_test(void)
> > >  {
> > >  	bool failed = false;
> > > @@ -162,9 +180,45 @@ out:
> > >  	return !failed;
> > >  }
> > >  
> > > -int main(void)
> > > +static bool psci_cpu_off_test(void)
> > > +{
> > > +	bool failed = false;
> > > +	int cpu;
> > > +
> > > +	for_each_present_cpu(cpu) {
> > > +		if (cpu == 0)
> > > +			continue;
> > > +		on_cpu_async(cpu, cpu_off_secondary_entry, NULL);
> > > +	}
> > > +
> > > +	cpumask_set_cpu(0, &cpu_off_done);
> > 
> > Since we're setting cpu_off_done for cpu0, then we could also set
> > cpu_off_success[0] = true and not have to skip it in the check loop
> > below.
> 
> I would prefer not to, since CPU 0 never invokes CPU_OFF for itself and setting
> cpu_off_success to true for CPU 0 might get confusing. Unless you insist :)

I don't insist, I'm just looking for consistency with the other tests. The
last one was getting changed to the opposite, but now I see you plan to
change it back.

> 
> > 
> > > +
> > > +	report_info("starting CPU_OFF test...");
> > > +
> > > +	cpu_off_start = 1;
> > > +	while (!cpumask_full(&cpu_off_done))
> > > +		cpu_relax();
> > > +
> > > +	/* Allow all the other CPUs to complete the operation */
> > > +	mdelay(CPU_OFF_TEST_WAIT_TIME);
> > 
> > Don't really need the define, just the numbers work for stuff
> > like this, but OK.
> 
> I'll get rid of the define.
> 
> You ok with waiting for 1 second for each test run? I remember you had
> objections when I added a similar delay to the timer tests.

Thanks for reminding me :-) Indeed, I'd rather not have tests waiting for
no reason. We want these tests to execute as fast as possible to ensure
they get run by impatient developers like me.

How about something like this that waits *up to* one second.

static void cpu_off_secondary_entry(void *data)
{
	int cpu = smp_processor_id();

	while (!cpu_off_start)
		cpu_relax();

	cpumask_set_cpu(cpu, &cpu_off_done);
	cpu_psci_cpu_die();
}

static bool psci_cpu_off_test(void)
{
	int count;

	...

	while (!cpumask_full(&cpu_off_done))
		cpu_relax();

	for (i = 0; i < 100; ++i) {
		mdelay(10);

		count = 0;

		for_each_present_cpu(cpu) {
			if (cpu == 0)
				continue;
			if (psci_affinity_info(cpu, 0) != PSCI_0_2_AFFINITY_LEVEL_OFF)
				++count;
		}
		if (!count)
			break;
	}

	if (count) {
		...
	}
	...
}


Thanks,
drew
                      
> 
> > 
> > > +
> > > +	for_each_present_cpu(cpu) {
> > > +		if (cpu == 0)
> > > +			continue;
> > > +
> > > +		if (!cpu_off_success[cpu]) {
> > > +			report_info("CPU%d could not be turned off", cpu);
> > > +			failed = true;
> > > +		}
> > > +	}
> > > +
> > > +	return !failed;
> > > +}
> > > +
> > > +int main(int argc, char **argv)
> > >  {
> > >  	int ver = psci_invoke(PSCI_0_2_FN_PSCI_VERSION, 0, 0, 0);
> > > +	bool cpu_on_success = true;
> > >  
> > >  	report_prefix_push("psci");
> > >  
> > > @@ -179,10 +233,17 @@ int main(void)
> > >  	report(psci_affinity_info_on(), "affinity-info-on");
> > >  	report(psci_affinity_info_off(), "affinity-info-off");
> > >  
> > > -	if (ERRATA(6c7a5dce22b3))
> > > -		report(psci_cpu_on_test(), "cpu-on");
> > > -	else
> > > +	if (ERRATA(6c7a5dce22b3)) {
> > > +		cpu_on_success = psci_cpu_on_test();
> > > +		report(cpu_on_success, "cpu-on");
> > > +	} else {
> > >  		report_skip("Skipping unsafe cpu-on test. Set ERRATA_6c7a5dce22b3=y to enable.");
> > > +	}
> > > +
> > > +	if (!cpu_on_success)
> > > +		report_skip("Skipping cpu-off test because the cpu-on test failed");
> > > +	else
> > > +		report(psci_cpu_off_test(), "cpu-off");
> > >  
> > >  done:
> > >  #if 0
> > > -- 
> > > 2.25.1
> > > 
> > 
> > Besides the nits,
> > 
> > Reviewed-by: Andrew Jones <andrew.jones@linux.dev>
> 
> Thanks!
> 
> Alex
> 
> > 
> > Thanks,
> > drew

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [kvm-unit-tests PATCH v4 2/2] arm/psci: Add PSCI CPU_OFF test case
  2023-01-19 11:36       ` Andrew Jones
@ 2023-01-19 11:45         ` Andrew Jones
  0 siblings, 0 replies; 11+ messages in thread
From: Andrew Jones @ 2023-01-19 11:45 UTC (permalink / raw)
  To: Alexandru Elisei; +Cc: kvm, kvmarm

On Thu, Jan 19, 2023 at 12:36:39PM +0100, Andrew Jones wrote:
> On Thu, Jan 19, 2023 at 10:35:35AM +0000, Alexandru Elisei wrote:
> > Hi Drew,
> > 
> > On Wed, Jan 18, 2023 at 07:48:21PM +0100, Andrew Jones wrote:
> > > On Wed, Jan 18, 2023 at 02:49:12PM +0000, Alexandru Elisei wrote:
> > > > From: Nikita Venkatesh <Nikita.Venkatesh@arm.com>
> > > > 
> > > > The test uses the following method.
> > > > 
> > > > The primary CPU brings up all the secondary CPUs, which are held in a wait
> > > > loop. Once the primary releases the CPUs, each of the secondary CPUs
> > > > proceed to issue PSCI_CPU_OFF. This is indicated by a cpumask and also the
> > > > status of the call is updated by the secondary CPU in cpu_off_done[].
> > > > 
> > > > The primary CPU waits for all the secondary CPUs to update the cpumask and
> > > > then proceeds to check for the status of the individual CPU CPU_OFF
> > > > request. There is a chance that some CPUs might fail at the CPU_OFF request
> > > > and come back and update the status once the primary CPU has finished the
> > > > scan. There is no fool proof method to handle this. As of now, we add a
> > > > 1sec delay between the cpumask check and the scan for the status.
> > > > 
> > > > Signed-off-by: Nikita Venkatesh <Nikita.Venkatesh@arm.com>
> > > > [ Alex E: Skip CPU_OFF test if CPU_ON failed ]
> > > > Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
> > > > ---
> > > >  arm/psci.c | 71 ++++++++++++++++++++++++++++++++++++++++++++++++++----
> > > >  1 file changed, 66 insertions(+), 5 deletions(-)
> > > > 
> > > > diff --git a/arm/psci.c b/arm/psci.c
> > > > index e96be941953b..d045616bfcd4 100644
> > > > --- a/arm/psci.c
> > > > +++ b/arm/psci.c
> > > > @@ -15,6 +15,8 @@
> > > >  #include <asm/psci.h>
> > > >  #include <asm/smp.h>
> > > >  
> > > > +#define CPU_OFF_TEST_WAIT_TIME 1000
> > > > +
> > > >  static bool invalid_function_exception;
> > > >  
> > > >  #ifdef __arm__
> > > > @@ -71,8 +73,10 @@ static bool psci_affinity_info_off(void)
> > > >  }
> > > >  
> > > >  static int cpu_on_ret[NR_CPUS];
> > > > -static cpumask_t cpu_on_ready, cpu_on_done;
> > > > +static bool cpu_off_success[NR_CPUS];
> > > > +static cpumask_t cpu_on_ready, cpu_on_done, cpu_off_done;
> > > >  static volatile int cpu_on_start;
> > > > +static volatile int cpu_off_start;
> > > >  
> > > >  extern void secondary_entry(void);
> > > >  static void cpu_on_do_wake_target(void)
> > > > @@ -94,6 +98,20 @@ static void cpu_on_target(void)
> > > >  	cpumask_set_cpu(cpu, &cpu_on_done);
> > > >  }
> > > >  
> > > > +static void cpu_off_secondary_entry(void *data)
> > > > +{
> > > > +	int cpu = smp_processor_id();
> > > > +
> > > > +	while (!cpu_off_start)
> > > > +		cpu_relax();
> > > > +	/* On to the CPU off test */
> > > > +	cpu_off_success[cpu] = true;
> > > > +	cpumask_set_cpu(cpu, &cpu_off_done);
> > > > +	cpu_psci_cpu_die();
> > > > +	/* The CPU shouldn't execute the next steps. */
> > > > +	cpu_off_success[cpu] = false;
> > > > +}
> > > > +
> > > >  static bool psci_cpu_on_test(void)
> > > >  {
> > > >  	bool failed = false;
> > > > @@ -162,9 +180,45 @@ out:
> > > >  	return !failed;
> > > >  }
> > > >  
> > > > -int main(void)
> > > > +static bool psci_cpu_off_test(void)
> > > > +{
> > > > +	bool failed = false;
> > > > +	int cpu;
> > > > +
> > > > +	for_each_present_cpu(cpu) {
> > > > +		if (cpu == 0)
> > > > +			continue;
> > > > +		on_cpu_async(cpu, cpu_off_secondary_entry, NULL);
> > > > +	}
> > > > +
> > > > +	cpumask_set_cpu(0, &cpu_off_done);
> > > 
> > > Since we're setting cpu_off_done for cpu0, then we could also set
> > > cpu_off_success[0] = true and not have to skip it in the check loop
> > > below.
> > 
> > I would prefer not to, since CPU 0 never invokes CPU_OFF for itself and setting
> > cpu_off_success to true for CPU 0 might get confusing. Unless you insist :)
> 
> I don't insist, I'm just looking for consistency with the other tests. The
> last one was getting changed to the opposite, but now I see you plan to
> change it back.
> 
> > 
> > > 
> > > > +
> > > > +	report_info("starting CPU_OFF test...");
> > > > +
> > > > +	cpu_off_start = 1;
> > > > +	while (!cpumask_full(&cpu_off_done))
> > > > +		cpu_relax();
> > > > +
> > > > +	/* Allow all the other CPUs to complete the operation */
> > > > +	mdelay(CPU_OFF_TEST_WAIT_TIME);
> > > 
> > > Don't really need the define, just the numbers work for stuff
> > > like this, but OK.
> > 
> > I'll get rid of the define.
> > 
> > You ok with waiting for 1 second for each test run? I remember you had
> > objections when I added a similar delay to the timer tests.
> 
> Thanks for reminding me :-) Indeed, I'd rather not have tests waiting for
> no reason. We want these tests to execute as fast as possible to ensure
> they get run by impatient developers like me.
> 
> How about something like this that waits *up to* one second.
> 
> static void cpu_off_secondary_entry(void *data)
> {
> 	int cpu = smp_processor_id();
> 
> 	while (!cpu_off_start)
> 		cpu_relax();
> 
> 	cpumask_set_cpu(cpu, &cpu_off_done);
> 	cpu_psci_cpu_die();
> }
> 
> static bool psci_cpu_off_test(void)
> {
> 	int count;
> 
> 	...
> 
> 	while (!cpumask_full(&cpu_off_done))
> 		cpu_relax();
> 
> 	for (i = 0; i < 100; ++i) {
> 		mdelay(10);
> 
> 		count = 0;
> 
> 		for_each_present_cpu(cpu) {
> 			if (cpu == 0)
> 				continue;
> 			if (psci_affinity_info(cpu, 0) != PSCI_0_2_AFFINITY_LEVEL_OFF)
> 				++count;
> 		}
> 		if (!count)
> 			break;
> 	}
> 
> 	if (count) {
> 		...
> 	}
> 	...

And here an additional sanity check that no cpus are in idle, which would
mean they came back from cpu_psci_cpu_die() and psci_affinity_info() is
lying.

  if (!cpumask_empty(&cpu_idle_mask)) {
      ...
  }

Could also create a new cpu_off_failed mask that cpu_off_secondary_entry()
would set on return from cpu_psci_cpu_die, but checking idle is pretty
much the same.

Thanks,
drew

> }
> 
> 
> Thanks,
> drew
>                       
> > 
> > > 
> > > > +
> > > > +	for_each_present_cpu(cpu) {
> > > > +		if (cpu == 0)
> > > > +			continue;
> > > > +
> > > > +		if (!cpu_off_success[cpu]) {
> > > > +			report_info("CPU%d could not be turned off", cpu);
> > > > +			failed = true;
> > > > +		}
> > > > +	}
> > > > +
> > > > +	return !failed;
> > > > +}
> > > > +
> > > > +int main(int argc, char **argv)
> > > >  {
> > > >  	int ver = psci_invoke(PSCI_0_2_FN_PSCI_VERSION, 0, 0, 0);
> > > > +	bool cpu_on_success = true;
> > > >  
> > > >  	report_prefix_push("psci");
> > > >  
> > > > @@ -179,10 +233,17 @@ int main(void)
> > > >  	report(psci_affinity_info_on(), "affinity-info-on");
> > > >  	report(psci_affinity_info_off(), "affinity-info-off");
> > > >  
> > > > -	if (ERRATA(6c7a5dce22b3))
> > > > -		report(psci_cpu_on_test(), "cpu-on");
> > > > -	else
> > > > +	if (ERRATA(6c7a5dce22b3)) {
> > > > +		cpu_on_success = psci_cpu_on_test();
> > > > +		report(cpu_on_success, "cpu-on");
> > > > +	} else {
> > > >  		report_skip("Skipping unsafe cpu-on test. Set ERRATA_6c7a5dce22b3=y to enable.");
> > > > +	}
> > > > +
> > > > +	if (!cpu_on_success)
> > > > +		report_skip("Skipping cpu-off test because the cpu-on test failed");
> > > > +	else
> > > > +		report(psci_cpu_off_test(), "cpu-off");
> > > >  
> > > >  done:
> > > >  #if 0
> > > > -- 
> > > > 2.25.1
> > > > 
> > > 
> > > Besides the nits,
> > > 
> > > Reviewed-by: Andrew Jones <andrew.jones@linux.dev>
> > 
> > Thanks!
> > 
> > Alex
> > 
> > > 
> > > Thanks,
> > > drew

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [kvm-unit-tests PATCH v4 1/2] arm/psci: Test that CPU 1 has been successfully brought online
  2023-01-18 14:49 ` [kvm-unit-tests PATCH v4 1/2] arm/psci: Test that CPU 1 has been successfully brought online Alexandru Elisei
  2023-01-18 18:35   ` Andrew Jones
@ 2023-01-19 11:46   ` Andrew Jones
  1 sibling, 0 replies; 11+ messages in thread
From: Andrew Jones @ 2023-01-19 11:46 UTC (permalink / raw)
  To: Alexandru Elisei; +Cc: kvm, kvmarm

On Wed, Jan 18, 2023 at 02:49:11PM +0000, Alexandru Elisei wrote:
> For the PSCI CPU_ON function test, all other CPUs perform a CPU_ON call
> that target CPU 1. The test is considered a success if CPU_ON returns PSCI
> SUCCESS exactly once, and for the rest of the calls PSCI ALREADY_ON.
> 
> Enhance the test by checking that CPU 1 is actually online and able to
> execute code. Also make the test more robust by checking that the CPU_ON
> call returns, instead of assuming that it will always succeed and
> hanging indefinitely if it doesn't.
> 
> Since the CPU 1 thread is now being set up properly by kvm-unit-tests
> when being brought online, it becomes possible to add other tests in the
> future that require all CPUs.
> 
> The include header order in arm/psci.c has been changed to be in
> alphabetic order. This means moving the errata.h include before
> libcflat.h, which causes compilation to fail because of missing includes
> in errata.h. Fix that also by including the needed header in errata.h.
> 
> Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
> ---
>  arm/psci.c        | 60 ++++++++++++++++++++++++++++++++++++-----------
>  lib/arm/asm/smp.h |  1 +
>  lib/arm/smp.c     | 12 +++++++---
>  lib/errata.h      |  2 ++
>  4 files changed, 58 insertions(+), 17 deletions(-)
> 
> diff --git a/arm/psci.c b/arm/psci.c
> index efa0722c0566..e96be941953b 100644
> --- a/arm/psci.c
> +++ b/arm/psci.c
> @@ -7,11 +7,13 @@
>   *
>   * This work is licensed under the terms of the GNU LGPL, version 2.
>   */
> -#include <libcflat.h>
>  #include <errata.h>
> +#include <libcflat.h>
> +
> +#include <asm/delay.h>
>  #include <asm/processor.h>
> -#include <asm/smp.h>
>  #include <asm/psci.h>
> +#include <asm/smp.h>
>  
>  static bool invalid_function_exception;
>  
> @@ -72,14 +74,23 @@ static int cpu_on_ret[NR_CPUS];
>  static cpumask_t cpu_on_ready, cpu_on_done;
>  static volatile int cpu_on_start;
>  
> -static void cpu_on_secondary_entry(void)
> +extern void secondary_entry(void);
> +static void cpu_on_do_wake_target(void)
>  {
>  	int cpu = smp_processor_id();
>  
>  	cpumask_set_cpu(cpu, &cpu_on_ready);
>  	while (!cpu_on_start)
>  		cpu_relax();
> -	cpu_on_ret[cpu] = psci_cpu_on(cpus[1], __pa(halt));
> +	cpu_on_ret[cpu] = psci_cpu_on(cpus[1], __pa(secondary_entry));
> +	cpumask_set_cpu(cpu, &cpu_on_done);
> +}
> +
> +static void cpu_on_target(void)
> +{
> +	int cpu = smp_processor_id();
> +
> +	cpu_on_ret[cpu] = PSCI_RET_ALREADY_ON;
>  	cpumask_set_cpu(cpu, &cpu_on_done);
>  }
>  
> @@ -87,33 +98,53 @@ static bool psci_cpu_on_test(void)
>  {
>  	bool failed = false;
>  	int ret_success = 0;
> -	int cpu;
> -
> -	cpumask_set_cpu(1, &cpu_on_ready);
> -	cpumask_set_cpu(1, &cpu_on_done);
> +	int i, cpu;
>  
>  	for_each_present_cpu(cpu) {
>  		if (cpu < 2)
>  			continue;
> -		smp_boot_secondary(cpu, cpu_on_secondary_entry);
> +		smp_boot_secondary(cpu, cpu_on_do_wake_target);
>  	}
>  
>  	cpumask_set_cpu(0, &cpu_on_ready);
> +	cpumask_set_cpu(1, &cpu_on_ready);
>  	while (!cpumask_full(&cpu_on_ready))
>  		cpu_relax();
>  
> +	/*
> +	 * Wait for all other CPUs to be online before configuring the thread
> +	 * for the target CPU, as all secondaries are set up using the same
> +	 * global variable.
> +	 */
> +	smp_prepare_secondary(1, cpu_on_target);
> +
>  	cpu_on_start = 1;
>  	smp_mb();
>  
> -	cpu_on_ret[0] = psci_cpu_on(cpus[1], __pa(halt));
> +	cpu_on_ret[0] = psci_cpu_on(cpus[1], __pa(secondary_entry));
>  	cpumask_set_cpu(0, &cpu_on_done);
>  
> -	while (!cpumask_full(&cpu_on_done))
> -		cpu_relax();
> +	report_info("waiting for CPU1 to come online...");
> +	for (i = 0; i < 10; i++) {
> +		mdelay(100);

Changing this to

     for (i = 0; i < 100; i++) {
             mdelay(10);

may speed it up a bit.

Thanks,
drew

> +		if (cpumask_full(&cpu_on_done))
> +			break;
> +	}
> +
> +	if (!cpumask_full(&cpu_on_done)) {
> +		for_each_present_cpu(cpu) {
> +			if (!cpumask_test_cpu(cpu, &cpu_on_done)) {
> +				if (cpu == 1)
> +					report_info("CPU1 failed to come online");
> +				else
> +					report_info("CPU%d failed to online CPU1", cpu);
> +			}
> +		}
> +		failed = true;
> +		goto out;
> +	}
>  
>  	for_each_present_cpu(cpu) {
> -		if (cpu == 1)
> -			continue;
>  		if (cpu_on_ret[cpu] == PSCI_RET_SUCCESS) {
>  			ret_success++;
>  		} else if (cpu_on_ret[cpu] != PSCI_RET_ALREADY_ON) {
> @@ -127,6 +158,7 @@ static bool psci_cpu_on_test(void)
>  		failed = true;
>  	}
>  
> +out:
>  	return !failed;
>  }
>  
> diff --git a/lib/arm/asm/smp.h b/lib/arm/asm/smp.h
> index 077afde85520..ff2ef8f88247 100644
> --- a/lib/arm/asm/smp.h
> +++ b/lib/arm/asm/smp.h
> @@ -49,6 +49,7 @@ static inline void set_cpu_idle(int cpu, bool idle)
>  }
>  
>  typedef void (*secondary_entry_fn)(void);
> +extern void smp_prepare_secondary(int cpu, secondary_entry_fn entry);
>  extern void smp_boot_secondary(int cpu, secondary_entry_fn entry);
>  extern void on_cpu_async(int cpu, void (*func)(void *data), void *data);
>  extern void on_cpu(int cpu, void (*func)(void *data), void *data);
> diff --git a/lib/arm/smp.c b/lib/arm/smp.c
> index 98a5054e039b..947f417f4aea 100644
> --- a/lib/arm/smp.c
> +++ b/lib/arm/smp.c
> @@ -58,13 +58,19 @@ secondary_entry_fn secondary_cinit(void)
>  	return entry;
>  }
>  
> -static void __smp_boot_secondary(int cpu, secondary_entry_fn entry)
> +void smp_prepare_secondary(int cpu, secondary_entry_fn entry)
>  {
> -	int ret;
> -
>  	secondary_data.stack = thread_stack_alloc();
>  	secondary_data.entry = entry;
>  	mmu_mark_disabled(cpu);
> +}
> +
> +static void __smp_boot_secondary(int cpu, secondary_entry_fn entry)
> +{
> +	int ret;
> +
> +	smp_prepare_secondary(cpu, entry);
> +
>  	ret = cpu_psci_cpu_boot(cpu);
>  	assert(ret == 0);
>  
> diff --git a/lib/errata.h b/lib/errata.h
> index 5af0eb3bf8e2..de8205d8b370 100644
> --- a/lib/errata.h
> +++ b/lib/errata.h
> @@ -6,6 +6,8 @@
>   */
>  #ifndef _ERRATA_H_
>  #define _ERRATA_H_
> +#include <libcflat.h>
> +
>  #include "config.h"
>  
>  #ifndef CONFIG_ERRATA_FORCE
> -- 
> 2.25.1
> 

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2023-01-19 11:47 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-01-18 14:49 [kvm-unit-tests PATCH v4 0/2] arm: Add PSCI CPU_OFF test Alexandru Elisei
2023-01-18 14:49 ` [kvm-unit-tests PATCH v4 1/2] arm/psci: Test that CPU 1 has been successfully brought online Alexandru Elisei
2023-01-18 18:35   ` Andrew Jones
2023-01-19 10:32     ` Alexandru Elisei
2023-01-19 11:01       ` Andrew Jones
2023-01-19 11:46   ` Andrew Jones
2023-01-18 14:49 ` [kvm-unit-tests PATCH v4 2/2] arm/psci: Add PSCI CPU_OFF test case Alexandru Elisei
2023-01-18 18:48   ` Andrew Jones
2023-01-19 10:35     ` Alexandru Elisei
2023-01-19 11:36       ` Andrew Jones
2023-01-19 11:45         ` Andrew Jones

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.