All of lore.kernel.org
 help / color / mirror / Atom feed
* kvm-unit-tests: psci_cpu_on_test FAILed
@ 2019-08-02 10:56 ` Zenghui Yu
  0 siblings, 0 replies; 18+ messages in thread
From: Zenghui Yu @ 2019-08-02 10:56 UTC (permalink / raw)
  To: Marc Zyngier, drjones, James Morse, julien.thierry.kdev, suzuki.poulose
  Cc: kvmarm, kvm, linux-arm-kernel, Wanghaibin (D)

Hi folks,

Running kvm-unit-tests with Linux 5.3.0-rc2 on Kunpeng 920, we will get
the following fail info:

	[...]
	FAIL psci (4 tests, 1 unexpected failures)
	[...]
and
	[...]
	INFO: unexpected cpu_on return value: caller=CPU9, ret=-2
	FAIL: cpu-on
	SUMMARY: 4 tests, 1 unexpected failures


I think this is an issue had been fixed once by commit 6c7a5dce22b3
("KVM: arm/arm64: fix races in kvm_psci_vcpu_on"), which makes use of
kvm->lock mutex to fix the race between two PSCI_CPU_ON calls - one
does reset on the MPIDR register whilst another reads it.

But commit 358b28f09f0 ("arm/arm64: KVM: Allow a VCPU to fully reset
itself") later moves the reset work into check_vcpu_requests(), by
making a KVM_REQ_VCPU_RESET request in PSCI code. Thus the reset work
has not been protected by kvm->lock mutex anymore, and the race shows up
again...

Do we need a fix for this issue? At least achieve a mutex execution
between the reset of MPIDR and kvm_mpidr_to_vcpu()?


Thanks,
zenghui


^ permalink raw reply	[flat|nested] 18+ messages in thread

* kvm-unit-tests: psci_cpu_on_test FAILed
@ 2019-08-02 10:56 ` Zenghui Yu
  0 siblings, 0 replies; 18+ messages in thread
From: Zenghui Yu @ 2019-08-02 10:56 UTC (permalink / raw)
  To: Marc Zyngier, drjones, James Morse, julien.thierry.kdev, suzuki.poulose
  Cc: linux-arm-kernel, kvmarm, kvm

Hi folks,

Running kvm-unit-tests with Linux 5.3.0-rc2 on Kunpeng 920, we will get
the following fail info:

	[...]
	FAIL psci (4 tests, 1 unexpected failures)
	[...]
and
	[...]
	INFO: unexpected cpu_on return value: caller=CPU9, ret=-2
	FAIL: cpu-on
	SUMMARY: 4 tests, 1 unexpected failures


I think this is an issue had been fixed once by commit 6c7a5dce22b3
("KVM: arm/arm64: fix races in kvm_psci_vcpu_on"), which makes use of
kvm->lock mutex to fix the race between two PSCI_CPU_ON calls - one
does reset on the MPIDR register whilst another reads it.

But commit 358b28f09f0 ("arm/arm64: KVM: Allow a VCPU to fully reset
itself") later moves the reset work into check_vcpu_requests(), by
making a KVM_REQ_VCPU_RESET request in PSCI code. Thus the reset work
has not been protected by kvm->lock mutex anymore, and the race shows up
again...

Do we need a fix for this issue? At least achieve a mutex execution
between the reset of MPIDR and kvm_mpidr_to_vcpu()?


Thanks,
zenghui

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 18+ messages in thread

* kvm-unit-tests: psci_cpu_on_test FAILed
@ 2019-08-02 10:56 ` Zenghui Yu
  0 siblings, 0 replies; 18+ messages in thread
From: Zenghui Yu @ 2019-08-02 10:56 UTC (permalink / raw)
  To: Marc Zyngier, drjones, James Morse, julien.thierry.kdev, suzuki.poulose
  Cc: linux-arm-kernel, Wanghaibin (D), kvmarm, kvm

Hi folks,

Running kvm-unit-tests with Linux 5.3.0-rc2 on Kunpeng 920, we will get
the following fail info:

	[...]
	FAIL psci (4 tests, 1 unexpected failures)
	[...]
and
	[...]
	INFO: unexpected cpu_on return value: caller=CPU9, ret=-2
	FAIL: cpu-on
	SUMMARY: 4 tests, 1 unexpected failures


I think this is an issue had been fixed once by commit 6c7a5dce22b3
("KVM: arm/arm64: fix races in kvm_psci_vcpu_on"), which makes use of
kvm->lock mutex to fix the race between two PSCI_CPU_ON calls - one
does reset on the MPIDR register whilst another reads it.

But commit 358b28f09f0 ("arm/arm64: KVM: Allow a VCPU to fully reset
itself") later moves the reset work into check_vcpu_requests(), by
making a KVM_REQ_VCPU_RESET request in PSCI code. Thus the reset work
has not been protected by kvm->lock mutex anymore, and the race shows up
again...

Do we need a fix for this issue? At least achieve a mutex execution
between the reset of MPIDR and kvm_mpidr_to_vcpu()?


Thanks,
zenghui


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: kvm-unit-tests: psci_cpu_on_test FAILed
  2019-08-02 10:56 ` Zenghui Yu
  (?)
@ 2019-08-02 11:32   ` Andrew Jones
  -1 siblings, 0 replies; 18+ messages in thread
From: Andrew Jones @ 2019-08-02 11:32 UTC (permalink / raw)
  To: Zenghui Yu
  Cc: Marc Zyngier, James Morse, julien.thierry.kdev, suzuki.poulose,
	kvmarm, kvm, linux-arm-kernel, Wanghaibin (D)

On Fri, Aug 02, 2019 at 06:56:51PM +0800, Zenghui Yu wrote:
> Hi folks,
> 
> Running kvm-unit-tests with Linux 5.3.0-rc2 on Kunpeng 920, we will get
> the following fail info:
> 
> 	[...]
> 	FAIL psci (4 tests, 1 unexpected failures)
> 	[...]
> and
> 	[...]
> 	INFO: unexpected cpu_on return value: caller=CPU9, ret=-2
> 	FAIL: cpu-on
> 	SUMMARY: 4 tests, 1 unexpected failures
> 
> 
> I think this is an issue had been fixed once by commit 6c7a5dce22b3
> ("KVM: arm/arm64: fix races in kvm_psci_vcpu_on"), which makes use of
> kvm->lock mutex to fix the race between two PSCI_CPU_ON calls - one
> does reset on the MPIDR register whilst another reads it.
> 
> But commit 358b28f09f0 ("arm/arm64: KVM: Allow a VCPU to fully reset
> itself") later moves the reset work into check_vcpu_requests(), by
> making a KVM_REQ_VCPU_RESET request in PSCI code. Thus the reset work
> has not been protected by kvm->lock mutex anymore, and the race shows up
> again...
> 
> Do we need a fix for this issue? At least achieve a mutex execution
> between the reset of MPIDR and kvm_mpidr_to_vcpu()?
> 
>

I noticed this too, but I put it pretty low on my TODO because it's a
safe failure (no host crash, just an unexpected PSCI_RET_INVALID_PARAMS
gets returned because the valid MPIDR doesn't look valid for a moment.)
Also, the test is quite pathological, especially when the host has many
CPUs, so I wouldn't expect this to show up on a sane guest. I agree
it would be nice to get it fixed eventually though.

Thanks,
drew

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: kvm-unit-tests: psci_cpu_on_test FAILed
@ 2019-08-02 11:32   ` Andrew Jones
  0 siblings, 0 replies; 18+ messages in thread
From: Andrew Jones @ 2019-08-02 11:32 UTC (permalink / raw)
  To: Zenghui Yu; +Cc: kvm, Marc Zyngier, linux-arm-kernel, kvmarm

On Fri, Aug 02, 2019 at 06:56:51PM +0800, Zenghui Yu wrote:
> Hi folks,
> 
> Running kvm-unit-tests with Linux 5.3.0-rc2 on Kunpeng 920, we will get
> the following fail info:
> 
> 	[...]
> 	FAIL psci (4 tests, 1 unexpected failures)
> 	[...]
> and
> 	[...]
> 	INFO: unexpected cpu_on return value: caller=CPU9, ret=-2
> 	FAIL: cpu-on
> 	SUMMARY: 4 tests, 1 unexpected failures
> 
> 
> I think this is an issue had been fixed once by commit 6c7a5dce22b3
> ("KVM: arm/arm64: fix races in kvm_psci_vcpu_on"), which makes use of
> kvm->lock mutex to fix the race between two PSCI_CPU_ON calls - one
> does reset on the MPIDR register whilst another reads it.
> 
> But commit 358b28f09f0 ("arm/arm64: KVM: Allow a VCPU to fully reset
> itself") later moves the reset work into check_vcpu_requests(), by
> making a KVM_REQ_VCPU_RESET request in PSCI code. Thus the reset work
> has not been protected by kvm->lock mutex anymore, and the race shows up
> again...
> 
> Do we need a fix for this issue? At least achieve a mutex execution
> between the reset of MPIDR and kvm_mpidr_to_vcpu()?
> 
>

I noticed this too, but I put it pretty low on my TODO because it's a
safe failure (no host crash, just an unexpected PSCI_RET_INVALID_PARAMS
gets returned because the valid MPIDR doesn't look valid for a moment.)
Also, the test is quite pathological, especially when the host has many
CPUs, so I wouldn't expect this to show up on a sane guest. I agree
it would be nice to get it fixed eventually though.

Thanks,
drew
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: kvm-unit-tests: psci_cpu_on_test FAILed
@ 2019-08-02 11:32   ` Andrew Jones
  0 siblings, 0 replies; 18+ messages in thread
From: Andrew Jones @ 2019-08-02 11:32 UTC (permalink / raw)
  To: Zenghui Yu
  Cc: kvm, suzuki.poulose, Marc Zyngier, James Morse, linux-arm-kernel,
	Wanghaibin (D),
	kvmarm, julien.thierry.kdev

On Fri, Aug 02, 2019 at 06:56:51PM +0800, Zenghui Yu wrote:
> Hi folks,
> 
> Running kvm-unit-tests with Linux 5.3.0-rc2 on Kunpeng 920, we will get
> the following fail info:
> 
> 	[...]
> 	FAIL psci (4 tests, 1 unexpected failures)
> 	[...]
> and
> 	[...]
> 	INFO: unexpected cpu_on return value: caller=CPU9, ret=-2
> 	FAIL: cpu-on
> 	SUMMARY: 4 tests, 1 unexpected failures
> 
> 
> I think this is an issue had been fixed once by commit 6c7a5dce22b3
> ("KVM: arm/arm64: fix races in kvm_psci_vcpu_on"), which makes use of
> kvm->lock mutex to fix the race between two PSCI_CPU_ON calls - one
> does reset on the MPIDR register whilst another reads it.
> 
> But commit 358b28f09f0 ("arm/arm64: KVM: Allow a VCPU to fully reset
> itself") later moves the reset work into check_vcpu_requests(), by
> making a KVM_REQ_VCPU_RESET request in PSCI code. Thus the reset work
> has not been protected by kvm->lock mutex anymore, and the race shows up
> again...
> 
> Do we need a fix for this issue? At least achieve a mutex execution
> between the reset of MPIDR and kvm_mpidr_to_vcpu()?
> 
>

I noticed this too, but I put it pretty low on my TODO because it's a
safe failure (no host crash, just an unexpected PSCI_RET_INVALID_PARAMS
gets returned because the valid MPIDR doesn't look valid for a moment.)
Also, the test is quite pathological, especially when the host has many
CPUs, so I wouldn't expect this to show up on a sane guest. I agree
it would be nice to get it fixed eventually though.

Thanks,
drew

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: kvm-unit-tests: psci_cpu_on_test FAILed
  2019-08-02 10:56 ` Zenghui Yu
  (?)
@ 2019-08-02 15:56   ` Marc Zyngier
  -1 siblings, 0 replies; 18+ messages in thread
From: Marc Zyngier @ 2019-08-02 15:56 UTC (permalink / raw)
  To: Zenghui Yu, drjones, James Morse, julien.thierry.kdev, suzuki.poulose
  Cc: kvmarm, kvm, linux-arm-kernel, Wanghaibin (D)

On 02/08/2019 11:56, Zenghui Yu wrote:
> Hi folks,
> 
> Running kvm-unit-tests with Linux 5.3.0-rc2 on Kunpeng 920, we will get
> the following fail info:
> 
> 	[...]
> 	FAIL psci (4 tests, 1 unexpected failures)
> 	[...]
> and
> 	[...]
> 	INFO: unexpected cpu_on return value: caller=CPU9, ret=-2
> 	FAIL: cpu-on
> 	SUMMARY: 4 tests, 1 unexpected failures
> 
> 
> I think this is an issue had been fixed once by commit 6c7a5dce22b3
> ("KVM: arm/arm64: fix races in kvm_psci_vcpu_on"), which makes use of
> kvm->lock mutex to fix the race between two PSCI_CPU_ON calls - one
> does reset on the MPIDR register whilst another reads it.
> 
> But commit 358b28f09f0 ("arm/arm64: KVM: Allow a VCPU to fully reset
> itself") later moves the reset work into check_vcpu_requests(), by
> making a KVM_REQ_VCPU_RESET request in PSCI code. Thus the reset work
> has not been protected by kvm->lock mutex anymore, and the race shows up
> again...
> 
> Do we need a fix for this issue? At least achieve a mutex execution
> between the reset of MPIDR and kvm_mpidr_to_vcpu()?

The thing is that the way we reset registers is marginally insane.
Yes, it catches most reset bugs. It also introduces many more in
the rest of the paths.

The fun part is that there is hardly a need for resetting MPIDR.
It has already been set when we've created the vcpu. It is the
poisoning of the sysreg array that creates a situation where
the MPIDR is temporarily invalid.

So instead of poisoning the array, how about we just keep
track of the registers for which we've called a reset function?
It should be enough to track the most obvious bugs... I've
cobbled the following patch together, which seems to fix the
issue on my TX2 with 64 vcpus.

Thoughts?

	M.

diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
index f26e181d881c..17f46ee7dc83 100644
--- a/arch/arm64/kvm/sys_regs.c
+++ b/arch/arm64/kvm/sys_regs.c
@@ -2254,13 +2254,17 @@ static int emulate_sys_reg(struct kvm_vcpu *vcpu,
 }
 
 static void reset_sys_reg_descs(struct kvm_vcpu *vcpu,
-			      const struct sys_reg_desc *table, size_t num)
+				const struct sys_reg_desc *table, size_t num,
+				unsigned long *bmap)
 {
 	unsigned long i;
 
 	for (i = 0; i < num; i++)
-		if (table[i].reset)
+		if (table[i].reset) {
 			table[i].reset(vcpu, &table[i]);
+			if (bmap)
+				set_bit(i, bmap);
+		}
 }
 
 /**
@@ -2772,21 +2776,23 @@ void kvm_sys_reg_table_init(void)
  */
 void kvm_reset_sys_regs(struct kvm_vcpu *vcpu)
 {
+	unsigned long *bmap;
 	size_t num;
 	const struct sys_reg_desc *table;
 
-	/* Catch someone adding a register without putting in reset entry. */
-	memset(&vcpu->arch.ctxt.sys_regs, 0x42, sizeof(vcpu->arch.ctxt.sys_regs));
+	bmap = bitmap_alloc(NR_SYS_REGS, GFP_KERNEL);
 
 	/* Generic chip reset first (so target could override). */
-	reset_sys_reg_descs(vcpu, sys_reg_descs, ARRAY_SIZE(sys_reg_descs));
+	reset_sys_reg_descs(vcpu, sys_reg_descs, ARRAY_SIZE(sys_reg_descs), bmap);
 
 	table = get_target_table(vcpu->arch.target, true, &num);
-	reset_sys_reg_descs(vcpu, table, num);
+	reset_sys_reg_descs(vcpu, table, num, bmap);
 
 	for (num = 1; num < NR_SYS_REGS; num++) {
-		if (WARN(__vcpu_sys_reg(vcpu, num) == 0x4242424242424242,
+		if (WARN(bmap && !test_bit(num, bmap),
 			 "Didn't reset __vcpu_sys_reg(%zi)\n", num))
 			break;
 	}
+
+	kfree(bmap);
 }


-- 
Jazz is not dead, it just smells funny...

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: kvm-unit-tests: psci_cpu_on_test FAILed
@ 2019-08-02 15:56   ` Marc Zyngier
  0 siblings, 0 replies; 18+ messages in thread
From: Marc Zyngier @ 2019-08-02 15:56 UTC (permalink / raw)
  To: Zenghui Yu, drjones, James Morse, julien.thierry.kdev, suzuki.poulose
  Cc: linux-arm-kernel, kvmarm, kvm

On 02/08/2019 11:56, Zenghui Yu wrote:
> Hi folks,
> 
> Running kvm-unit-tests with Linux 5.3.0-rc2 on Kunpeng 920, we will get
> the following fail info:
> 
> 	[...]
> 	FAIL psci (4 tests, 1 unexpected failures)
> 	[...]
> and
> 	[...]
> 	INFO: unexpected cpu_on return value: caller=CPU9, ret=-2
> 	FAIL: cpu-on
> 	SUMMARY: 4 tests, 1 unexpected failures
> 
> 
> I think this is an issue had been fixed once by commit 6c7a5dce22b3
> ("KVM: arm/arm64: fix races in kvm_psci_vcpu_on"), which makes use of
> kvm->lock mutex to fix the race between two PSCI_CPU_ON calls - one
> does reset on the MPIDR register whilst another reads it.
> 
> But commit 358b28f09f0 ("arm/arm64: KVM: Allow a VCPU to fully reset
> itself") later moves the reset work into check_vcpu_requests(), by
> making a KVM_REQ_VCPU_RESET request in PSCI code. Thus the reset work
> has not been protected by kvm->lock mutex anymore, and the race shows up
> again...
> 
> Do we need a fix for this issue? At least achieve a mutex execution
> between the reset of MPIDR and kvm_mpidr_to_vcpu()?

The thing is that the way we reset registers is marginally insane.
Yes, it catches most reset bugs. It also introduces many more in
the rest of the paths.

The fun part is that there is hardly a need for resetting MPIDR.
It has already been set when we've created the vcpu. It is the
poisoning of the sysreg array that creates a situation where
the MPIDR is temporarily invalid.

So instead of poisoning the array, how about we just keep
track of the registers for which we've called a reset function?
It should be enough to track the most obvious bugs... I've
cobbled the following patch together, which seems to fix the
issue on my TX2 with 64 vcpus.

Thoughts?

	M.

diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
index f26e181d881c..17f46ee7dc83 100644
--- a/arch/arm64/kvm/sys_regs.c
+++ b/arch/arm64/kvm/sys_regs.c
@@ -2254,13 +2254,17 @@ static int emulate_sys_reg(struct kvm_vcpu *vcpu,
 }
 
 static void reset_sys_reg_descs(struct kvm_vcpu *vcpu,
-			      const struct sys_reg_desc *table, size_t num)
+				const struct sys_reg_desc *table, size_t num,
+				unsigned long *bmap)
 {
 	unsigned long i;
 
 	for (i = 0; i < num; i++)
-		if (table[i].reset)
+		if (table[i].reset) {
 			table[i].reset(vcpu, &table[i]);
+			if (bmap)
+				set_bit(i, bmap);
+		}
 }
 
 /**
@@ -2772,21 +2776,23 @@ void kvm_sys_reg_table_init(void)
  */
 void kvm_reset_sys_regs(struct kvm_vcpu *vcpu)
 {
+	unsigned long *bmap;
 	size_t num;
 	const struct sys_reg_desc *table;
 
-	/* Catch someone adding a register without putting in reset entry. */
-	memset(&vcpu->arch.ctxt.sys_regs, 0x42, sizeof(vcpu->arch.ctxt.sys_regs));
+	bmap = bitmap_alloc(NR_SYS_REGS, GFP_KERNEL);
 
 	/* Generic chip reset first (so target could override). */
-	reset_sys_reg_descs(vcpu, sys_reg_descs, ARRAY_SIZE(sys_reg_descs));
+	reset_sys_reg_descs(vcpu, sys_reg_descs, ARRAY_SIZE(sys_reg_descs), bmap);
 
 	table = get_target_table(vcpu->arch.target, true, &num);
-	reset_sys_reg_descs(vcpu, table, num);
+	reset_sys_reg_descs(vcpu, table, num, bmap);
 
 	for (num = 1; num < NR_SYS_REGS; num++) {
-		if (WARN(__vcpu_sys_reg(vcpu, num) == 0x4242424242424242,
+		if (WARN(bmap && !test_bit(num, bmap),
 			 "Didn't reset __vcpu_sys_reg(%zi)\n", num))
 			break;
 	}
+
+	kfree(bmap);
 }


-- 
Jazz is not dead, it just smells funny...
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: kvm-unit-tests: psci_cpu_on_test FAILed
@ 2019-08-02 15:56   ` Marc Zyngier
  0 siblings, 0 replies; 18+ messages in thread
From: Marc Zyngier @ 2019-08-02 15:56 UTC (permalink / raw)
  To: Zenghui Yu, drjones, James Morse, julien.thierry.kdev, suzuki.poulose
  Cc: linux-arm-kernel, Wanghaibin (D), kvmarm, kvm

On 02/08/2019 11:56, Zenghui Yu wrote:
> Hi folks,
> 
> Running kvm-unit-tests with Linux 5.3.0-rc2 on Kunpeng 920, we will get
> the following fail info:
> 
> 	[...]
> 	FAIL psci (4 tests, 1 unexpected failures)
> 	[...]
> and
> 	[...]
> 	INFO: unexpected cpu_on return value: caller=CPU9, ret=-2
> 	FAIL: cpu-on
> 	SUMMARY: 4 tests, 1 unexpected failures
> 
> 
> I think this is an issue had been fixed once by commit 6c7a5dce22b3
> ("KVM: arm/arm64: fix races in kvm_psci_vcpu_on"), which makes use of
> kvm->lock mutex to fix the race between two PSCI_CPU_ON calls - one
> does reset on the MPIDR register whilst another reads it.
> 
> But commit 358b28f09f0 ("arm/arm64: KVM: Allow a VCPU to fully reset
> itself") later moves the reset work into check_vcpu_requests(), by
> making a KVM_REQ_VCPU_RESET request in PSCI code. Thus the reset work
> has not been protected by kvm->lock mutex anymore, and the race shows up
> again...
> 
> Do we need a fix for this issue? At least achieve a mutex execution
> between the reset of MPIDR and kvm_mpidr_to_vcpu()?

The thing is that the way we reset registers is marginally insane.
Yes, it catches most reset bugs. It also introduces many more in
the rest of the paths.

The fun part is that there is hardly a need for resetting MPIDR.
It has already been set when we've created the vcpu. It is the
poisoning of the sysreg array that creates a situation where
the MPIDR is temporarily invalid.

So instead of poisoning the array, how about we just keep
track of the registers for which we've called a reset function?
It should be enough to track the most obvious bugs... I've
cobbled the following patch together, which seems to fix the
issue on my TX2 with 64 vcpus.

Thoughts?

	M.

diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
index f26e181d881c..17f46ee7dc83 100644
--- a/arch/arm64/kvm/sys_regs.c
+++ b/arch/arm64/kvm/sys_regs.c
@@ -2254,13 +2254,17 @@ static int emulate_sys_reg(struct kvm_vcpu *vcpu,
 }
 
 static void reset_sys_reg_descs(struct kvm_vcpu *vcpu,
-			      const struct sys_reg_desc *table, size_t num)
+				const struct sys_reg_desc *table, size_t num,
+				unsigned long *bmap)
 {
 	unsigned long i;
 
 	for (i = 0; i < num; i++)
-		if (table[i].reset)
+		if (table[i].reset) {
 			table[i].reset(vcpu, &table[i]);
+			if (bmap)
+				set_bit(i, bmap);
+		}
 }
 
 /**
@@ -2772,21 +2776,23 @@ void kvm_sys_reg_table_init(void)
  */
 void kvm_reset_sys_regs(struct kvm_vcpu *vcpu)
 {
+	unsigned long *bmap;
 	size_t num;
 	const struct sys_reg_desc *table;
 
-	/* Catch someone adding a register without putting in reset entry. */
-	memset(&vcpu->arch.ctxt.sys_regs, 0x42, sizeof(vcpu->arch.ctxt.sys_regs));
+	bmap = bitmap_alloc(NR_SYS_REGS, GFP_KERNEL);
 
 	/* Generic chip reset first (so target could override). */
-	reset_sys_reg_descs(vcpu, sys_reg_descs, ARRAY_SIZE(sys_reg_descs));
+	reset_sys_reg_descs(vcpu, sys_reg_descs, ARRAY_SIZE(sys_reg_descs), bmap);
 
 	table = get_target_table(vcpu->arch.target, true, &num);
-	reset_sys_reg_descs(vcpu, table, num);
+	reset_sys_reg_descs(vcpu, table, num, bmap);
 
 	for (num = 1; num < NR_SYS_REGS; num++) {
-		if (WARN(__vcpu_sys_reg(vcpu, num) == 0x4242424242424242,
+		if (WARN(bmap && !test_bit(num, bmap),
 			 "Didn't reset __vcpu_sys_reg(%zi)\n", num))
 			break;
 	}
+
+	kfree(bmap);
 }


-- 
Jazz is not dead, it just smells funny...

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: kvm-unit-tests: psci_cpu_on_test FAILed
  2019-08-02 15:56   ` Marc Zyngier
  (?)
@ 2019-08-03  9:27     ` Zenghui Yu
  -1 siblings, 0 replies; 18+ messages in thread
From: Zenghui Yu @ 2019-08-03  9:27 UTC (permalink / raw)
  To: Marc Zyngier, drjones, James Morse, julien.thierry.kdev, suzuki.poulose
  Cc: kvmarm, kvm, linux-arm-kernel, Wanghaibin (D)

Hi Marc,

On 2019/8/2 23:56, Marc Zyngier wrote:
> On 02/08/2019 11:56, Zenghui Yu wrote:
>> Hi folks,
>>
>> Running kvm-unit-tests with Linux 5.3.0-rc2 on Kunpeng 920, we will get
>> the following fail info:
>>
>> 	[...]
>> 	FAIL psci (4 tests, 1 unexpected failures)
>> 	[...]
>> and
>> 	[...]
>> 	INFO: unexpected cpu_on return value: caller=CPU9, ret=-2
>> 	FAIL: cpu-on
>> 	SUMMARY: 4 tests, 1 unexpected failures
>>
>>
>> I think this is an issue had been fixed once by commit 6c7a5dce22b3
>> ("KVM: arm/arm64: fix races in kvm_psci_vcpu_on"), which makes use of
>> kvm->lock mutex to fix the race between two PSCI_CPU_ON calls - one
>> does reset on the MPIDR register whilst another reads it.
>>
>> But commit 358b28f09f0 ("arm/arm64: KVM: Allow a VCPU to fully reset
>> itself") later moves the reset work into check_vcpu_requests(), by
>> making a KVM_REQ_VCPU_RESET request in PSCI code. Thus the reset work
>> has not been protected by kvm->lock mutex anymore, and the race shows up
>> again...
>>
>> Do we need a fix for this issue? At least achieve a mutex execution
>> between the reset of MPIDR and kvm_mpidr_to_vcpu()?
> 
> The thing is that the way we reset registers is marginally insane.
> Yes, it catches most reset bugs. It also introduces many more in
> the rest of the paths.
> 
> The fun part is that there is hardly a need for resetting MPIDR.
> It has already been set when we've created the vcpu. It is the

(That means we can let reset_mpidr() do nothing?)

> poisoning of the sysreg array that creates a situation where
> the MPIDR is temporarily invalid.
> 
> So instead of poisoning the array, how about we just keep
> track of the registers for which we've called a reset function?
> It should be enough to track the most obvious bugs... I've

The reset of DBG{BCR,BVR,WVR,WCR}n_EL1 registers will also be tracked.
It may affect our judgment?

> cobbled the following patch together, which seems to fix the
> issue on my TX2 with 64 vcpus.
> 
> Thoughts?
> 
> 	M.
> 
> diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
> index f26e181d881c..17f46ee7dc83 100644
> --- a/arch/arm64/kvm/sys_regs.c
> +++ b/arch/arm64/kvm/sys_regs.c
> @@ -2254,13 +2254,17 @@ static int emulate_sys_reg(struct kvm_vcpu *vcpu,
>   }
>   
>   static void reset_sys_reg_descs(struct kvm_vcpu *vcpu,
> -			      const struct sys_reg_desc *table, size_t num)
> +				const struct sys_reg_desc *table, size_t num,
> +				unsigned long *bmap)
>   {
>   	unsigned long i;
>   
>   	for (i = 0; i < num; i++)
> -		if (table[i].reset)
> +		if (table[i].reset) {
>   			table[i].reset(vcpu, &table[i]);
> +			if (bmap)
> +				set_bit(i, bmap);

I think this should be:
	set_bit(table[i].reg, bmap);

Am I wrong?

> +		}
>   }
>   
>   /**
> @@ -2772,21 +2776,23 @@ void kvm_sys_reg_table_init(void)
>    */
>   void kvm_reset_sys_regs(struct kvm_vcpu *vcpu)
>   {
> +	unsigned long *bmap;
>   	size_t num;
>   	const struct sys_reg_desc *table;
>   
> -	/* Catch someone adding a register without putting in reset entry. */
> -	memset(&vcpu->arch.ctxt.sys_regs, 0x42, sizeof(vcpu->arch.ctxt.sys_regs));
> +	bmap = bitmap_alloc(NR_SYS_REGS, GFP_KERNEL);

LOCKDEP kernel will be not happy with this bitmap_alloc:

" BUG: sleeping function called from invalid context at mm/slab.h:501
   in_atomic(): 1, irqs_disabled(): 0, pid: 8710, name: qemu-system-aar "

>   
>   	/* Generic chip reset first (so target could override). */
> -	reset_sys_reg_descs(vcpu, sys_reg_descs, ARRAY_SIZE(sys_reg_descs));
> +	reset_sys_reg_descs(vcpu, sys_reg_descs, ARRAY_SIZE(sys_reg_descs), bmap);
>   
>   	table = get_target_table(vcpu->arch.target, true, &num);
> -	reset_sys_reg_descs(vcpu, table, num);
> +	reset_sys_reg_descs(vcpu, table, num, bmap);
>   
>   	for (num = 1; num < NR_SYS_REGS; num++) {
> -		if (WARN(__vcpu_sys_reg(vcpu, num) == 0x4242424242424242,
> +		if (WARN(bmap && !test_bit(num, bmap),
>   			 "Didn't reset __vcpu_sys_reg(%zi)\n", num))
>   			break;
>   	}
> +
> +	kfree(bmap);
>   }
> 
> 

Some other minor questions about the sys reg resetting:
1. Pointer Authentication Registers haven't have reset entry yet,
    do they need? The same for ACTLR_EL1.
2. Why does PMCR_EL0 register have no "reg" field, in sys_reg_descs[]?

I will test this patch with kvm-unit-tests next week!


Thanks,
zenghui


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: kvm-unit-tests: psci_cpu_on_test FAILed
@ 2019-08-03  9:27     ` Zenghui Yu
  0 siblings, 0 replies; 18+ messages in thread
From: Zenghui Yu @ 2019-08-03  9:27 UTC (permalink / raw)
  To: Marc Zyngier, drjones, James Morse, julien.thierry.kdev, suzuki.poulose
  Cc: linux-arm-kernel, kvmarm, kvm

Hi Marc,

On 2019/8/2 23:56, Marc Zyngier wrote:
> On 02/08/2019 11:56, Zenghui Yu wrote:
>> Hi folks,
>>
>> Running kvm-unit-tests with Linux 5.3.0-rc2 on Kunpeng 920, we will get
>> the following fail info:
>>
>> 	[...]
>> 	FAIL psci (4 tests, 1 unexpected failures)
>> 	[...]
>> and
>> 	[...]
>> 	INFO: unexpected cpu_on return value: caller=CPU9, ret=-2
>> 	FAIL: cpu-on
>> 	SUMMARY: 4 tests, 1 unexpected failures
>>
>>
>> I think this is an issue had been fixed once by commit 6c7a5dce22b3
>> ("KVM: arm/arm64: fix races in kvm_psci_vcpu_on"), which makes use of
>> kvm->lock mutex to fix the race between two PSCI_CPU_ON calls - one
>> does reset on the MPIDR register whilst another reads it.
>>
>> But commit 358b28f09f0 ("arm/arm64: KVM: Allow a VCPU to fully reset
>> itself") later moves the reset work into check_vcpu_requests(), by
>> making a KVM_REQ_VCPU_RESET request in PSCI code. Thus the reset work
>> has not been protected by kvm->lock mutex anymore, and the race shows up
>> again...
>>
>> Do we need a fix for this issue? At least achieve a mutex execution
>> between the reset of MPIDR and kvm_mpidr_to_vcpu()?
> 
> The thing is that the way we reset registers is marginally insane.
> Yes, it catches most reset bugs. It also introduces many more in
> the rest of the paths.
> 
> The fun part is that there is hardly a need for resetting MPIDR.
> It has already been set when we've created the vcpu. It is the

(That means we can let reset_mpidr() do nothing?)

> poisoning of the sysreg array that creates a situation where
> the MPIDR is temporarily invalid.
> 
> So instead of poisoning the array, how about we just keep
> track of the registers for which we've called a reset function?
> It should be enough to track the most obvious bugs... I've

The reset of DBG{BCR,BVR,WVR,WCR}n_EL1 registers will also be tracked.
It may affect our judgment?

> cobbled the following patch together, which seems to fix the
> issue on my TX2 with 64 vcpus.
> 
> Thoughts?
> 
> 	M.
> 
> diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
> index f26e181d881c..17f46ee7dc83 100644
> --- a/arch/arm64/kvm/sys_regs.c
> +++ b/arch/arm64/kvm/sys_regs.c
> @@ -2254,13 +2254,17 @@ static int emulate_sys_reg(struct kvm_vcpu *vcpu,
>   }
>   
>   static void reset_sys_reg_descs(struct kvm_vcpu *vcpu,
> -			      const struct sys_reg_desc *table, size_t num)
> +				const struct sys_reg_desc *table, size_t num,
> +				unsigned long *bmap)
>   {
>   	unsigned long i;
>   
>   	for (i = 0; i < num; i++)
> -		if (table[i].reset)
> +		if (table[i].reset) {
>   			table[i].reset(vcpu, &table[i]);
> +			if (bmap)
> +				set_bit(i, bmap);

I think this should be:
	set_bit(table[i].reg, bmap);

Am I wrong?

> +		}
>   }
>   
>   /**
> @@ -2772,21 +2776,23 @@ void kvm_sys_reg_table_init(void)
>    */
>   void kvm_reset_sys_regs(struct kvm_vcpu *vcpu)
>   {
> +	unsigned long *bmap;
>   	size_t num;
>   	const struct sys_reg_desc *table;
>   
> -	/* Catch someone adding a register without putting in reset entry. */
> -	memset(&vcpu->arch.ctxt.sys_regs, 0x42, sizeof(vcpu->arch.ctxt.sys_regs));
> +	bmap = bitmap_alloc(NR_SYS_REGS, GFP_KERNEL);

LOCKDEP kernel will be not happy with this bitmap_alloc:

" BUG: sleeping function called from invalid context at mm/slab.h:501
   in_atomic(): 1, irqs_disabled(): 0, pid: 8710, name: qemu-system-aar "

>   
>   	/* Generic chip reset first (so target could override). */
> -	reset_sys_reg_descs(vcpu, sys_reg_descs, ARRAY_SIZE(sys_reg_descs));
> +	reset_sys_reg_descs(vcpu, sys_reg_descs, ARRAY_SIZE(sys_reg_descs), bmap);
>   
>   	table = get_target_table(vcpu->arch.target, true, &num);
> -	reset_sys_reg_descs(vcpu, table, num);
> +	reset_sys_reg_descs(vcpu, table, num, bmap);
>   
>   	for (num = 1; num < NR_SYS_REGS; num++) {
> -		if (WARN(__vcpu_sys_reg(vcpu, num) == 0x4242424242424242,
> +		if (WARN(bmap && !test_bit(num, bmap),
>   			 "Didn't reset __vcpu_sys_reg(%zi)\n", num))
>   			break;
>   	}
> +
> +	kfree(bmap);
>   }
> 
> 

Some other minor questions about the sys reg resetting:
1. Pointer Authentication Registers haven't have reset entry yet,
    do they need? The same for ACTLR_EL1.
2. Why does PMCR_EL0 register have no "reg" field, in sys_reg_descs[]?

I will test this patch with kvm-unit-tests next week!


Thanks,
zenghui

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: kvm-unit-tests: psci_cpu_on_test FAILed
@ 2019-08-03  9:27     ` Zenghui Yu
  0 siblings, 0 replies; 18+ messages in thread
From: Zenghui Yu @ 2019-08-03  9:27 UTC (permalink / raw)
  To: Marc Zyngier, drjones, James Morse, julien.thierry.kdev, suzuki.poulose
  Cc: linux-arm-kernel, Wanghaibin (D), kvmarm, kvm

Hi Marc,

On 2019/8/2 23:56, Marc Zyngier wrote:
> On 02/08/2019 11:56, Zenghui Yu wrote:
>> Hi folks,
>>
>> Running kvm-unit-tests with Linux 5.3.0-rc2 on Kunpeng 920, we will get
>> the following fail info:
>>
>> 	[...]
>> 	FAIL psci (4 tests, 1 unexpected failures)
>> 	[...]
>> and
>> 	[...]
>> 	INFO: unexpected cpu_on return value: caller=CPU9, ret=-2
>> 	FAIL: cpu-on
>> 	SUMMARY: 4 tests, 1 unexpected failures
>>
>>
>> I think this is an issue had been fixed once by commit 6c7a5dce22b3
>> ("KVM: arm/arm64: fix races in kvm_psci_vcpu_on"), which makes use of
>> kvm->lock mutex to fix the race between two PSCI_CPU_ON calls - one
>> does reset on the MPIDR register whilst another reads it.
>>
>> But commit 358b28f09f0 ("arm/arm64: KVM: Allow a VCPU to fully reset
>> itself") later moves the reset work into check_vcpu_requests(), by
>> making a KVM_REQ_VCPU_RESET request in PSCI code. Thus the reset work
>> has not been protected by kvm->lock mutex anymore, and the race shows up
>> again...
>>
>> Do we need a fix for this issue? At least achieve a mutex execution
>> between the reset of MPIDR and kvm_mpidr_to_vcpu()?
> 
> The thing is that the way we reset registers is marginally insane.
> Yes, it catches most reset bugs. It also introduces many more in
> the rest of the paths.
> 
> The fun part is that there is hardly a need for resetting MPIDR.
> It has already been set when we've created the vcpu. It is the

(That means we can let reset_mpidr() do nothing?)

> poisoning of the sysreg array that creates a situation where
> the MPIDR is temporarily invalid.
> 
> So instead of poisoning the array, how about we just keep
> track of the registers for which we've called a reset function?
> It should be enough to track the most obvious bugs... I've

The reset of DBG{BCR,BVR,WVR,WCR}n_EL1 registers will also be tracked.
It may affect our judgment?

> cobbled the following patch together, which seems to fix the
> issue on my TX2 with 64 vcpus.
> 
> Thoughts?
> 
> 	M.
> 
> diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
> index f26e181d881c..17f46ee7dc83 100644
> --- a/arch/arm64/kvm/sys_regs.c
> +++ b/arch/arm64/kvm/sys_regs.c
> @@ -2254,13 +2254,17 @@ static int emulate_sys_reg(struct kvm_vcpu *vcpu,
>   }
>   
>   static void reset_sys_reg_descs(struct kvm_vcpu *vcpu,
> -			      const struct sys_reg_desc *table, size_t num)
> +				const struct sys_reg_desc *table, size_t num,
> +				unsigned long *bmap)
>   {
>   	unsigned long i;
>   
>   	for (i = 0; i < num; i++)
> -		if (table[i].reset)
> +		if (table[i].reset) {
>   			table[i].reset(vcpu, &table[i]);
> +			if (bmap)
> +				set_bit(i, bmap);

I think this should be:
	set_bit(table[i].reg, bmap);

Am I wrong?

> +		}
>   }
>   
>   /**
> @@ -2772,21 +2776,23 @@ void kvm_sys_reg_table_init(void)
>    */
>   void kvm_reset_sys_regs(struct kvm_vcpu *vcpu)
>   {
> +	unsigned long *bmap;
>   	size_t num;
>   	const struct sys_reg_desc *table;
>   
> -	/* Catch someone adding a register without putting in reset entry. */
> -	memset(&vcpu->arch.ctxt.sys_regs, 0x42, sizeof(vcpu->arch.ctxt.sys_regs));
> +	bmap = bitmap_alloc(NR_SYS_REGS, GFP_KERNEL);

LOCKDEP kernel will be not happy with this bitmap_alloc:

" BUG: sleeping function called from invalid context at mm/slab.h:501
   in_atomic(): 1, irqs_disabled(): 0, pid: 8710, name: qemu-system-aar "

>   
>   	/* Generic chip reset first (so target could override). */
> -	reset_sys_reg_descs(vcpu, sys_reg_descs, ARRAY_SIZE(sys_reg_descs));
> +	reset_sys_reg_descs(vcpu, sys_reg_descs, ARRAY_SIZE(sys_reg_descs), bmap);
>   
>   	table = get_target_table(vcpu->arch.target, true, &num);
> -	reset_sys_reg_descs(vcpu, table, num);
> +	reset_sys_reg_descs(vcpu, table, num, bmap);
>   
>   	for (num = 1; num < NR_SYS_REGS; num++) {
> -		if (WARN(__vcpu_sys_reg(vcpu, num) == 0x4242424242424242,
> +		if (WARN(bmap && !test_bit(num, bmap),
>   			 "Didn't reset __vcpu_sys_reg(%zi)\n", num))
>   			break;
>   	}
> +
> +	kfree(bmap);
>   }
> 
> 

Some other minor questions about the sys reg resetting:
1. Pointer Authentication Registers haven't have reset entry yet,
    do they need? The same for ACTLR_EL1.
2. Why does PMCR_EL0 register have no "reg" field, in sys_reg_descs[]?

I will test this patch with kvm-unit-tests next week!


Thanks,
zenghui


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: kvm-unit-tests: psci_cpu_on_test FAILed
  2019-08-03  9:27     ` Zenghui Yu
  (?)
@ 2019-08-03 10:10       ` Marc Zyngier
  -1 siblings, 0 replies; 18+ messages in thread
From: Marc Zyngier @ 2019-08-03 10:10 UTC (permalink / raw)
  To: Zenghui Yu
  Cc: drjones, James Morse, julien.thierry.kdev, suzuki.poulose,
	kvmarm, kvm, linux-arm-kernel, Wanghaibin (D)

On Sat, 3 Aug 2019 17:27:41 +0800
Zenghui Yu <yuzenghui@huawei.com> wrote:

> Hi Marc,
> 
> On 2019/8/2 23:56, Marc Zyngier wrote:
> > On 02/08/2019 11:56, Zenghui Yu wrote:  
> >> Hi folks,
> >>
> >> Running kvm-unit-tests with Linux 5.3.0-rc2 on Kunpeng 920, we will get
> >> the following fail info:
> >>
> >> 	[...]
> >> 	FAIL psci (4 tests, 1 unexpected failures)
> >> 	[...]
> >> and
> >> 	[...]
> >> 	INFO: unexpected cpu_on return value: caller=CPU9, ret=-2
> >> 	FAIL: cpu-on
> >> 	SUMMARY: 4 tests, 1 unexpected failures
> >>
> >>
> >> I think this is an issue had been fixed once by commit 6c7a5dce22b3
> >> ("KVM: arm/arm64: fix races in kvm_psci_vcpu_on"), which makes use of
> >> kvm->lock mutex to fix the race between two PSCI_CPU_ON calls - one
> >> does reset on the MPIDR register whilst another reads it.
> >>
> >> But commit 358b28f09f0 ("arm/arm64: KVM: Allow a VCPU to fully reset
> >> itself") later moves the reset work into check_vcpu_requests(), by
> >> making a KVM_REQ_VCPU_RESET request in PSCI code. Thus the reset work
> >> has not been protected by kvm->lock mutex anymore, and the race shows up
> >> again...
> >>
> >> Do we need a fix for this issue? At least achieve a mutex execution
> >> between the reset of MPIDR and kvm_mpidr_to_vcpu()?  
> > 
> > The thing is that the way we reset registers is marginally insane.
> > Yes, it catches most reset bugs. It also introduces many more in
> > the rest of the paths.
> > 
> > The fun part is that there is hardly a need for resetting MPIDR.
> > It has already been set when we've created the vcpu. It is the  
> 
> (That means we can let reset_mpidr() do nothing?)

It should ever be only written once, as MPIDR is a constant from the
guest perspective. So it is not that it can do nothing. It is just that
there should never be any other value written to it.

> 
> > poisoning of the sysreg array that creates a situation where
> > the MPIDR is temporarily invalid.
> > 
> > So instead of poisoning the array, how about we just keep
> > track of the registers for which we've called a reset function?
> > It should be enough to track the most obvious bugs... I've  
> 
> The reset of DBG{BCR,BVR,WVR,WCR}n_EL1 registers will also be tracked.
> It may affect our judgment?

How so?

> 
> > cobbled the following patch together, which seems to fix the
> > issue on my TX2 with 64 vcpus.
> > 
> > Thoughts?
> > 
> > 	M.
> > 
> > diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
> > index f26e181d881c..17f46ee7dc83 100644
> > --- a/arch/arm64/kvm/sys_regs.c
> > +++ b/arch/arm64/kvm/sys_regs.c
> > @@ -2254,13 +2254,17 @@ static int emulate_sys_reg(struct kvm_vcpu *vcpu,
> >   }  
> >   >   static void reset_sys_reg_descs(struct kvm_vcpu *vcpu,  
> > -			      const struct sys_reg_desc *table, size_t num)
> > +				const struct sys_reg_desc *table, size_t num,
> > +				unsigned long *bmap)
> >   {
> >   	unsigned long i;  
> >   >   	for (i = 0; i < num; i++)  
> > -		if (table[i].reset)
> > +		if (table[i].reset) {
> >   			table[i].reset(vcpu, &table[i]);
> > +			if (bmap)
> > +				set_bit(i, bmap);  
> 
> I think this should be:
> 	set_bit(table[i].reg, bmap);
> 
> Am I wrong?

No, you're absolutely right.

> 
> > +		}
> >   }  
> >   >   /**  
> > @@ -2772,21 +2776,23 @@ void kvm_sys_reg_table_init(void)
> >    */
> >   void kvm_reset_sys_regs(struct kvm_vcpu *vcpu)
> >   {
> > +	unsigned long *bmap;
> >   	size_t num;
> >   	const struct sys_reg_desc *table;  
> >   > -	/* Catch someone adding a register without putting in reset entry. */  
> > -	memset(&vcpu->arch.ctxt.sys_regs, 0x42, sizeof(vcpu->arch.ctxt.sys_regs));
> > +	bmap = bitmap_alloc(NR_SYS_REGS, GFP_KERNEL);  
> 
> LOCKDEP kernel will be not happy with this bitmap_alloc:
> 
> " BUG: sleeping function called from invalid context at mm/slab.h:501
>    in_atomic(): 1, irqs_disabled(): 0, pid: 8710, name: qemu-system-aar "

Well spotted. I guess GFP_ATOMIC is in order.

> 
> >   >   	/* Generic chip reset first (so target could override). */  
> > -	reset_sys_reg_descs(vcpu, sys_reg_descs, ARRAY_SIZE(sys_reg_descs));
> > +	reset_sys_reg_descs(vcpu, sys_reg_descs, ARRAY_SIZE(sys_reg_descs), bmap);  
> >   >   	table = get_target_table(vcpu->arch.target, true, &num);  
> > -	reset_sys_reg_descs(vcpu, table, num);
> > +	reset_sys_reg_descs(vcpu, table, num, bmap);  
> >   >   	for (num = 1; num < NR_SYS_REGS; num++) {  
> > -		if (WARN(__vcpu_sys_reg(vcpu, num) == 0x4242424242424242,
> > +		if (WARN(bmap && !test_bit(num, bmap),
> >   			 "Didn't reset __vcpu_sys_reg(%zi)\n", num))
> >   			break;
> >   	}
> > +
> > +	kfree(bmap);
> >   }
> > 
> >   
> 
> Some other minor questions about the sys reg resetting:
> 1. Pointer Authentication Registers haven't have reset entry yet,
>     do they need? The same for ACTLR_EL1.

Pointer auth registers definitely have a reset function, set to
reset_unknown. So does ACTLR_EL1, which resets to the host's value.

> 2. Why does PMCR_EL0 register have no "reg" field, in sys_reg_descs[]?

This looks like a (very minor) bug. reset_pmcr writes directly to the
PMCR_EL0 shadow register without using r->reg as the register number.
But in the light of the reset tracking we want to add, this needs
fixing.

> I will test this patch with kvm-unit-tests next week!

Well, wait until I repost something a bit less buggy...

Thanks,

	M.
-- 
Without deviation from the norm, progress is not possible.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: kvm-unit-tests: psci_cpu_on_test FAILed
@ 2019-08-03 10:10       ` Marc Zyngier
  0 siblings, 0 replies; 18+ messages in thread
From: Marc Zyngier @ 2019-08-03 10:10 UTC (permalink / raw)
  To: Zenghui Yu; +Cc: kvm, linux-arm-kernel, kvmarm

On Sat, 3 Aug 2019 17:27:41 +0800
Zenghui Yu <yuzenghui@huawei.com> wrote:

> Hi Marc,
> 
> On 2019/8/2 23:56, Marc Zyngier wrote:
> > On 02/08/2019 11:56, Zenghui Yu wrote:  
> >> Hi folks,
> >>
> >> Running kvm-unit-tests with Linux 5.3.0-rc2 on Kunpeng 920, we will get
> >> the following fail info:
> >>
> >> 	[...]
> >> 	FAIL psci (4 tests, 1 unexpected failures)
> >> 	[...]
> >> and
> >> 	[...]
> >> 	INFO: unexpected cpu_on return value: caller=CPU9, ret=-2
> >> 	FAIL: cpu-on
> >> 	SUMMARY: 4 tests, 1 unexpected failures
> >>
> >>
> >> I think this is an issue had been fixed once by commit 6c7a5dce22b3
> >> ("KVM: arm/arm64: fix races in kvm_psci_vcpu_on"), which makes use of
> >> kvm->lock mutex to fix the race between two PSCI_CPU_ON calls - one
> >> does reset on the MPIDR register whilst another reads it.
> >>
> >> But commit 358b28f09f0 ("arm/arm64: KVM: Allow a VCPU to fully reset
> >> itself") later moves the reset work into check_vcpu_requests(), by
> >> making a KVM_REQ_VCPU_RESET request in PSCI code. Thus the reset work
> >> has not been protected by kvm->lock mutex anymore, and the race shows up
> >> again...
> >>
> >> Do we need a fix for this issue? At least achieve a mutex execution
> >> between the reset of MPIDR and kvm_mpidr_to_vcpu()?  
> > 
> > The thing is that the way we reset registers is marginally insane.
> > Yes, it catches most reset bugs. It also introduces many more in
> > the rest of the paths.
> > 
> > The fun part is that there is hardly a need for resetting MPIDR.
> > It has already been set when we've created the vcpu. It is the  
> 
> (That means we can let reset_mpidr() do nothing?)

It should ever be only written once, as MPIDR is a constant from the
guest perspective. So it is not that it can do nothing. It is just that
there should never be any other value written to it.

> 
> > poisoning of the sysreg array that creates a situation where
> > the MPIDR is temporarily invalid.
> > 
> > So instead of poisoning the array, how about we just keep
> > track of the registers for which we've called a reset function?
> > It should be enough to track the most obvious bugs... I've  
> 
> The reset of DBG{BCR,BVR,WVR,WCR}n_EL1 registers will also be tracked.
> It may affect our judgment?

How so?

> 
> > cobbled the following patch together, which seems to fix the
> > issue on my TX2 with 64 vcpus.
> > 
> > Thoughts?
> > 
> > 	M.
> > 
> > diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
> > index f26e181d881c..17f46ee7dc83 100644
> > --- a/arch/arm64/kvm/sys_regs.c
> > +++ b/arch/arm64/kvm/sys_regs.c
> > @@ -2254,13 +2254,17 @@ static int emulate_sys_reg(struct kvm_vcpu *vcpu,
> >   }  
> >   >   static void reset_sys_reg_descs(struct kvm_vcpu *vcpu,  
> > -			      const struct sys_reg_desc *table, size_t num)
> > +				const struct sys_reg_desc *table, size_t num,
> > +				unsigned long *bmap)
> >   {
> >   	unsigned long i;  
> >   >   	for (i = 0; i < num; i++)  
> > -		if (table[i].reset)
> > +		if (table[i].reset) {
> >   			table[i].reset(vcpu, &table[i]);
> > +			if (bmap)
> > +				set_bit(i, bmap);  
> 
> I think this should be:
> 	set_bit(table[i].reg, bmap);
> 
> Am I wrong?

No, you're absolutely right.

> 
> > +		}
> >   }  
> >   >   /**  
> > @@ -2772,21 +2776,23 @@ void kvm_sys_reg_table_init(void)
> >    */
> >   void kvm_reset_sys_regs(struct kvm_vcpu *vcpu)
> >   {
> > +	unsigned long *bmap;
> >   	size_t num;
> >   	const struct sys_reg_desc *table;  
> >   > -	/* Catch someone adding a register without putting in reset entry. */  
> > -	memset(&vcpu->arch.ctxt.sys_regs, 0x42, sizeof(vcpu->arch.ctxt.sys_regs));
> > +	bmap = bitmap_alloc(NR_SYS_REGS, GFP_KERNEL);  
> 
> LOCKDEP kernel will be not happy with this bitmap_alloc:
> 
> " BUG: sleeping function called from invalid context at mm/slab.h:501
>    in_atomic(): 1, irqs_disabled(): 0, pid: 8710, name: qemu-system-aar "

Well spotted. I guess GFP_ATOMIC is in order.

> 
> >   >   	/* Generic chip reset first (so target could override). */  
> > -	reset_sys_reg_descs(vcpu, sys_reg_descs, ARRAY_SIZE(sys_reg_descs));
> > +	reset_sys_reg_descs(vcpu, sys_reg_descs, ARRAY_SIZE(sys_reg_descs), bmap);  
> >   >   	table = get_target_table(vcpu->arch.target, true, &num);  
> > -	reset_sys_reg_descs(vcpu, table, num);
> > +	reset_sys_reg_descs(vcpu, table, num, bmap);  
> >   >   	for (num = 1; num < NR_SYS_REGS; num++) {  
> > -		if (WARN(__vcpu_sys_reg(vcpu, num) == 0x4242424242424242,
> > +		if (WARN(bmap && !test_bit(num, bmap),
> >   			 "Didn't reset __vcpu_sys_reg(%zi)\n", num))
> >   			break;
> >   	}
> > +
> > +	kfree(bmap);
> >   }
> > 
> >   
> 
> Some other minor questions about the sys reg resetting:
> 1. Pointer Authentication Registers haven't have reset entry yet,
>     do they need? The same for ACTLR_EL1.

Pointer auth registers definitely have a reset function, set to
reset_unknown. So does ACTLR_EL1, which resets to the host's value.

> 2. Why does PMCR_EL0 register have no "reg" field, in sys_reg_descs[]?

This looks like a (very minor) bug. reset_pmcr writes directly to the
PMCR_EL0 shadow register without using r->reg as the register number.
But in the light of the reset tracking we want to add, this needs
fixing.

> I will test this patch with kvm-unit-tests next week!

Well, wait until I repost something a bit less buggy...

Thanks,

	M.
-- 
Without deviation from the norm, progress is not possible.
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: kvm-unit-tests: psci_cpu_on_test FAILed
@ 2019-08-03 10:10       ` Marc Zyngier
  0 siblings, 0 replies; 18+ messages in thread
From: Marc Zyngier @ 2019-08-03 10:10 UTC (permalink / raw)
  To: Zenghui Yu
  Cc: drjones, kvm, suzuki.poulose, James Morse, linux-arm-kernel,
	Wanghaibin (D),
	kvmarm, julien.thierry.kdev

On Sat, 3 Aug 2019 17:27:41 +0800
Zenghui Yu <yuzenghui@huawei.com> wrote:

> Hi Marc,
> 
> On 2019/8/2 23:56, Marc Zyngier wrote:
> > On 02/08/2019 11:56, Zenghui Yu wrote:  
> >> Hi folks,
> >>
> >> Running kvm-unit-tests with Linux 5.3.0-rc2 on Kunpeng 920, we will get
> >> the following fail info:
> >>
> >> 	[...]
> >> 	FAIL psci (4 tests, 1 unexpected failures)
> >> 	[...]
> >> and
> >> 	[...]
> >> 	INFO: unexpected cpu_on return value: caller=CPU9, ret=-2
> >> 	FAIL: cpu-on
> >> 	SUMMARY: 4 tests, 1 unexpected failures
> >>
> >>
> >> I think this is an issue had been fixed once by commit 6c7a5dce22b3
> >> ("KVM: arm/arm64: fix races in kvm_psci_vcpu_on"), which makes use of
> >> kvm->lock mutex to fix the race between two PSCI_CPU_ON calls - one
> >> does reset on the MPIDR register whilst another reads it.
> >>
> >> But commit 358b28f09f0 ("arm/arm64: KVM: Allow a VCPU to fully reset
> >> itself") later moves the reset work into check_vcpu_requests(), by
> >> making a KVM_REQ_VCPU_RESET request in PSCI code. Thus the reset work
> >> has not been protected by kvm->lock mutex anymore, and the race shows up
> >> again...
> >>
> >> Do we need a fix for this issue? At least achieve a mutex execution
> >> between the reset of MPIDR and kvm_mpidr_to_vcpu()?  
> > 
> > The thing is that the way we reset registers is marginally insane.
> > Yes, it catches most reset bugs. It also introduces many more in
> > the rest of the paths.
> > 
> > The fun part is that there is hardly a need for resetting MPIDR.
> > It has already been set when we've created the vcpu. It is the  
> 
> (That means we can let reset_mpidr() do nothing?)

It should ever be only written once, as MPIDR is a constant from the
guest perspective. So it is not that it can do nothing. It is just that
there should never be any other value written to it.

> 
> > poisoning of the sysreg array that creates a situation where
> > the MPIDR is temporarily invalid.
> > 
> > So instead of poisoning the array, how about we just keep
> > track of the registers for which we've called a reset function?
> > It should be enough to track the most obvious bugs... I've  
> 
> The reset of DBG{BCR,BVR,WVR,WCR}n_EL1 registers will also be tracked.
> It may affect our judgment?

How so?

> 
> > cobbled the following patch together, which seems to fix the
> > issue on my TX2 with 64 vcpus.
> > 
> > Thoughts?
> > 
> > 	M.
> > 
> > diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
> > index f26e181d881c..17f46ee7dc83 100644
> > --- a/arch/arm64/kvm/sys_regs.c
> > +++ b/arch/arm64/kvm/sys_regs.c
> > @@ -2254,13 +2254,17 @@ static int emulate_sys_reg(struct kvm_vcpu *vcpu,
> >   }  
> >   >   static void reset_sys_reg_descs(struct kvm_vcpu *vcpu,  
> > -			      const struct sys_reg_desc *table, size_t num)
> > +				const struct sys_reg_desc *table, size_t num,
> > +				unsigned long *bmap)
> >   {
> >   	unsigned long i;  
> >   >   	for (i = 0; i < num; i++)  
> > -		if (table[i].reset)
> > +		if (table[i].reset) {
> >   			table[i].reset(vcpu, &table[i]);
> > +			if (bmap)
> > +				set_bit(i, bmap);  
> 
> I think this should be:
> 	set_bit(table[i].reg, bmap);
> 
> Am I wrong?

No, you're absolutely right.

> 
> > +		}
> >   }  
> >   >   /**  
> > @@ -2772,21 +2776,23 @@ void kvm_sys_reg_table_init(void)
> >    */
> >   void kvm_reset_sys_regs(struct kvm_vcpu *vcpu)
> >   {
> > +	unsigned long *bmap;
> >   	size_t num;
> >   	const struct sys_reg_desc *table;  
> >   > -	/* Catch someone adding a register without putting in reset entry. */  
> > -	memset(&vcpu->arch.ctxt.sys_regs, 0x42, sizeof(vcpu->arch.ctxt.sys_regs));
> > +	bmap = bitmap_alloc(NR_SYS_REGS, GFP_KERNEL);  
> 
> LOCKDEP kernel will be not happy with this bitmap_alloc:
> 
> " BUG: sleeping function called from invalid context at mm/slab.h:501
>    in_atomic(): 1, irqs_disabled(): 0, pid: 8710, name: qemu-system-aar "

Well spotted. I guess GFP_ATOMIC is in order.

> 
> >   >   	/* Generic chip reset first (so target could override). */  
> > -	reset_sys_reg_descs(vcpu, sys_reg_descs, ARRAY_SIZE(sys_reg_descs));
> > +	reset_sys_reg_descs(vcpu, sys_reg_descs, ARRAY_SIZE(sys_reg_descs), bmap);  
> >   >   	table = get_target_table(vcpu->arch.target, true, &num);  
> > -	reset_sys_reg_descs(vcpu, table, num);
> > +	reset_sys_reg_descs(vcpu, table, num, bmap);  
> >   >   	for (num = 1; num < NR_SYS_REGS; num++) {  
> > -		if (WARN(__vcpu_sys_reg(vcpu, num) == 0x4242424242424242,
> > +		if (WARN(bmap && !test_bit(num, bmap),
> >   			 "Didn't reset __vcpu_sys_reg(%zi)\n", num))
> >   			break;
> >   	}
> > +
> > +	kfree(bmap);
> >   }
> > 
> >   
> 
> Some other minor questions about the sys reg resetting:
> 1. Pointer Authentication Registers haven't have reset entry yet,
>     do they need? The same for ACTLR_EL1.

Pointer auth registers definitely have a reset function, set to
reset_unknown. So does ACTLR_EL1, which resets to the host's value.

> 2. Why does PMCR_EL0 register have no "reg" field, in sys_reg_descs[]?

This looks like a (very minor) bug. reset_pmcr writes directly to the
PMCR_EL0 shadow register without using r->reg as the register number.
But in the light of the reset tracking we want to add, this needs
fixing.

> I will test this patch with kvm-unit-tests next week!

Well, wait until I repost something a bit less buggy...

Thanks,

	M.
-- 
Without deviation from the norm, progress is not possible.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: kvm-unit-tests: psci_cpu_on_test FAILed
  2019-08-03 10:10       ` Marc Zyngier
  (?)
@ 2019-08-05  2:38         ` Zenghui Yu
  -1 siblings, 0 replies; 18+ messages in thread
From: Zenghui Yu @ 2019-08-05  2:38 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: drjones, James Morse, julien.thierry.kdev, suzuki.poulose,
	kvmarm, kvm, linux-arm-kernel, Wanghaibin (D)

Hi Marc,

On 2019/8/3 18:10, Marc Zyngier wrote:
> On Sat, 3 Aug 2019 17:27:41 +0800
> Zenghui Yu <yuzenghui@huawei.com> wrote:
> 
>> Hi Marc,
>>
>> On 2019/8/2 23:56, Marc Zyngier wrote:
>>> On 02/08/2019 11:56, Zenghui Yu wrote:
>>>> Hi folks,
>>>>
>>>> Running kvm-unit-tests with Linux 5.3.0-rc2 on Kunpeng 920, we will get
>>>> the following fail info:
>>>>
>>>> 	[...]
>>>> 	FAIL psci (4 tests, 1 unexpected failures)
>>>> 	[...]
>>>> and
>>>> 	[...]
>>>> 	INFO: unexpected cpu_on return value: caller=CPU9, ret=-2
>>>> 	FAIL: cpu-on
>>>> 	SUMMARY: 4 tests, 1 unexpected failures
>>>>
>>>>
>>>> I think this is an issue had been fixed once by commit 6c7a5dce22b3
>>>> ("KVM: arm/arm64: fix races in kvm_psci_vcpu_on"), which makes use of
>>>> kvm->lock mutex to fix the race between two PSCI_CPU_ON calls - one
>>>> does reset on the MPIDR register whilst another reads it.
>>>>
>>>> But commit 358b28f09f0 ("arm/arm64: KVM: Allow a VCPU to fully reset
>>>> itself") later moves the reset work into check_vcpu_requests(), by
>>>> making a KVM_REQ_VCPU_RESET request in PSCI code. Thus the reset work
>>>> has not been protected by kvm->lock mutex anymore, and the race shows up
>>>> again...
>>>>
>>>> Do we need a fix for this issue? At least achieve a mutex execution
>>>> between the reset of MPIDR and kvm_mpidr_to_vcpu()?
>>>
>>> The thing is that the way we reset registers is marginally insane.
>>> Yes, it catches most reset bugs. It also introduces many more in
>>> the rest of the paths.
>>>
>>> The fun part is that there is hardly a need for resetting MPIDR.
>>> It has already been set when we've created the vcpu. It is the
>>
>> (That means we can let reset_mpidr() do nothing?)
> 
> It should ever be only written once, as MPIDR is a constant from the
> guest perspective. So it is not that it can do nothing. It is just that
> there should never be any other value written to it.

Thanks for this explanation.

>>
>>> poisoning of the sysreg array that creates a situation where
>>> the MPIDR is temporarily invalid.
>>>
>>> So instead of poisoning the array, how about we just keep
>>> track of the registers for which we've called a reset function?
>>> It should be enough to track the most obvious bugs... I've
>>
>> The reset of DBG{BCR,BVR,WVR,WCR}n_EL1 registers will also be tracked.
>> It may affect our judgment?
> 
> How so?

bmap[0..15] will be set multiple times. But it also will not affect
anything now (it's safe).

>>
>>> cobbled the following patch together, which seems to fix the
>>> issue on my TX2 with 64 vcpus.
>>>
>>> Thoughts?
>>>
>>> 	M.
>>>
>>> diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
>>> index f26e181d881c..17f46ee7dc83 100644
>>> --- a/arch/arm64/kvm/sys_regs.c
>>> +++ b/arch/arm64/kvm/sys_regs.c
>>> @@ -2254,13 +2254,17 @@ static int emulate_sys_reg(struct kvm_vcpu *vcpu,
>>>    }
>>>    >   static void reset_sys_reg_descs(struct kvm_vcpu *vcpu,
>>> -			      const struct sys_reg_desc *table, size_t num)
>>> +				const struct sys_reg_desc *table, size_t num,
>>> +				unsigned long *bmap)
>>>    {
>>>    	unsigned long i;
>>>    >   	for (i = 0; i < num; i++)
>>> -		if (table[i].reset)
>>> +		if (table[i].reset) {
>>>    			table[i].reset(vcpu, &table[i]);
>>> +			if (bmap)
>>> +				set_bit(i, bmap);
>>
>> I think this should be:
>> 	set_bit(table[i].reg, bmap);
>>
>> Am I wrong?
> 
> No, you're absolutely right.
> 
>>
>>> +		}
>>>    }
>>>    >   /**
>>> @@ -2772,21 +2776,23 @@ void kvm_sys_reg_table_init(void)
>>>     */
>>>    void kvm_reset_sys_regs(struct kvm_vcpu *vcpu)
>>>    {
>>> +	unsigned long *bmap;
>>>    	size_t num;
>>>    	const struct sys_reg_desc *table;
>>>    > -	/* Catch someone adding a register without putting in reset entry. */
>>> -	memset(&vcpu->arch.ctxt.sys_regs, 0x42, sizeof(vcpu->arch.ctxt.sys_regs));
>>> +	bmap = bitmap_alloc(NR_SYS_REGS, GFP_KERNEL);
>>
>> LOCKDEP kernel will be not happy with this bitmap_alloc:
>>
>> " BUG: sleeping function called from invalid context at mm/slab.h:501
>>     in_atomic(): 1, irqs_disabled(): 0, pid: 8710, name: qemu-system-aar "
> 
> Well spotted. I guess GFP_ATOMIC is in order.
> 
>>
>>>    >   	/* Generic chip reset first (so target could override). */
>>> -	reset_sys_reg_descs(vcpu, sys_reg_descs, ARRAY_SIZE(sys_reg_descs));
>>> +	reset_sys_reg_descs(vcpu, sys_reg_descs, ARRAY_SIZE(sys_reg_descs), bmap);
>>>    >   	table = get_target_table(vcpu->arch.target, true, &num);
>>> -	reset_sys_reg_descs(vcpu, table, num);
>>> +	reset_sys_reg_descs(vcpu, table, num, bmap);
>>>    >   	for (num = 1; num < NR_SYS_REGS; num++) {
>>> -		if (WARN(__vcpu_sys_reg(vcpu, num) == 0x4242424242424242,
>>> +		if (WARN(bmap && !test_bit(num, bmap),
>>>    			 "Didn't reset __vcpu_sys_reg(%zi)\n", num))
>>>    			break;
>>>    	}
>>> +
>>> +	kfree(bmap);
>>>    }
>>>
>>>    
>>
>> Some other minor questions about the sys reg resetting:
>> 1. Pointer Authentication Registers haven't have reset entry yet,
>>      do they need? The same for ACTLR_EL1.
> 
> Pointer auth registers definitely have a reset function, set to
> reset_unknown. So does ACTLR_EL1, which resets to the host's value.

I find them now :-)

> 
>> 2. Why does PMCR_EL0 register have no "reg" field, in sys_reg_descs[]?
> 
> This looks like a (very minor) bug. reset_pmcr writes directly to the
> PMCR_EL0 shadow register without using r->reg as the register number.
> But in the light of the reset tracking we want to add, this needs
> fixing.
> 
>> I will test this patch with kvm-unit-tests next week!
> 
> Well, wait until I repost something a bit less buggy...

Thanks,
zenghui



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: kvm-unit-tests: psci_cpu_on_test FAILed
@ 2019-08-05  2:38         ` Zenghui Yu
  0 siblings, 0 replies; 18+ messages in thread
From: Zenghui Yu @ 2019-08-05  2:38 UTC (permalink / raw)
  To: Marc Zyngier; +Cc: kvm, linux-arm-kernel, kvmarm

Hi Marc,

On 2019/8/3 18:10, Marc Zyngier wrote:
> On Sat, 3 Aug 2019 17:27:41 +0800
> Zenghui Yu <yuzenghui@huawei.com> wrote:
> 
>> Hi Marc,
>>
>> On 2019/8/2 23:56, Marc Zyngier wrote:
>>> On 02/08/2019 11:56, Zenghui Yu wrote:
>>>> Hi folks,
>>>>
>>>> Running kvm-unit-tests with Linux 5.3.0-rc2 on Kunpeng 920, we will get
>>>> the following fail info:
>>>>
>>>> 	[...]
>>>> 	FAIL psci (4 tests, 1 unexpected failures)
>>>> 	[...]
>>>> and
>>>> 	[...]
>>>> 	INFO: unexpected cpu_on return value: caller=CPU9, ret=-2
>>>> 	FAIL: cpu-on
>>>> 	SUMMARY: 4 tests, 1 unexpected failures
>>>>
>>>>
>>>> I think this is an issue had been fixed once by commit 6c7a5dce22b3
>>>> ("KVM: arm/arm64: fix races in kvm_psci_vcpu_on"), which makes use of
>>>> kvm->lock mutex to fix the race between two PSCI_CPU_ON calls - one
>>>> does reset on the MPIDR register whilst another reads it.
>>>>
>>>> But commit 358b28f09f0 ("arm/arm64: KVM: Allow a VCPU to fully reset
>>>> itself") later moves the reset work into check_vcpu_requests(), by
>>>> making a KVM_REQ_VCPU_RESET request in PSCI code. Thus the reset work
>>>> has not been protected by kvm->lock mutex anymore, and the race shows up
>>>> again...
>>>>
>>>> Do we need a fix for this issue? At least achieve a mutex execution
>>>> between the reset of MPIDR and kvm_mpidr_to_vcpu()?
>>>
>>> The thing is that the way we reset registers is marginally insane.
>>> Yes, it catches most reset bugs. It also introduces many more in
>>> the rest of the paths.
>>>
>>> The fun part is that there is hardly a need for resetting MPIDR.
>>> It has already been set when we've created the vcpu. It is the
>>
>> (That means we can let reset_mpidr() do nothing?)
> 
> It should ever be only written once, as MPIDR is a constant from the
> guest perspective. So it is not that it can do nothing. It is just that
> there should never be any other value written to it.

Thanks for this explanation.

>>
>>> poisoning of the sysreg array that creates a situation where
>>> the MPIDR is temporarily invalid.
>>>
>>> So instead of poisoning the array, how about we just keep
>>> track of the registers for which we've called a reset function?
>>> It should be enough to track the most obvious bugs... I've
>>
>> The reset of DBG{BCR,BVR,WVR,WCR}n_EL1 registers will also be tracked.
>> It may affect our judgment?
> 
> How so?

bmap[0..15] will be set multiple times. But it also will not affect
anything now (it's safe).

>>
>>> cobbled the following patch together, which seems to fix the
>>> issue on my TX2 with 64 vcpus.
>>>
>>> Thoughts?
>>>
>>> 	M.
>>>
>>> diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
>>> index f26e181d881c..17f46ee7dc83 100644
>>> --- a/arch/arm64/kvm/sys_regs.c
>>> +++ b/arch/arm64/kvm/sys_regs.c
>>> @@ -2254,13 +2254,17 @@ static int emulate_sys_reg(struct kvm_vcpu *vcpu,
>>>    }
>>>    >   static void reset_sys_reg_descs(struct kvm_vcpu *vcpu,
>>> -			      const struct sys_reg_desc *table, size_t num)
>>> +				const struct sys_reg_desc *table, size_t num,
>>> +				unsigned long *bmap)
>>>    {
>>>    	unsigned long i;
>>>    >   	for (i = 0; i < num; i++)
>>> -		if (table[i].reset)
>>> +		if (table[i].reset) {
>>>    			table[i].reset(vcpu, &table[i]);
>>> +			if (bmap)
>>> +				set_bit(i, bmap);
>>
>> I think this should be:
>> 	set_bit(table[i].reg, bmap);
>>
>> Am I wrong?
> 
> No, you're absolutely right.
> 
>>
>>> +		}
>>>    }
>>>    >   /**
>>> @@ -2772,21 +2776,23 @@ void kvm_sys_reg_table_init(void)
>>>     */
>>>    void kvm_reset_sys_regs(struct kvm_vcpu *vcpu)
>>>    {
>>> +	unsigned long *bmap;
>>>    	size_t num;
>>>    	const struct sys_reg_desc *table;
>>>    > -	/* Catch someone adding a register without putting in reset entry. */
>>> -	memset(&vcpu->arch.ctxt.sys_regs, 0x42, sizeof(vcpu->arch.ctxt.sys_regs));
>>> +	bmap = bitmap_alloc(NR_SYS_REGS, GFP_KERNEL);
>>
>> LOCKDEP kernel will be not happy with this bitmap_alloc:
>>
>> " BUG: sleeping function called from invalid context at mm/slab.h:501
>>     in_atomic(): 1, irqs_disabled(): 0, pid: 8710, name: qemu-system-aar "
> 
> Well spotted. I guess GFP_ATOMIC is in order.
> 
>>
>>>    >   	/* Generic chip reset first (so target could override). */
>>> -	reset_sys_reg_descs(vcpu, sys_reg_descs, ARRAY_SIZE(sys_reg_descs));
>>> +	reset_sys_reg_descs(vcpu, sys_reg_descs, ARRAY_SIZE(sys_reg_descs), bmap);
>>>    >   	table = get_target_table(vcpu->arch.target, true, &num);
>>> -	reset_sys_reg_descs(vcpu, table, num);
>>> +	reset_sys_reg_descs(vcpu, table, num, bmap);
>>>    >   	for (num = 1; num < NR_SYS_REGS; num++) {
>>> -		if (WARN(__vcpu_sys_reg(vcpu, num) == 0x4242424242424242,
>>> +		if (WARN(bmap && !test_bit(num, bmap),
>>>    			 "Didn't reset __vcpu_sys_reg(%zi)\n", num))
>>>    			break;
>>>    	}
>>> +
>>> +	kfree(bmap);
>>>    }
>>>
>>>    
>>
>> Some other minor questions about the sys reg resetting:
>> 1. Pointer Authentication Registers haven't have reset entry yet,
>>      do they need? The same for ACTLR_EL1.
> 
> Pointer auth registers definitely have a reset function, set to
> reset_unknown. So does ACTLR_EL1, which resets to the host's value.

I find them now :-)

> 
>> 2. Why does PMCR_EL0 register have no "reg" field, in sys_reg_descs[]?
> 
> This looks like a (very minor) bug. reset_pmcr writes directly to the
> PMCR_EL0 shadow register without using r->reg as the register number.
> But in the light of the reset tracking we want to add, this needs
> fixing.
> 
>> I will test this patch with kvm-unit-tests next week!
> 
> Well, wait until I repost something a bit less buggy...

Thanks,
zenghui


_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: kvm-unit-tests: psci_cpu_on_test FAILed
@ 2019-08-05  2:38         ` Zenghui Yu
  0 siblings, 0 replies; 18+ messages in thread
From: Zenghui Yu @ 2019-08-05  2:38 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: drjones, kvm, suzuki.poulose, James Morse, linux-arm-kernel,
	Wanghaibin (D),
	kvmarm, julien.thierry.kdev

Hi Marc,

On 2019/8/3 18:10, Marc Zyngier wrote:
> On Sat, 3 Aug 2019 17:27:41 +0800
> Zenghui Yu <yuzenghui@huawei.com> wrote:
> 
>> Hi Marc,
>>
>> On 2019/8/2 23:56, Marc Zyngier wrote:
>>> On 02/08/2019 11:56, Zenghui Yu wrote:
>>>> Hi folks,
>>>>
>>>> Running kvm-unit-tests with Linux 5.3.0-rc2 on Kunpeng 920, we will get
>>>> the following fail info:
>>>>
>>>> 	[...]
>>>> 	FAIL psci (4 tests, 1 unexpected failures)
>>>> 	[...]
>>>> and
>>>> 	[...]
>>>> 	INFO: unexpected cpu_on return value: caller=CPU9, ret=-2
>>>> 	FAIL: cpu-on
>>>> 	SUMMARY: 4 tests, 1 unexpected failures
>>>>
>>>>
>>>> I think this is an issue had been fixed once by commit 6c7a5dce22b3
>>>> ("KVM: arm/arm64: fix races in kvm_psci_vcpu_on"), which makes use of
>>>> kvm->lock mutex to fix the race between two PSCI_CPU_ON calls - one
>>>> does reset on the MPIDR register whilst another reads it.
>>>>
>>>> But commit 358b28f09f0 ("arm/arm64: KVM: Allow a VCPU to fully reset
>>>> itself") later moves the reset work into check_vcpu_requests(), by
>>>> making a KVM_REQ_VCPU_RESET request in PSCI code. Thus the reset work
>>>> has not been protected by kvm->lock mutex anymore, and the race shows up
>>>> again...
>>>>
>>>> Do we need a fix for this issue? At least achieve a mutex execution
>>>> between the reset of MPIDR and kvm_mpidr_to_vcpu()?
>>>
>>> The thing is that the way we reset registers is marginally insane.
>>> Yes, it catches most reset bugs. It also introduces many more in
>>> the rest of the paths.
>>>
>>> The fun part is that there is hardly a need for resetting MPIDR.
>>> It has already been set when we've created the vcpu. It is the
>>
>> (That means we can let reset_mpidr() do nothing?)
> 
> It should ever be only written once, as MPIDR is a constant from the
> guest perspective. So it is not that it can do nothing. It is just that
> there should never be any other value written to it.

Thanks for this explanation.

>>
>>> poisoning of the sysreg array that creates a situation where
>>> the MPIDR is temporarily invalid.
>>>
>>> So instead of poisoning the array, how about we just keep
>>> track of the registers for which we've called a reset function?
>>> It should be enough to track the most obvious bugs... I've
>>
>> The reset of DBG{BCR,BVR,WVR,WCR}n_EL1 registers will also be tracked.
>> It may affect our judgment?
> 
> How so?

bmap[0..15] will be set multiple times. But it also will not affect
anything now (it's safe).

>>
>>> cobbled the following patch together, which seems to fix the
>>> issue on my TX2 with 64 vcpus.
>>>
>>> Thoughts?
>>>
>>> 	M.
>>>
>>> diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
>>> index f26e181d881c..17f46ee7dc83 100644
>>> --- a/arch/arm64/kvm/sys_regs.c
>>> +++ b/arch/arm64/kvm/sys_regs.c
>>> @@ -2254,13 +2254,17 @@ static int emulate_sys_reg(struct kvm_vcpu *vcpu,
>>>    }
>>>    >   static void reset_sys_reg_descs(struct kvm_vcpu *vcpu,
>>> -			      const struct sys_reg_desc *table, size_t num)
>>> +				const struct sys_reg_desc *table, size_t num,
>>> +				unsigned long *bmap)
>>>    {
>>>    	unsigned long i;
>>>    >   	for (i = 0; i < num; i++)
>>> -		if (table[i].reset)
>>> +		if (table[i].reset) {
>>>    			table[i].reset(vcpu, &table[i]);
>>> +			if (bmap)
>>> +				set_bit(i, bmap);
>>
>> I think this should be:
>> 	set_bit(table[i].reg, bmap);
>>
>> Am I wrong?
> 
> No, you're absolutely right.
> 
>>
>>> +		}
>>>    }
>>>    >   /**
>>> @@ -2772,21 +2776,23 @@ void kvm_sys_reg_table_init(void)
>>>     */
>>>    void kvm_reset_sys_regs(struct kvm_vcpu *vcpu)
>>>    {
>>> +	unsigned long *bmap;
>>>    	size_t num;
>>>    	const struct sys_reg_desc *table;
>>>    > -	/* Catch someone adding a register without putting in reset entry. */
>>> -	memset(&vcpu->arch.ctxt.sys_regs, 0x42, sizeof(vcpu->arch.ctxt.sys_regs));
>>> +	bmap = bitmap_alloc(NR_SYS_REGS, GFP_KERNEL);
>>
>> LOCKDEP kernel will be not happy with this bitmap_alloc:
>>
>> " BUG: sleeping function called from invalid context at mm/slab.h:501
>>     in_atomic(): 1, irqs_disabled(): 0, pid: 8710, name: qemu-system-aar "
> 
> Well spotted. I guess GFP_ATOMIC is in order.
> 
>>
>>>    >   	/* Generic chip reset first (so target could override). */
>>> -	reset_sys_reg_descs(vcpu, sys_reg_descs, ARRAY_SIZE(sys_reg_descs));
>>> +	reset_sys_reg_descs(vcpu, sys_reg_descs, ARRAY_SIZE(sys_reg_descs), bmap);
>>>    >   	table = get_target_table(vcpu->arch.target, true, &num);
>>> -	reset_sys_reg_descs(vcpu, table, num);
>>> +	reset_sys_reg_descs(vcpu, table, num, bmap);
>>>    >   	for (num = 1; num < NR_SYS_REGS; num++) {
>>> -		if (WARN(__vcpu_sys_reg(vcpu, num) == 0x4242424242424242,
>>> +		if (WARN(bmap && !test_bit(num, bmap),
>>>    			 "Didn't reset __vcpu_sys_reg(%zi)\n", num))
>>>    			break;
>>>    	}
>>> +
>>> +	kfree(bmap);
>>>    }
>>>
>>>    
>>
>> Some other minor questions about the sys reg resetting:
>> 1. Pointer Authentication Registers haven't have reset entry yet,
>>      do they need? The same for ACTLR_EL1.
> 
> Pointer auth registers definitely have a reset function, set to
> reset_unknown. So does ACTLR_EL1, which resets to the host's value.

I find them now :-)

> 
>> 2. Why does PMCR_EL0 register have no "reg" field, in sys_reg_descs[]?
> 
> This looks like a (very minor) bug. reset_pmcr writes directly to the
> PMCR_EL0 shadow register without using r->reg as the register number.
> But in the light of the reset tracking we want to add, this needs
> fixing.
> 
>> I will test this patch with kvm-unit-tests next week!
> 
> Well, wait until I repost something a bit less buggy...

Thanks,
zenghui



_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2019-08-05  2:39 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-08-02 10:56 kvm-unit-tests: psci_cpu_on_test FAILed Zenghui Yu
2019-08-02 10:56 ` Zenghui Yu
2019-08-02 10:56 ` Zenghui Yu
2019-08-02 11:32 ` Andrew Jones
2019-08-02 11:32   ` Andrew Jones
2019-08-02 11:32   ` Andrew Jones
2019-08-02 15:56 ` Marc Zyngier
2019-08-02 15:56   ` Marc Zyngier
2019-08-02 15:56   ` Marc Zyngier
2019-08-03  9:27   ` Zenghui Yu
2019-08-03  9:27     ` Zenghui Yu
2019-08-03  9:27     ` Zenghui Yu
2019-08-03 10:10     ` Marc Zyngier
2019-08-03 10:10       ` Marc Zyngier
2019-08-03 10:10       ` Marc Zyngier
2019-08-05  2:38       ` Zenghui Yu
2019-08-05  2:38         ` Zenghui Yu
2019-08-05  2:38         ` Zenghui Yu

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.