All of lore.kernel.org
 help / color / mirror / Atom feed
* Report an error on GICv4.1 vcpu de-schedule
@ 2022-03-17  7:27 ` Jingyi Wang
  0 siblings, 0 replies; 15+ messages in thread
From: Jingyi Wang @ 2022-03-17  7:27 UTC (permalink / raw)
  To: linux-arm-kernel, kvmarm, kvm, maz; +Cc: chengjian8, Martin.Weidmann

Hi Marc,

The patch "KVM: arm64: Delay the polling of the GICR_VPENDBASER.Dirty
bit"(57e3cebd022fbc035dcf190ac789fd2ffc747f5b) remove the polling of
GICR_VPENDBASER.Dirty bit in vcpu_load() , while check the VPT parsing
ready in kvm_vgic_flush_hwstate() for better performance.

Most time it works, but we have met an error on our hardware recently.
In preemptable kernel, the vcpu can be preempted between vcpu_load and
kvm_vgic_flush_hwstate. As a result, it get de-scheduled and
its_clear_vpend_valid() is called

	val = gicr_read_vpendbaser(vlpi_base + GICR_VPENDBASER);
	val &= ~GICR_VPENDBASER_Valid;
	val &= ~clr;
	val |= set;
	gicr_write_vpendbaser(val, vlpi_base + GICR_VPENDBASER);


The function clears Valid bit meanwhile GICR_VPENDBASER_Dirty
maybe still 1, which cause the subsequent GICR_VPENDBASER_Dirty polling
fail and report ""ITS virtual pending table not cleaning".

We have communicated with Martin from ARM and get the conclusion
that we should not change valid bit while the dirty bit not clear——
"The dirty bit reports whether the last schedule /de-schedule
operation has completed.The restriction on not changing Valid when Dirty
is 1, is so that hardware can always complete the last operation for
starting the next".

I think maybe we can check dirty bit clear before clearing the valid bit
in its_clear_vpend_valid() code. Hope to know your opinion about this
issue.

Thanks,
Jingyi







_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Report an error on GICv4.1 vcpu de-schedule
@ 2022-03-17  7:27 ` Jingyi Wang
  0 siblings, 0 replies; 15+ messages in thread
From: Jingyi Wang @ 2022-03-17  7:27 UTC (permalink / raw)
  To: linux-arm-kernel, kvmarm, kvm, maz
  Cc: wanghaibin.wang, yuzenghui, Martin.Weidmann, tangnianyao,
	chengjian8, Jingyi Wang

Hi Marc,

The patch "KVM: arm64: Delay the polling of the GICR_VPENDBASER.Dirty
bit"(57e3cebd022fbc035dcf190ac789fd2ffc747f5b) remove the polling of
GICR_VPENDBASER.Dirty bit in vcpu_load() , while check the VPT parsing
ready in kvm_vgic_flush_hwstate() for better performance.

Most time it works, but we have met an error on our hardware recently.
In preemptable kernel, the vcpu can be preempted between vcpu_load and
kvm_vgic_flush_hwstate. As a result, it get de-scheduled and
its_clear_vpend_valid() is called

	val = gicr_read_vpendbaser(vlpi_base + GICR_VPENDBASER);
	val &= ~GICR_VPENDBASER_Valid;
	val &= ~clr;
	val |= set;
	gicr_write_vpendbaser(val, vlpi_base + GICR_VPENDBASER);


The function clears Valid bit meanwhile GICR_VPENDBASER_Dirty
maybe still 1, which cause the subsequent GICR_VPENDBASER_Dirty polling
fail and report ""ITS virtual pending table not cleaning".

We have communicated with Martin from ARM and get the conclusion
that we should not change valid bit while the dirty bit not clear——
"The dirty bit reports whether the last schedule /de-schedule
operation has completed.The restriction on not changing Valid when Dirty
is 1, is so that hardware can always complete the last operation for
starting the next".

I think maybe we can check dirty bit clear before clearing the valid bit
in its_clear_vpend_valid() code. Hope to know your opinion about this
issue.

Thanks,
Jingyi








^ permalink raw reply	[flat|nested] 15+ messages in thread

* Report an error on GICv4.1 vcpu de-schedule
@ 2022-03-17  7:27 ` Jingyi Wang
  0 siblings, 0 replies; 15+ messages in thread
From: Jingyi Wang @ 2022-03-17  7:27 UTC (permalink / raw)
  To: linux-arm-kernel, kvmarm, kvm, maz
  Cc: wanghaibin.wang, yuzenghui, Martin.Weidmann, tangnianyao,
	chengjian8, Jingyi Wang

Hi Marc,

The patch "KVM: arm64: Delay the polling of the GICR_VPENDBASER.Dirty
bit"(57e3cebd022fbc035dcf190ac789fd2ffc747f5b) remove the polling of
GICR_VPENDBASER.Dirty bit in vcpu_load() , while check the VPT parsing
ready in kvm_vgic_flush_hwstate() for better performance.

Most time it works, but we have met an error on our hardware recently.
In preemptable kernel, the vcpu can be preempted between vcpu_load and
kvm_vgic_flush_hwstate. As a result, it get de-scheduled and
its_clear_vpend_valid() is called

	val = gicr_read_vpendbaser(vlpi_base + GICR_VPENDBASER);
	val &= ~GICR_VPENDBASER_Valid;
	val &= ~clr;
	val |= set;
	gicr_write_vpendbaser(val, vlpi_base + GICR_VPENDBASER);


The function clears Valid bit meanwhile GICR_VPENDBASER_Dirty
maybe still 1, which cause the subsequent GICR_VPENDBASER_Dirty polling
fail and report ""ITS virtual pending table not cleaning".

We have communicated with Martin from ARM and get the conclusion
that we should not change valid bit while the dirty bit not clear——
"The dirty bit reports whether the last schedule /de-schedule
operation has completed.The restriction on not changing Valid when Dirty
is 1, is so that hardware can always complete the last operation for
starting the next".

I think maybe we can check dirty bit clear before clearing the valid bit
in its_clear_vpend_valid() code. Hope to know your opinion about this
issue.

Thanks,
Jingyi








_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Report an error on GICv4.1 vcpu de-schedule
  2022-03-17  7:27 ` Jingyi Wang
  (?)
@ 2022-03-17 10:17   ` Marc Zyngier
  -1 siblings, 0 replies; 15+ messages in thread
From: Marc Zyngier @ 2022-03-17 10:17 UTC (permalink / raw)
  To: Jingyi Wang
  Cc: linux-arm-kernel, kvmarm, kvm, wanghaibin.wang, yuzenghui,
	Martin.Weidmann, tangnianyao, chengjian8

Hi Jingyi,

On Thu, 17 Mar 2022 07:27:45 +0000,
Jingyi Wang <wangjingyi11@huawei.com> wrote:
> 
> Hi Marc,
> 
> The patch "KVM: arm64: Delay the polling of the GICR_VPENDBASER.Dirty
> bit"(57e3cebd022fbc035dcf190ac789fd2ffc747f5b) remove the polling of
> GICR_VPENDBASER.Dirty bit in vcpu_load() , while check the VPT parsing
> ready in kvm_vgic_flush_hwstate() for better performance.
> 
> Most time it works, but we have met an error on our hardware recently.
> In preemptable kernel, the vcpu can be preempted between vcpu_load and
> kvm_vgic_flush_hwstate. As a result, it get de-scheduled and
> its_clear_vpend_valid() is called
> 
> 	val = gicr_read_vpendbaser(vlpi_base + GICR_VPENDBASER);
> 	val &= ~GICR_VPENDBASER_Valid;
> 	val &= ~clr;
> 	val |= set;
> 	gicr_write_vpendbaser(val, vlpi_base + GICR_VPENDBASER);
> 
> 
> The function clears Valid bit meanwhile GICR_VPENDBASER_Dirty
> maybe still 1, which cause the subsequent GICR_VPENDBASER_Dirty polling
> fail and report ""ITS virtual pending table not cleaning".
> 
> We have communicated with Martin from ARM and get the conclusion
> that we should not change valid bit while the dirty bit not clear——
> "The dirty bit reports whether the last schedule /de-schedule
> operation has completed.The restriction on not changing Valid when Dirty
> is 1, is so that hardware can always complete the last operation for
> starting the next".

Indeed, the spec is crystal clear about that, and clearing Valid while
Dirty is set is plain wrong.

> 
> I think maybe we can check dirty bit clear before clearing the valid bit
> in its_clear_vpend_valid() code. Hope to know your opinion about this
> issue.

Yes, that's what should happen. I came up with the patch below. Please
give it a shot and let me know if that helps. If it does, I'll queue
it as a fix.

Thanks,

	M.

From c23ccc9cfa603e30ac189d43af75f03b60d780bc Mon Sep 17 00:00:00 2001
From: Marc Zyngier <maz@kernel.org>
Date: Thu, 17 Mar 2022 09:49:02 +0000
Subject: [PATCH] irqchip/gic-v4: Wait for GICR_VPENDBASER.Dirty to clear
 before descheduling

The way KVM drives GICv4.{0,1} is as follows:
- vcpu_load() makes the VPE resident, instructing the RD to start
  scanning for interrupts
- just before entering the guest, we check that the RD has finished
  scanning and that we can start running the vcpu
- on preemption, we deschedule the VPE by making it invalid on
  the RD

However, we are preemptible between the first two steps. If it so
happens *and* that the RD was still scanning, we nonetheless write
to the GICR_VPENDBASER register while Dirty is set, and bad things
happen (we're in UNPRED land).

This affects both the 4.0 and 4.1 implementations.

Make sure Dirty is cleared before performing the deschedule,
meaning that its_clear_vpend_valid() becomes a sort of full VPE
residency barrier.

Reported-by: Jingyi Wang <wangjingyi11@huawei.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Fixes: 57e3cebd022f ("KVM: arm64: Delay the polling of the GICR_VPENDBASER.Dirty
bit")
Link: https://lore.kernel.org/r/4aae10ba-b39a-5f84-754b-69c2eb0a2c03@huawei.com
---
 drivers/irqchip/irq-gic-v3-its.c | 28 +++++++++++++++++++---------
 1 file changed, 19 insertions(+), 9 deletions(-)

diff --git a/drivers/irqchip/irq-gic-v3-its.c b/drivers/irqchip/irq-gic-v3-its.c
index 9e93ff2b6375..c9b1df980899 100644
--- a/drivers/irqchip/irq-gic-v3-its.c
+++ b/drivers/irqchip/irq-gic-v3-its.c
@@ -3011,18 +3011,12 @@ static int __init allocate_lpi_tables(void)
 	return 0;
 }
 
-static u64 its_clear_vpend_valid(void __iomem *vlpi_base, u64 clr, u64 set)
+static u64 read_vpend_dirty_clear(void __iomem *vlpi_base)
 {
 	u32 count = 1000000;	/* 1s! */
 	bool clean;
 	u64 val;
 
-	val = gicr_read_vpendbaser(vlpi_base + GICR_VPENDBASER);
-	val &= ~GICR_VPENDBASER_Valid;
-	val &= ~clr;
-	val |= set;
-	gicr_write_vpendbaser(val, vlpi_base + GICR_VPENDBASER);
-
 	do {
 		val = gicr_read_vpendbaser(vlpi_base + GICR_VPENDBASER);
 		clean = !(val & GICR_VPENDBASER_Dirty);
@@ -3033,10 +3027,26 @@ static u64 its_clear_vpend_valid(void __iomem *vlpi_base, u64 clr, u64 set)
 		}
 	} while (!clean && count);
 
-	if (unlikely(val & GICR_VPENDBASER_Dirty)) {
+	if (unlikely(!clean))
 		pr_err_ratelimited("ITS virtual pending table not cleaning\n");
+
+	return val;
+}
+
+static u64 its_clear_vpend_valid(void __iomem *vlpi_base, u64 clr, u64 set)
+{
+	u64 val;
+
+	/* Make sure we wait until the RD is done with the initial scan */
+	val = read_vpend_dirty_clear(vlpi_base);
+	val &= ~GICR_VPENDBASER_Valid;
+	val &= ~clr;
+	val |= set;
+	gicr_write_vpendbaser(val, vlpi_base + GICR_VPENDBASER);
+
+	val = read_vpend_dirty_clear(vlpi_base);
+	if (unlikely(val & GICR_VPENDBASER_Dirty))
 		val |= GICR_VPENDBASER_PendingLast;
-	}
 
 	return val;
 }
-- 
2.34.1


-- 
Without deviation from the norm, progress is not possible.

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: Report an error on GICv4.1 vcpu de-schedule
@ 2022-03-17 10:17   ` Marc Zyngier
  0 siblings, 0 replies; 15+ messages in thread
From: Marc Zyngier @ 2022-03-17 10:17 UTC (permalink / raw)
  To: Jingyi Wang; +Cc: kvm, chengjian8, Martin.Weidmann, kvmarm, linux-arm-kernel

Hi Jingyi,

On Thu, 17 Mar 2022 07:27:45 +0000,
Jingyi Wang <wangjingyi11@huawei.com> wrote:
> 
> Hi Marc,
> 
> The patch "KVM: arm64: Delay the polling of the GICR_VPENDBASER.Dirty
> bit"(57e3cebd022fbc035dcf190ac789fd2ffc747f5b) remove the polling of
> GICR_VPENDBASER.Dirty bit in vcpu_load() , while check the VPT parsing
> ready in kvm_vgic_flush_hwstate() for better performance.
> 
> Most time it works, but we have met an error on our hardware recently.
> In preemptable kernel, the vcpu can be preempted between vcpu_load and
> kvm_vgic_flush_hwstate. As a result, it get de-scheduled and
> its_clear_vpend_valid() is called
> 
> 	val = gicr_read_vpendbaser(vlpi_base + GICR_VPENDBASER);
> 	val &= ~GICR_VPENDBASER_Valid;
> 	val &= ~clr;
> 	val |= set;
> 	gicr_write_vpendbaser(val, vlpi_base + GICR_VPENDBASER);
> 
> 
> The function clears Valid bit meanwhile GICR_VPENDBASER_Dirty
> maybe still 1, which cause the subsequent GICR_VPENDBASER_Dirty polling
> fail and report ""ITS virtual pending table not cleaning".
> 
> We have communicated with Martin from ARM and get the conclusion
> that we should not change valid bit while the dirty bit not clear——
> "The dirty bit reports whether the last schedule /de-schedule
> operation has completed.The restriction on not changing Valid when Dirty
> is 1, is so that hardware can always complete the last operation for
> starting the next".

Indeed, the spec is crystal clear about that, and clearing Valid while
Dirty is set is plain wrong.

> 
> I think maybe we can check dirty bit clear before clearing the valid bit
> in its_clear_vpend_valid() code. Hope to know your opinion about this
> issue.

Yes, that's what should happen. I came up with the patch below. Please
give it a shot and let me know if that helps. If it does, I'll queue
it as a fix.

Thanks,

	M.

From c23ccc9cfa603e30ac189d43af75f03b60d780bc Mon Sep 17 00:00:00 2001
From: Marc Zyngier <maz@kernel.org>
Date: Thu, 17 Mar 2022 09:49:02 +0000
Subject: [PATCH] irqchip/gic-v4: Wait for GICR_VPENDBASER.Dirty to clear
 before descheduling

The way KVM drives GICv4.{0,1} is as follows:
- vcpu_load() makes the VPE resident, instructing the RD to start
  scanning for interrupts
- just before entering the guest, we check that the RD has finished
  scanning and that we can start running the vcpu
- on preemption, we deschedule the VPE by making it invalid on
  the RD

However, we are preemptible between the first two steps. If it so
happens *and* that the RD was still scanning, we nonetheless write
to the GICR_VPENDBASER register while Dirty is set, and bad things
happen (we're in UNPRED land).

This affects both the 4.0 and 4.1 implementations.

Make sure Dirty is cleared before performing the deschedule,
meaning that its_clear_vpend_valid() becomes a sort of full VPE
residency barrier.

Reported-by: Jingyi Wang <wangjingyi11@huawei.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Fixes: 57e3cebd022f ("KVM: arm64: Delay the polling of the GICR_VPENDBASER.Dirty
bit")
Link: https://lore.kernel.org/r/4aae10ba-b39a-5f84-754b-69c2eb0a2c03@huawei.com
---
 drivers/irqchip/irq-gic-v3-its.c | 28 +++++++++++++++++++---------
 1 file changed, 19 insertions(+), 9 deletions(-)

diff --git a/drivers/irqchip/irq-gic-v3-its.c b/drivers/irqchip/irq-gic-v3-its.c
index 9e93ff2b6375..c9b1df980899 100644
--- a/drivers/irqchip/irq-gic-v3-its.c
+++ b/drivers/irqchip/irq-gic-v3-its.c
@@ -3011,18 +3011,12 @@ static int __init allocate_lpi_tables(void)
 	return 0;
 }
 
-static u64 its_clear_vpend_valid(void __iomem *vlpi_base, u64 clr, u64 set)
+static u64 read_vpend_dirty_clear(void __iomem *vlpi_base)
 {
 	u32 count = 1000000;	/* 1s! */
 	bool clean;
 	u64 val;
 
-	val = gicr_read_vpendbaser(vlpi_base + GICR_VPENDBASER);
-	val &= ~GICR_VPENDBASER_Valid;
-	val &= ~clr;
-	val |= set;
-	gicr_write_vpendbaser(val, vlpi_base + GICR_VPENDBASER);
-
 	do {
 		val = gicr_read_vpendbaser(vlpi_base + GICR_VPENDBASER);
 		clean = !(val & GICR_VPENDBASER_Dirty);
@@ -3033,10 +3027,26 @@ static u64 its_clear_vpend_valid(void __iomem *vlpi_base, u64 clr, u64 set)
 		}
 	} while (!clean && count);
 
-	if (unlikely(val & GICR_VPENDBASER_Dirty)) {
+	if (unlikely(!clean))
 		pr_err_ratelimited("ITS virtual pending table not cleaning\n");
+
+	return val;
+}
+
+static u64 its_clear_vpend_valid(void __iomem *vlpi_base, u64 clr, u64 set)
+{
+	u64 val;
+
+	/* Make sure we wait until the RD is done with the initial scan */
+	val = read_vpend_dirty_clear(vlpi_base);
+	val &= ~GICR_VPENDBASER_Valid;
+	val &= ~clr;
+	val |= set;
+	gicr_write_vpendbaser(val, vlpi_base + GICR_VPENDBASER);
+
+	val = read_vpend_dirty_clear(vlpi_base);
+	if (unlikely(val & GICR_VPENDBASER_Dirty))
 		val |= GICR_VPENDBASER_PendingLast;
-	}
 
 	return val;
 }
-- 
2.34.1


-- 
Without deviation from the norm, progress is not possible.
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: Report an error on GICv4.1 vcpu de-schedule
@ 2022-03-17 10:17   ` Marc Zyngier
  0 siblings, 0 replies; 15+ messages in thread
From: Marc Zyngier @ 2022-03-17 10:17 UTC (permalink / raw)
  To: Jingyi Wang
  Cc: linux-arm-kernel, kvmarm, kvm, wanghaibin.wang, yuzenghui,
	Martin.Weidmann, tangnianyao, chengjian8

Hi Jingyi,

On Thu, 17 Mar 2022 07:27:45 +0000,
Jingyi Wang <wangjingyi11@huawei.com> wrote:
> 
> Hi Marc,
> 
> The patch "KVM: arm64: Delay the polling of the GICR_VPENDBASER.Dirty
> bit"(57e3cebd022fbc035dcf190ac789fd2ffc747f5b) remove the polling of
> GICR_VPENDBASER.Dirty bit in vcpu_load() , while check the VPT parsing
> ready in kvm_vgic_flush_hwstate() for better performance.
> 
> Most time it works, but we have met an error on our hardware recently.
> In preemptable kernel, the vcpu can be preempted between vcpu_load and
> kvm_vgic_flush_hwstate. As a result, it get de-scheduled and
> its_clear_vpend_valid() is called
> 
> 	val = gicr_read_vpendbaser(vlpi_base + GICR_VPENDBASER);
> 	val &= ~GICR_VPENDBASER_Valid;
> 	val &= ~clr;
> 	val |= set;
> 	gicr_write_vpendbaser(val, vlpi_base + GICR_VPENDBASER);
> 
> 
> The function clears Valid bit meanwhile GICR_VPENDBASER_Dirty
> maybe still 1, which cause the subsequent GICR_VPENDBASER_Dirty polling
> fail and report ""ITS virtual pending table not cleaning".
> 
> We have communicated with Martin from ARM and get the conclusion
> that we should not change valid bit while the dirty bit not clear——
> "The dirty bit reports whether the last schedule /de-schedule
> operation has completed.The restriction on not changing Valid when Dirty
> is 1, is so that hardware can always complete the last operation for
> starting the next".

Indeed, the spec is crystal clear about that, and clearing Valid while
Dirty is set is plain wrong.

> 
> I think maybe we can check dirty bit clear before clearing the valid bit
> in its_clear_vpend_valid() code. Hope to know your opinion about this
> issue.

Yes, that's what should happen. I came up with the patch below. Please
give it a shot and let me know if that helps. If it does, I'll queue
it as a fix.

Thanks,

	M.

From c23ccc9cfa603e30ac189d43af75f03b60d780bc Mon Sep 17 00:00:00 2001
From: Marc Zyngier <maz@kernel.org>
Date: Thu, 17 Mar 2022 09:49:02 +0000
Subject: [PATCH] irqchip/gic-v4: Wait for GICR_VPENDBASER.Dirty to clear
 before descheduling

The way KVM drives GICv4.{0,1} is as follows:
- vcpu_load() makes the VPE resident, instructing the RD to start
  scanning for interrupts
- just before entering the guest, we check that the RD has finished
  scanning and that we can start running the vcpu
- on preemption, we deschedule the VPE by making it invalid on
  the RD

However, we are preemptible between the first two steps. If it so
happens *and* that the RD was still scanning, we nonetheless write
to the GICR_VPENDBASER register while Dirty is set, and bad things
happen (we're in UNPRED land).

This affects both the 4.0 and 4.1 implementations.

Make sure Dirty is cleared before performing the deschedule,
meaning that its_clear_vpend_valid() becomes a sort of full VPE
residency barrier.

Reported-by: Jingyi Wang <wangjingyi11@huawei.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Fixes: 57e3cebd022f ("KVM: arm64: Delay the polling of the GICR_VPENDBASER.Dirty
bit")
Link: https://lore.kernel.org/r/4aae10ba-b39a-5f84-754b-69c2eb0a2c03@huawei.com
---
 drivers/irqchip/irq-gic-v3-its.c | 28 +++++++++++++++++++---------
 1 file changed, 19 insertions(+), 9 deletions(-)

diff --git a/drivers/irqchip/irq-gic-v3-its.c b/drivers/irqchip/irq-gic-v3-its.c
index 9e93ff2b6375..c9b1df980899 100644
--- a/drivers/irqchip/irq-gic-v3-its.c
+++ b/drivers/irqchip/irq-gic-v3-its.c
@@ -3011,18 +3011,12 @@ static int __init allocate_lpi_tables(void)
 	return 0;
 }
 
-static u64 its_clear_vpend_valid(void __iomem *vlpi_base, u64 clr, u64 set)
+static u64 read_vpend_dirty_clear(void __iomem *vlpi_base)
 {
 	u32 count = 1000000;	/* 1s! */
 	bool clean;
 	u64 val;
 
-	val = gicr_read_vpendbaser(vlpi_base + GICR_VPENDBASER);
-	val &= ~GICR_VPENDBASER_Valid;
-	val &= ~clr;
-	val |= set;
-	gicr_write_vpendbaser(val, vlpi_base + GICR_VPENDBASER);
-
 	do {
 		val = gicr_read_vpendbaser(vlpi_base + GICR_VPENDBASER);
 		clean = !(val & GICR_VPENDBASER_Dirty);
@@ -3033,10 +3027,26 @@ static u64 its_clear_vpend_valid(void __iomem *vlpi_base, u64 clr, u64 set)
 		}
 	} while (!clean && count);
 
-	if (unlikely(val & GICR_VPENDBASER_Dirty)) {
+	if (unlikely(!clean))
 		pr_err_ratelimited("ITS virtual pending table not cleaning\n");
+
+	return val;
+}
+
+static u64 its_clear_vpend_valid(void __iomem *vlpi_base, u64 clr, u64 set)
+{
+	u64 val;
+
+	/* Make sure we wait until the RD is done with the initial scan */
+	val = read_vpend_dirty_clear(vlpi_base);
+	val &= ~GICR_VPENDBASER_Valid;
+	val &= ~clr;
+	val |= set;
+	gicr_write_vpendbaser(val, vlpi_base + GICR_VPENDBASER);
+
+	val = read_vpend_dirty_clear(vlpi_base);
+	if (unlikely(val & GICR_VPENDBASER_Dirty))
 		val |= GICR_VPENDBASER_PendingLast;
-	}
 
 	return val;
 }
-- 
2.34.1


-- 
Without deviation from the norm, progress is not possible.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: Report an error on GICv4.1 vcpu de-schedule
  2022-03-17 10:17   ` Marc Zyngier
  (?)
@ 2022-03-18  6:14     ` Jingyi Wang
  -1 siblings, 0 replies; 15+ messages in thread
From: Jingyi Wang @ 2022-03-18  6:14 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: linux-arm-kernel, kvmarm, kvm, wanghaibin.wang, yuzenghui,
	Martin.Weidmann, tangnianyao, chengjian8



On 3/17/2022 6:17 PM, Marc Zyngier wrote:
> Hi Jingyi,
> 
> On Thu, 17 Mar 2022 07:27:45 +0000,
> Jingyi Wang <wangjingyi11@huawei.com> wrote:
>>
>> Hi Marc,
>>
>> The patch "KVM: arm64: Delay the polling of the GICR_VPENDBASER.Dirty
>> bit"(57e3cebd022fbc035dcf190ac789fd2ffc747f5b) remove the polling of
>> GICR_VPENDBASER.Dirty bit in vcpu_load() , while check the VPT parsing
>> ready in kvm_vgic_flush_hwstate() for better performance.
>>
>> Most time it works, but we have met an error on our hardware recently.
>> In preemptable kernel, the vcpu can be preempted between vcpu_load and
>> kvm_vgic_flush_hwstate. As a result, it get de-scheduled and
>> its_clear_vpend_valid() is called
>>
>> 	val = gicr_read_vpendbaser(vlpi_base + GICR_VPENDBASER);
>> 	val &= ~GICR_VPENDBASER_Valid;
>> 	val &= ~clr;
>> 	val |= set;
>> 	gicr_write_vpendbaser(val, vlpi_base + GICR_VPENDBASER);
>>
>>
>> The function clears Valid bit meanwhile GICR_VPENDBASER_Dirty
>> maybe still 1, which cause the subsequent GICR_VPENDBASER_Dirty polling
>> fail and report ""ITS virtual pending table not cleaning".
>>
>> We have communicated with Martin from ARM and get the conclusion
>> that we should not change valid bit while the dirty bit not clear——
>> "The dirty bit reports whether the last schedule /de-schedule
>> operation has completed.The restriction on not changing Valid when Dirty
>> is 1, is so that hardware can always complete the last operation for
>> starting the next".
> 
> Indeed, the spec is crystal clear about that, and clearing Valid while
> Dirty is set is plain wrong.
> 
>>
>> I think maybe we can check dirty bit clear before clearing the valid bit
>> in its_clear_vpend_valid() code. Hope to know your opinion about this
>> issue.
> 
> Yes, that's what should happen. I came up with the patch below. Please
> give it a shot and let me know if that helps. If it does, I'll queue
> it as a fix.
> 
> Thanks,
> 
> 	M.
> 

Thanks, we will test that soon.

>>From c23ccc9cfa603e30ac189d43af75f03b60d780bc Mon Sep 17 00:00:00 2001
> From: Marc Zyngier <maz@kernel.org>
> Date: Thu, 17 Mar 2022 09:49:02 +0000
> Subject: [PATCH] irqchip/gic-v4: Wait for GICR_VPENDBASER.Dirty to clear
>   before descheduling
> 
> The way KVM drives GICv4.{0,1} is as follows:
> - vcpu_load() makes the VPE resident, instructing the RD to start
>    scanning for interrupts
> - just before entering the guest, we check that the RD has finished
>    scanning and that we can start running the vcpu
> - on preemption, we deschedule the VPE by making it invalid on
>    the RD
> 
> However, we are preemptible between the first two steps. If it so
> happens *and* that the RD was still scanning, we nonetheless write
> to the GICR_VPENDBASER register while Dirty is set, and bad things
> happen (we're in UNPRED land).
> 
> This affects both the 4.0 and 4.1 implementations.
> 
> Make sure Dirty is cleared before performing the deschedule,
> meaning that its_clear_vpend_valid() becomes a sort of full VPE
> residency barrier.
> 
> Reported-by: Jingyi Wang <wangjingyi11@huawei.com>
> Signed-off-by: Marc Zyngier <maz@kernel.org>
> Fixes: 57e3cebd022f ("KVM: arm64: Delay the polling of the GICR_VPENDBASER.Dirty
> bit")
> Link: https://lore.kernel.org/r/4aae10ba-b39a-5f84-754b-69c2eb0a2c03@huawei.com
> ---
>   drivers/irqchip/irq-gic-v3-its.c | 28 +++++++++++++++++++---------
>   1 file changed, 19 insertions(+), 9 deletions(-)
> 
> diff --git a/drivers/irqchip/irq-gic-v3-its.c b/drivers/irqchip/irq-gic-v3-its.c
> index 9e93ff2b6375..c9b1df980899 100644
> --- a/drivers/irqchip/irq-gic-v3-its.c
> +++ b/drivers/irqchip/irq-gic-v3-its.c
> @@ -3011,18 +3011,12 @@ static int __init allocate_lpi_tables(void)
>   	return 0;
>   }
>   
> -static u64 its_clear_vpend_valid(void __iomem *vlpi_base, u64 clr, u64 set)
> +static u64 read_vpend_dirty_clear(void __iomem *vlpi_base)
>   {
>   	u32 count = 1000000;	/* 1s! */
>   	bool clean;
>   	u64 val;
>   
> -	val = gicr_read_vpendbaser(vlpi_base + GICR_VPENDBASER);
> -	val &= ~GICR_VPENDBASER_Valid;
> -	val &= ~clr;
> -	val |= set;
> -	gicr_write_vpendbaser(val, vlpi_base + GICR_VPENDBASER);
> -
>   	do {
>   		val = gicr_read_vpendbaser(vlpi_base + GICR_VPENDBASER);
>   		clean = !(val & GICR_VPENDBASER_Dirty);
> @@ -3033,10 +3027,26 @@ static u64 its_clear_vpend_valid(void __iomem *vlpi_base, u64 clr, u64 set)
>   		}
>   	} while (!clean && count);
>   
> -	if (unlikely(val & GICR_VPENDBASER_Dirty)) {
> +	if (unlikely(!clean))
>   		pr_err_ratelimited("ITS virtual pending table not cleaning\n");
> +
> +	return val;
> +}
> +
> +static u64 its_clear_vpend_valid(void __iomem *vlpi_base, u64 clr, u64 set)
> +{
> +	u64 val;
> +
> +	/* Make sure we wait until the RD is done with the initial scan */
> +	val = read_vpend_dirty_clear(vlpi_base);
> +	val &= ~GICR_VPENDBASER_Valid;
> +	val &= ~clr;
> +	val |= set;
> +	gicr_write_vpendbaser(val, vlpi_base + GICR_VPENDBASER);
> +
> +	val = read_vpend_dirty_clear(vlpi_base);
> +	if (unlikely(val & GICR_VPENDBASER_Dirty))
>   		val |= GICR_VPENDBASER_PendingLast;
> -	}
>   
>   	return val;
>   }
> 

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Report an error on GICv4.1 vcpu de-schedule
@ 2022-03-18  6:14     ` Jingyi Wang
  0 siblings, 0 replies; 15+ messages in thread
From: Jingyi Wang @ 2022-03-18  6:14 UTC (permalink / raw)
  To: Marc Zyngier; +Cc: kvm, chengjian8, Martin.Weidmann, kvmarm, linux-arm-kernel



On 3/17/2022 6:17 PM, Marc Zyngier wrote:
> Hi Jingyi,
> 
> On Thu, 17 Mar 2022 07:27:45 +0000,
> Jingyi Wang <wangjingyi11@huawei.com> wrote:
>>
>> Hi Marc,
>>
>> The patch "KVM: arm64: Delay the polling of the GICR_VPENDBASER.Dirty
>> bit"(57e3cebd022fbc035dcf190ac789fd2ffc747f5b) remove the polling of
>> GICR_VPENDBASER.Dirty bit in vcpu_load() , while check the VPT parsing
>> ready in kvm_vgic_flush_hwstate() for better performance.
>>
>> Most time it works, but we have met an error on our hardware recently.
>> In preemptable kernel, the vcpu can be preempted between vcpu_load and
>> kvm_vgic_flush_hwstate. As a result, it get de-scheduled and
>> its_clear_vpend_valid() is called
>>
>> 	val = gicr_read_vpendbaser(vlpi_base + GICR_VPENDBASER);
>> 	val &= ~GICR_VPENDBASER_Valid;
>> 	val &= ~clr;
>> 	val |= set;
>> 	gicr_write_vpendbaser(val, vlpi_base + GICR_VPENDBASER);
>>
>>
>> The function clears Valid bit meanwhile GICR_VPENDBASER_Dirty
>> maybe still 1, which cause the subsequent GICR_VPENDBASER_Dirty polling
>> fail and report ""ITS virtual pending table not cleaning".
>>
>> We have communicated with Martin from ARM and get the conclusion
>> that we should not change valid bit while the dirty bit not clear——
>> "The dirty bit reports whether the last schedule /de-schedule
>> operation has completed.The restriction on not changing Valid when Dirty
>> is 1, is so that hardware can always complete the last operation for
>> starting the next".
> 
> Indeed, the spec is crystal clear about that, and clearing Valid while
> Dirty is set is plain wrong.
> 
>>
>> I think maybe we can check dirty bit clear before clearing the valid bit
>> in its_clear_vpend_valid() code. Hope to know your opinion about this
>> issue.
> 
> Yes, that's what should happen. I came up with the patch below. Please
> give it a shot and let me know if that helps. If it does, I'll queue
> it as a fix.
> 
> Thanks,
> 
> 	M.
> 

Thanks, we will test that soon.

>>From c23ccc9cfa603e30ac189d43af75f03b60d780bc Mon Sep 17 00:00:00 2001
> From: Marc Zyngier <maz@kernel.org>
> Date: Thu, 17 Mar 2022 09:49:02 +0000
> Subject: [PATCH] irqchip/gic-v4: Wait for GICR_VPENDBASER.Dirty to clear
>   before descheduling
> 
> The way KVM drives GICv4.{0,1} is as follows:
> - vcpu_load() makes the VPE resident, instructing the RD to start
>    scanning for interrupts
> - just before entering the guest, we check that the RD has finished
>    scanning and that we can start running the vcpu
> - on preemption, we deschedule the VPE by making it invalid on
>    the RD
> 
> However, we are preemptible between the first two steps. If it so
> happens *and* that the RD was still scanning, we nonetheless write
> to the GICR_VPENDBASER register while Dirty is set, and bad things
> happen (we're in UNPRED land).
> 
> This affects both the 4.0 and 4.1 implementations.
> 
> Make sure Dirty is cleared before performing the deschedule,
> meaning that its_clear_vpend_valid() becomes a sort of full VPE
> residency barrier.
> 
> Reported-by: Jingyi Wang <wangjingyi11@huawei.com>
> Signed-off-by: Marc Zyngier <maz@kernel.org>
> Fixes: 57e3cebd022f ("KVM: arm64: Delay the polling of the GICR_VPENDBASER.Dirty
> bit")
> Link: https://lore.kernel.org/r/4aae10ba-b39a-5f84-754b-69c2eb0a2c03@huawei.com
> ---
>   drivers/irqchip/irq-gic-v3-its.c | 28 +++++++++++++++++++---------
>   1 file changed, 19 insertions(+), 9 deletions(-)
> 
> diff --git a/drivers/irqchip/irq-gic-v3-its.c b/drivers/irqchip/irq-gic-v3-its.c
> index 9e93ff2b6375..c9b1df980899 100644
> --- a/drivers/irqchip/irq-gic-v3-its.c
> +++ b/drivers/irqchip/irq-gic-v3-its.c
> @@ -3011,18 +3011,12 @@ static int __init allocate_lpi_tables(void)
>   	return 0;
>   }
>   
> -static u64 its_clear_vpend_valid(void __iomem *vlpi_base, u64 clr, u64 set)
> +static u64 read_vpend_dirty_clear(void __iomem *vlpi_base)
>   {
>   	u32 count = 1000000;	/* 1s! */
>   	bool clean;
>   	u64 val;
>   
> -	val = gicr_read_vpendbaser(vlpi_base + GICR_VPENDBASER);
> -	val &= ~GICR_VPENDBASER_Valid;
> -	val &= ~clr;
> -	val |= set;
> -	gicr_write_vpendbaser(val, vlpi_base + GICR_VPENDBASER);
> -
>   	do {
>   		val = gicr_read_vpendbaser(vlpi_base + GICR_VPENDBASER);
>   		clean = !(val & GICR_VPENDBASER_Dirty);
> @@ -3033,10 +3027,26 @@ static u64 its_clear_vpend_valid(void __iomem *vlpi_base, u64 clr, u64 set)
>   		}
>   	} while (!clean && count);
>   
> -	if (unlikely(val & GICR_VPENDBASER_Dirty)) {
> +	if (unlikely(!clean))
>   		pr_err_ratelimited("ITS virtual pending table not cleaning\n");
> +
> +	return val;
> +}
> +
> +static u64 its_clear_vpend_valid(void __iomem *vlpi_base, u64 clr, u64 set)
> +{
> +	u64 val;
> +
> +	/* Make sure we wait until the RD is done with the initial scan */
> +	val = read_vpend_dirty_clear(vlpi_base);
> +	val &= ~GICR_VPENDBASER_Valid;
> +	val &= ~clr;
> +	val |= set;
> +	gicr_write_vpendbaser(val, vlpi_base + GICR_VPENDBASER);
> +
> +	val = read_vpend_dirty_clear(vlpi_base);
> +	if (unlikely(val & GICR_VPENDBASER_Dirty))
>   		val |= GICR_VPENDBASER_PendingLast;
> -	}
>   
>   	return val;
>   }
> 
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Report an error on GICv4.1 vcpu de-schedule
@ 2022-03-18  6:14     ` Jingyi Wang
  0 siblings, 0 replies; 15+ messages in thread
From: Jingyi Wang @ 2022-03-18  6:14 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: linux-arm-kernel, kvmarm, kvm, wanghaibin.wang, yuzenghui,
	Martin.Weidmann, tangnianyao, chengjian8



On 3/17/2022 6:17 PM, Marc Zyngier wrote:
> Hi Jingyi,
> 
> On Thu, 17 Mar 2022 07:27:45 +0000,
> Jingyi Wang <wangjingyi11@huawei.com> wrote:
>>
>> Hi Marc,
>>
>> The patch "KVM: arm64: Delay the polling of the GICR_VPENDBASER.Dirty
>> bit"(57e3cebd022fbc035dcf190ac789fd2ffc747f5b) remove the polling of
>> GICR_VPENDBASER.Dirty bit in vcpu_load() , while check the VPT parsing
>> ready in kvm_vgic_flush_hwstate() for better performance.
>>
>> Most time it works, but we have met an error on our hardware recently.
>> In preemptable kernel, the vcpu can be preempted between vcpu_load and
>> kvm_vgic_flush_hwstate. As a result, it get de-scheduled and
>> its_clear_vpend_valid() is called
>>
>> 	val = gicr_read_vpendbaser(vlpi_base + GICR_VPENDBASER);
>> 	val &= ~GICR_VPENDBASER_Valid;
>> 	val &= ~clr;
>> 	val |= set;
>> 	gicr_write_vpendbaser(val, vlpi_base + GICR_VPENDBASER);
>>
>>
>> The function clears Valid bit meanwhile GICR_VPENDBASER_Dirty
>> maybe still 1, which cause the subsequent GICR_VPENDBASER_Dirty polling
>> fail and report ""ITS virtual pending table not cleaning".
>>
>> We have communicated with Martin from ARM and get the conclusion
>> that we should not change valid bit while the dirty bit not clear——
>> "The dirty bit reports whether the last schedule /de-schedule
>> operation has completed.The restriction on not changing Valid when Dirty
>> is 1, is so that hardware can always complete the last operation for
>> starting the next".
> 
> Indeed, the spec is crystal clear about that, and clearing Valid while
> Dirty is set is plain wrong.
> 
>>
>> I think maybe we can check dirty bit clear before clearing the valid bit
>> in its_clear_vpend_valid() code. Hope to know your opinion about this
>> issue.
> 
> Yes, that's what should happen. I came up with the patch below. Please
> give it a shot and let me know if that helps. If it does, I'll queue
> it as a fix.
> 
> Thanks,
> 
> 	M.
> 

Thanks, we will test that soon.

>>From c23ccc9cfa603e30ac189d43af75f03b60d780bc Mon Sep 17 00:00:00 2001
> From: Marc Zyngier <maz@kernel.org>
> Date: Thu, 17 Mar 2022 09:49:02 +0000
> Subject: [PATCH] irqchip/gic-v4: Wait for GICR_VPENDBASER.Dirty to clear
>   before descheduling
> 
> The way KVM drives GICv4.{0,1} is as follows:
> - vcpu_load() makes the VPE resident, instructing the RD to start
>    scanning for interrupts
> - just before entering the guest, we check that the RD has finished
>    scanning and that we can start running the vcpu
> - on preemption, we deschedule the VPE by making it invalid on
>    the RD
> 
> However, we are preemptible between the first two steps. If it so
> happens *and* that the RD was still scanning, we nonetheless write
> to the GICR_VPENDBASER register while Dirty is set, and bad things
> happen (we're in UNPRED land).
> 
> This affects both the 4.0 and 4.1 implementations.
> 
> Make sure Dirty is cleared before performing the deschedule,
> meaning that its_clear_vpend_valid() becomes a sort of full VPE
> residency barrier.
> 
> Reported-by: Jingyi Wang <wangjingyi11@huawei.com>
> Signed-off-by: Marc Zyngier <maz@kernel.org>
> Fixes: 57e3cebd022f ("KVM: arm64: Delay the polling of the GICR_VPENDBASER.Dirty
> bit")
> Link: https://lore.kernel.org/r/4aae10ba-b39a-5f84-754b-69c2eb0a2c03@huawei.com
> ---
>   drivers/irqchip/irq-gic-v3-its.c | 28 +++++++++++++++++++---------
>   1 file changed, 19 insertions(+), 9 deletions(-)
> 
> diff --git a/drivers/irqchip/irq-gic-v3-its.c b/drivers/irqchip/irq-gic-v3-its.c
> index 9e93ff2b6375..c9b1df980899 100644
> --- a/drivers/irqchip/irq-gic-v3-its.c
> +++ b/drivers/irqchip/irq-gic-v3-its.c
> @@ -3011,18 +3011,12 @@ static int __init allocate_lpi_tables(void)
>   	return 0;
>   }
>   
> -static u64 its_clear_vpend_valid(void __iomem *vlpi_base, u64 clr, u64 set)
> +static u64 read_vpend_dirty_clear(void __iomem *vlpi_base)
>   {
>   	u32 count = 1000000;	/* 1s! */
>   	bool clean;
>   	u64 val;
>   
> -	val = gicr_read_vpendbaser(vlpi_base + GICR_VPENDBASER);
> -	val &= ~GICR_VPENDBASER_Valid;
> -	val &= ~clr;
> -	val |= set;
> -	gicr_write_vpendbaser(val, vlpi_base + GICR_VPENDBASER);
> -
>   	do {
>   		val = gicr_read_vpendbaser(vlpi_base + GICR_VPENDBASER);
>   		clean = !(val & GICR_VPENDBASER_Dirty);
> @@ -3033,10 +3027,26 @@ static u64 its_clear_vpend_valid(void __iomem *vlpi_base, u64 clr, u64 set)
>   		}
>   	} while (!clean && count);
>   
> -	if (unlikely(val & GICR_VPENDBASER_Dirty)) {
> +	if (unlikely(!clean))
>   		pr_err_ratelimited("ITS virtual pending table not cleaning\n");
> +
> +	return val;
> +}
> +
> +static u64 its_clear_vpend_valid(void __iomem *vlpi_base, u64 clr, u64 set)
> +{
> +	u64 val;
> +
> +	/* Make sure we wait until the RD is done with the initial scan */
> +	val = read_vpend_dirty_clear(vlpi_base);
> +	val &= ~GICR_VPENDBASER_Valid;
> +	val &= ~clr;
> +	val |= set;
> +	gicr_write_vpendbaser(val, vlpi_base + GICR_VPENDBASER);
> +
> +	val = read_vpend_dirty_clear(vlpi_base);
> +	if (unlikely(val & GICR_VPENDBASER_Dirty))
>   		val |= GICR_VPENDBASER_PendingLast;
> -	}
>   
>   	return val;
>   }
> 

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Report an error on GICv4.1 vcpu de-schedule
  2022-03-17 10:17   ` Marc Zyngier
  (?)
@ 2022-03-21  1:33     ` Jingyi Wang
  -1 siblings, 0 replies; 15+ messages in thread
From: Jingyi Wang @ 2022-03-21  1:33 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: linux-arm-kernel, kvmarm, kvm, wanghaibin.wang, yuzenghui,
	Martin.Weidmann, tangnianyao, chengjian8

Hi Marc,

On 3/17/2022 6:17 PM, Marc Zyngier wrote:
> Hi Jingyi,
> 
> On Thu, 17 Mar 2022 07:27:45 +0000,
> Jingyi Wang <wangjingyi11@huawei.com> wrote:
>>
>> Hi Marc,
>>
>> The patch "KVM: arm64: Delay the polling of the GICR_VPENDBASER.Dirty
>> bit"(57e3cebd022fbc035dcf190ac789fd2ffc747f5b) remove the polling of
>> GICR_VPENDBASER.Dirty bit in vcpu_load() , while check the VPT parsing
>> ready in kvm_vgic_flush_hwstate() for better performance.
>>
>> Most time it works, but we have met an error on our hardware recently.
>> In preemptable kernel, the vcpu can be preempted between vcpu_load and
>> kvm_vgic_flush_hwstate. As a result, it get de-scheduled and
>> its_clear_vpend_valid() is called
>>
>> 	val = gicr_read_vpendbaser(vlpi_base + GICR_VPENDBASER);
>> 	val &= ~GICR_VPENDBASER_Valid;
>> 	val &= ~clr;
>> 	val |= set;
>> 	gicr_write_vpendbaser(val, vlpi_base + GICR_VPENDBASER);
>>
>>
>> The function clears Valid bit meanwhile GICR_VPENDBASER_Dirty
>> maybe still 1, which cause the subsequent GICR_VPENDBASER_Dirty polling
>> fail and report ""ITS virtual pending table not cleaning".
>>
>> We have communicated with Martin from ARM and get the conclusion
>> that we should not change valid bit while the dirty bit not clear——
>> "The dirty bit reports whether the last schedule /de-schedule
>> operation has completed.The restriction on not changing Valid when Dirty
>> is 1, is so that hardware can always complete the last operation for
>> starting the next".
> 
> Indeed, the spec is crystal clear about that, and clearing Valid while
> Dirty is set is plain wrong.
> 
>>
>> I think maybe we can check dirty bit clear before clearing the valid bit
>> in its_clear_vpend_valid() code. Hope to know your opinion about this
>> issue.
> 
> Yes, that's what should happen. I came up with the patch below. Please
> give it a shot and let me know if that helps. If it does, I'll queue
> it as a fix.
> 
> Thanks,
> 
> 	M.
> 
>>From c23ccc9cfa603e30ac189d43af75f03b60d780bc Mon Sep 17 00:00:00 2001
> From: Marc Zyngier <maz@kernel.org>
> Date: Thu, 17 Mar 2022 09:49:02 +0000
> Subject: [PATCH] irqchip/gic-v4: Wait for GICR_VPENDBASER.Dirty to clear
>   before descheduling
> 
> The way KVM drives GICv4.{0,1} is as follows:
> - vcpu_load() makes the VPE resident, instructing the RD to start
>    scanning for interrupts
> - just before entering the guest, we check that the RD has finished
>    scanning and that we can start running the vcpu
> - on preemption, we deschedule the VPE by making it invalid on
>    the RD
> 
> However, we are preemptible between the first two steps. If it so
> happens *and* that the RD was still scanning, we nonetheless write
> to the GICR_VPENDBASER register while Dirty is set, and bad things
> happen (we're in UNPRED land).
> 
> This affects both the 4.0 and 4.1 implementations.
> 
> Make sure Dirty is cleared before performing the deschedule,
> meaning that its_clear_vpend_valid() becomes a sort of full VPE
> residency barrier.
> 
> Reported-by: Jingyi Wang <wangjingyi11@huawei.com>
> Signed-off-by: Marc Zyngier <maz@kernel.org>
> Fixes: 57e3cebd022f ("KVM: arm64: Delay the polling of the GICR_VPENDBASER.Dirty
> bit")
> Link: https://lore.kernel.org/r/4aae10ba-b39a-5f84-754b-69c2eb0a2c03@huawei.com
> ---
>   drivers/irqchip/irq-gic-v3-its.c | 28 +++++++++++++++++++---------
>   1 file changed, 19 insertions(+), 9 deletions(-)
> 
> diff --git a/drivers/irqchip/irq-gic-v3-its.c b/drivers/irqchip/irq-gic-v3-its.c
> index 9e93ff2b6375..c9b1df980899 100644
> --- a/drivers/irqchip/irq-gic-v3-its.c
> +++ b/drivers/irqchip/irq-gic-v3-its.c
> @@ -3011,18 +3011,12 @@ static int __init allocate_lpi_tables(void)
>   	return 0;
>   }
>   
> -static u64 its_clear_vpend_valid(void __iomem *vlpi_base, u64 clr, u64 set)
> +static u64 read_vpend_dirty_clear(void __iomem *vlpi_base)
>   {
>   	u32 count = 1000000;	/* 1s! */
>   	bool clean;
>   	u64 val;
>   
> -	val = gicr_read_vpendbaser(vlpi_base + GICR_VPENDBASER);
> -	val &= ~GICR_VPENDBASER_Valid;
> -	val &= ~clr;
> -	val |= set;
> -	gicr_write_vpendbaser(val, vlpi_base + GICR_VPENDBASER);
> -
>   	do {
>   		val = gicr_read_vpendbaser(vlpi_base + GICR_VPENDBASER);
>   		clean = !(val & GICR_VPENDBASER_Dirty);
> @@ -3033,10 +3027,26 @@ static u64 its_clear_vpend_valid(void __iomem *vlpi_base, u64 clr, u64 set)
>   		}
>   	} while (!clean && count);
>   
> -	if (unlikely(val & GICR_VPENDBASER_Dirty)) {
> +	if (unlikely(!clean))
>   		pr_err_ratelimited("ITS virtual pending table not cleaning\n");
> +
> +	return val;
> +}
> +
> +static u64 its_clear_vpend_valid(void __iomem *vlpi_base, u64 clr, u64 set)
> +{
> +	u64 val;
> +
> +	/* Make sure we wait until the RD is done with the initial scan */
> +	val = read_vpend_dirty_clear(vlpi_base);
> +	val &= ~GICR_VPENDBASER_Valid;
> +	val &= ~clr;
> +	val |= set;
> +	gicr_write_vpendbaser(val, vlpi_base + GICR_VPENDBASER);
> +
> +	val = read_vpend_dirty_clear(vlpi_base);
> +	if (unlikely(val & GICR_VPENDBASER_Dirty))
>   		val |= GICR_VPENDBASER_PendingLast;
> -	}
>   
>   	return val;
>   }
> 


This patch works fine on the hardware.
Tested-by:Nianyao Tang <tangnianyao@huawei.com>

Thanks,
Jingyi

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Report an error on GICv4.1 vcpu de-schedule
@ 2022-03-21  1:33     ` Jingyi Wang
  0 siblings, 0 replies; 15+ messages in thread
From: Jingyi Wang @ 2022-03-21  1:33 UTC (permalink / raw)
  To: Marc Zyngier; +Cc: kvm, chengjian8, Martin.Weidmann, kvmarm, linux-arm-kernel

Hi Marc,

On 3/17/2022 6:17 PM, Marc Zyngier wrote:
> Hi Jingyi,
> 
> On Thu, 17 Mar 2022 07:27:45 +0000,
> Jingyi Wang <wangjingyi11@huawei.com> wrote:
>>
>> Hi Marc,
>>
>> The patch "KVM: arm64: Delay the polling of the GICR_VPENDBASER.Dirty
>> bit"(57e3cebd022fbc035dcf190ac789fd2ffc747f5b) remove the polling of
>> GICR_VPENDBASER.Dirty bit in vcpu_load() , while check the VPT parsing
>> ready in kvm_vgic_flush_hwstate() for better performance.
>>
>> Most time it works, but we have met an error on our hardware recently.
>> In preemptable kernel, the vcpu can be preempted between vcpu_load and
>> kvm_vgic_flush_hwstate. As a result, it get de-scheduled and
>> its_clear_vpend_valid() is called
>>
>> 	val = gicr_read_vpendbaser(vlpi_base + GICR_VPENDBASER);
>> 	val &= ~GICR_VPENDBASER_Valid;
>> 	val &= ~clr;
>> 	val |= set;
>> 	gicr_write_vpendbaser(val, vlpi_base + GICR_VPENDBASER);
>>
>>
>> The function clears Valid bit meanwhile GICR_VPENDBASER_Dirty
>> maybe still 1, which cause the subsequent GICR_VPENDBASER_Dirty polling
>> fail and report ""ITS virtual pending table not cleaning".
>>
>> We have communicated with Martin from ARM and get the conclusion
>> that we should not change valid bit while the dirty bit not clear——
>> "The dirty bit reports whether the last schedule /de-schedule
>> operation has completed.The restriction on not changing Valid when Dirty
>> is 1, is so that hardware can always complete the last operation for
>> starting the next".
> 
> Indeed, the spec is crystal clear about that, and clearing Valid while
> Dirty is set is plain wrong.
> 
>>
>> I think maybe we can check dirty bit clear before clearing the valid bit
>> in its_clear_vpend_valid() code. Hope to know your opinion about this
>> issue.
> 
> Yes, that's what should happen. I came up with the patch below. Please
> give it a shot and let me know if that helps. If it does, I'll queue
> it as a fix.
> 
> Thanks,
> 
> 	M.
> 
>>From c23ccc9cfa603e30ac189d43af75f03b60d780bc Mon Sep 17 00:00:00 2001
> From: Marc Zyngier <maz@kernel.org>
> Date: Thu, 17 Mar 2022 09:49:02 +0000
> Subject: [PATCH] irqchip/gic-v4: Wait for GICR_VPENDBASER.Dirty to clear
>   before descheduling
> 
> The way KVM drives GICv4.{0,1} is as follows:
> - vcpu_load() makes the VPE resident, instructing the RD to start
>    scanning for interrupts
> - just before entering the guest, we check that the RD has finished
>    scanning and that we can start running the vcpu
> - on preemption, we deschedule the VPE by making it invalid on
>    the RD
> 
> However, we are preemptible between the first two steps. If it so
> happens *and* that the RD was still scanning, we nonetheless write
> to the GICR_VPENDBASER register while Dirty is set, and bad things
> happen (we're in UNPRED land).
> 
> This affects both the 4.0 and 4.1 implementations.
> 
> Make sure Dirty is cleared before performing the deschedule,
> meaning that its_clear_vpend_valid() becomes a sort of full VPE
> residency barrier.
> 
> Reported-by: Jingyi Wang <wangjingyi11@huawei.com>
> Signed-off-by: Marc Zyngier <maz@kernel.org>
> Fixes: 57e3cebd022f ("KVM: arm64: Delay the polling of the GICR_VPENDBASER.Dirty
> bit")
> Link: https://lore.kernel.org/r/4aae10ba-b39a-5f84-754b-69c2eb0a2c03@huawei.com
> ---
>   drivers/irqchip/irq-gic-v3-its.c | 28 +++++++++++++++++++---------
>   1 file changed, 19 insertions(+), 9 deletions(-)
> 
> diff --git a/drivers/irqchip/irq-gic-v3-its.c b/drivers/irqchip/irq-gic-v3-its.c
> index 9e93ff2b6375..c9b1df980899 100644
> --- a/drivers/irqchip/irq-gic-v3-its.c
> +++ b/drivers/irqchip/irq-gic-v3-its.c
> @@ -3011,18 +3011,12 @@ static int __init allocate_lpi_tables(void)
>   	return 0;
>   }
>   
> -static u64 its_clear_vpend_valid(void __iomem *vlpi_base, u64 clr, u64 set)
> +static u64 read_vpend_dirty_clear(void __iomem *vlpi_base)
>   {
>   	u32 count = 1000000;	/* 1s! */
>   	bool clean;
>   	u64 val;
>   
> -	val = gicr_read_vpendbaser(vlpi_base + GICR_VPENDBASER);
> -	val &= ~GICR_VPENDBASER_Valid;
> -	val &= ~clr;
> -	val |= set;
> -	gicr_write_vpendbaser(val, vlpi_base + GICR_VPENDBASER);
> -
>   	do {
>   		val = gicr_read_vpendbaser(vlpi_base + GICR_VPENDBASER);
>   		clean = !(val & GICR_VPENDBASER_Dirty);
> @@ -3033,10 +3027,26 @@ static u64 its_clear_vpend_valid(void __iomem *vlpi_base, u64 clr, u64 set)
>   		}
>   	} while (!clean && count);
>   
> -	if (unlikely(val & GICR_VPENDBASER_Dirty)) {
> +	if (unlikely(!clean))
>   		pr_err_ratelimited("ITS virtual pending table not cleaning\n");
> +
> +	return val;
> +}
> +
> +static u64 its_clear_vpend_valid(void __iomem *vlpi_base, u64 clr, u64 set)
> +{
> +	u64 val;
> +
> +	/* Make sure we wait until the RD is done with the initial scan */
> +	val = read_vpend_dirty_clear(vlpi_base);
> +	val &= ~GICR_VPENDBASER_Valid;
> +	val &= ~clr;
> +	val |= set;
> +	gicr_write_vpendbaser(val, vlpi_base + GICR_VPENDBASER);
> +
> +	val = read_vpend_dirty_clear(vlpi_base);
> +	if (unlikely(val & GICR_VPENDBASER_Dirty))
>   		val |= GICR_VPENDBASER_PendingLast;
> -	}
>   
>   	return val;
>   }
> 


This patch works fine on the hardware.
Tested-by:Nianyao Tang <tangnianyao@huawei.com>

Thanks,
Jingyi
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Report an error on GICv4.1 vcpu de-schedule
@ 2022-03-21  1:33     ` Jingyi Wang
  0 siblings, 0 replies; 15+ messages in thread
From: Jingyi Wang @ 2022-03-21  1:33 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: linux-arm-kernel, kvmarm, kvm, wanghaibin.wang, yuzenghui,
	Martin.Weidmann, tangnianyao, chengjian8

Hi Marc,

On 3/17/2022 6:17 PM, Marc Zyngier wrote:
> Hi Jingyi,
> 
> On Thu, 17 Mar 2022 07:27:45 +0000,
> Jingyi Wang <wangjingyi11@huawei.com> wrote:
>>
>> Hi Marc,
>>
>> The patch "KVM: arm64: Delay the polling of the GICR_VPENDBASER.Dirty
>> bit"(57e3cebd022fbc035dcf190ac789fd2ffc747f5b) remove the polling of
>> GICR_VPENDBASER.Dirty bit in vcpu_load() , while check the VPT parsing
>> ready in kvm_vgic_flush_hwstate() for better performance.
>>
>> Most time it works, but we have met an error on our hardware recently.
>> In preemptable kernel, the vcpu can be preempted between vcpu_load and
>> kvm_vgic_flush_hwstate. As a result, it get de-scheduled and
>> its_clear_vpend_valid() is called
>>
>> 	val = gicr_read_vpendbaser(vlpi_base + GICR_VPENDBASER);
>> 	val &= ~GICR_VPENDBASER_Valid;
>> 	val &= ~clr;
>> 	val |= set;
>> 	gicr_write_vpendbaser(val, vlpi_base + GICR_VPENDBASER);
>>
>>
>> The function clears Valid bit meanwhile GICR_VPENDBASER_Dirty
>> maybe still 1, which cause the subsequent GICR_VPENDBASER_Dirty polling
>> fail and report ""ITS virtual pending table not cleaning".
>>
>> We have communicated with Martin from ARM and get the conclusion
>> that we should not change valid bit while the dirty bit not clear——
>> "The dirty bit reports whether the last schedule /de-schedule
>> operation has completed.The restriction on not changing Valid when Dirty
>> is 1, is so that hardware can always complete the last operation for
>> starting the next".
> 
> Indeed, the spec is crystal clear about that, and clearing Valid while
> Dirty is set is plain wrong.
> 
>>
>> I think maybe we can check dirty bit clear before clearing the valid bit
>> in its_clear_vpend_valid() code. Hope to know your opinion about this
>> issue.
> 
> Yes, that's what should happen. I came up with the patch below. Please
> give it a shot and let me know if that helps. If it does, I'll queue
> it as a fix.
> 
> Thanks,
> 
> 	M.
> 
>>From c23ccc9cfa603e30ac189d43af75f03b60d780bc Mon Sep 17 00:00:00 2001
> From: Marc Zyngier <maz@kernel.org>
> Date: Thu, 17 Mar 2022 09:49:02 +0000
> Subject: [PATCH] irqchip/gic-v4: Wait for GICR_VPENDBASER.Dirty to clear
>   before descheduling
> 
> The way KVM drives GICv4.{0,1} is as follows:
> - vcpu_load() makes the VPE resident, instructing the RD to start
>    scanning for interrupts
> - just before entering the guest, we check that the RD has finished
>    scanning and that we can start running the vcpu
> - on preemption, we deschedule the VPE by making it invalid on
>    the RD
> 
> However, we are preemptible between the first two steps. If it so
> happens *and* that the RD was still scanning, we nonetheless write
> to the GICR_VPENDBASER register while Dirty is set, and bad things
> happen (we're in UNPRED land).
> 
> This affects both the 4.0 and 4.1 implementations.
> 
> Make sure Dirty is cleared before performing the deschedule,
> meaning that its_clear_vpend_valid() becomes a sort of full VPE
> residency barrier.
> 
> Reported-by: Jingyi Wang <wangjingyi11@huawei.com>
> Signed-off-by: Marc Zyngier <maz@kernel.org>
> Fixes: 57e3cebd022f ("KVM: arm64: Delay the polling of the GICR_VPENDBASER.Dirty
> bit")
> Link: https://lore.kernel.org/r/4aae10ba-b39a-5f84-754b-69c2eb0a2c03@huawei.com
> ---
>   drivers/irqchip/irq-gic-v3-its.c | 28 +++++++++++++++++++---------
>   1 file changed, 19 insertions(+), 9 deletions(-)
> 
> diff --git a/drivers/irqchip/irq-gic-v3-its.c b/drivers/irqchip/irq-gic-v3-its.c
> index 9e93ff2b6375..c9b1df980899 100644
> --- a/drivers/irqchip/irq-gic-v3-its.c
> +++ b/drivers/irqchip/irq-gic-v3-its.c
> @@ -3011,18 +3011,12 @@ static int __init allocate_lpi_tables(void)
>   	return 0;
>   }
>   
> -static u64 its_clear_vpend_valid(void __iomem *vlpi_base, u64 clr, u64 set)
> +static u64 read_vpend_dirty_clear(void __iomem *vlpi_base)
>   {
>   	u32 count = 1000000;	/* 1s! */
>   	bool clean;
>   	u64 val;
>   
> -	val = gicr_read_vpendbaser(vlpi_base + GICR_VPENDBASER);
> -	val &= ~GICR_VPENDBASER_Valid;
> -	val &= ~clr;
> -	val |= set;
> -	gicr_write_vpendbaser(val, vlpi_base + GICR_VPENDBASER);
> -
>   	do {
>   		val = gicr_read_vpendbaser(vlpi_base + GICR_VPENDBASER);
>   		clean = !(val & GICR_VPENDBASER_Dirty);
> @@ -3033,10 +3027,26 @@ static u64 its_clear_vpend_valid(void __iomem *vlpi_base, u64 clr, u64 set)
>   		}
>   	} while (!clean && count);
>   
> -	if (unlikely(val & GICR_VPENDBASER_Dirty)) {
> +	if (unlikely(!clean))
>   		pr_err_ratelimited("ITS virtual pending table not cleaning\n");
> +
> +	return val;
> +}
> +
> +static u64 its_clear_vpend_valid(void __iomem *vlpi_base, u64 clr, u64 set)
> +{
> +	u64 val;
> +
> +	/* Make sure we wait until the RD is done with the initial scan */
> +	val = read_vpend_dirty_clear(vlpi_base);
> +	val &= ~GICR_VPENDBASER_Valid;
> +	val &= ~clr;
> +	val |= set;
> +	gicr_write_vpendbaser(val, vlpi_base + GICR_VPENDBASER);
> +
> +	val = read_vpend_dirty_clear(vlpi_base);
> +	if (unlikely(val & GICR_VPENDBASER_Dirty))
>   		val |= GICR_VPENDBASER_PendingLast;
> -	}
>   
>   	return val;
>   }
> 


This patch works fine on the hardware.
Tested-by:Nianyao Tang <tangnianyao@huawei.com>

Thanks,
Jingyi

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [irqchip: irq/irqchip-next] irqchip/gic-v4: Wait for GICR_VPENDBASER.Dirty to clear before descheduling
  2022-03-17  7:27 ` Jingyi Wang
                   ` (2 preceding siblings ...)
  (?)
@ 2022-03-21  8:59 ` irqchip-bot for Marc Zyngier
  -1 siblings, 0 replies; 15+ messages in thread
From: irqchip-bot for Marc Zyngier @ 2022-03-21  8:59 UTC (permalink / raw)
  To: linux-kernel; +Cc: Jingyi Wang, Nianyao Tang, Marc Zyngier, tglx

The following commit has been merged into the irq/irqchip-next branch of irqchip:

Commit-ID:     686121ebe6fce266f16978e98ecda35a72c39dbf
Gitweb:        https://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms/686121ebe6fce266f16978e98ecda35a72c39dbf
Author:        Marc Zyngier <maz@kernel.org>
AuthorDate:    Thu, 17 Mar 2022 09:49:02 
Committer:     Marc Zyngier <maz@kernel.org>
CommitterDate: Mon, 21 Mar 2022 08:53:05 

irqchip/gic-v4: Wait for GICR_VPENDBASER.Dirty to clear before descheduling

The way KVM drives GICv4.{0,1} is as follows:
- vcpu_load() makes the VPE resident, instructing the RD to start
  scanning for interrupts
- just before entering the guest, we check that the RD has finished
  scanning and that we can start running the vcpu
- on preemption, we deschedule the VPE by making it invalid on
  the RD

However, we are preemptible between the first two steps. If it so
happens *and* that the RD was still scanning, we nonetheless write
to the GICR_VPENDBASER register while Dirty is set, and bad things
happen (we're in UNPRED land).

This affects both the 4.0 and 4.1 implementations.

Make sure Dirty is cleared before performing the deschedule,
meaning that its_clear_vpend_valid() becomes a sort of full VPE
residency barrier.

Reported-by: Jingyi Wang <wangjingyi11@huawei.com>
Tested-by: Nianyao Tang <tangnianyao@huawei.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Fixes: 57e3cebd022f ("KVM: arm64: Delay the polling of the GICR_VPENDBASER.Dirty
bit")
Link: https://lore.kernel.org/r/4aae10ba-b39a-5f84-754b-69c2eb0a2c03@huawei.com
---
 drivers/irqchip/irq-gic-v3-its.c | 28 +++++++++++++++++++---------
 1 file changed, 19 insertions(+), 9 deletions(-)

diff --git a/drivers/irqchip/irq-gic-v3-its.c b/drivers/irqchip/irq-gic-v3-its.c
index 9e93ff2..c9b1df9 100644
--- a/drivers/irqchip/irq-gic-v3-its.c
+++ b/drivers/irqchip/irq-gic-v3-its.c
@@ -3011,18 +3011,12 @@ static int __init allocate_lpi_tables(void)
 	return 0;
 }
 
-static u64 its_clear_vpend_valid(void __iomem *vlpi_base, u64 clr, u64 set)
+static u64 read_vpend_dirty_clear(void __iomem *vlpi_base)
 {
 	u32 count = 1000000;	/* 1s! */
 	bool clean;
 	u64 val;
 
-	val = gicr_read_vpendbaser(vlpi_base + GICR_VPENDBASER);
-	val &= ~GICR_VPENDBASER_Valid;
-	val &= ~clr;
-	val |= set;
-	gicr_write_vpendbaser(val, vlpi_base + GICR_VPENDBASER);
-
 	do {
 		val = gicr_read_vpendbaser(vlpi_base + GICR_VPENDBASER);
 		clean = !(val & GICR_VPENDBASER_Dirty);
@@ -3033,10 +3027,26 @@ static u64 its_clear_vpend_valid(void __iomem *vlpi_base, u64 clr, u64 set)
 		}
 	} while (!clean && count);
 
-	if (unlikely(val & GICR_VPENDBASER_Dirty)) {
+	if (unlikely(!clean))
 		pr_err_ratelimited("ITS virtual pending table not cleaning\n");
+
+	return val;
+}
+
+static u64 its_clear_vpend_valid(void __iomem *vlpi_base, u64 clr, u64 set)
+{
+	u64 val;
+
+	/* Make sure we wait until the RD is done with the initial scan */
+	val = read_vpend_dirty_clear(vlpi_base);
+	val &= ~GICR_VPENDBASER_Valid;
+	val &= ~clr;
+	val |= set;
+	gicr_write_vpendbaser(val, vlpi_base + GICR_VPENDBASER);
+
+	val = read_vpend_dirty_clear(vlpi_base);
+	if (unlikely(val & GICR_VPENDBASER_Dirty))
 		val |= GICR_VPENDBASER_PendingLast;
-	}
 
 	return val;
 }

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [irqchip: irq/irqchip-next] irqchip/gic-v4: Wait for GICR_VPENDBASER.Dirty to clear before descheduling
  2022-03-17  7:27 ` Jingyi Wang
                   ` (3 preceding siblings ...)
  (?)
@ 2022-03-21 14:07 ` irqchip-bot for Marc Zyngier
  -1 siblings, 0 replies; 15+ messages in thread
From: irqchip-bot for Marc Zyngier @ 2022-03-21 14:07 UTC (permalink / raw)
  To: linux-kernel; +Cc: Jingyi Wang, Nianyao Tang, Marc Zyngier, tglx

The following commit has been merged into the irq/irqchip-next branch of irqchip:

Commit-ID:     e307414a346d99ead7b1e962daee331e71467d18
Gitweb:        https://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms/e307414a346d99ead7b1e962daee331e71467d18
Author:        Marc Zyngier <maz@kernel.org>
AuthorDate:    Thu, 17 Mar 2022 09:49:02 
Committer:     Marc Zyngier <maz@kernel.org>
CommitterDate: Mon, 21 Mar 2022 14:02:32 

irqchip/gic-v4: Wait for GICR_VPENDBASER.Dirty to clear before descheduling

The way KVM drives GICv4.{0,1} is as follows:
- vcpu_load() makes the VPE resident, instructing the RD to start
  scanning for interrupts
- just before entering the guest, we check that the RD has finished
  scanning and that we can start running the vcpu
- on preemption, we deschedule the VPE by making it invalid on
  the RD

However, we are preemptible between the first two steps. If it so
happens *and* that the RD was still scanning, we nonetheless write
to the GICR_VPENDBASER register while Dirty is set, and bad things
happen (we're in UNPRED land).

This affects both the 4.0 and 4.1 implementations.

Make sure Dirty is cleared before performing the deschedule,
meaning that its_clear_vpend_valid() becomes a sort of full VPE
residency barrier.

Reported-by: Jingyi Wang <wangjingyi11@huawei.com>
Tested-by: Nianyao Tang <tangnianyao@huawei.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Fixes: 57e3cebd022f ("KVM: arm64: Delay the polling of the GICR_VPENDBASER.Dirty bit")
Link: https://lore.kernel.org/r/4aae10ba-b39a-5f84-754b-69c2eb0a2c03@huawei.com
---
 drivers/irqchip/irq-gic-v3-its.c | 28 +++++++++++++++++++---------
 1 file changed, 19 insertions(+), 9 deletions(-)

diff --git a/drivers/irqchip/irq-gic-v3-its.c b/drivers/irqchip/irq-gic-v3-its.c
index 9e93ff2..c9b1df9 100644
--- a/drivers/irqchip/irq-gic-v3-its.c
+++ b/drivers/irqchip/irq-gic-v3-its.c
@@ -3011,18 +3011,12 @@ static int __init allocate_lpi_tables(void)
 	return 0;
 }
 
-static u64 its_clear_vpend_valid(void __iomem *vlpi_base, u64 clr, u64 set)
+static u64 read_vpend_dirty_clear(void __iomem *vlpi_base)
 {
 	u32 count = 1000000;	/* 1s! */
 	bool clean;
 	u64 val;
 
-	val = gicr_read_vpendbaser(vlpi_base + GICR_VPENDBASER);
-	val &= ~GICR_VPENDBASER_Valid;
-	val &= ~clr;
-	val |= set;
-	gicr_write_vpendbaser(val, vlpi_base + GICR_VPENDBASER);
-
 	do {
 		val = gicr_read_vpendbaser(vlpi_base + GICR_VPENDBASER);
 		clean = !(val & GICR_VPENDBASER_Dirty);
@@ -3033,10 +3027,26 @@ static u64 its_clear_vpend_valid(void __iomem *vlpi_base, u64 clr, u64 set)
 		}
 	} while (!clean && count);
 
-	if (unlikely(val & GICR_VPENDBASER_Dirty)) {
+	if (unlikely(!clean))
 		pr_err_ratelimited("ITS virtual pending table not cleaning\n");
+
+	return val;
+}
+
+static u64 its_clear_vpend_valid(void __iomem *vlpi_base, u64 clr, u64 set)
+{
+	u64 val;
+
+	/* Make sure we wait until the RD is done with the initial scan */
+	val = read_vpend_dirty_clear(vlpi_base);
+	val &= ~GICR_VPENDBASER_Valid;
+	val &= ~clr;
+	val |= set;
+	gicr_write_vpendbaser(val, vlpi_base + GICR_VPENDBASER);
+
+	val = read_vpend_dirty_clear(vlpi_base);
+	if (unlikely(val & GICR_VPENDBASER_Dirty))
 		val |= GICR_VPENDBASER_PendingLast;
-	}
 
 	return val;
 }

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [irqchip: irq/irqchip-fixes] irqchip/gic-v4: Wait for GICR_VPENDBASER.Dirty to clear before descheduling
  2022-03-17  7:27 ` Jingyi Wang
                   ` (4 preceding siblings ...)
  (?)
@ 2022-04-05 15:40 ` irqchip-bot for Marc Zyngier
  -1 siblings, 0 replies; 15+ messages in thread
From: irqchip-bot for Marc Zyngier @ 2022-04-05 15:40 UTC (permalink / raw)
  To: linux-kernel; +Cc: Jingyi Wang, Nianyao Tang, Marc Zyngier, tglx

The following commit has been merged into the irq/irqchip-fixes branch of irqchip:

Commit-ID:     af27e41612ec7e5b4783f589b753a7c31a37aac8
Gitweb:        https://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms/af27e41612ec7e5b4783f589b753a7c31a37aac8
Author:        Marc Zyngier <maz@kernel.org>
AuthorDate:    Thu, 17 Mar 2022 09:49:02 
Committer:     Marc Zyngier <maz@kernel.org>
CommitterDate: Tue, 05 Apr 2022 16:33:13 +01:00

irqchip/gic-v4: Wait for GICR_VPENDBASER.Dirty to clear before descheduling

The way KVM drives GICv4.{0,1} is as follows:
- vcpu_load() makes the VPE resident, instructing the RD to start
  scanning for interrupts
- just before entering the guest, we check that the RD has finished
  scanning and that we can start running the vcpu
- on preemption, we deschedule the VPE by making it invalid on
  the RD

However, we are preemptible between the first two steps. If it so
happens *and* that the RD was still scanning, we nonetheless write
to the GICR_VPENDBASER register while Dirty is set, and bad things
happen (we're in UNPRED land).

This affects both the 4.0 and 4.1 implementations.

Make sure Dirty is cleared before performing the deschedule,
meaning that its_clear_vpend_valid() becomes a sort of full VPE
residency barrier.

Reported-by: Jingyi Wang <wangjingyi11@huawei.com>
Tested-by: Nianyao Tang <tangnianyao@huawei.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Fixes: 57e3cebd022f ("KVM: arm64: Delay the polling of the GICR_VPENDBASER.Dirty bit")
Link: https://lore.kernel.org/r/4aae10ba-b39a-5f84-754b-69c2eb0a2c03@huawei.com
---
 drivers/irqchip/irq-gic-v3-its.c | 28 +++++++++++++++++++---------
 1 file changed, 19 insertions(+), 9 deletions(-)

diff --git a/drivers/irqchip/irq-gic-v3-its.c b/drivers/irqchip/irq-gic-v3-its.c
index cd77297..a0fc764 100644
--- a/drivers/irqchip/irq-gic-v3-its.c
+++ b/drivers/irqchip/irq-gic-v3-its.c
@@ -3011,18 +3011,12 @@ static int __init allocate_lpi_tables(void)
 	return 0;
 }
 
-static u64 its_clear_vpend_valid(void __iomem *vlpi_base, u64 clr, u64 set)
+static u64 read_vpend_dirty_clear(void __iomem *vlpi_base)
 {
 	u32 count = 1000000;	/* 1s! */
 	bool clean;
 	u64 val;
 
-	val = gicr_read_vpendbaser(vlpi_base + GICR_VPENDBASER);
-	val &= ~GICR_VPENDBASER_Valid;
-	val &= ~clr;
-	val |= set;
-	gicr_write_vpendbaser(val, vlpi_base + GICR_VPENDBASER);
-
 	do {
 		val = gicr_read_vpendbaser(vlpi_base + GICR_VPENDBASER);
 		clean = !(val & GICR_VPENDBASER_Dirty);
@@ -3033,10 +3027,26 @@ static u64 its_clear_vpend_valid(void __iomem *vlpi_base, u64 clr, u64 set)
 		}
 	} while (!clean && count);
 
-	if (unlikely(val & GICR_VPENDBASER_Dirty)) {
+	if (unlikely(!clean))
 		pr_err_ratelimited("ITS virtual pending table not cleaning\n");
+
+	return val;
+}
+
+static u64 its_clear_vpend_valid(void __iomem *vlpi_base, u64 clr, u64 set)
+{
+	u64 val;
+
+	/* Make sure we wait until the RD is done with the initial scan */
+	val = read_vpend_dirty_clear(vlpi_base);
+	val &= ~GICR_VPENDBASER_Valid;
+	val &= ~clr;
+	val |= set;
+	gicr_write_vpendbaser(val, vlpi_base + GICR_VPENDBASER);
+
+	val = read_vpend_dirty_clear(vlpi_base);
+	if (unlikely(val & GICR_VPENDBASER_Dirty))
 		val |= GICR_VPENDBASER_PendingLast;
-	}
 
 	return val;
 }

^ permalink raw reply related	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2022-04-05 23:17 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-03-17  7:27 Report an error on GICv4.1 vcpu de-schedule Jingyi Wang
2022-03-17  7:27 ` Jingyi Wang
2022-03-17  7:27 ` Jingyi Wang
2022-03-17 10:17 ` Marc Zyngier
2022-03-17 10:17   ` Marc Zyngier
2022-03-17 10:17   ` Marc Zyngier
2022-03-18  6:14   ` Jingyi Wang
2022-03-18  6:14     ` Jingyi Wang
2022-03-18  6:14     ` Jingyi Wang
2022-03-21  1:33   ` Jingyi Wang
2022-03-21  1:33     ` Jingyi Wang
2022-03-21  1:33     ` Jingyi Wang
2022-03-21  8:59 ` [irqchip: irq/irqchip-next] irqchip/gic-v4: Wait for GICR_VPENDBASER.Dirty to clear before descheduling irqchip-bot for Marc Zyngier
2022-03-21 14:07 ` irqchip-bot for Marc Zyngier
2022-04-05 15:40 ` [irqchip: irq/irqchip-fixes] " irqchip-bot for Marc Zyngier

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.