All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/2] KVM: arm/arm64: GICv2-on-v3 fixes
@ 2018-03-07 12:40 ` Marc Zyngier
  0 siblings, 0 replies; 28+ messages in thread
From: Marc Zyngier @ 2018-03-07 12:40 UTC (permalink / raw)
  To: kvm, kvmarm, linux-arm-kernel
  Cc: Andre Przywara, Christoffer Dall, Eric Auger

I've been trying to run VMs on a GICv3-based system that offers the
GICv2 compatibility feature, and noticed that they would tend to
slowly die under load.

It turned out that this is due to KVM not being exactly true to the
architecture, and ends up injecting multiple SGI with the same vintid,
which the architecture clearly outlines as a "don't do that". This bug
has been there since the first days of the "new vgic".

Another issue is that we don't use the right barriers when exiting
from the guest, as we only synchronize stores, while the architecture
requires to synchronize both loads and stores. And we miss an isb to
force execution of the previous dsb.

Unless someone shouts, I'm planning to take these patches into 4.16 as
they directly affect my laptop (yes, I'm biased).

Marc Zyngier (2):
  KVM: arm/arm64: vgic: Don't populate multiple LRs with the same vintid
  kvm: arm/arm64: vgic-v3: Tighten synchronization for guests using v2
    on v3

 virt/kvm/arm/hyp/vgic-v3-sr.c |  3 ++-
 virt/kvm/arm/vgic/vgic.c      | 11 +----------
 2 files changed, 3 insertions(+), 11 deletions(-)

-- 
2.14.2

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH 0/2] KVM: arm/arm64: GICv2-on-v3 fixes
@ 2018-03-07 12:40 ` Marc Zyngier
  0 siblings, 0 replies; 28+ messages in thread
From: Marc Zyngier @ 2018-03-07 12:40 UTC (permalink / raw)
  To: linux-arm-kernel

I've been trying to run VMs on a GICv3-based system that offers the
GICv2 compatibility feature, and noticed that they would tend to
slowly die under load.

It turned out that this is due to KVM not being exactly true to the
architecture, and ends up injecting multiple SGI with the same vintid,
which the architecture clearly outlines as a "don't do that". This bug
has been there since the first days of the "new vgic".

Another issue is that we don't use the right barriers when exiting
from the guest, as we only synchronize stores, while the architecture
requires to synchronize both loads and stores. And we miss an isb to
force execution of the previous dsb.

Unless someone shouts, I'm planning to take these patches into 4.16 as
they directly affect my laptop (yes, I'm biased).

Marc Zyngier (2):
  KVM: arm/arm64: vgic: Don't populate multiple LRs with the same vintid
  kvm: arm/arm64: vgic-v3: Tighten synchronization for guests using v2
    on v3

 virt/kvm/arm/hyp/vgic-v3-sr.c |  3 ++-
 virt/kvm/arm/vgic/vgic.c      | 11 +----------
 2 files changed, 3 insertions(+), 11 deletions(-)

-- 
2.14.2

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH 1/2] KVM: arm/arm64: vgic: Don't populate multiple LRs with the same vintid
  2018-03-07 12:40 ` Marc Zyngier
@ 2018-03-07 12:40   ` Marc Zyngier
  -1 siblings, 0 replies; 28+ messages in thread
From: Marc Zyngier @ 2018-03-07 12:40 UTC (permalink / raw)
  To: kvm, kvmarm, linux-arm-kernel; +Cc: Andre Przywara

The vgic code is trying to be clever when injecting GICv2 SGIs,
and will happily populate LRs with the same interrupt number if
they come from multiple vcpus (after all, they are distinct
interrupt sources).

Unfortunately, this is against the letter of the architecture,
and the GICv2 architecture spec says "Each valid interrupt stored
in the List registers must have a unique VirtualID for that
virtual CPU interface.". GICv3 has similar (although slightly
ambiguous) restrictions.

This results in guests locking up when using GICv2-on-GICv3, for
example. The obvious fix is to stop trying so hard, and inject
a single vcpu per SGI per guest entry. After all, pending SGIs
with multiple source vcpus are pretty rare, and are mostly seen
in scenario where the physical CPUs are severely overcomitted.

Cc: stable@vger.kernel.org
Fixes: 0919e84c0fc1 ("KVM: arm/arm64: vgic-new: Add IRQ sync/flush framework")
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
---
 virt/kvm/arm/vgic/vgic.c | 11 +----------
 1 file changed, 1 insertion(+), 10 deletions(-)

diff --git a/virt/kvm/arm/vgic/vgic.c b/virt/kvm/arm/vgic/vgic.c
index c7c5ef190afa..1f7ff175f47b 100644
--- a/virt/kvm/arm/vgic/vgic.c
+++ b/virt/kvm/arm/vgic/vgic.c
@@ -720,18 +720,9 @@ static void vgic_flush_lr_state(struct kvm_vcpu *vcpu)
 	list_for_each_entry(irq, &vgic_cpu->ap_list_head, ap_list) {
 		spin_lock(&irq->irq_lock);
 
-		if (unlikely(vgic_target_oracle(irq) != vcpu))
-			goto next;
-
-		/*
-		 * If we get an SGI with multiple sources, try to get
-		 * them in all at once.
-		 */
-		do {
+		if (likely(vgic_target_oracle(irq) == vcpu))
 			vgic_populate_lr(vcpu, irq, count++);
-		} while (irq->source && count < kvm_vgic_global_state.nr_lr);
 
-next:
 		spin_unlock(&irq->irq_lock);
 
 		if (count == kvm_vgic_global_state.nr_lr) {
-- 
2.14.2

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 1/2] KVM: arm/arm64: vgic: Don't populate multiple LRs with the same vintid
@ 2018-03-07 12:40   ` Marc Zyngier
  0 siblings, 0 replies; 28+ messages in thread
From: Marc Zyngier @ 2018-03-07 12:40 UTC (permalink / raw)
  To: linux-arm-kernel

The vgic code is trying to be clever when injecting GICv2 SGIs,
and will happily populate LRs with the same interrupt number if
they come from multiple vcpus (after all, they are distinct
interrupt sources).

Unfortunately, this is against the letter of the architecture,
and the GICv2 architecture spec says "Each valid interrupt stored
in the List registers must have a unique VirtualID for that
virtual CPU interface.". GICv3 has similar (although slightly
ambiguous) restrictions.

This results in guests locking up when using GICv2-on-GICv3, for
example. The obvious fix is to stop trying so hard, and inject
a single vcpu per SGI per guest entry. After all, pending SGIs
with multiple source vcpus are pretty rare, and are mostly seen
in scenario where the physical CPUs are severely overcomitted.

Cc: stable at vger.kernel.org
Fixes: 0919e84c0fc1 ("KVM: arm/arm64: vgic-new: Add IRQ sync/flush framework")
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
---
 virt/kvm/arm/vgic/vgic.c | 11 +----------
 1 file changed, 1 insertion(+), 10 deletions(-)

diff --git a/virt/kvm/arm/vgic/vgic.c b/virt/kvm/arm/vgic/vgic.c
index c7c5ef190afa..1f7ff175f47b 100644
--- a/virt/kvm/arm/vgic/vgic.c
+++ b/virt/kvm/arm/vgic/vgic.c
@@ -720,18 +720,9 @@ static void vgic_flush_lr_state(struct kvm_vcpu *vcpu)
 	list_for_each_entry(irq, &vgic_cpu->ap_list_head, ap_list) {
 		spin_lock(&irq->irq_lock);
 
-		if (unlikely(vgic_target_oracle(irq) != vcpu))
-			goto next;
-
-		/*
-		 * If we get an SGI with multiple sources, try to get
-		 * them in all at once.
-		 */
-		do {
+		if (likely(vgic_target_oracle(irq) == vcpu))
 			vgic_populate_lr(vcpu, irq, count++);
-		} while (irq->source && count < kvm_vgic_global_state.nr_lr);
 
-next:
 		spin_unlock(&irq->irq_lock);
 
 		if (count == kvm_vgic_global_state.nr_lr) {
-- 
2.14.2

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 2/2] kvm: arm/arm64: vgic-v3: Tighten synchronization for guests using v2 on v3
  2018-03-07 12:40 ` Marc Zyngier
@ 2018-03-07 12:40   ` Marc Zyngier
  -1 siblings, 0 replies; 28+ messages in thread
From: Marc Zyngier @ 2018-03-07 12:40 UTC (permalink / raw)
  To: kvm, kvmarm, linux-arm-kernel; +Cc: Andre Przywara

On guest exit, and when using GICv2 on GICv3, we use a dsb(st) to
force synchronization between the memory-mapped guest view and
the system-register view that the hypervisor uses.

This is incorrect, as the spec calls out the need for "a DSB whose
required access type is both loads and stores with any Shareability
attribute", while we're only synchronizing stores.

We also lack an isb after the dsb to ensure that the latter has
actually been executed before we start reading stuff from the sysregs.

The fix is pretty easy: turn dsb(st) into dsb(sy), and slap an isb()
just after.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
---
 virt/kvm/arm/hyp/vgic-v3-sr.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/virt/kvm/arm/hyp/vgic-v3-sr.c b/virt/kvm/arm/hyp/vgic-v3-sr.c
index f5c3d6d7019e..b89ce5432214 100644
--- a/virt/kvm/arm/hyp/vgic-v3-sr.c
+++ b/virt/kvm/arm/hyp/vgic-v3-sr.c
@@ -215,7 +215,8 @@ void __hyp_text __vgic_v3_save_state(struct kvm_vcpu *vcpu)
 	 * are now visible to the system register interface.
 	 */
 	if (!cpu_if->vgic_sre) {
-		dsb(st);
+		dsb(sy);
+		isb();
 		cpu_if->vgic_vmcr = read_gicreg(ICH_VMCR_EL2);
 	}
 
-- 
2.14.2

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 2/2] kvm: arm/arm64: vgic-v3: Tighten synchronization for guests using v2 on v3
@ 2018-03-07 12:40   ` Marc Zyngier
  0 siblings, 0 replies; 28+ messages in thread
From: Marc Zyngier @ 2018-03-07 12:40 UTC (permalink / raw)
  To: linux-arm-kernel

On guest exit, and when using GICv2 on GICv3, we use a dsb(st) to
force synchronization between the memory-mapped guest view and
the system-register view that the hypervisor uses.

This is incorrect, as the spec calls out the need for "a DSB whose
required access type is both loads and stores with any Shareability
attribute", while we're only synchronizing stores.

We also lack an isb after the dsb to ensure that the latter has
actually been executed before we start reading stuff from the sysregs.

The fix is pretty easy: turn dsb(st) into dsb(sy), and slap an isb()
just after.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
---
 virt/kvm/arm/hyp/vgic-v3-sr.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/virt/kvm/arm/hyp/vgic-v3-sr.c b/virt/kvm/arm/hyp/vgic-v3-sr.c
index f5c3d6d7019e..b89ce5432214 100644
--- a/virt/kvm/arm/hyp/vgic-v3-sr.c
+++ b/virt/kvm/arm/hyp/vgic-v3-sr.c
@@ -215,7 +215,8 @@ void __hyp_text __vgic_v3_save_state(struct kvm_vcpu *vcpu)
 	 * are now visible to the system register interface.
 	 */
 	if (!cpu_if->vgic_sre) {
-		dsb(st);
+		dsb(sy);
+		isb();
 		cpu_if->vgic_vmcr = read_gicreg(ICH_VMCR_EL2);
 	}
 
-- 
2.14.2

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* Re: [PATCH 1/2] KVM: arm/arm64: vgic: Don't populate multiple LRs with the same vintid
  2018-03-07 12:40   ` Marc Zyngier
@ 2018-03-07 14:44     ` Andre Przywara
  -1 siblings, 0 replies; 28+ messages in thread
From: Andre Przywara @ 2018-03-07 14:44 UTC (permalink / raw)
  To: Marc Zyngier, kvm, kvmarm, linux-arm-kernel; +Cc: Eric Auger, Christoffer Dall

Hi,

On 07/03/18 12:40, Marc Zyngier wrote:
> The vgic code is trying to be clever when injecting GICv2 SGIs,
> and will happily populate LRs with the same interrupt number if
> they come from multiple vcpus (after all, they are distinct
> interrupt sources).
> 
> Unfortunately, this is against the letter of the architecture,
> and the GICv2 architecture spec says "Each valid interrupt stored
> in the List registers must have a unique VirtualID for that
> virtual CPU interface.". GICv3 has similar (although slightly
> ambiguous) restrictions.

Ah, good catch. I was silently assuming that this "unique interrupt"
restriction was including the source ID, but fair enough.

> This results in guests locking up when using GICv2-on-GICv3, for
> example. The obvious fix is to stop trying so hard, and inject
> a single vcpu per SGI per guest entry. After all, pending SGIs
> with multiple source vcpus are pretty rare, and are mostly seen
> in scenario where the physical CPUs are severely overcomitted.
> 
> Cc: stable@vger.kernel.org
> Fixes: 0919e84c0fc1 ("KVM: arm/arm64: vgic-new: Add IRQ sync/flush framework")
> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>

Reviewed-by: Andre Przywara <andre.przywara@arm.com>

Thanks!
Andre.

> ---
>  virt/kvm/arm/vgic/vgic.c | 11 +----------
>  1 file changed, 1 insertion(+), 10 deletions(-)
> 
> diff --git a/virt/kvm/arm/vgic/vgic.c b/virt/kvm/arm/vgic/vgic.c
> index c7c5ef190afa..1f7ff175f47b 100644
> --- a/virt/kvm/arm/vgic/vgic.c
> +++ b/virt/kvm/arm/vgic/vgic.c
> @@ -720,18 +720,9 @@ static void vgic_flush_lr_state(struct kvm_vcpu *vcpu)
>  	list_for_each_entry(irq, &vgic_cpu->ap_list_head, ap_list) {
>  		spin_lock(&irq->irq_lock);
>  
> -		if (unlikely(vgic_target_oracle(irq) != vcpu))
> -			goto next;
> -
> -		/*
> -		 * If we get an SGI with multiple sources, try to get
> -		 * them in all at once.
> -		 */
> -		do {
> +		if (likely(vgic_target_oracle(irq) == vcpu))
>  			vgic_populate_lr(vcpu, irq, count++);
> -		} while (irq->source && count < kvm_vgic_global_state.nr_lr);
>  
> -next:
>  		spin_unlock(&irq->irq_lock);
>  
>  		if (count == kvm_vgic_global_state.nr_lr) {
> 

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH 1/2] KVM: arm/arm64: vgic: Don't populate multiple LRs with the same vintid
@ 2018-03-07 14:44     ` Andre Przywara
  0 siblings, 0 replies; 28+ messages in thread
From: Andre Przywara @ 2018-03-07 14:44 UTC (permalink / raw)
  To: linux-arm-kernel

Hi,

On 07/03/18 12:40, Marc Zyngier wrote:
> The vgic code is trying to be clever when injecting GICv2 SGIs,
> and will happily populate LRs with the same interrupt number if
> they come from multiple vcpus (after all, they are distinct
> interrupt sources).
> 
> Unfortunately, this is against the letter of the architecture,
> and the GICv2 architecture spec says "Each valid interrupt stored
> in the List registers must have a unique VirtualID for that
> virtual CPU interface.". GICv3 has similar (although slightly
> ambiguous) restrictions.

Ah, good catch. I was silently assuming that this "unique interrupt"
restriction was including the source ID, but fair enough.

> This results in guests locking up when using GICv2-on-GICv3, for
> example. The obvious fix is to stop trying so hard, and inject
> a single vcpu per SGI per guest entry. After all, pending SGIs
> with multiple source vcpus are pretty rare, and are mostly seen
> in scenario where the physical CPUs are severely overcomitted.
> 
> Cc: stable at vger.kernel.org
> Fixes: 0919e84c0fc1 ("KVM: arm/arm64: vgic-new: Add IRQ sync/flush framework")
> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>

Reviewed-by: Andre Przywara <andre.przywara@arm.com>

Thanks!
Andre.

> ---
>  virt/kvm/arm/vgic/vgic.c | 11 +----------
>  1 file changed, 1 insertion(+), 10 deletions(-)
> 
> diff --git a/virt/kvm/arm/vgic/vgic.c b/virt/kvm/arm/vgic/vgic.c
> index c7c5ef190afa..1f7ff175f47b 100644
> --- a/virt/kvm/arm/vgic/vgic.c
> +++ b/virt/kvm/arm/vgic/vgic.c
> @@ -720,18 +720,9 @@ static void vgic_flush_lr_state(struct kvm_vcpu *vcpu)
>  	list_for_each_entry(irq, &vgic_cpu->ap_list_head, ap_list) {
>  		spin_lock(&irq->irq_lock);
>  
> -		if (unlikely(vgic_target_oracle(irq) != vcpu))
> -			goto next;
> -
> -		/*
> -		 * If we get an SGI with multiple sources, try to get
> -		 * them in all at once.
> -		 */
> -		do {
> +		if (likely(vgic_target_oracle(irq) == vcpu))
>  			vgic_populate_lr(vcpu, irq, count++);
> -		} while (irq->source && count < kvm_vgic_global_state.nr_lr);
>  
> -next:
>  		spin_unlock(&irq->irq_lock);
>  
>  		if (count == kvm_vgic_global_state.nr_lr) {
> 

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 2/2] kvm: arm/arm64: vgic-v3: Tighten synchronization for guests using v2 on v3
  2018-03-07 12:40   ` Marc Zyngier
@ 2018-03-07 16:06     ` Andre Przywara
  -1 siblings, 0 replies; 28+ messages in thread
From: Andre Przywara @ 2018-03-07 16:06 UTC (permalink / raw)
  To: Marc Zyngier, kvm, kvmarm, linux-arm-kernel; +Cc: Eric Auger, Christoffer Dall

Hi,

On 07/03/18 12:40, Marc Zyngier wrote:
> On guest exit, and when using GICv2 on GICv3, we use a dsb(st) to
> force synchronization between the memory-mapped guest view and
> the system-register view that the hypervisor uses.
> 
> This is incorrect, as the spec calls out the need for "a DSB whose
> required access type is both loads and stores with any Shareability
> attribute", while we're only synchronizing stores.
> 
> We also lack an isb after the dsb to ensure that the latter has
> actually been executed before we start reading stuff from the sysregs.
> 
> The fix is pretty easy: turn dsb(st) into dsb(sy), and slap an isb()
> just after.

It's a shame that the spec indeed asks for the biggest available hammer,
but: so be it!

> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>

Reviewed-by: Andre Przywara <andre.przywara@arm.com>

Cheers,
Andre.

> ---
>  virt/kvm/arm/hyp/vgic-v3-sr.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/virt/kvm/arm/hyp/vgic-v3-sr.c b/virt/kvm/arm/hyp/vgic-v3-sr.c
> index f5c3d6d7019e..b89ce5432214 100644
> --- a/virt/kvm/arm/hyp/vgic-v3-sr.c
> +++ b/virt/kvm/arm/hyp/vgic-v3-sr.c
> @@ -215,7 +215,8 @@ void __hyp_text __vgic_v3_save_state(struct kvm_vcpu *vcpu)
>  	 * are now visible to the system register interface.
>  	 */
>  	if (!cpu_if->vgic_sre) {
> -		dsb(st);
> +		dsb(sy);
> +		isb();
>  		cpu_if->vgic_vmcr = read_gicreg(ICH_VMCR_EL2);
>  	}
>  
> 

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH 2/2] kvm: arm/arm64: vgic-v3: Tighten synchronization for guests using v2 on v3
@ 2018-03-07 16:06     ` Andre Przywara
  0 siblings, 0 replies; 28+ messages in thread
From: Andre Przywara @ 2018-03-07 16:06 UTC (permalink / raw)
  To: linux-arm-kernel

Hi,

On 07/03/18 12:40, Marc Zyngier wrote:
> On guest exit, and when using GICv2 on GICv3, we use a dsb(st) to
> force synchronization between the memory-mapped guest view and
> the system-register view that the hypervisor uses.
> 
> This is incorrect, as the spec calls out the need for "a DSB whose
> required access type is both loads and stores with any Shareability
> attribute", while we're only synchronizing stores.
> 
> We also lack an isb after the dsb to ensure that the latter has
> actually been executed before we start reading stuff from the sysregs.
> 
> The fix is pretty easy: turn dsb(st) into dsb(sy), and slap an isb()
> just after.

It's a shame that the spec indeed asks for the biggest available hammer,
but: so be it!

> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>

Reviewed-by: Andre Przywara <andre.przywara@arm.com>

Cheers,
Andre.

> ---
>  virt/kvm/arm/hyp/vgic-v3-sr.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/virt/kvm/arm/hyp/vgic-v3-sr.c b/virt/kvm/arm/hyp/vgic-v3-sr.c
> index f5c3d6d7019e..b89ce5432214 100644
> --- a/virt/kvm/arm/hyp/vgic-v3-sr.c
> +++ b/virt/kvm/arm/hyp/vgic-v3-sr.c
> @@ -215,7 +215,8 @@ void __hyp_text __vgic_v3_save_state(struct kvm_vcpu *vcpu)
>  	 * are now visible to the system register interface.
>  	 */
>  	if (!cpu_if->vgic_sre) {
> -		dsb(st);
> +		dsb(sy);
> +		isb();
>  		cpu_if->vgic_vmcr = read_gicreg(ICH_VMCR_EL2);
>  	}
>  
> 

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 1/2] KVM: arm/arm64: vgic: Don't populate multiple LRs with the same vintid
  2018-03-07 12:40   ` Marc Zyngier
@ 2018-03-07 23:34     ` Christoffer Dall
  -1 siblings, 0 replies; 28+ messages in thread
From: Christoffer Dall @ 2018-03-07 23:34 UTC (permalink / raw)
  To: Marc Zyngier; +Cc: linux-arm-kernel, Andre Przywara, kvmarm, KVM General

On Wed, Mar 7, 2018 at 12:40 PM, Marc Zyngier <marc.zyngier@arm.com> wrote:
> The vgic code is trying to be clever when injecting GICv2 SGIs,
> and will happily populate LRs with the same interrupt number if
> they come from multiple vcpus (after all, they are distinct
> interrupt sources).
>
> Unfortunately, this is against the letter of the architecture,
> and the GICv2 architecture spec says "Each valid interrupt stored
> in the List registers must have a unique VirtualID for that
> virtual CPU interface.". GICv3 has similar (although slightly
> ambiguous) restrictions.
>
> This results in guests locking up when using GICv2-on-GICv3, for
> example. The obvious fix is to stop trying so hard, and inject
> a single vcpu per SGI per guest entry. After all, pending SGIs
> with multiple source vcpus are pretty rare, and are mostly seen
> in scenario where the physical CPUs are severely overcomitted.
>
> Cc: stable@vger.kernel.org
> Fixes: 0919e84c0fc1 ("KVM: arm/arm64: vgic-new: Add IRQ sync/flush framework")
> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
> ---
>  virt/kvm/arm/vgic/vgic.c | 11 +----------
>  1 file changed, 1 insertion(+), 10 deletions(-)
>
> diff --git a/virt/kvm/arm/vgic/vgic.c b/virt/kvm/arm/vgic/vgic.c
> index c7c5ef190afa..1f7ff175f47b 100644
> --- a/virt/kvm/arm/vgic/vgic.c
> +++ b/virt/kvm/arm/vgic/vgic.c
> @@ -720,18 +720,9 @@ static void vgic_flush_lr_state(struct kvm_vcpu *vcpu)
>         list_for_each_entry(irq, &vgic_cpu->ap_list_head, ap_list) {
>                 spin_lock(&irq->irq_lock);
>
> -               if (unlikely(vgic_target_oracle(irq) != vcpu))
> -                       goto next;
> -
> -               /*
> -                * If we get an SGI with multiple sources, try to get
> -                * them in all at once.
> -                */
> -               do {
> +               if (likely(vgic_target_oracle(irq) == vcpu))
>                         vgic_populate_lr(vcpu, irq, count++);

I think we need to change vgic_populate_lr to set the EOI maintenance
interrupt flag so that when the interrupt is deactivated, if there are
additional pending sources, we exit the guest and pick up the
interrupt.

An alternative would be to set the underflow interrupt, but I don't
think that would be correct for multiple priorities, because the SGI
could have a higher priority than other pending interrupts we put in
the LR.

Thanks,
Christoffer

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH 1/2] KVM: arm/arm64: vgic: Don't populate multiple LRs with the same vintid
@ 2018-03-07 23:34     ` Christoffer Dall
  0 siblings, 0 replies; 28+ messages in thread
From: Christoffer Dall @ 2018-03-07 23:34 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Mar 7, 2018 at 12:40 PM, Marc Zyngier <marc.zyngier@arm.com> wrote:
> The vgic code is trying to be clever when injecting GICv2 SGIs,
> and will happily populate LRs with the same interrupt number if
> they come from multiple vcpus (after all, they are distinct
> interrupt sources).
>
> Unfortunately, this is against the letter of the architecture,
> and the GICv2 architecture spec says "Each valid interrupt stored
> in the List registers must have a unique VirtualID for that
> virtual CPU interface.". GICv3 has similar (although slightly
> ambiguous) restrictions.
>
> This results in guests locking up when using GICv2-on-GICv3, for
> example. The obvious fix is to stop trying so hard, and inject
> a single vcpu per SGI per guest entry. After all, pending SGIs
> with multiple source vcpus are pretty rare, and are mostly seen
> in scenario where the physical CPUs are severely overcomitted.
>
> Cc: stable at vger.kernel.org
> Fixes: 0919e84c0fc1 ("KVM: arm/arm64: vgic-new: Add IRQ sync/flush framework")
> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
> ---
>  virt/kvm/arm/vgic/vgic.c | 11 +----------
>  1 file changed, 1 insertion(+), 10 deletions(-)
>
> diff --git a/virt/kvm/arm/vgic/vgic.c b/virt/kvm/arm/vgic/vgic.c
> index c7c5ef190afa..1f7ff175f47b 100644
> --- a/virt/kvm/arm/vgic/vgic.c
> +++ b/virt/kvm/arm/vgic/vgic.c
> @@ -720,18 +720,9 @@ static void vgic_flush_lr_state(struct kvm_vcpu *vcpu)
>         list_for_each_entry(irq, &vgic_cpu->ap_list_head, ap_list) {
>                 spin_lock(&irq->irq_lock);
>
> -               if (unlikely(vgic_target_oracle(irq) != vcpu))
> -                       goto next;
> -
> -               /*
> -                * If we get an SGI with multiple sources, try to get
> -                * them in all at once.
> -                */
> -               do {
> +               if (likely(vgic_target_oracle(irq) == vcpu))
>                         vgic_populate_lr(vcpu, irq, count++);

I think we need to change vgic_populate_lr to set the EOI maintenance
interrupt flag so that when the interrupt is deactivated, if there are
additional pending sources, we exit the guest and pick up the
interrupt.

An alternative would be to set the underflow interrupt, but I don't
think that would be correct for multiple priorities, because the SGI
could have a higher priority than other pending interrupts we put in
the LR.

Thanks,
Christoffer

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 1/2] KVM: arm/arm64: vgic: Don't populate multiple LRs with the same vintid
  2018-03-07 23:34     ` Christoffer Dall
@ 2018-03-08 10:19       ` Marc Zyngier
  -1 siblings, 0 replies; 28+ messages in thread
From: Marc Zyngier @ 2018-03-08 10:19 UTC (permalink / raw)
  To: Christoffer Dall; +Cc: linux-arm-kernel, Andre Przywara, kvmarm, KVM General

On 07/03/18 23:34, Christoffer Dall wrote:
> On Wed, Mar 7, 2018 at 12:40 PM, Marc Zyngier <marc.zyngier@arm.com> wrote:
>> The vgic code is trying to be clever when injecting GICv2 SGIs,
>> and will happily populate LRs with the same interrupt number if
>> they come from multiple vcpus (after all, they are distinct
>> interrupt sources).
>>
>> Unfortunately, this is against the letter of the architecture,
>> and the GICv2 architecture spec says "Each valid interrupt stored
>> in the List registers must have a unique VirtualID for that
>> virtual CPU interface.". GICv3 has similar (although slightly
>> ambiguous) restrictions.
>>
>> This results in guests locking up when using GICv2-on-GICv3, for
>> example. The obvious fix is to stop trying so hard, and inject
>> a single vcpu per SGI per guest entry. After all, pending SGIs
>> with multiple source vcpus are pretty rare, and are mostly seen
>> in scenario where the physical CPUs are severely overcomitted.
>>
>> Cc: stable@vger.kernel.org
>> Fixes: 0919e84c0fc1 ("KVM: arm/arm64: vgic-new: Add IRQ sync/flush framework")
>> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
>> ---
>>  virt/kvm/arm/vgic/vgic.c | 11 +----------
>>  1 file changed, 1 insertion(+), 10 deletions(-)
>>
>> diff --git a/virt/kvm/arm/vgic/vgic.c b/virt/kvm/arm/vgic/vgic.c
>> index c7c5ef190afa..1f7ff175f47b 100644
>> --- a/virt/kvm/arm/vgic/vgic.c
>> +++ b/virt/kvm/arm/vgic/vgic.c
>> @@ -720,18 +720,9 @@ static void vgic_flush_lr_state(struct kvm_vcpu *vcpu)
>>         list_for_each_entry(irq, &vgic_cpu->ap_list_head, ap_list) {
>>                 spin_lock(&irq->irq_lock);
>>
>> -               if (unlikely(vgic_target_oracle(irq) != vcpu))
>> -                       goto next;
>> -
>> -               /*
>> -                * If we get an SGI with multiple sources, try to get
>> -                * them in all at once.
>> -                */
>> -               do {
>> +               if (likely(vgic_target_oracle(irq) == vcpu))
>>                         vgic_populate_lr(vcpu, irq, count++);
> 
> I think we need to change vgic_populate_lr to set the EOI maintenance
> interrupt flag so that when the interrupt is deactivated, if there are
> additional pending sources, we exit the guest and pick up the
> interrupt.

Potentially. We need to be careful about about the semantics of EOI MI
with non-level interrupts (see the other thread about EOI signalling).

> An alternative would be to set the underflow interrupt, but I don't
> think that would be correct for multiple priorities, because the SGI
> could have a higher priority than other pending interrupts we put in
> the LR.

I don't think priorities are the issue (after all, we already sort the
irqs in order of priority). My worry is that underflow is allowed to
fire if there is one interrupt pending, which implies that you could
end-up in a livelock scenario if you only have one SGI pending with
multiple sources.

Another possibility would be to use ICH_HCR_EL2.NPIE (GICH_HCR.NPIE on
GICv2), which delivers a a MI if no pending interrupts are present. Once
the SGI has been activated, we're guaranteed to be able to inject a new
pending one.

I like the latter, because it doesn't overload the rest of the code with
new semantics. Thoughts?

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH 1/2] KVM: arm/arm64: vgic: Don't populate multiple LRs with the same vintid
@ 2018-03-08 10:19       ` Marc Zyngier
  0 siblings, 0 replies; 28+ messages in thread
From: Marc Zyngier @ 2018-03-08 10:19 UTC (permalink / raw)
  To: linux-arm-kernel

On 07/03/18 23:34, Christoffer Dall wrote:
> On Wed, Mar 7, 2018 at 12:40 PM, Marc Zyngier <marc.zyngier@arm.com> wrote:
>> The vgic code is trying to be clever when injecting GICv2 SGIs,
>> and will happily populate LRs with the same interrupt number if
>> they come from multiple vcpus (after all, they are distinct
>> interrupt sources).
>>
>> Unfortunately, this is against the letter of the architecture,
>> and the GICv2 architecture spec says "Each valid interrupt stored
>> in the List registers must have a unique VirtualID for that
>> virtual CPU interface.". GICv3 has similar (although slightly
>> ambiguous) restrictions.
>>
>> This results in guests locking up when using GICv2-on-GICv3, for
>> example. The obvious fix is to stop trying so hard, and inject
>> a single vcpu per SGI per guest entry. After all, pending SGIs
>> with multiple source vcpus are pretty rare, and are mostly seen
>> in scenario where the physical CPUs are severely overcomitted.
>>
>> Cc: stable at vger.kernel.org
>> Fixes: 0919e84c0fc1 ("KVM: arm/arm64: vgic-new: Add IRQ sync/flush framework")
>> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
>> ---
>>  virt/kvm/arm/vgic/vgic.c | 11 +----------
>>  1 file changed, 1 insertion(+), 10 deletions(-)
>>
>> diff --git a/virt/kvm/arm/vgic/vgic.c b/virt/kvm/arm/vgic/vgic.c
>> index c7c5ef190afa..1f7ff175f47b 100644
>> --- a/virt/kvm/arm/vgic/vgic.c
>> +++ b/virt/kvm/arm/vgic/vgic.c
>> @@ -720,18 +720,9 @@ static void vgic_flush_lr_state(struct kvm_vcpu *vcpu)
>>         list_for_each_entry(irq, &vgic_cpu->ap_list_head, ap_list) {
>>                 spin_lock(&irq->irq_lock);
>>
>> -               if (unlikely(vgic_target_oracle(irq) != vcpu))
>> -                       goto next;
>> -
>> -               /*
>> -                * If we get an SGI with multiple sources, try to get
>> -                * them in all at once.
>> -                */
>> -               do {
>> +               if (likely(vgic_target_oracle(irq) == vcpu))
>>                         vgic_populate_lr(vcpu, irq, count++);
> 
> I think we need to change vgic_populate_lr to set the EOI maintenance
> interrupt flag so that when the interrupt is deactivated, if there are
> additional pending sources, we exit the guest and pick up the
> interrupt.

Potentially. We need to be careful about about the semantics of EOI MI
with non-level interrupts (see the other thread about EOI signalling).

> An alternative would be to set the underflow interrupt, but I don't
> think that would be correct for multiple priorities, because the SGI
> could have a higher priority than other pending interrupts we put in
> the LR.

I don't think priorities are the issue (after all, we already sort the
irqs in order of priority). My worry is that underflow is allowed to
fire if there is one interrupt pending, which implies that you could
end-up in a livelock scenario if you only have one SGI pending with
multiple sources.

Another possibility would be to use ICH_HCR_EL2.NPIE (GICH_HCR.NPIE on
GICv2), which delivers a a MI if no pending interrupts are present. Once
the SGI has been activated, we're guaranteed to be able to inject a new
pending one.

I like the latter, because it doesn't overload the rest of the code with
new semantics. Thoughts?

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 1/2] KVM: arm/arm64: vgic: Don't populate multiple LRs with the same vintid
  2018-03-08 10:19       ` Marc Zyngier
@ 2018-03-08 13:40         ` Marc Zyngier
  -1 siblings, 0 replies; 28+ messages in thread
From: Marc Zyngier @ 2018-03-08 13:40 UTC (permalink / raw)
  To: Christoffer Dall
  Cc: linux-arm-kernel, Andre Przywara, kvmarm, KVM General, Eric Auger

On 08/03/18 10:19, Marc Zyngier wrote:
> On 07/03/18 23:34, Christoffer Dall wrote:
>> On Wed, Mar 7, 2018 at 12:40 PM, Marc Zyngier <marc.zyngier@arm.com> wrote:
>>> The vgic code is trying to be clever when injecting GICv2 SGIs,
>>> and will happily populate LRs with the same interrupt number if
>>> they come from multiple vcpus (after all, they are distinct
>>> interrupt sources).
>>>
>>> Unfortunately, this is against the letter of the architecture,
>>> and the GICv2 architecture spec says "Each valid interrupt stored
>>> in the List registers must have a unique VirtualID for that
>>> virtual CPU interface.". GICv3 has similar (although slightly
>>> ambiguous) restrictions.
>>>
>>> This results in guests locking up when using GICv2-on-GICv3, for
>>> example. The obvious fix is to stop trying so hard, and inject
>>> a single vcpu per SGI per guest entry. After all, pending SGIs
>>> with multiple source vcpus are pretty rare, and are mostly seen
>>> in scenario where the physical CPUs are severely overcomitted.
>>>
>>> Cc: stable@vger.kernel.org
>>> Fixes: 0919e84c0fc1 ("KVM: arm/arm64: vgic-new: Add IRQ sync/flush framework")
>>> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
>>> ---
>>>  virt/kvm/arm/vgic/vgic.c | 11 +----------
>>>  1 file changed, 1 insertion(+), 10 deletions(-)
>>>
>>> diff --git a/virt/kvm/arm/vgic/vgic.c b/virt/kvm/arm/vgic/vgic.c
>>> index c7c5ef190afa..1f7ff175f47b 100644
>>> --- a/virt/kvm/arm/vgic/vgic.c
>>> +++ b/virt/kvm/arm/vgic/vgic.c
>>> @@ -720,18 +720,9 @@ static void vgic_flush_lr_state(struct kvm_vcpu *vcpu)
>>>         list_for_each_entry(irq, &vgic_cpu->ap_list_head, ap_list) {
>>>                 spin_lock(&irq->irq_lock);
>>>
>>> -               if (unlikely(vgic_target_oracle(irq) != vcpu))
>>> -                       goto next;
>>> -
>>> -               /*
>>> -                * If we get an SGI with multiple sources, try to get
>>> -                * them in all at once.
>>> -                */
>>> -               do {
>>> +               if (likely(vgic_target_oracle(irq) == vcpu))
>>>                         vgic_populate_lr(vcpu, irq, count++);
>>
>> I think we need to change vgic_populate_lr to set the EOI maintenance
>> interrupt flag so that when the interrupt is deactivated, if there are
>> additional pending sources, we exit the guest and pick up the
>> interrupt.
> 
> Potentially. We need to be careful about about the semantics of EOI MI
> with non-level interrupts (see the other thread about EOI signalling).
> 
>> An alternative would be to set the underflow interrupt, but I don't
>> think that would be correct for multiple priorities, because the SGI
>> could have a higher priority than other pending interrupts we put in
>> the LR.
> 
> I don't think priorities are the issue (after all, we already sort the
> irqs in order of priority). My worry is that underflow is allowed to
> fire if there is one interrupt pending, which implies that you could
> end-up in a livelock scenario if you only have one SGI pending with
> multiple sources.
> 
> Another possibility would be to use ICH_HCR_EL2.NPIE (GICH_HCR.NPIE on
> GICv2), which delivers a a MI if no pending interrupts are present. Once
> the SGI has been activated, we're guaranteed to be able to inject a new
> pending one.
> 
> I like the latter, because it doesn't overload the rest of the code with
> new semantics. Thoughts?
To illustrate what I mean, here's a quickly hacked patch that implements
this idea:

diff --git a/include/linux/irqchip/arm-gic-v3.h b/include/linux/irqchip/arm-gic-v3.h
index c00c4c33e432..b26eccc78fb1 100644
--- a/include/linux/irqchip/arm-gic-v3.h
+++ b/include/linux/irqchip/arm-gic-v3.h
@@ -503,6 +503,7 @@
 
 #define ICH_HCR_EN			(1 << 0)
 #define ICH_HCR_UIE			(1 << 1)
+#define ICH_HCR_NPIE			(1 << 3)
 #define ICH_HCR_TC			(1 << 10)
 #define ICH_HCR_TALL0			(1 << 11)
 #define ICH_HCR_TALL1			(1 << 12)
diff --git a/include/linux/irqchip/arm-gic.h b/include/linux/irqchip/arm-gic.h
index d3453ee072fc..68d8b1f73682 100644
--- a/include/linux/irqchip/arm-gic.h
+++ b/include/linux/irqchip/arm-gic.h
@@ -84,6 +84,7 @@
 
 #define GICH_HCR_EN			(1 << 0)
 #define GICH_HCR_UIE			(1 << 1)
+#define GICH_HCR_NPIE			(1 << 3)
 
 #define GICH_LR_VIRTUALID		(0x3ff << 0)
 #define GICH_LR_PHYSID_CPUID_SHIFT	(10)
diff --git a/virt/kvm/arm/vgic/vgic-v2.c b/virt/kvm/arm/vgic/vgic-v2.c
index 0be616e4ee29..9a22b74f1550 100644
--- a/virt/kvm/arm/vgic/vgic-v2.c
+++ b/virt/kvm/arm/vgic/vgic-v2.c
@@ -37,6 +37,13 @@ void vgic_v2_init_lrs(void)
 		vgic_v2_write_lr(i, 0);
 }
 
+void vgic_v2_set_npie(struct kvm_vcpu *vcpu)
+{
+	struct vgic_v2_cpu_if *cpuif = &vcpu->arch.vgic_cpu.vgic_v2;
+
+	cpuif->vgic_hcr |= GICH_HCR_NPIE;
+}
+
 void vgic_v2_set_underflow(struct kvm_vcpu *vcpu)
 {
 	struct vgic_v2_cpu_if *cpuif = &vcpu->arch.vgic_cpu.vgic_v2;
@@ -63,7 +70,7 @@ void vgic_v2_fold_lr_state(struct kvm_vcpu *vcpu)
 	int lr;
 	unsigned long flags;
 
-	cpuif->vgic_hcr &= ~GICH_HCR_UIE;
+	cpuif->vgic_hcr &= ~(GICH_HCR_UIE | GICH_HCR_NPIE);
 
 	for (lr = 0; lr < vgic_cpu->used_lrs; lr++) {
 		u32 val = cpuif->vgic_lr[lr];
diff --git a/virt/kvm/arm/vgic/vgic-v3.c b/virt/kvm/arm/vgic/vgic-v3.c
index c68352b8ed28..749da7624fba 100644
--- a/virt/kvm/arm/vgic/vgic-v3.c
+++ b/virt/kvm/arm/vgic/vgic-v3.c
@@ -26,6 +26,13 @@ static bool group1_trap;
 static bool common_trap;
 static bool gicv4_enable;
 
+void vgic_v3_set_npie(struct kvm_vcpu *vcpu)
+{
+	struct vgic_v3_cpu_if *cpuif = &vcpu->arch.vgic_cpu.vgic_v3;
+
+	cpuif->vgic_hcr |= ICH_HCR_NPIE;
+}
+
 void vgic_v3_set_underflow(struct kvm_vcpu *vcpu)
 {
 	struct vgic_v3_cpu_if *cpuif = &vcpu->arch.vgic_cpu.vgic_v3;
@@ -46,7 +53,7 @@ void vgic_v3_fold_lr_state(struct kvm_vcpu *vcpu)
 	int lr;
 	unsigned long flags;
 
-	cpuif->vgic_hcr &= ~ICH_HCR_UIE;
+	cpuif->vgic_hcr &= ~(ICH_HCR_UIE | ICH_HCR_NPIE);
 
 	for (lr = 0; lr < vgic_cpu->used_lrs; lr++) {
 		u64 val = cpuif->vgic_lr[lr];
diff --git a/virt/kvm/arm/vgic/vgic.c b/virt/kvm/arm/vgic/vgic.c
index 07126a3b1908..2cd458fa8c12 100644
--- a/virt/kvm/arm/vgic/vgic.c
+++ b/virt/kvm/arm/vgic/vgic.c
@@ -710,6 +710,14 @@ static inline void vgic_set_underflow(struct kvm_vcpu *vcpu)
 		vgic_v3_set_underflow(vcpu);
 }
 
+static inline void vgic_set_npie(struct kvm_vcpu *vcpu)
+{
+	if (kvm_vgic_global_state.type == VGIC_V2)
+		vgic_v2_set_npie(vcpu);
+	else
+		vgic_v3_set_npie(vcpu);
+}
+
 /* Requires the ap_list_lock to be held. */
 static int compute_ap_list_depth(struct kvm_vcpu *vcpu)
 {
@@ -737,6 +745,7 @@ static void vgic_flush_lr_state(struct kvm_vcpu *vcpu)
 	struct vgic_cpu *vgic_cpu = &vcpu->arch.vgic_cpu;
 	struct vgic_irq *irq;
 	int count = 0;
+	bool npie = false;
 
 	DEBUG_SPINLOCK_BUG_ON(!spin_is_locked(&vgic_cpu->ap_list_lock));
 
@@ -749,6 +758,9 @@ static void vgic_flush_lr_state(struct kvm_vcpu *vcpu)
 		if (likely(vgic_target_oracle(irq) == vcpu))
 			vgic_populate_lr(vcpu, irq, count++);
 
+		if (irq->source)
+			npie = true;
+
 		spin_unlock(&irq->irq_lock);
 
 		if (count == kvm_vgic_global_state.nr_lr) {
@@ -759,6 +771,9 @@ static void vgic_flush_lr_state(struct kvm_vcpu *vcpu)
 		}
 	}
 
+	if (npie)
+		vgic_set_npie(vcpu);
+
 	vcpu->arch.vgic_cpu.used_lrs = count;
 
 	/* Nuke remaining LRs */
diff --git a/virt/kvm/arm/vgic/vgic.h b/virt/kvm/arm/vgic/vgic.h
index 5b11859a1a1e..f5b8519e5546 100644
--- a/virt/kvm/arm/vgic/vgic.h
+++ b/virt/kvm/arm/vgic/vgic.h
@@ -160,6 +160,7 @@ void vgic_v2_fold_lr_state(struct kvm_vcpu *vcpu);
 void vgic_v2_populate_lr(struct kvm_vcpu *vcpu, struct vgic_irq *irq, int lr);
 void vgic_v2_clear_lr(struct kvm_vcpu *vcpu, int lr);
 void vgic_v2_set_underflow(struct kvm_vcpu *vcpu);
+void vgic_v2_set_npie(struct kvm_vcpu *vcpu);
 int vgic_v2_has_attr_regs(struct kvm_device *dev, struct kvm_device_attr *attr);
 int vgic_v2_dist_uaccess(struct kvm_vcpu *vcpu, bool is_write,
 			 int offset, u32 *val);
@@ -189,6 +190,7 @@ void vgic_v3_fold_lr_state(struct kvm_vcpu *vcpu);
 void vgic_v3_populate_lr(struct kvm_vcpu *vcpu, struct vgic_irq *irq, int lr);
 void vgic_v3_clear_lr(struct kvm_vcpu *vcpu, int lr);
 void vgic_v3_set_underflow(struct kvm_vcpu *vcpu);
+void vgic_v3_set_npie(struct kvm_vcpu *vcpu);
 void vgic_v3_set_vmcr(struct kvm_vcpu *vcpu, struct vgic_vmcr *vmcr);
 void vgic_v3_get_vmcr(struct kvm_vcpu *vcpu, struct vgic_vmcr *vmcr);
 void vgic_v3_enable(struct kvm_vcpu *vcpu);

Thanks,

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 1/2] KVM: arm/arm64: vgic: Don't populate multiple LRs with the same vintid
@ 2018-03-08 13:40         ` Marc Zyngier
  0 siblings, 0 replies; 28+ messages in thread
From: Marc Zyngier @ 2018-03-08 13:40 UTC (permalink / raw)
  To: linux-arm-kernel

On 08/03/18 10:19, Marc Zyngier wrote:
> On 07/03/18 23:34, Christoffer Dall wrote:
>> On Wed, Mar 7, 2018 at 12:40 PM, Marc Zyngier <marc.zyngier@arm.com> wrote:
>>> The vgic code is trying to be clever when injecting GICv2 SGIs,
>>> and will happily populate LRs with the same interrupt number if
>>> they come from multiple vcpus (after all, they are distinct
>>> interrupt sources).
>>>
>>> Unfortunately, this is against the letter of the architecture,
>>> and the GICv2 architecture spec says "Each valid interrupt stored
>>> in the List registers must have a unique VirtualID for that
>>> virtual CPU interface.". GICv3 has similar (although slightly
>>> ambiguous) restrictions.
>>>
>>> This results in guests locking up when using GICv2-on-GICv3, for
>>> example. The obvious fix is to stop trying so hard, and inject
>>> a single vcpu per SGI per guest entry. After all, pending SGIs
>>> with multiple source vcpus are pretty rare, and are mostly seen
>>> in scenario where the physical CPUs are severely overcomitted.
>>>
>>> Cc: stable at vger.kernel.org
>>> Fixes: 0919e84c0fc1 ("KVM: arm/arm64: vgic-new: Add IRQ sync/flush framework")
>>> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
>>> ---
>>>  virt/kvm/arm/vgic/vgic.c | 11 +----------
>>>  1 file changed, 1 insertion(+), 10 deletions(-)
>>>
>>> diff --git a/virt/kvm/arm/vgic/vgic.c b/virt/kvm/arm/vgic/vgic.c
>>> index c7c5ef190afa..1f7ff175f47b 100644
>>> --- a/virt/kvm/arm/vgic/vgic.c
>>> +++ b/virt/kvm/arm/vgic/vgic.c
>>> @@ -720,18 +720,9 @@ static void vgic_flush_lr_state(struct kvm_vcpu *vcpu)
>>>         list_for_each_entry(irq, &vgic_cpu->ap_list_head, ap_list) {
>>>                 spin_lock(&irq->irq_lock);
>>>
>>> -               if (unlikely(vgic_target_oracle(irq) != vcpu))
>>> -                       goto next;
>>> -
>>> -               /*
>>> -                * If we get an SGI with multiple sources, try to get
>>> -                * them in all at once.
>>> -                */
>>> -               do {
>>> +               if (likely(vgic_target_oracle(irq) == vcpu))
>>>                         vgic_populate_lr(vcpu, irq, count++);
>>
>> I think we need to change vgic_populate_lr to set the EOI maintenance
>> interrupt flag so that when the interrupt is deactivated, if there are
>> additional pending sources, we exit the guest and pick up the
>> interrupt.
> 
> Potentially. We need to be careful about about the semantics of EOI MI
> with non-level interrupts (see the other thread about EOI signalling).
> 
>> An alternative would be to set the underflow interrupt, but I don't
>> think that would be correct for multiple priorities, because the SGI
>> could have a higher priority than other pending interrupts we put in
>> the LR.
> 
> I don't think priorities are the issue (after all, we already sort the
> irqs in order of priority). My worry is that underflow is allowed to
> fire if there is one interrupt pending, which implies that you could
> end-up in a livelock scenario if you only have one SGI pending with
> multiple sources.
> 
> Another possibility would be to use ICH_HCR_EL2.NPIE (GICH_HCR.NPIE on
> GICv2), which delivers a a MI if no pending interrupts are present. Once
> the SGI has been activated, we're guaranteed to be able to inject a new
> pending one.
> 
> I like the latter, because it doesn't overload the rest of the code with
> new semantics. Thoughts?
To illustrate what I mean, here's a quickly hacked patch that implements
this idea:

diff --git a/include/linux/irqchip/arm-gic-v3.h b/include/linux/irqchip/arm-gic-v3.h
index c00c4c33e432..b26eccc78fb1 100644
--- a/include/linux/irqchip/arm-gic-v3.h
+++ b/include/linux/irqchip/arm-gic-v3.h
@@ -503,6 +503,7 @@
 
 #define ICH_HCR_EN			(1 << 0)
 #define ICH_HCR_UIE			(1 << 1)
+#define ICH_HCR_NPIE			(1 << 3)
 #define ICH_HCR_TC			(1 << 10)
 #define ICH_HCR_TALL0			(1 << 11)
 #define ICH_HCR_TALL1			(1 << 12)
diff --git a/include/linux/irqchip/arm-gic.h b/include/linux/irqchip/arm-gic.h
index d3453ee072fc..68d8b1f73682 100644
--- a/include/linux/irqchip/arm-gic.h
+++ b/include/linux/irqchip/arm-gic.h
@@ -84,6 +84,7 @@
 
 #define GICH_HCR_EN			(1 << 0)
 #define GICH_HCR_UIE			(1 << 1)
+#define GICH_HCR_NPIE			(1 << 3)
 
 #define GICH_LR_VIRTUALID		(0x3ff << 0)
 #define GICH_LR_PHYSID_CPUID_SHIFT	(10)
diff --git a/virt/kvm/arm/vgic/vgic-v2.c b/virt/kvm/arm/vgic/vgic-v2.c
index 0be616e4ee29..9a22b74f1550 100644
--- a/virt/kvm/arm/vgic/vgic-v2.c
+++ b/virt/kvm/arm/vgic/vgic-v2.c
@@ -37,6 +37,13 @@ void vgic_v2_init_lrs(void)
 		vgic_v2_write_lr(i, 0);
 }
 
+void vgic_v2_set_npie(struct kvm_vcpu *vcpu)
+{
+	struct vgic_v2_cpu_if *cpuif = &vcpu->arch.vgic_cpu.vgic_v2;
+
+	cpuif->vgic_hcr |= GICH_HCR_NPIE;
+}
+
 void vgic_v2_set_underflow(struct kvm_vcpu *vcpu)
 {
 	struct vgic_v2_cpu_if *cpuif = &vcpu->arch.vgic_cpu.vgic_v2;
@@ -63,7 +70,7 @@ void vgic_v2_fold_lr_state(struct kvm_vcpu *vcpu)
 	int lr;
 	unsigned long flags;
 
-	cpuif->vgic_hcr &= ~GICH_HCR_UIE;
+	cpuif->vgic_hcr &= ~(GICH_HCR_UIE | GICH_HCR_NPIE);
 
 	for (lr = 0; lr < vgic_cpu->used_lrs; lr++) {
 		u32 val = cpuif->vgic_lr[lr];
diff --git a/virt/kvm/arm/vgic/vgic-v3.c b/virt/kvm/arm/vgic/vgic-v3.c
index c68352b8ed28..749da7624fba 100644
--- a/virt/kvm/arm/vgic/vgic-v3.c
+++ b/virt/kvm/arm/vgic/vgic-v3.c
@@ -26,6 +26,13 @@ static bool group1_trap;
 static bool common_trap;
 static bool gicv4_enable;
 
+void vgic_v3_set_npie(struct kvm_vcpu *vcpu)
+{
+	struct vgic_v3_cpu_if *cpuif = &vcpu->arch.vgic_cpu.vgic_v3;
+
+	cpuif->vgic_hcr |= ICH_HCR_NPIE;
+}
+
 void vgic_v3_set_underflow(struct kvm_vcpu *vcpu)
 {
 	struct vgic_v3_cpu_if *cpuif = &vcpu->arch.vgic_cpu.vgic_v3;
@@ -46,7 +53,7 @@ void vgic_v3_fold_lr_state(struct kvm_vcpu *vcpu)
 	int lr;
 	unsigned long flags;
 
-	cpuif->vgic_hcr &= ~ICH_HCR_UIE;
+	cpuif->vgic_hcr &= ~(ICH_HCR_UIE | ICH_HCR_NPIE);
 
 	for (lr = 0; lr < vgic_cpu->used_lrs; lr++) {
 		u64 val = cpuif->vgic_lr[lr];
diff --git a/virt/kvm/arm/vgic/vgic.c b/virt/kvm/arm/vgic/vgic.c
index 07126a3b1908..2cd458fa8c12 100644
--- a/virt/kvm/arm/vgic/vgic.c
+++ b/virt/kvm/arm/vgic/vgic.c
@@ -710,6 +710,14 @@ static inline void vgic_set_underflow(struct kvm_vcpu *vcpu)
 		vgic_v3_set_underflow(vcpu);
 }
 
+static inline void vgic_set_npie(struct kvm_vcpu *vcpu)
+{
+	if (kvm_vgic_global_state.type == VGIC_V2)
+		vgic_v2_set_npie(vcpu);
+	else
+		vgic_v3_set_npie(vcpu);
+}
+
 /* Requires the ap_list_lock to be held. */
 static int compute_ap_list_depth(struct kvm_vcpu *vcpu)
 {
@@ -737,6 +745,7 @@ static void vgic_flush_lr_state(struct kvm_vcpu *vcpu)
 	struct vgic_cpu *vgic_cpu = &vcpu->arch.vgic_cpu;
 	struct vgic_irq *irq;
 	int count = 0;
+	bool npie = false;
 
 	DEBUG_SPINLOCK_BUG_ON(!spin_is_locked(&vgic_cpu->ap_list_lock));
 
@@ -749,6 +758,9 @@ static void vgic_flush_lr_state(struct kvm_vcpu *vcpu)
 		if (likely(vgic_target_oracle(irq) == vcpu))
 			vgic_populate_lr(vcpu, irq, count++);
 
+		if (irq->source)
+			npie = true;
+
 		spin_unlock(&irq->irq_lock);
 
 		if (count == kvm_vgic_global_state.nr_lr) {
@@ -759,6 +771,9 @@ static void vgic_flush_lr_state(struct kvm_vcpu *vcpu)
 		}
 	}
 
+	if (npie)
+		vgic_set_npie(vcpu);
+
 	vcpu->arch.vgic_cpu.used_lrs = count;
 
 	/* Nuke remaining LRs */
diff --git a/virt/kvm/arm/vgic/vgic.h b/virt/kvm/arm/vgic/vgic.h
index 5b11859a1a1e..f5b8519e5546 100644
--- a/virt/kvm/arm/vgic/vgic.h
+++ b/virt/kvm/arm/vgic/vgic.h
@@ -160,6 +160,7 @@ void vgic_v2_fold_lr_state(struct kvm_vcpu *vcpu);
 void vgic_v2_populate_lr(struct kvm_vcpu *vcpu, struct vgic_irq *irq, int lr);
 void vgic_v2_clear_lr(struct kvm_vcpu *vcpu, int lr);
 void vgic_v2_set_underflow(struct kvm_vcpu *vcpu);
+void vgic_v2_set_npie(struct kvm_vcpu *vcpu);
 int vgic_v2_has_attr_regs(struct kvm_device *dev, struct kvm_device_attr *attr);
 int vgic_v2_dist_uaccess(struct kvm_vcpu *vcpu, bool is_write,
 			 int offset, u32 *val);
@@ -189,6 +190,7 @@ void vgic_v3_fold_lr_state(struct kvm_vcpu *vcpu);
 void vgic_v3_populate_lr(struct kvm_vcpu *vcpu, struct vgic_irq *irq, int lr);
 void vgic_v3_clear_lr(struct kvm_vcpu *vcpu, int lr);
 void vgic_v3_set_underflow(struct kvm_vcpu *vcpu);
+void vgic_v3_set_npie(struct kvm_vcpu *vcpu);
 void vgic_v3_set_vmcr(struct kvm_vcpu *vcpu, struct vgic_vmcr *vmcr);
 void vgic_v3_get_vmcr(struct kvm_vcpu *vcpu, struct vgic_vmcr *vmcr);
 void vgic_v3_enable(struct kvm_vcpu *vcpu);

Thanks,

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* Re: [PATCH 1/2] KVM: arm/arm64: vgic: Don't populate multiple LRs with the same vintid
  2018-03-08 10:19       ` Marc Zyngier
@ 2018-03-08 16:02         ` Christoffer Dall
  -1 siblings, 0 replies; 28+ messages in thread
From: Christoffer Dall @ 2018-03-08 16:02 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: linux-arm-kernel, Andre Przywara, kvmarm, KVM General, Eric Auger

On Thu, Mar 08, 2018 at 10:19:49AM +0000, Marc Zyngier wrote:
> On 07/03/18 23:34, Christoffer Dall wrote:
> > On Wed, Mar 7, 2018 at 12:40 PM, Marc Zyngier <marc.zyngier@arm.com> wrote:
> >> The vgic code is trying to be clever when injecting GICv2 SGIs,
> >> and will happily populate LRs with the same interrupt number if
> >> they come from multiple vcpus (after all, they are distinct
> >> interrupt sources).
> >>
> >> Unfortunately, this is against the letter of the architecture,
> >> and the GICv2 architecture spec says "Each valid interrupt stored
> >> in the List registers must have a unique VirtualID for that
> >> virtual CPU interface.". GICv3 has similar (although slightly
> >> ambiguous) restrictions.
> >>
> >> This results in guests locking up when using GICv2-on-GICv3, for
> >> example. The obvious fix is to stop trying so hard, and inject
> >> a single vcpu per SGI per guest entry. After all, pending SGIs
> >> with multiple source vcpus are pretty rare, and are mostly seen
> >> in scenario where the physical CPUs are severely overcomitted.
> >>
> >> Cc: stable@vger.kernel.org
> >> Fixes: 0919e84c0fc1 ("KVM: arm/arm64: vgic-new: Add IRQ sync/flush framework")
> >> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
> >> ---
> >>  virt/kvm/arm/vgic/vgic.c | 11 +----------
> >>  1 file changed, 1 insertion(+), 10 deletions(-)
> >>
> >> diff --git a/virt/kvm/arm/vgic/vgic.c b/virt/kvm/arm/vgic/vgic.c
> >> index c7c5ef190afa..1f7ff175f47b 100644
> >> --- a/virt/kvm/arm/vgic/vgic.c
> >> +++ b/virt/kvm/arm/vgic/vgic.c
> >> @@ -720,18 +720,9 @@ static void vgic_flush_lr_state(struct kvm_vcpu *vcpu)
> >>         list_for_each_entry(irq, &vgic_cpu->ap_list_head, ap_list) {
> >>                 spin_lock(&irq->irq_lock);
> >>
> >> -               if (unlikely(vgic_target_oracle(irq) != vcpu))
> >> -                       goto next;
> >> -
> >> -               /*
> >> -                * If we get an SGI with multiple sources, try to get
> >> -                * them in all at once.
> >> -                */
> >> -               do {
> >> +               if (likely(vgic_target_oracle(irq) == vcpu))
> >>                         vgic_populate_lr(vcpu, irq, count++);
> > 
> > I think we need to change vgic_populate_lr to set the EOI maintenance
> > interrupt flag so that when the interrupt is deactivated, if there are
> > additional pending sources, we exit the guest and pick up the
> > interrupt.
> 
> Potentially. We need to be careful about about the semantics of EOI MI
> with non-level interrupts (see the other thread about EOI signalling).

I'll have a look.

> 
> > An alternative would be to set the underflow interrupt, but I don't
> > think that would be correct for multiple priorities, because the SGI
> > could have a higher priority than other pending interrupts we put in
> > the LR.
> 
> I don't think priorities are the issue (after all, we already sort the
> irqs in order of priority). 

Yes, but assume you have three pending interrupts, one SGI from two
sources, and one SPI, and assume that the SGI has priority 1 and SPI
priority 2 (lower means higher priority), then I think with underflow or
the no-pending interrupt flag, we'll deliver the SGI from the first
source, and then the SPI, and then exiting to pick up the last SGI from
the other source.  That's not how I understand the GIC architecture is
supposed to work.  Am I missing something?

> My worry is that underflow is allowed to
> fire if there is one interrupt pending, which implies that you could
> end-up in a livelock scenario if you only have one SGI pending with
> multiple sources.

Yes, doesn't work, so I think it should be a maintenance interrupt on
EOI.

> 
> Another possibility would be to use ICH_HCR_EL2.NPIE (GICH_HCR.NPIE on
> GICv2), which delivers a a MI if no pending interrupts are present. Once
> the SGI has been activated, we're guaranteed to be able to inject a new
> pending one.
> 
> I like the latter, because it doesn't overload the rest of the code with
> new semantics. Thoughts?
> 

I'm fine with that if I can be proven wrong about the multiple sources
and priorities.

Thanks,
-Christoffer

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH 1/2] KVM: arm/arm64: vgic: Don't populate multiple LRs with the same vintid
@ 2018-03-08 16:02         ` Christoffer Dall
  0 siblings, 0 replies; 28+ messages in thread
From: Christoffer Dall @ 2018-03-08 16:02 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Mar 08, 2018 at 10:19:49AM +0000, Marc Zyngier wrote:
> On 07/03/18 23:34, Christoffer Dall wrote:
> > On Wed, Mar 7, 2018 at 12:40 PM, Marc Zyngier <marc.zyngier@arm.com> wrote:
> >> The vgic code is trying to be clever when injecting GICv2 SGIs,
> >> and will happily populate LRs with the same interrupt number if
> >> they come from multiple vcpus (after all, they are distinct
> >> interrupt sources).
> >>
> >> Unfortunately, this is against the letter of the architecture,
> >> and the GICv2 architecture spec says "Each valid interrupt stored
> >> in the List registers must have a unique VirtualID for that
> >> virtual CPU interface.". GICv3 has similar (although slightly
> >> ambiguous) restrictions.
> >>
> >> This results in guests locking up when using GICv2-on-GICv3, for
> >> example. The obvious fix is to stop trying so hard, and inject
> >> a single vcpu per SGI per guest entry. After all, pending SGIs
> >> with multiple source vcpus are pretty rare, and are mostly seen
> >> in scenario where the physical CPUs are severely overcomitted.
> >>
> >> Cc: stable at vger.kernel.org
> >> Fixes: 0919e84c0fc1 ("KVM: arm/arm64: vgic-new: Add IRQ sync/flush framework")
> >> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
> >> ---
> >>  virt/kvm/arm/vgic/vgic.c | 11 +----------
> >>  1 file changed, 1 insertion(+), 10 deletions(-)
> >>
> >> diff --git a/virt/kvm/arm/vgic/vgic.c b/virt/kvm/arm/vgic/vgic.c
> >> index c7c5ef190afa..1f7ff175f47b 100644
> >> --- a/virt/kvm/arm/vgic/vgic.c
> >> +++ b/virt/kvm/arm/vgic/vgic.c
> >> @@ -720,18 +720,9 @@ static void vgic_flush_lr_state(struct kvm_vcpu *vcpu)
> >>         list_for_each_entry(irq, &vgic_cpu->ap_list_head, ap_list) {
> >>                 spin_lock(&irq->irq_lock);
> >>
> >> -               if (unlikely(vgic_target_oracle(irq) != vcpu))
> >> -                       goto next;
> >> -
> >> -               /*
> >> -                * If we get an SGI with multiple sources, try to get
> >> -                * them in all at once.
> >> -                */
> >> -               do {
> >> +               if (likely(vgic_target_oracle(irq) == vcpu))
> >>                         vgic_populate_lr(vcpu, irq, count++);
> > 
> > I think we need to change vgic_populate_lr to set the EOI maintenance
> > interrupt flag so that when the interrupt is deactivated, if there are
> > additional pending sources, we exit the guest and pick up the
> > interrupt.
> 
> Potentially. We need to be careful about about the semantics of EOI MI
> with non-level interrupts (see the other thread about EOI signalling).

I'll have a look.

> 
> > An alternative would be to set the underflow interrupt, but I don't
> > think that would be correct for multiple priorities, because the SGI
> > could have a higher priority than other pending interrupts we put in
> > the LR.
> 
> I don't think priorities are the issue (after all, we already sort the
> irqs in order of priority). 

Yes, but assume you have three pending interrupts, one SGI from two
sources, and one SPI, and assume that the SGI has priority 1 and SPI
priority 2 (lower means higher priority), then I think with underflow or
the no-pending interrupt flag, we'll deliver the SGI from the first
source, and then the SPI, and then exiting to pick up the last SGI from
the other source.  That's not how I understand the GIC architecture is
supposed to work.  Am I missing something?

> My worry is that underflow is allowed to
> fire if there is one interrupt pending, which implies that you could
> end-up in a livelock scenario if you only have one SGI pending with
> multiple sources.

Yes, doesn't work, so I think it should be a maintenance interrupt on
EOI.

> 
> Another possibility would be to use ICH_HCR_EL2.NPIE (GICH_HCR.NPIE on
> GICv2), which delivers a a MI if no pending interrupts are present. Once
> the SGI has been activated, we're guaranteed to be able to inject a new
> pending one.
> 
> I like the latter, because it doesn't overload the rest of the code with
> new semantics. Thoughts?
> 

I'm fine with that if I can be proven wrong about the multiple sources
and priorities.

Thanks,
-Christoffer

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 1/2] KVM: arm/arm64: vgic: Don't populate multiple LRs with the same vintid
  2018-03-08 16:02         ` Christoffer Dall
@ 2018-03-08 17:04           ` Marc Zyngier
  -1 siblings, 0 replies; 28+ messages in thread
From: Marc Zyngier @ 2018-03-08 17:04 UTC (permalink / raw)
  To: Christoffer Dall; +Cc: linux-arm-kernel, Andre Przywara, kvmarm, KVM General

On Thu, 08 Mar 2018 16:02:42 +0000,
Christoffer Dall wrote:
> 
> On Thu, Mar 08, 2018 at 10:19:49AM +0000, Marc Zyngier wrote:
> > On 07/03/18 23:34, Christoffer Dall wrote:
> > > On Wed, Mar 7, 2018 at 12:40 PM, Marc Zyngier <marc.zyngier@arm.com> wrote:
> > >> The vgic code is trying to be clever when injecting GICv2 SGIs,
> > >> and will happily populate LRs with the same interrupt number if
> > >> they come from multiple vcpus (after all, they are distinct
> > >> interrupt sources).
> > >>
> > >> Unfortunately, this is against the letter of the architecture,
> > >> and the GICv2 architecture spec says "Each valid interrupt stored
> > >> in the List registers must have a unique VirtualID for that
> > >> virtual CPU interface.". GICv3 has similar (although slightly
> > >> ambiguous) restrictions.
> > >>
> > >> This results in guests locking up when using GICv2-on-GICv3, for
> > >> example. The obvious fix is to stop trying so hard, and inject
> > >> a single vcpu per SGI per guest entry. After all, pending SGIs
> > >> with multiple source vcpus are pretty rare, and are mostly seen
> > >> in scenario where the physical CPUs are severely overcomitted.
> > >>
> > >> Cc: stable@vger.kernel.org
> > >> Fixes: 0919e84c0fc1 ("KVM: arm/arm64: vgic-new: Add IRQ sync/flush framework")
> > >> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
> > >> ---
> > >>  virt/kvm/arm/vgic/vgic.c | 11 +----------
> > >>  1 file changed, 1 insertion(+), 10 deletions(-)
> > >>
> > >> diff --git a/virt/kvm/arm/vgic/vgic.c b/virt/kvm/arm/vgic/vgic.c
> > >> index c7c5ef190afa..1f7ff175f47b 100644
> > >> --- a/virt/kvm/arm/vgic/vgic.c
> > >> +++ b/virt/kvm/arm/vgic/vgic.c
> > >> @@ -720,18 +720,9 @@ static void vgic_flush_lr_state(struct kvm_vcpu *vcpu)
> > >>         list_for_each_entry(irq, &vgic_cpu->ap_list_head, ap_list) {
> > >>                 spin_lock(&irq->irq_lock);
> > >>
> > >> -               if (unlikely(vgic_target_oracle(irq) != vcpu))
> > >> -                       goto next;
> > >> -
> > >> -               /*
> > >> -                * If we get an SGI with multiple sources, try to get
> > >> -                * them in all at once.
> > >> -                */
> > >> -               do {
> > >> +               if (likely(vgic_target_oracle(irq) == vcpu))
> > >>                         vgic_populate_lr(vcpu, irq, count++);
> > > 
> > > I think we need to change vgic_populate_lr to set the EOI maintenance
> > > interrupt flag so that when the interrupt is deactivated, if there are
> > > additional pending sources, we exit the guest and pick up the
> > > interrupt.
> > 
> > Potentially. We need to be careful about about the semantics of EOI MI
> > with non-level interrupts (see the other thread about EOI signalling).
> 
> I'll have a look.
> 
> > 
> > > An alternative would be to set the underflow interrupt, but I don't
> > > think that would be correct for multiple priorities, because the SGI
> > > could have a higher priority than other pending interrupts we put in
> > > the LR.
> > 
> > I don't think priorities are the issue (after all, we already sort the
> > irqs in order of priority). 
> 
> Yes, but assume you have three pending interrupts, one SGI from two
> sources, and one SPI, and assume that the SGI has priority 1 and SPI
> priority 2 (lower means higher priority), then I think with underflow or
> the no-pending interrupt flag, we'll deliver the SGI from the first
> source, and then the SPI, and then exiting to pick up the last SGI from
> the other source.  That's not how I understand the GIC architecture is
> supposed to work.  Am I missing something?

No, you do have a point here. I'm so glad this is a v2 only thing.

> 
> > My worry is that underflow is allowed to
> > fire if there is one interrupt pending, which implies that you could
> > end-up in a livelock scenario if you only have one SGI pending with
> > multiple sources.
> 
> Yes, doesn't work, so I think it should be a maintenance interrupt on
> EOI.
> 
> > 
> > Another possibility would be to use ICH_HCR_EL2.NPIE (GICH_HCR.NPIE on
> > GICv2), which delivers a a MI if no pending interrupts are present. Once
> > the SGI has been activated, we're guaranteed to be able to inject a new
> > pending one.
> > 
> > I like the latter, because it doesn't overload the rest of the code with
> > new semantics. Thoughts?
> > 
> 
> I'm fine with that if I can be proven wrong about the multiple sources
> and priorities.

I'll have a look at respining this with MI on EOI.

Thanks,

	M.

-- 
Jazz is not dead, it just smell funny.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH 1/2] KVM: arm/arm64: vgic: Don't populate multiple LRs with the same vintid
@ 2018-03-08 17:04           ` Marc Zyngier
  0 siblings, 0 replies; 28+ messages in thread
From: Marc Zyngier @ 2018-03-08 17:04 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, 08 Mar 2018 16:02:42 +0000,
Christoffer Dall wrote:
> 
> On Thu, Mar 08, 2018 at 10:19:49AM +0000, Marc Zyngier wrote:
> > On 07/03/18 23:34, Christoffer Dall wrote:
> > > On Wed, Mar 7, 2018 at 12:40 PM, Marc Zyngier <marc.zyngier@arm.com> wrote:
> > >> The vgic code is trying to be clever when injecting GICv2 SGIs,
> > >> and will happily populate LRs with the same interrupt number if
> > >> they come from multiple vcpus (after all, they are distinct
> > >> interrupt sources).
> > >>
> > >> Unfortunately, this is against the letter of the architecture,
> > >> and the GICv2 architecture spec says "Each valid interrupt stored
> > >> in the List registers must have a unique VirtualID for that
> > >> virtual CPU interface.". GICv3 has similar (although slightly
> > >> ambiguous) restrictions.
> > >>
> > >> This results in guests locking up when using GICv2-on-GICv3, for
> > >> example. The obvious fix is to stop trying so hard, and inject
> > >> a single vcpu per SGI per guest entry. After all, pending SGIs
> > >> with multiple source vcpus are pretty rare, and are mostly seen
> > >> in scenario where the physical CPUs are severely overcomitted.
> > >>
> > >> Cc: stable at vger.kernel.org
> > >> Fixes: 0919e84c0fc1 ("KVM: arm/arm64: vgic-new: Add IRQ sync/flush framework")
> > >> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
> > >> ---
> > >>  virt/kvm/arm/vgic/vgic.c | 11 +----------
> > >>  1 file changed, 1 insertion(+), 10 deletions(-)
> > >>
> > >> diff --git a/virt/kvm/arm/vgic/vgic.c b/virt/kvm/arm/vgic/vgic.c
> > >> index c7c5ef190afa..1f7ff175f47b 100644
> > >> --- a/virt/kvm/arm/vgic/vgic.c
> > >> +++ b/virt/kvm/arm/vgic/vgic.c
> > >> @@ -720,18 +720,9 @@ static void vgic_flush_lr_state(struct kvm_vcpu *vcpu)
> > >>         list_for_each_entry(irq, &vgic_cpu->ap_list_head, ap_list) {
> > >>                 spin_lock(&irq->irq_lock);
> > >>
> > >> -               if (unlikely(vgic_target_oracle(irq) != vcpu))
> > >> -                       goto next;
> > >> -
> > >> -               /*
> > >> -                * If we get an SGI with multiple sources, try to get
> > >> -                * them in all at once.
> > >> -                */
> > >> -               do {
> > >> +               if (likely(vgic_target_oracle(irq) == vcpu))
> > >>                         vgic_populate_lr(vcpu, irq, count++);
> > > 
> > > I think we need to change vgic_populate_lr to set the EOI maintenance
> > > interrupt flag so that when the interrupt is deactivated, if there are
> > > additional pending sources, we exit the guest and pick up the
> > > interrupt.
> > 
> > Potentially. We need to be careful about about the semantics of EOI MI
> > with non-level interrupts (see the other thread about EOI signalling).
> 
> I'll have a look.
> 
> > 
> > > An alternative would be to set the underflow interrupt, but I don't
> > > think that would be correct for multiple priorities, because the SGI
> > > could have a higher priority than other pending interrupts we put in
> > > the LR.
> > 
> > I don't think priorities are the issue (after all, we already sort the
> > irqs in order of priority). 
> 
> Yes, but assume you have three pending interrupts, one SGI from two
> sources, and one SPI, and assume that the SGI has priority 1 and SPI
> priority 2 (lower means higher priority), then I think with underflow or
> the no-pending interrupt flag, we'll deliver the SGI from the first
> source, and then the SPI, and then exiting to pick up the last SGI from
> the other source.  That's not how I understand the GIC architecture is
> supposed to work.  Am I missing something?

No, you do have a point here. I'm so glad this is a v2 only thing.

> 
> > My worry is that underflow is allowed to
> > fire if there is one interrupt pending, which implies that you could
> > end-up in a livelock scenario if you only have one SGI pending with
> > multiple sources.
> 
> Yes, doesn't work, so I think it should be a maintenance interrupt on
> EOI.
> 
> > 
> > Another possibility would be to use ICH_HCR_EL2.NPIE (GICH_HCR.NPIE on
> > GICv2), which delivers a a MI if no pending interrupts are present. Once
> > the SGI has been activated, we're guaranteed to be able to inject a new
> > pending one.
> > 
> > I like the latter, because it doesn't overload the rest of the code with
> > new semantics. Thoughts?
> > 
> 
> I'm fine with that if I can be proven wrong about the multiple sources
> and priorities.

I'll have a look at respining this with MI on EOI.

Thanks,

	M.

-- 
Jazz is not dead, it just smell funny.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 1/2] KVM: arm/arm64: vgic: Don't populate multiple LRs with the same vintid
  2018-03-08 17:04           ` Marc Zyngier
@ 2018-03-08 18:39             ` Marc Zyngier
  -1 siblings, 0 replies; 28+ messages in thread
From: Marc Zyngier @ 2018-03-08 18:39 UTC (permalink / raw)
  To: Christoffer Dall
  Cc: linux-arm-kernel, Andre Przywara, kvmarm, KVM General, Eric Auger

On Thu, 08 Mar 2018 17:04:38 +0000,
Marc Zyngier wrote:
> 
> On Thu, 08 Mar 2018 16:02:42 +0000,
> Christoffer Dall wrote:
> > 
> > On Thu, Mar 08, 2018 at 10:19:49AM +0000, Marc Zyngier wrote:
> > > On 07/03/18 23:34, Christoffer Dall wrote:
> > > > On Wed, Mar 7, 2018 at 12:40 PM, Marc Zyngier <marc.zyngier@arm.com> wrote:
> > > >> The vgic code is trying to be clever when injecting GICv2 SGIs,
> > > >> and will happily populate LRs with the same interrupt number if
> > > >> they come from multiple vcpus (after all, they are distinct
> > > >> interrupt sources).
> > > >>
> > > >> Unfortunately, this is against the letter of the architecture,
> > > >> and the GICv2 architecture spec says "Each valid interrupt stored
> > > >> in the List registers must have a unique VirtualID for that
> > > >> virtual CPU interface.". GICv3 has similar (although slightly
> > > >> ambiguous) restrictions.
> > > >>
> > > >> This results in guests locking up when using GICv2-on-GICv3, for
> > > >> example. The obvious fix is to stop trying so hard, and inject
> > > >> a single vcpu per SGI per guest entry. After all, pending SGIs
> > > >> with multiple source vcpus are pretty rare, and are mostly seen
> > > >> in scenario where the physical CPUs are severely overcomitted.
> > > >>
> > > >> Cc: stable@vger.kernel.org
> > > >> Fixes: 0919e84c0fc1 ("KVM: arm/arm64: vgic-new: Add IRQ sync/flush framework")
> > > >> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
> > > >> ---
> > > >>  virt/kvm/arm/vgic/vgic.c | 11 +----------
> > > >>  1 file changed, 1 insertion(+), 10 deletions(-)
> > > >>
> > > >> diff --git a/virt/kvm/arm/vgic/vgic.c b/virt/kvm/arm/vgic/vgic.c
> > > >> index c7c5ef190afa..1f7ff175f47b 100644
> > > >> --- a/virt/kvm/arm/vgic/vgic.c
> > > >> +++ b/virt/kvm/arm/vgic/vgic.c
> > > >> @@ -720,18 +720,9 @@ static void vgic_flush_lr_state(struct kvm_vcpu *vcpu)
> > > >>         list_for_each_entry(irq, &vgic_cpu->ap_list_head, ap_list) {
> > > >>                 spin_lock(&irq->irq_lock);
> > > >>
> > > >> -               if (unlikely(vgic_target_oracle(irq) != vcpu))
> > > >> -                       goto next;
> > > >> -
> > > >> -               /*
> > > >> -                * If we get an SGI with multiple sources, try to get
> > > >> -                * them in all at once.
> > > >> -                */
> > > >> -               do {
> > > >> +               if (likely(vgic_target_oracle(irq) == vcpu))
> > > >>                         vgic_populate_lr(vcpu, irq, count++);
> > > > 
> > > > I think we need to change vgic_populate_lr to set the EOI maintenance
> > > > interrupt flag so that when the interrupt is deactivated, if there are
> > > > additional pending sources, we exit the guest and pick up the
> > > > interrupt.
> > > 
> > > Potentially. We need to be careful about about the semantics of EOI MI
> > > with non-level interrupts (see the other thread about EOI signalling).
> > 
> > I'll have a look.
> > 
> > > 
> > > > An alternative would be to set the underflow interrupt, but I don't
> > > > think that would be correct for multiple priorities, because the SGI
> > > > could have a higher priority than other pending interrupts we put in
> > > > the LR.
> > > 
> > > I don't think priorities are the issue (after all, we already sort the
> > > irqs in order of priority). 
> > 
> > Yes, but assume you have three pending interrupts, one SGI from two
> > sources, and one SPI, and assume that the SGI has priority 1 and SPI
> > priority 2 (lower means higher priority), then I think with underflow or
> > the no-pending interrupt flag, we'll deliver the SGI from the first
> > source, and then the SPI, and then exiting to pick up the last SGI from
> > the other source.  That's not how I understand the GIC architecture is
> > supposed to work.  Am I missing something?
> 
> No, you do have a point here. I'm so glad this is a v2 only thing.
> 
> > 
> > > My worry is that underflow is allowed to
> > > fire if there is one interrupt pending, which implies that you could
> > > end-up in a livelock scenario if you only have one SGI pending with
> > > multiple sources.
> > 
> > Yes, doesn't work, so I think it should be a maintenance interrupt on
> > EOI.

Thinking of it a bit more: MI on EOI doesn't offer much more guarantee
in the way of priority ordering. Taking your example above: Even if
you generate a MI when EOIing the SGI, there is no guarantee that
you'll take the MI before you've acked the SPI.

If you really want to offer that absolute guarantee that all the
multi-source SGIs of higher priority are delivered before anything
else, then you must make sure that only the SGIs are present in the
LRs, excluding any other interrupt on lower priority until you've
queued them all.

At that stage, I wonder if there is a point in doing anything at
all. The GICv2 architecture is too rubbish for words.

	M.

-- 
Jazz is not dead, it just smell funny.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH 1/2] KVM: arm/arm64: vgic: Don't populate multiple LRs with the same vintid
@ 2018-03-08 18:39             ` Marc Zyngier
  0 siblings, 0 replies; 28+ messages in thread
From: Marc Zyngier @ 2018-03-08 18:39 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, 08 Mar 2018 17:04:38 +0000,
Marc Zyngier wrote:
> 
> On Thu, 08 Mar 2018 16:02:42 +0000,
> Christoffer Dall wrote:
> > 
> > On Thu, Mar 08, 2018 at 10:19:49AM +0000, Marc Zyngier wrote:
> > > On 07/03/18 23:34, Christoffer Dall wrote:
> > > > On Wed, Mar 7, 2018 at 12:40 PM, Marc Zyngier <marc.zyngier@arm.com> wrote:
> > > >> The vgic code is trying to be clever when injecting GICv2 SGIs,
> > > >> and will happily populate LRs with the same interrupt number if
> > > >> they come from multiple vcpus (after all, they are distinct
> > > >> interrupt sources).
> > > >>
> > > >> Unfortunately, this is against the letter of the architecture,
> > > >> and the GICv2 architecture spec says "Each valid interrupt stored
> > > >> in the List registers must have a unique VirtualID for that
> > > >> virtual CPU interface.". GICv3 has similar (although slightly
> > > >> ambiguous) restrictions.
> > > >>
> > > >> This results in guests locking up when using GICv2-on-GICv3, for
> > > >> example. The obvious fix is to stop trying so hard, and inject
> > > >> a single vcpu per SGI per guest entry. After all, pending SGIs
> > > >> with multiple source vcpus are pretty rare, and are mostly seen
> > > >> in scenario where the physical CPUs are severely overcomitted.
> > > >>
> > > >> Cc: stable at vger.kernel.org
> > > >> Fixes: 0919e84c0fc1 ("KVM: arm/arm64: vgic-new: Add IRQ sync/flush framework")
> > > >> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
> > > >> ---
> > > >>  virt/kvm/arm/vgic/vgic.c | 11 +----------
> > > >>  1 file changed, 1 insertion(+), 10 deletions(-)
> > > >>
> > > >> diff --git a/virt/kvm/arm/vgic/vgic.c b/virt/kvm/arm/vgic/vgic.c
> > > >> index c7c5ef190afa..1f7ff175f47b 100644
> > > >> --- a/virt/kvm/arm/vgic/vgic.c
> > > >> +++ b/virt/kvm/arm/vgic/vgic.c
> > > >> @@ -720,18 +720,9 @@ static void vgic_flush_lr_state(struct kvm_vcpu *vcpu)
> > > >>         list_for_each_entry(irq, &vgic_cpu->ap_list_head, ap_list) {
> > > >>                 spin_lock(&irq->irq_lock);
> > > >>
> > > >> -               if (unlikely(vgic_target_oracle(irq) != vcpu))
> > > >> -                       goto next;
> > > >> -
> > > >> -               /*
> > > >> -                * If we get an SGI with multiple sources, try to get
> > > >> -                * them in all at once.
> > > >> -                */
> > > >> -               do {
> > > >> +               if (likely(vgic_target_oracle(irq) == vcpu))
> > > >>                         vgic_populate_lr(vcpu, irq, count++);
> > > > 
> > > > I think we need to change vgic_populate_lr to set the EOI maintenance
> > > > interrupt flag so that when the interrupt is deactivated, if there are
> > > > additional pending sources, we exit the guest and pick up the
> > > > interrupt.
> > > 
> > > Potentially. We need to be careful about about the semantics of EOI MI
> > > with non-level interrupts (see the other thread about EOI signalling).
> > 
> > I'll have a look.
> > 
> > > 
> > > > An alternative would be to set the underflow interrupt, but I don't
> > > > think that would be correct for multiple priorities, because the SGI
> > > > could have a higher priority than other pending interrupts we put in
> > > > the LR.
> > > 
> > > I don't think priorities are the issue (after all, we already sort the
> > > irqs in order of priority). 
> > 
> > Yes, but assume you have three pending interrupts, one SGI from two
> > sources, and one SPI, and assume that the SGI has priority 1 and SPI
> > priority 2 (lower means higher priority), then I think with underflow or
> > the no-pending interrupt flag, we'll deliver the SGI from the first
> > source, and then the SPI, and then exiting to pick up the last SGI from
> > the other source.  That's not how I understand the GIC architecture is
> > supposed to work.  Am I missing something?
> 
> No, you do have a point here. I'm so glad this is a v2 only thing.
> 
> > 
> > > My worry is that underflow is allowed to
> > > fire if there is one interrupt pending, which implies that you could
> > > end-up in a livelock scenario if you only have one SGI pending with
> > > multiple sources.
> > 
> > Yes, doesn't work, so I think it should be a maintenance interrupt on
> > EOI.

Thinking of it a bit more: MI on EOI doesn't offer much more guarantee
in the way of priority ordering. Taking your example above: Even if
you generate a MI when EOIing the SGI, there is no guarantee that
you'll take the MI before you've acked the SPI.

If you really want to offer that absolute guarantee that all the
multi-source SGIs of higher priority are delivered before anything
else, then you must make sure that only the SGIs are present in the
LRs, excluding any other interrupt on lower priority until you've
queued them all.

At that stage, I wonder if there is a point in doing anything at
all. The GICv2 architecture is too rubbish for words.

	M.

-- 
Jazz is not dead, it just smell funny.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 1/2] KVM: arm/arm64: vgic: Don't populate multiple LRs with the same vintid
  2018-03-08 18:39             ` Marc Zyngier
@ 2018-03-09 21:39               ` Christoffer Dall
  -1 siblings, 0 replies; 28+ messages in thread
From: Christoffer Dall @ 2018-03-09 21:39 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: linux-arm-kernel, Andre Przywara, kvmarm, KVM General, Eric Auger

On Thu, Mar 08, 2018 at 06:39:20PM +0000, Marc Zyngier wrote:
> On Thu, 08 Mar 2018 17:04:38 +0000,
> Marc Zyngier wrote:
> > 
> > On Thu, 08 Mar 2018 16:02:42 +0000,
> > Christoffer Dall wrote:
> > > 
> > > On Thu, Mar 08, 2018 at 10:19:49AM +0000, Marc Zyngier wrote:
> > > > On 07/03/18 23:34, Christoffer Dall wrote:
> > > > > On Wed, Mar 7, 2018 at 12:40 PM, Marc Zyngier <marc.zyngier@arm.com> wrote:
> > > > >> The vgic code is trying to be clever when injecting GICv2 SGIs,
> > > > >> and will happily populate LRs with the same interrupt number if
> > > > >> they come from multiple vcpus (after all, they are distinct
> > > > >> interrupt sources).
> > > > >>
> > > > >> Unfortunately, this is against the letter of the architecture,
> > > > >> and the GICv2 architecture spec says "Each valid interrupt stored
> > > > >> in the List registers must have a unique VirtualID for that
> > > > >> virtual CPU interface.". GICv3 has similar (although slightly
> > > > >> ambiguous) restrictions.
> > > > >>
> > > > >> This results in guests locking up when using GICv2-on-GICv3, for
> > > > >> example. The obvious fix is to stop trying so hard, and inject
> > > > >> a single vcpu per SGI per guest entry. After all, pending SGIs
> > > > >> with multiple source vcpus are pretty rare, and are mostly seen
> > > > >> in scenario where the physical CPUs are severely overcomitted.
> > > > >>
> > > > >> Cc: stable@vger.kernel.org
> > > > >> Fixes: 0919e84c0fc1 ("KVM: arm/arm64: vgic-new: Add IRQ sync/flush framework")
> > > > >> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
> > > > >> ---
> > > > >>  virt/kvm/arm/vgic/vgic.c | 11 +----------
> > > > >>  1 file changed, 1 insertion(+), 10 deletions(-)
> > > > >>
> > > > >> diff --git a/virt/kvm/arm/vgic/vgic.c b/virt/kvm/arm/vgic/vgic.c
> > > > >> index c7c5ef190afa..1f7ff175f47b 100644
> > > > >> --- a/virt/kvm/arm/vgic/vgic.c
> > > > >> +++ b/virt/kvm/arm/vgic/vgic.c
> > > > >> @@ -720,18 +720,9 @@ static void vgic_flush_lr_state(struct kvm_vcpu *vcpu)
> > > > >>         list_for_each_entry(irq, &vgic_cpu->ap_list_head, ap_list) {
> > > > >>                 spin_lock(&irq->irq_lock);
> > > > >>
> > > > >> -               if (unlikely(vgic_target_oracle(irq) != vcpu))
> > > > >> -                       goto next;
> > > > >> -
> > > > >> -               /*
> > > > >> -                * If we get an SGI with multiple sources, try to get
> > > > >> -                * them in all at once.
> > > > >> -                */
> > > > >> -               do {
> > > > >> +               if (likely(vgic_target_oracle(irq) == vcpu))
> > > > >>                         vgic_populate_lr(vcpu, irq, count++);
> > > > > 
> > > > > I think we need to change vgic_populate_lr to set the EOI maintenance
> > > > > interrupt flag so that when the interrupt is deactivated, if there are
> > > > > additional pending sources, we exit the guest and pick up the
> > > > > interrupt.
> > > > 
> > > > Potentially. We need to be careful about about the semantics of EOI MI
> > > > with non-level interrupts (see the other thread about EOI signalling).
> > > 
> > > I'll have a look.
> > > 
> > > > 
> > > > > An alternative would be to set the underflow interrupt, but I don't
> > > > > think that would be correct for multiple priorities, because the SGI
> > > > > could have a higher priority than other pending interrupts we put in
> > > > > the LR.
> > > > 
> > > > I don't think priorities are the issue (after all, we already sort the
> > > > irqs in order of priority). 
> > > 
> > > Yes, but assume you have three pending interrupts, one SGI from two
> > > sources, and one SPI, and assume that the SGI has priority 1 and SPI
> > > priority 2 (lower means higher priority), then I think with underflow or
> > > the no-pending interrupt flag, we'll deliver the SGI from the first
> > > source, and then the SPI, and then exiting to pick up the last SGI from
> > > the other source.  That's not how I understand the GIC architecture is
> > > supposed to work.  Am I missing something?
> > 
> > No, you do have a point here. I'm so glad this is a v2 only thing.
> > 
> > > 
> > > > My worry is that underflow is allowed to
> > > > fire if there is one interrupt pending, which implies that you could
> > > > end-up in a livelock scenario if you only have one SGI pending with
> > > > multiple sources.
> > > 
> > > Yes, doesn't work, so I think it should be a maintenance interrupt on
> > > EOI.
> 
> Thinking of it a bit more: MI on EOI doesn't offer much more guarantee
> in the way of priority ordering. Taking your example above: Even if
> you generate a MI when EOIing the SGI, there is no guarantee that
> you'll take the MI before you've acked the SPI.

There's no guarantee, but at least you're attempting at processing the
SGIs in first.  It's the best we can do, but not completely correct,
kinda thing...

> 
> If you really want to offer that absolute guarantee that all the
> multi-source SGIs of higher priority are delivered before anything
> else, then you must make sure that only the SGIs are present in the
> LRs, excluding any other interrupt on lower priority until you've
> queued them all.

Yes, that sucks!  Might not be too hard to implement, it's basically an
early out of the loop traversing the AP list, but just an annoying
complication.

> 
> At that stage, I wonder if there is a point in doing anything at
> all. The GICv2 architecture is too rubbish for words.
> 

The case we do need to worry about is the guest processing all its
interrupts and not exiting while there is actually still an SGI pending.
At that point, we can either do it with the "no interrupts pending
maintenance interrupt" or with the "EOI maintenance interrupt", and I
think the latter at least gets us slightly closer to the architecture
for a non-virtualized system.

Thanks,
-Christoffer

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH 1/2] KVM: arm/arm64: vgic: Don't populate multiple LRs with the same vintid
@ 2018-03-09 21:39               ` Christoffer Dall
  0 siblings, 0 replies; 28+ messages in thread
From: Christoffer Dall @ 2018-03-09 21:39 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Mar 08, 2018 at 06:39:20PM +0000, Marc Zyngier wrote:
> On Thu, 08 Mar 2018 17:04:38 +0000,
> Marc Zyngier wrote:
> > 
> > On Thu, 08 Mar 2018 16:02:42 +0000,
> > Christoffer Dall wrote:
> > > 
> > > On Thu, Mar 08, 2018 at 10:19:49AM +0000, Marc Zyngier wrote:
> > > > On 07/03/18 23:34, Christoffer Dall wrote:
> > > > > On Wed, Mar 7, 2018 at 12:40 PM, Marc Zyngier <marc.zyngier@arm.com> wrote:
> > > > >> The vgic code is trying to be clever when injecting GICv2 SGIs,
> > > > >> and will happily populate LRs with the same interrupt number if
> > > > >> they come from multiple vcpus (after all, they are distinct
> > > > >> interrupt sources).
> > > > >>
> > > > >> Unfortunately, this is against the letter of the architecture,
> > > > >> and the GICv2 architecture spec says "Each valid interrupt stored
> > > > >> in the List registers must have a unique VirtualID for that
> > > > >> virtual CPU interface.". GICv3 has similar (although slightly
> > > > >> ambiguous) restrictions.
> > > > >>
> > > > >> This results in guests locking up when using GICv2-on-GICv3, for
> > > > >> example. The obvious fix is to stop trying so hard, and inject
> > > > >> a single vcpu per SGI per guest entry. After all, pending SGIs
> > > > >> with multiple source vcpus are pretty rare, and are mostly seen
> > > > >> in scenario where the physical CPUs are severely overcomitted.
> > > > >>
> > > > >> Cc: stable at vger.kernel.org
> > > > >> Fixes: 0919e84c0fc1 ("KVM: arm/arm64: vgic-new: Add IRQ sync/flush framework")
> > > > >> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
> > > > >> ---
> > > > >>  virt/kvm/arm/vgic/vgic.c | 11 +----------
> > > > >>  1 file changed, 1 insertion(+), 10 deletions(-)
> > > > >>
> > > > >> diff --git a/virt/kvm/arm/vgic/vgic.c b/virt/kvm/arm/vgic/vgic.c
> > > > >> index c7c5ef190afa..1f7ff175f47b 100644
> > > > >> --- a/virt/kvm/arm/vgic/vgic.c
> > > > >> +++ b/virt/kvm/arm/vgic/vgic.c
> > > > >> @@ -720,18 +720,9 @@ static void vgic_flush_lr_state(struct kvm_vcpu *vcpu)
> > > > >>         list_for_each_entry(irq, &vgic_cpu->ap_list_head, ap_list) {
> > > > >>                 spin_lock(&irq->irq_lock);
> > > > >>
> > > > >> -               if (unlikely(vgic_target_oracle(irq) != vcpu))
> > > > >> -                       goto next;
> > > > >> -
> > > > >> -               /*
> > > > >> -                * If we get an SGI with multiple sources, try to get
> > > > >> -                * them in all at once.
> > > > >> -                */
> > > > >> -               do {
> > > > >> +               if (likely(vgic_target_oracle(irq) == vcpu))
> > > > >>                         vgic_populate_lr(vcpu, irq, count++);
> > > > > 
> > > > > I think we need to change vgic_populate_lr to set the EOI maintenance
> > > > > interrupt flag so that when the interrupt is deactivated, if there are
> > > > > additional pending sources, we exit the guest and pick up the
> > > > > interrupt.
> > > > 
> > > > Potentially. We need to be careful about about the semantics of EOI MI
> > > > with non-level interrupts (see the other thread about EOI signalling).
> > > 
> > > I'll have a look.
> > > 
> > > > 
> > > > > An alternative would be to set the underflow interrupt, but I don't
> > > > > think that would be correct for multiple priorities, because the SGI
> > > > > could have a higher priority than other pending interrupts we put in
> > > > > the LR.
> > > > 
> > > > I don't think priorities are the issue (after all, we already sort the
> > > > irqs in order of priority). 
> > > 
> > > Yes, but assume you have three pending interrupts, one SGI from two
> > > sources, and one SPI, and assume that the SGI has priority 1 and SPI
> > > priority 2 (lower means higher priority), then I think with underflow or
> > > the no-pending interrupt flag, we'll deliver the SGI from the first
> > > source, and then the SPI, and then exiting to pick up the last SGI from
> > > the other source.  That's not how I understand the GIC architecture is
> > > supposed to work.  Am I missing something?
> > 
> > No, you do have a point here. I'm so glad this is a v2 only thing.
> > 
> > > 
> > > > My worry is that underflow is allowed to
> > > > fire if there is one interrupt pending, which implies that you could
> > > > end-up in a livelock scenario if you only have one SGI pending with
> > > > multiple sources.
> > > 
> > > Yes, doesn't work, so I think it should be a maintenance interrupt on
> > > EOI.
> 
> Thinking of it a bit more: MI on EOI doesn't offer much more guarantee
> in the way of priority ordering. Taking your example above: Even if
> you generate a MI when EOIing the SGI, there is no guarantee that
> you'll take the MI before you've acked the SPI.

There's no guarantee, but at least you're attempting at processing the
SGIs in first.  It's the best we can do, but not completely correct,
kinda thing...

> 
> If you really want to offer that absolute guarantee that all the
> multi-source SGIs of higher priority are delivered before anything
> else, then you must make sure that only the SGIs are present in the
> LRs, excluding any other interrupt on lower priority until you've
> queued them all.

Yes, that sucks!  Might not be too hard to implement, it's basically an
early out of the loop traversing the AP list, but just an annoying
complication.

> 
> At that stage, I wonder if there is a point in doing anything at
> all. The GICv2 architecture is too rubbish for words.
> 

The case we do need to worry about is the guest processing all its
interrupts and not exiting while there is actually still an SGI pending.
At that point, we can either do it with the "no interrupts pending
maintenance interrupt" or with the "EOI maintenance interrupt", and I
think the latter at least gets us slightly closer to the architecture
for a non-virtualized system.

Thanks,
-Christoffer

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 1/2] KVM: arm/arm64: vgic: Don't populate multiple LRs with the same vintid
  2018-03-09 21:39               ` Christoffer Dall
@ 2018-03-10 13:57                 ` Marc Zyngier
  -1 siblings, 0 replies; 28+ messages in thread
From: Marc Zyngier @ 2018-03-10 13:57 UTC (permalink / raw)
  To: Christoffer Dall
  Cc: linux-arm-kernel, Andre Przywara, kvmarm, KVM General, Eric Auger

Hi Christoffer,

On Fri, 09 Mar 2018 21:39:31 +0000,
Christoffer Dall wrote:
> 
> On Thu, Mar 08, 2018 at 06:39:20PM +0000, Marc Zyngier wrote:
> > Thinking of it a bit more: MI on EOI doesn't offer much more guarantee
> > in the way of priority ordering. Taking your example above: Even if
> > you generate a MI when EOIing the SGI, there is no guarantee that
> > you'll take the MI before you've acked the SPI.
> 
> There's no guarantee, but at least you're attempting at processing the
> SGIs in first.  It's the best we can do, but not completely correct,
> kinda thing...
> 
> > 
> > If you really want to offer that absolute guarantee that all the
> > multi-source SGIs of higher priority are delivered before anything
> > else, then you must make sure that only the SGIs are present in the
> > LRs, excluding any other interrupt on lower priority until you've
> > queued them all.
> 
> Yes, that sucks!  Might not be too hard to implement, it's basically an
> early out of the loop traversing the AP list, but just an annoying
> complication.

Yeah, it is a bit gross. The way I implemented it is by forcing the AP
list to be sorted if there is any multi-SGI in the pipeline, and early
out as soon as we see an interrupt of a lower priority than the first
multi-SGI. That way, we only have an overhead in the case that
combines multi-SGI and lower priority interrupts.

> > At that stage, I wonder if there is a point in doing anything at
> > all. The GICv2 architecture is too rubbish for words.
> > 
> 
> The case we do need to worry about is the guest processing all its
> interrupts and not exiting while there is actually still an SGI pending.
> At that point, we can either do it with the "no interrupts pending
> maintenance interrupt" or with the "EOI maintenance interrupt", and I
> think the latter at least gets us slightly closer to the architecture
> for a non-virtualized system.

I think that this is where we disagree. I don't see anything in the
architecture that mandates that we should present the SGIs before
anything else. All we need to do is to ensure that interrupts of
higher priority are presented before anything else. It is perfectly
acceptable for an implementation to deliver SGI0, then SPI3, and SGI0
(from another CPU) after that, as long as SPI3 isn't of lesser
priority than SGI0.

Another thing I dislike about using EOI for that is forces us to
propagate the knowledge of the multi-SGI horror further down the
stack, down to both implementations of vgic_populate_lr. NPIE allows
us to keep that knowledge local. But that's an orthogonal issue, and
we can further argue/bikeshed about the respective merits of both
solutions once we have something that fits the sorry state of the
GICv2 architecture ;-).

Thanks,

	M.

-- 
Jazz is not dead, it just smell funny.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH 1/2] KVM: arm/arm64: vgic: Don't populate multiple LRs with the same vintid
@ 2018-03-10 13:57                 ` Marc Zyngier
  0 siblings, 0 replies; 28+ messages in thread
From: Marc Zyngier @ 2018-03-10 13:57 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Christoffer,

On Fri, 09 Mar 2018 21:39:31 +0000,
Christoffer Dall wrote:
> 
> On Thu, Mar 08, 2018 at 06:39:20PM +0000, Marc Zyngier wrote:
> > Thinking of it a bit more: MI on EOI doesn't offer much more guarantee
> > in the way of priority ordering. Taking your example above: Even if
> > you generate a MI when EOIing the SGI, there is no guarantee that
> > you'll take the MI before you've acked the SPI.
> 
> There's no guarantee, but at least you're attempting at processing the
> SGIs in first.  It's the best we can do, but not completely correct,
> kinda thing...
> 
> > 
> > If you really want to offer that absolute guarantee that all the
> > multi-source SGIs of higher priority are delivered before anything
> > else, then you must make sure that only the SGIs are present in the
> > LRs, excluding any other interrupt on lower priority until you've
> > queued them all.
> 
> Yes, that sucks!  Might not be too hard to implement, it's basically an
> early out of the loop traversing the AP list, but just an annoying
> complication.

Yeah, it is a bit gross. The way I implemented it is by forcing the AP
list to be sorted if there is any multi-SGI in the pipeline, and early
out as soon as we see an interrupt of a lower priority than the first
multi-SGI. That way, we only have an overhead in the case that
combines multi-SGI and lower priority interrupts.

> > At that stage, I wonder if there is a point in doing anything at
> > all. The GICv2 architecture is too rubbish for words.
> > 
> 
> The case we do need to worry about is the guest processing all its
> interrupts and not exiting while there is actually still an SGI pending.
> At that point, we can either do it with the "no interrupts pending
> maintenance interrupt" or with the "EOI maintenance interrupt", and I
> think the latter at least gets us slightly closer to the architecture
> for a non-virtualized system.

I think that this is where we disagree. I don't see anything in the
architecture that mandates that we should present the SGIs before
anything else. All we need to do is to ensure that interrupts of
higher priority are presented before anything else. It is perfectly
acceptable for an implementation to deliver SGI0, then SPI3, and SGI0
(from another CPU) after that, as long as SPI3 isn't of lesser
priority than SGI0.

Another thing I dislike about using EOI for that is forces us to
propagate the knowledge of the multi-SGI horror further down the
stack, down to both implementations of vgic_populate_lr. NPIE allows
us to keep that knowledge local. But that's an orthogonal issue, and
we can further argue/bikeshed about the respective merits of both
solutions once we have something that fits the sorry state of the
GICv2 architecture ;-).

Thanks,

	M.

-- 
Jazz is not dead, it just smell funny.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 1/2] KVM: arm/arm64: vgic: Don't populate multiple LRs with the same vintid
  2018-03-10 13:57                 ` Marc Zyngier
@ 2018-03-11  1:51                   ` Christoffer Dall
  -1 siblings, 0 replies; 28+ messages in thread
From: Christoffer Dall @ 2018-03-11  1:51 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: linux-arm-kernel, Andre Przywara, kvmarm, KVM General, Eric Auger

On Sat, Mar 10, 2018 at 1:57 PM, Marc Zyngier <marc.zyngier@arm.com> wrote:
> Hi Christoffer,
>
> On Fri, 09 Mar 2018 21:39:31 +0000,
> Christoffer Dall wrote:
>>
>> On Thu, Mar 08, 2018 at 06:39:20PM +0000, Marc Zyngier wrote:
>> > Thinking of it a bit more: MI on EOI doesn't offer much more guarantee
>> > in the way of priority ordering. Taking your example above: Even if
>> > you generate a MI when EOIing the SGI, there is no guarantee that
>> > you'll take the MI before you've acked the SPI.
>>
>> There's no guarantee, but at least you're attempting at processing the
>> SGIs in first.  It's the best we can do, but not completely correct,
>> kinda thing...
>>
>> >
>> > If you really want to offer that absolute guarantee that all the
>> > multi-source SGIs of higher priority are delivered before anything
>> > else, then you must make sure that only the SGIs are present in the
>> > LRs, excluding any other interrupt on lower priority until you've
>> > queued them all.
>>
>> Yes, that sucks!  Might not be too hard to implement, it's basically an
>> early out of the loop traversing the AP list, but just an annoying
>> complication.
>
> Yeah, it is a bit gross. The way I implemented it is by forcing the AP
> list to be sorted if there is any multi-SGI in the pipeline, and early
> out as soon as we see an interrupt of a lower priority than the first
> multi-SGI. That way, we only have an overhead in the case that
> combines multi-SGI and lower priority interrupts.
>

yes, that's what I had in mind as well.

>> > At that stage, I wonder if there is a point in doing anything at
>> > all. The GICv2 architecture is too rubbish for words.
>> >
>>
>> The case we do need to worry about is the guest processing all its
>> interrupts and not exiting while there is actually still an SGI pending.
>> At that point, we can either do it with the "no interrupts pending
>> maintenance interrupt" or with the "EOI maintenance interrupt", and I
>> think the latter at least gets us slightly closer to the architecture
>> for a non-virtualized system.
>
> I think that this is where we disagree.

I don't think we disagree, I must have expressed myself poorly...

> I don't see anything in the
> architecture that mandates that we should present the SGIs before
> anything else.

Neither do I.

> All we need to do is to ensure that interrupts of
> higher priority are presented before anything else.

Agreed.

> It is perfectly
> acceptable for an implementation to deliver SGI0, then SPI3, and SGI0
> (from another CPU) after that, as long as SPI3 isn't of lesser
> priority than SGI0.

Yes, but what we cannot do is let the guest deliver SGI0, then SPI3,
and then loop forever without delivering SGI0 from another CPU.
That's why I said "the guest processing all its interrupts and not
exiting while there is actually still an SGI pending" and said that we
could use either the EOI or the NPIE trick.

>
> Another thing I dislike about using EOI for that is forces us to
> propagate the knowledge of the multi-SGI horror further down the
> stack, down to both implementations of vgic_populate_lr. NPIE allows
> us to keep that knowledge local. But that's an orthogonal issue, and
> we can further argue/bikeshed about the respective merits of both
> solutions once we have something that fits the sorry state of the
> GICv2 architecture ;-).
>

Yeah, I don't care deeply.  If NPIE is prettier in the
implementations, let's do that.

Thanks,
-Christoffer

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH 1/2] KVM: arm/arm64: vgic: Don't populate multiple LRs with the same vintid
@ 2018-03-11  1:51                   ` Christoffer Dall
  0 siblings, 0 replies; 28+ messages in thread
From: Christoffer Dall @ 2018-03-11  1:51 UTC (permalink / raw)
  To: linux-arm-kernel

On Sat, Mar 10, 2018 at 1:57 PM, Marc Zyngier <marc.zyngier@arm.com> wrote:
> Hi Christoffer,
>
> On Fri, 09 Mar 2018 21:39:31 +0000,
> Christoffer Dall wrote:
>>
>> On Thu, Mar 08, 2018 at 06:39:20PM +0000, Marc Zyngier wrote:
>> > Thinking of it a bit more: MI on EOI doesn't offer much more guarantee
>> > in the way of priority ordering. Taking your example above: Even if
>> > you generate a MI when EOIing the SGI, there is no guarantee that
>> > you'll take the MI before you've acked the SPI.
>>
>> There's no guarantee, but at least you're attempting at processing the
>> SGIs in first.  It's the best we can do, but not completely correct,
>> kinda thing...
>>
>> >
>> > If you really want to offer that absolute guarantee that all the
>> > multi-source SGIs of higher priority are delivered before anything
>> > else, then you must make sure that only the SGIs are present in the
>> > LRs, excluding any other interrupt on lower priority until you've
>> > queued them all.
>>
>> Yes, that sucks!  Might not be too hard to implement, it's basically an
>> early out of the loop traversing the AP list, but just an annoying
>> complication.
>
> Yeah, it is a bit gross. The way I implemented it is by forcing the AP
> list to be sorted if there is any multi-SGI in the pipeline, and early
> out as soon as we see an interrupt of a lower priority than the first
> multi-SGI. That way, we only have an overhead in the case that
> combines multi-SGI and lower priority interrupts.
>

yes, that's what I had in mind as well.

>> > At that stage, I wonder if there is a point in doing anything at
>> > all. The GICv2 architecture is too rubbish for words.
>> >
>>
>> The case we do need to worry about is the guest processing all its
>> interrupts and not exiting while there is actually still an SGI pending.
>> At that point, we can either do it with the "no interrupts pending
>> maintenance interrupt" or with the "EOI maintenance interrupt", and I
>> think the latter at least gets us slightly closer to the architecture
>> for a non-virtualized system.
>
> I think that this is where we disagree.

I don't think we disagree, I must have expressed myself poorly...

> I don't see anything in the
> architecture that mandates that we should present the SGIs before
> anything else.

Neither do I.

> All we need to do is to ensure that interrupts of
> higher priority are presented before anything else.

Agreed.

> It is perfectly
> acceptable for an implementation to deliver SGI0, then SPI3, and SGI0
> (from another CPU) after that, as long as SPI3 isn't of lesser
> priority than SGI0.

Yes, but what we cannot do is let the guest deliver SGI0, then SPI3,
and then loop forever without delivering SGI0 from another CPU.
That's why I said "the guest processing all its interrupts and not
exiting while there is actually still an SGI pending" and said that we
could use either the EOI or the NPIE trick.

>
> Another thing I dislike about using EOI for that is forces us to
> propagate the knowledge of the multi-SGI horror further down the
> stack, down to both implementations of vgic_populate_lr. NPIE allows
> us to keep that knowledge local. But that's an orthogonal issue, and
> we can further argue/bikeshed about the respective merits of both
> solutions once we have something that fits the sorry state of the
> GICv2 architecture ;-).
>

Yeah, I don't care deeply.  If NPIE is prettier in the
implementations, let's do that.

Thanks,
-Christoffer

^ permalink raw reply	[flat|nested] 28+ messages in thread

end of thread, other threads:[~2018-03-11  1:51 UTC | newest]

Thread overview: 28+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-03-07 12:40 [PATCH 0/2] KVM: arm/arm64: GICv2-on-v3 fixes Marc Zyngier
2018-03-07 12:40 ` Marc Zyngier
2018-03-07 12:40 ` [PATCH 1/2] KVM: arm/arm64: vgic: Don't populate multiple LRs with the same vintid Marc Zyngier
2018-03-07 12:40   ` Marc Zyngier
2018-03-07 14:44   ` Andre Przywara
2018-03-07 14:44     ` Andre Przywara
2018-03-07 23:34   ` Christoffer Dall
2018-03-07 23:34     ` Christoffer Dall
2018-03-08 10:19     ` Marc Zyngier
2018-03-08 10:19       ` Marc Zyngier
2018-03-08 13:40       ` Marc Zyngier
2018-03-08 13:40         ` Marc Zyngier
2018-03-08 16:02       ` Christoffer Dall
2018-03-08 16:02         ` Christoffer Dall
2018-03-08 17:04         ` Marc Zyngier
2018-03-08 17:04           ` Marc Zyngier
2018-03-08 18:39           ` Marc Zyngier
2018-03-08 18:39             ` Marc Zyngier
2018-03-09 21:39             ` Christoffer Dall
2018-03-09 21:39               ` Christoffer Dall
2018-03-10 13:57               ` Marc Zyngier
2018-03-10 13:57                 ` Marc Zyngier
2018-03-11  1:51                 ` Christoffer Dall
2018-03-11  1:51                   ` Christoffer Dall
2018-03-07 12:40 ` [PATCH 2/2] kvm: arm/arm64: vgic-v3: Tighten synchronization for guests using v2 on v3 Marc Zyngier
2018-03-07 12:40   ` Marc Zyngier
2018-03-07 16:06   ` Andre Przywara
2018-03-07 16:06     ` Andre Przywara

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.