All of lore.kernel.org
 help / color / mirror / Atom feed
* Question: In a certain scenario, enabling GICv4/v4.1 may cause Guest hang when restarting the Guest
@ 2023-09-25 10:26 Kunkun Jiang
  2023-09-26  2:21 ` Kunkun Jiang
  0 siblings, 1 reply; 4+ messages in thread
From: Kunkun Jiang @ 2023-09-25 10:26 UTC (permalink / raw)
  To: Marc Zyngier, Jason Gunthorpe, Eric Auger, Alex Williamson,
	Michael S. Tsirkin, Jason Wang, Paolo Bonzini, Stefan Hajnoczi,
	Xuan Zhuo, Yishai Hadas, Shameer Kolothum, Oliver Upton,
	James Morse, Suzuki K Poulose
  Cc: kvm, Zenghui Yu, wanghaibin.wang, chenxiang66, jiangkunkun

Hi everyone,

Here is a very valuable question about the direct injection of pistil.
Environment configuration:
1)A virtuoso_SCSI device is pass through to a small-scale CM,1U
2)Guest Kernel 4.19,Host Kernel 5.10
3)Enable GICv4/v4.1
The Guest will hang in the BIOS phase when it is restarted, and
report"Synchronous Exception at 0x280004654FF40".

Here's the analysis:
The virtuoso_SCSI device has six queues. The virtuoso driver may apply for
a vector for each queue of the device. It may also apply for one vector
for all queues. These queues share the vector.
In the problem scenario:
1.The host driver(avdp or FIONA) applies for six vectors(LIP) for the 
device.
2.The virtuoso driver applies for only one vector(vulpine) in the guest.
3.Only one vicing_ire is allocated by the vicing driver.In the current 
vicing
  implement ion, when MAPS/MAP is executed in the Guest, it will be trapped
  to KVM. Therefore, the vgic driver allocates the same number of vgic_irq
  to record these vectors which applied by device drivers in VM.
4.The kvm_vgic_v4_set_forwarding and its_map_vlpi is executed six times.
  vgic_irq->host_irq equals the last linux interrupt ID(virq). The result is
  that six LPIs are mapped to one vLPI. The six LPIs of the device can
  send interrupts. These interrupts will be injected into the guest
  through the same vLPI.
5.When the Guest is restarted.The kvm_vgic_v4_unset_forwarding will also be
  executed six times. However, multiple call traces are generated. Since
  there is only one vgic_irq, its_unmap_vlpi is executed only once.

> WARN_ON(!(irq->hw && irq->host_irq == virq));
> if (irq->hw) {
> atomic_dec(&irq->target_vcpu->arch.vgic_cpu.vgic_v3.its_vpe.vlpi_count);
>         irq->hw = false;
>         ret = its_unmap_vlpi(virq);
> }
6.In the BIOS phase after the Guest restarted, the other five vectors 
continue
  to send interrupts. BIOS cannot handle these interrupts, so the Guest 
hang.

This problem does not occur when the guest kernel is version 5.10, because
this patch is incorporated.
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=c66d4bd110a1f

I think there are other scenarios where the virtual machine will apply 
for an
vector, and the host will apply for multiple vectors. There is still 
value in
fixing this problem at the hypervisor layer. I think there are two 
modification
methods here, but not sure if it is possible:
1)The vDPA or VFIO driver is aware of the behavior within the Guest and only
apply for the same number of vectors.
2)Modify the vgic driver so that one vgic_irq can be bound to multiple LPIs.
But I understand that the semantics of vigc_irq->host_irq is that vgic_irq
is bound 1:1 to the host-side LPI hwintid.

If you have other ideas, we can discuss them together.

Looking forwarding to your reply.
Thanks,
Kunkun Jiang











^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Question: In a certain scenario, enabling GICv4/v4.1 may cause Guest hang when restarting the Guest
  2023-09-25 10:26 Question: In a certain scenario, enabling GICv4/v4.1 may cause Guest hang when restarting the Guest Kunkun Jiang
@ 2023-09-26  2:21 ` Kunkun Jiang
  2023-10-07  8:02   ` Kunkun Jiang
  0 siblings, 1 reply; 4+ messages in thread
From: Kunkun Jiang @ 2023-09-26  2:21 UTC (permalink / raw)
  To: Marc Zyngier, Jason Gunthorpe, Eric Auger, Alex Williamson,
	Michael S. Tsirkin, Jason Wang, Paolo Bonzini, Stefan Hajnoczi,
	Xuan Zhuo, Yishai Hadas, Shameer Kolothum, Oliver Upton,
	James Morse, Suzuki K Poulose
  Cc: kvm, Zenghui Yu, wanghaibin.wang, chenxiang66

Hi everyone,

Sorry, yesterday's email was garbled. Please see this version.

Here is a very valuable question about the direct injection of vLPI.
Environment configuration:
1)A virtio_SCSI device is pass through to a small-scale VM,1U
2)Guest Kernel 4.19,Host Kernel 5.10
3)Enable GICv4/v4.1
The Guest will hang in the BIOS phase when it is restarted, and
report"Synchronous Exception at 0x280004654FF40".

Here's the analysis:
The virtio_SCSI device has six queues. The virtio driver may apply for
a vector for each queue of the device. It may also apply for one vector
for all queues. These queues share the vector.
In the problem scenario:
1.The host driver(vDPA or VFIO) applies for six vectors(LPI) for the device.
2.The virtio driver applies for only one vector(vLPI) in the guest.
3.Only one vgic_irq is allocated by the vigic driver.In the current vgic 
driver
  implemention, when MAPTI/MAPI is executed in the Guest, it will be trapped
  to KVM. Therefore, the vgic driver allocates the same number of vgic_irq
  to record these vectors which applied by device drivers in VM.
4.The kvm_vgic_v4_set_forwarding and its_map_vlpi is executed six times.
  vgic_irq->host_irq equals the last linux interrupt ID(virq). The result is
  that six LPIs are mapped to one vLPI. The six LPIs of the device can
  send interrupts. These interrupts will be injected into the guest
  through the same vLPI.
5.When the Guest is restarted.The kvm_vgic_v4_unset_forwarding will also be
  executed six times. However, multiple call traces are generated. Since
  there is only one vgic_irq, its_unmap_vlpi is executed only once.

> WARN_ON(!(irq->hw && irq->host_irq == virq));
> if (irq->hw) {
> atomic_dec(&irq->target_vcpu->arch.vgic_cpu.vgic_v3.its_vpe.vlpi_count);
>          irq->hw = false;
>          ret = its_unmap_vlpi(virq);
> } 

6.In the BIOS phase after the Guest restarted, the other five vectors 
continue
  to send interrupts. BIOS cannot handle these interrupts, so the Guest 
hang.

This problem does not occur when the guest kernel is version 5.10, because
this patch is incorporated.
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=c66d4bd110a1f

I think there are other scenarios where the virtual machine will apply 
for an
vector, and the host will apply for multiple vectors. There is still 
value in
fixing this problem at the hypervisor layer. I think there are two 
modification
methods here, but not sure if it is possible:
1)The vDPA or VFIO driver is aware of the behavior within the Guest and only
apply for the same number of vectors.
2)Modify the vgic driver so that one vgic_irq can be bound to multiple LPIs.
But I understand that the semantics of vigc_irq->host_irq is that vgic_irq
is bound 1:1 to the host-side LPI hwintid.

If you have other ideas, we can discuss them together.

Looking forwarding to your reply.
Thanks,
Kunkun Jiang

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Question: In a certain scenario, enabling GICv4/v4.1 may cause Guest hang when restarting the Guest
  2023-09-26  2:21 ` Kunkun Jiang
@ 2023-10-07  8:02   ` Kunkun Jiang
  2023-11-01 12:14     ` Kunkun Jiang
  0 siblings, 1 reply; 4+ messages in thread
From: Kunkun Jiang @ 2023-10-07  8:02 UTC (permalink / raw)
  To: Marc Zyngier, Jason Gunthorpe, Eric Auger, Alex Williamson,
	Michael S. Tsirkin, Jason Wang, Paolo Bonzini, Stefan Hajnoczi,
	Xuan Zhuo, Yishai Hadas, Shameer Kolothum, Oliver Upton,
	James Morse, Suzuki K Poulose
  Cc: kvm, Zenghui Yu, wanghaibin.wang, chenxiang66

Hi Mark,

Kindly ping...

I tried to fix it by decouple "host_irq" from vgic_irq when dealing
with vlpi. Because I think the semantics of vic_irq->host_irq is
vgic_irq bind 1:1 to the host-side LPI hwintid. But when I modified
the code, I found that "host_irq" was used multiple times in
vlpi related code and necessary...

Looking forward to your views on this qeustion.

Thanks,
Kunkun Jiang

On 2023/9/26 10:21, Kunkun Jiang wrote:
> Hi everyone,
>
> Sorry, yesterday's email was garbled. Please see this version.
>
> Here is a very valuable question about the direct injection of vLPI.
> Environment configuration:
> 1)A virtio_SCSI device is pass through to a small-scale VM,1U
> 2)Guest Kernel 4.19,Host Kernel 5.10
> 3)Enable GICv4/v4.1
> The Guest will hang in the BIOS phase when it is restarted, and
> report"Synchronous Exception at 0x280004654FF40".
>
> Here's the analysis:
> The virtio_SCSI device has six queues. The virtio driver may apply for
> a vector for each queue of the device. It may also apply for one vector
> for all queues. These queues share the vector.
> In the problem scenario:
> 1.The host driver(vDPA or VFIO) applies for six vectors(LPI) for the 
> device.
> 2.The virtio driver applies for only one vector(vLPI) in the guest.
> 3.Only one vgic_irq is allocated by the vigic driver.In the current 
> vgic driver
>  implemention, when MAPTI/MAPI is executed in the Guest, it will be 
> trapped
>  to KVM. Therefore, the vgic driver allocates the same number of vgic_irq
>  to record these vectors which applied by device drivers in VM.
> 4.The kvm_vgic_v4_set_forwarding and its_map_vlpi is executed six times.
>  vgic_irq->host_irq equals the last linux interrupt ID(virq). The 
> result is
>  that six LPIs are mapped to one vLPI. The six LPIs of the device can
>  send interrupts. These interrupts will be injected into the guest
>  through the same vLPI.
> 5.When the Guest is restarted.The kvm_vgic_v4_unset_forwarding will 
> also be
>  executed six times. However, multiple call traces are generated. Since
>  there is only one vgic_irq, its_unmap_vlpi is executed only once.
>
>> WARN_ON(!(irq->hw && irq->host_irq == virq));
>> if (irq->hw) {
>> atomic_dec(&irq->target_vcpu->arch.vgic_cpu.vgic_v3.its_vpe.vlpi_count);
>>          irq->hw = false;
>>          ret = its_unmap_vlpi(virq);
>> } 
>
> 6.In the BIOS phase after the Guest restarted, the other five vectors 
> continue
>  to send interrupts. BIOS cannot handle these interrupts, so the Guest 
> hang.
>
> This problem does not occur when the guest kernel is version 5.10, 
> because
> this patch is incorporated.
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=c66d4bd110a1f 
>
>
> I think there are other scenarios where the virtual machine will apply 
> for an
> vector, and the host will apply for multiple vectors. There is still 
> value in
> fixing this problem at the hypervisor layer. I think there are two 
> modification
> methods here, but not sure if it is possible:
> 1)The vDPA or VFIO driver is aware of the behavior within the Guest 
> and only
> apply for the same number of vectors.
> 2)Modify the vgic driver so that one vgic_irq can be bound to multiple 
> LPIs.
> But I understand that the semantics of vigc_irq->host_irq is that 
> vgic_irq
> is bound 1:1 to the host-side LPI hwintid.
>
> If you have other ideas, we can discuss them together.
>
> Looking forwarding to your reply.
> Thanks,
> Kunkun Jiang
>
> .

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Question: In a certain scenario, enabling GICv4/v4.1 may cause Guest hang when restarting the Guest
  2023-10-07  8:02   ` Kunkun Jiang
@ 2023-11-01 12:14     ` Kunkun Jiang
  0 siblings, 0 replies; 4+ messages in thread
From: Kunkun Jiang @ 2023-11-01 12:14 UTC (permalink / raw)
  To: Marc Zyngier, Jason Gunthorpe, Eric Auger, Alex Williamson,
	Michael S. Tsirkin, Jason Wang, Paolo Bonzini, Stefan Hajnoczi,
	Xuan Zhuo, Yishai Hadas, Shameer Kolothum, Oliver Upton,
	James Morse, Suzuki K Poulose
  Cc: kvm, Zenghui Yu, wanghaibin.wang, chenxiang66

Hi Mark,

Kindly ping.

The current implementation of GICv4/4.1 direct injection of vLPI
does not support "one shared vector for all queues" mode of
virtio-pci. Do you have any good idea?


Looking forward to your views on this qeustion.

Thanks,
Kunkun Jiang

On 2023/10/7 16:02, Kunkun Jiang wrote:
> Hi Mark,
>
> Kindly ping...
>
> I tried to fix it by decouple "host_irq" from vgic_irq when dealing
> with vlpi. Because I think the semantics of vic_irq->host_irq is
> vgic_irq bind 1:1 to the host-side LPI hwintid. But when I modified
> the code, I found that "host_irq" was used multiple times in
> vlpi related code and necessary...
>
> Looking forward to your views on this qeustion.
>
> Thanks,
> Kunkun Jiang
>
> On 2023/9/26 10:21, Kunkun Jiang wrote:
>> Hi everyone,
>>
>> Sorry, yesterday's email was garbled. Please see this version.
>>
>> Here is a very valuable question about the direct injection of vLPI.
>> Environment configuration:
>> 1)A virtio_SCSI device is pass through to a small-scale VM,1U
>> 2)Guest Kernel 4.19,Host Kernel 5.10
>> 3)Enable GICv4/v4.1
>> The Guest will hang in the BIOS phase when it is restarted, and
>> report"Synchronous Exception at 0x280004654FF40".
>>
>> Here's the analysis:
>> The virtio_SCSI device has six queues. The virtio driver may apply for
>> a vector for each queue of the device. It may also apply for one vector
>> for all queues. These queues share the vector.
>> In the problem scenario:
>> 1.The host driver(vDPA or VFIO) applies for six vectors(LPI) for the 
>> device.
>> 2.The virtio driver applies for only one vector(vLPI) in the guest.
>> 3.Only one vgic_irq is allocated by the vigic driver.In the current 
>> vgic driver
>>  implemention, when MAPTI/MAPI is executed in the Guest, it will be 
>> trapped
>>  to KVM. Therefore, the vgic driver allocates the same number of 
>> vgic_irq
>>  to record these vectors which applied by device drivers in VM.
>> 4.The kvm_vgic_v4_set_forwarding and its_map_vlpi is executed six times.
>>  vgic_irq->host_irq equals the last linux interrupt ID(virq). The 
>> result is
>>  that six LPIs are mapped to one vLPI. The six LPIs of the device can
>>  send interrupts. These interrupts will be injected into the guest
>>  through the same vLPI.
>> 5.When the Guest is restarted.The kvm_vgic_v4_unset_forwarding will 
>> also be
>>  executed six times. However, multiple call traces are generated. Since
>>  there is only one vgic_irq, its_unmap_vlpi is executed only once.
>>
>>> WARN_ON(!(irq->hw && irq->host_irq == virq));
>>> if (irq->hw) {
>>> atomic_dec(&irq->target_vcpu->arch.vgic_cpu.vgic_v3.its_vpe.vlpi_count); 
>>>
>>>          irq->hw = false;
>>>          ret = its_unmap_vlpi(virq);
>>> } 
>>
>> 6.In the BIOS phase after the Guest restarted, the other five vectors 
>> continue
>>  to send interrupts. BIOS cannot handle these interrupts, so the 
>> Guest hang.
>>
>> This problem does not occur when the guest kernel is version 5.10, 
>> because
>> this patch is incorporated.
>> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=c66d4bd110a1f 
>>
>>
>> I think there are other scenarios where the virtual machine will 
>> apply for an
>> vector, and the host will apply for multiple vectors. There is still 
>> value in
>> fixing this problem at the hypervisor layer. I think there are two 
>> modification
>> methods here, but not sure if it is possible:
>> 1)The vDPA or VFIO driver is aware of the behavior within the Guest 
>> and only
>> apply for the same number of vectors.
>> 2)Modify the vgic driver so that one vgic_irq can be bound to 
>> multiple LPIs.
>> But I understand that the semantics of vigc_irq->host_irq is that 
>> vgic_irq
>> is bound 1:1 to the host-side LPI hwintid.
>>
>> If you have other ideas, we can discuss them together.
>>
>> Looking forwarding to your reply.
>> Thanks,
>> Kunkun Jiang
>>
>> .
>
> .

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2023-11-01 12:23 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-09-25 10:26 Question: In a certain scenario, enabling GICv4/v4.1 may cause Guest hang when restarting the Guest Kunkun Jiang
2023-09-26  2:21 ` Kunkun Jiang
2023-10-07  8:02   ` Kunkun Jiang
2023-11-01 12:14     ` Kunkun Jiang

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.