linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: Unexpected interrupt received in Guest OS when booting after "system_reset"
       [not found] <e8ffbc4e-f7b7-14a1-7614-a3db85c9152f@huawei.com>
@ 2019-03-28 17:18 ` Marc Zyngier
  2019-03-29  1:19   ` Heyi Guo
  0 siblings, 1 reply; 5+ messages in thread
From: Marc Zyngier @ 2019-03-28 17:18 UTC (permalink / raw)
  To: Heyi Guo, Christoffer Dall
  Cc: linux-arm-kernel, kvmarm, linux-kernel, wanghaibin 00208455

[Please do not send HTML emails]

On 28/03/2019 15:44, Heyi Guo wrote:
> Hi Marc and Christoffer,
> 
> When we issue "system_reset" from qemu monitor to a running VM, guest
> Linux will occasionally get "Unexpected interrupt" after rebooting, with
> kernel message at the bottom.
> 
> After some investigation, we found it might be caused by the
> preservation of virtual LPI during system reset: it seems the virtual
> LPI remains in the ap_list during VM reset, as well as its "enabled" and
> "pending_latch" status, and this causes the virtual LPI to be injected
> wrongly after VCPU reboots and enables interrupt.
> 
> We propose to clear "enabled" flag of virtual LPI when PROPBASER (or
> GICR_CTRL) of virtual GICR is written to 0, and update virtual LPI
> properties when GICR_CTRL.enableLPIs is set to 1 again.
> 
> Any advice? Or did we miss something?

We're clearly missing a trick here, but I'm not convinced of your 
approach. What should happend is that the redistributors should be reset 
as well, and that this should recall any LPI that has been made pending. 
Unfortunately, we don't seem to have such code in place, which is 
embarrassing.

Can you give the following, untested patch a go? It isn't right either, 
but it should have the right effect. If you confirm that it solves your 
problem, we can look at adding the right hooks...

Thanks,

	M.

diff --git a/virt/kvm/arm/vgic/vgic-its.c b/virt/kvm/arm/vgic/vgic-its.c
index ab3f47745d9c..bd9a9250f323 100644
--- a/virt/kvm/arm/vgic/vgic-its.c
+++ b/virt/kvm/arm/vgic/vgic-its.c
@@ -2403,8 +2403,32 @@ static int vgic_its_commit_v0(struct vgic_its *its)
 	return 0;
 }
 
+static void vgic_nuke_pending_lpis(struct kvm_vcpu *vcpu)
+{
+	struct vgic_cpu *vgic_cpu = &vcpu->arch.vgic_cpu;
+	struct vgic_irq *irq, *tmp;
+	unsigned long flags;
+
+	raw_spin_lock_irqsave(&vcpu->arch.vgic_cpu.ap_list_lock, flags);
+
+	list_for_each_entry_safe(irq, tmp, &vgic_cpu->ap_list_head, ap_list) {
+		if (irq->intid >= VGIC_MIN_LPI) {
+			list_del(&irq->ap_list);
+			vgic_put_irq(vcpu->kvm, irq);
+		}
+	}
+
+	raw_spin_unlock_irqrestore(&vcpu->arch.vgic_cpu.ap_list_lock, flags);
+}
+
 static void vgic_its_reset(struct kvm *kvm, struct vgic_its *its)
 {
+	struct kvm_vcpu *vcpu;
+	int c;
+
+	kvm_for_each_vcpu(c, vcpu, kvm)
+		vgic_nuke_pending_lpis(vcpu);
+
 	/* We need to keep the ABI specific field values */
 	its->baser_coll_table &= ~GITS_BASER_VALID;
 	its->baser_device_table &= ~GITS_BASER_VALID;

-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: Unexpected interrupt received in Guest OS when booting after "system_reset"
  2019-03-28 17:18 ` Unexpected interrupt received in Guest OS when booting after "system_reset" Marc Zyngier
@ 2019-03-29  1:19   ` Heyi Guo
  2019-03-29  9:19     ` Heyi Guo
  0 siblings, 1 reply; 5+ messages in thread
From: Heyi Guo @ 2019-03-29  1:19 UTC (permalink / raw)
  To: Marc Zyngier, Christoffer Dall
  Cc: linux-arm-kernel, kvmarm, linux-kernel, wanghaibin 00208455



On 2019/3/29 1:18, Marc Zyngier wrote:
> [Please do not send HTML emails]
Sorry; will keep in mind next time :)
>
> On 28/03/2019 15:44, Heyi Guo wrote:
>> Hi Marc and Christoffer,
>>
>> When we issue "system_reset" from qemu monitor to a running VM, guest
>> Linux will occasionally get "Unexpected interrupt" after rebooting, with
>> kernel message at the bottom.
>>
>> After some investigation, we found it might be caused by the
>> preservation of virtual LPI during system reset: it seems the virtual
>> LPI remains in the ap_list during VM reset, as well as its "enabled" and
>> "pending_latch" status, and this causes the virtual LPI to be injected
>> wrongly after VCPU reboots and enables interrupt.
>>
>> We propose to clear "enabled" flag of virtual LPI when PROPBASER (or
>> GICR_CTRL) of virtual GICR is written to 0, and update virtual LPI
>> properties when GICR_CTRL.enableLPIs is set to 1 again.
>>
>> Any advice? Or did we miss something?
> We're clearly missing a trick here, but I'm not convinced of your
> approach.
To be honest, we were not fully convinced by ourselves either. I was worrying about guest switching GICR_CTRL or GICR_PROPBASER at runtime which probably causes issue for our rough approach.

> What should happend is that the redistributors should be reset
> as well, and that this should recall any LPI that has been made pending.
> Unfortunately, we don't seem to have such code in place, which is
> embarrassing.
>
> Can you give the following, untested patch a go? It isn't right either,
> but it should have the right effect. If you confirm that it solves your
> problem, we can look at adding the right hooks...
Thanks, I'll test this and get back to you.
Heyi

> Thanks,
>
> 	M.
>
> diff --git a/virt/kvm/arm/vgic/vgic-its.c b/virt/kvm/arm/vgic/vgic-its.c
> index ab3f47745d9c..bd9a9250f323 100644
> --- a/virt/kvm/arm/vgic/vgic-its.c
> +++ b/virt/kvm/arm/vgic/vgic-its.c
> @@ -2403,8 +2403,32 @@ static int vgic_its_commit_v0(struct vgic_its *its)
>   	return 0;
>   }
>   
> +static void vgic_nuke_pending_lpis(struct kvm_vcpu *vcpu)
> +{
> +	struct vgic_cpu *vgic_cpu = &vcpu->arch.vgic_cpu;
> +	struct vgic_irq *irq, *tmp;
> +	unsigned long flags;
> +
> +	raw_spin_lock_irqsave(&vcpu->arch.vgic_cpu.ap_list_lock, flags);
> +
> +	list_for_each_entry_safe(irq, tmp, &vgic_cpu->ap_list_head, ap_list) {
> +		if (irq->intid >= VGIC_MIN_LPI) {
> +			list_del(&irq->ap_list);
> +			vgic_put_irq(vcpu->kvm, irq);
> +		}
> +	}
> +
> +	raw_spin_unlock_irqrestore(&vcpu->arch.vgic_cpu.ap_list_lock, flags);
> +}
> +
>   static void vgic_its_reset(struct kvm *kvm, struct vgic_its *its)
>   {
> +	struct kvm_vcpu *vcpu;
> +	int c;
> +
> +	kvm_for_each_vcpu(c, vcpu, kvm)
> +		vgic_nuke_pending_lpis(vcpu);
> +
>   	/* We need to keep the ABI specific field values */
>   	its->baser_coll_table &= ~GITS_BASER_VALID;
>   	its->baser_device_table &= ~GITS_BASER_VALID;
>



^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Unexpected interrupt received in Guest OS when booting after "system_reset"
  2019-03-29  1:19   ` Heyi Guo
@ 2019-03-29  9:19     ` Heyi Guo
  2019-03-29 10:54       ` Marc Zyngier
  0 siblings, 1 reply; 5+ messages in thread
From: Heyi Guo @ 2019-03-29  9:19 UTC (permalink / raw)
  To: Marc Zyngier, Christoffer Dall
  Cc: linux-arm-kernel, kvmarm, linux-kernel, wanghaibin 00208455

Hi Marc,

The patch works. I tested for 1.5 hour and 52 VM resets. There were 16 times that a virtual LPI left in the ap_list (seen by an additional printk) during reset and we never saw "Unexpected interrupt received" any more.

Just a minor comment: how about replacing /vcpu->arch.vgic_cpu./ with /vgic_cpu->/ in the lock/unlock code line, to reduce some words?

Thanks,

Heyi

On 2019/3/29 9:19, Heyi Guo wrote:
>
>
> On 2019/3/29 1:18, Marc Zyngier wrote:
>> [Please do not send HTML emails]
> Sorry; will keep in mind next time :)
>>
>> On 28/03/2019 15:44, Heyi Guo wrote:
>>> Hi Marc and Christoffer,
>>>
>>> When we issue "system_reset" from qemu monitor to a running VM, guest
>>> Linux will occasionally get "Unexpected interrupt" after rebooting, with
>>> kernel message at the bottom.
>>>
>>> After some investigation, we found it might be caused by the
>>> preservation of virtual LPI during system reset: it seems the virtual
>>> LPI remains in the ap_list during VM reset, as well as its "enabled" and
>>> "pending_latch" status, and this causes the virtual LPI to be injected
>>> wrongly after VCPU reboots and enables interrupt.
>>>
>>> We propose to clear "enabled" flag of virtual LPI when PROPBASER (or
>>> GICR_CTRL) of virtual GICR is written to 0, and update virtual LPI
>>> properties when GICR_CTRL.enableLPIs is set to 1 again.
>>>
>>> Any advice? Or did we miss something?
>> We're clearly missing a trick here, but I'm not convinced of your
>> approach.
> To be honest, we were not fully convinced by ourselves either. I was worrying about guest switching GICR_CTRL or GICR_PROPBASER at runtime which probably causes issue for our rough approach.
>
>> What should happend is that the redistributors should be reset
>> as well, and that this should recall any LPI that has been made pending.
>> Unfortunately, we don't seem to have such code in place, which is
>> embarrassing.
>>
>> Can you give the following, untested patch a go? It isn't right either,
>> but it should have the right effect. If you confirm that it solves your
>> problem, we can look at adding the right hooks...
> Thanks, I'll test this and get back to you.
> Heyi
>
>> Thanks,
>>
>>     M.
>>
>> diff --git a/virt/kvm/arm/vgic/vgic-its.c b/virt/kvm/arm/vgic/vgic-its.c
>> index ab3f47745d9c..bd9a9250f323 100644
>> --- a/virt/kvm/arm/vgic/vgic-its.c
>> +++ b/virt/kvm/arm/vgic/vgic-its.c
>> @@ -2403,8 +2403,32 @@ static int vgic_its_commit_v0(struct vgic_its *its)
>>       return 0;
>>   }
>>   +static void vgic_nuke_pending_lpis(struct kvm_vcpu *vcpu)
>> +{
>> +    struct vgic_cpu *vgic_cpu = &vcpu->arch.vgic_cpu;
>> +    struct vgic_irq *irq, *tmp;
>> +    unsigned long flags;
>> +
>> + raw_spin_lock_irqsave(&vcpu->arch.vgic_cpu.ap_list_lock, flags);
>> +
>> +    list_for_each_entry_safe(irq, tmp, &vgic_cpu->ap_list_head, ap_list) {
>> +        if (irq->intid >= VGIC_MIN_LPI) {
>> +            list_del(&irq->ap_list);
>> +            vgic_put_irq(vcpu->kvm, irq);
>> +        }
>> +    }
>> +
>> + raw_spin_unlock_irqrestore(&vcpu->arch.vgic_cpu.ap_list_lock, flags);
>> +}
>> +
>>   static void vgic_its_reset(struct kvm *kvm, struct vgic_its *its)
>>   {
>> +    struct kvm_vcpu *vcpu;
>> +    int c;
>> +
>> +    kvm_for_each_vcpu(c, vcpu, kvm)
>> +        vgic_nuke_pending_lpis(vcpu);
>> +
>>       /* We need to keep the ABI specific field values */
>>       its->baser_coll_table &= ~GITS_BASER_VALID;
>>       its->baser_device_table &= ~GITS_BASER_VALID;
>>
>
>
>
> .
>



^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Unexpected interrupt received in Guest OS when booting after "system_reset"
  2019-03-29  9:19     ` Heyi Guo
@ 2019-03-29 10:54       ` Marc Zyngier
  2019-03-30  0:55         ` Heyi Guo
  0 siblings, 1 reply; 5+ messages in thread
From: Marc Zyngier @ 2019-03-29 10:54 UTC (permalink / raw)
  To: Heyi Guo, Christoffer Dall
  Cc: linux-arm-kernel, kvmarm, linux-kernel, wanghaibin 00208455

On 29/03/2019 09:19, Heyi Guo wrote:
> Hi Marc,
> 
> The patch works. I tested for 1.5 hour and 52 VM resets. There were
> 16 times that a virtual LPI left in the ap_list (seen by an
> additional printk) during reset and we never saw "Unexpected
> interrupt received" any more.


Thanks for testing, much appreciated.

> Just a minor comment: how about replacing /vcpu->arch.vgic_cpu./ with
> /vgic_cpu->/ in the lock/unlock code line, to reduce some words?

Well, as I said, the patch is wrong in other ways, so I wouldn't bother
with that. It only serves as a test for my theory.

I think I'm slowly warming up to you initial proposal to hook things
into the PROPBASER/PENDBASER registers, as the LPIs do have a life
outside of the ITS itself.

I'll try to respin something next week.

Thanks,

	M.

> 
> Thanks,
> 
> Heyi
> 
> On 2019/3/29 9:19, Heyi Guo wrote:
>>
>>
>> On 2019/3/29 1:18, Marc Zyngier wrote:
>>> [Please do not send HTML emails]
>> Sorry; will keep in mind next time :)
>>>
>>> On 28/03/2019 15:44, Heyi Guo wrote:
>>>> Hi Marc and Christoffer,
>>>>
>>>> When we issue "system_reset" from qemu monitor to a running VM, guest
>>>> Linux will occasionally get "Unexpected interrupt" after rebooting, with
>>>> kernel message at the bottom.
>>>>
>>>> After some investigation, we found it might be caused by the
>>>> preservation of virtual LPI during system reset: it seems the virtual
>>>> LPI remains in the ap_list during VM reset, as well as its "enabled" and
>>>> "pending_latch" status, and this causes the virtual LPI to be injected
>>>> wrongly after VCPU reboots and enables interrupt.
>>>>
>>>> We propose to clear "enabled" flag of virtual LPI when PROPBASER (or
>>>> GICR_CTRL) of virtual GICR is written to 0, and update virtual LPI
>>>> properties when GICR_CTRL.enableLPIs is set to 1 again.
>>>>
>>>> Any advice? Or did we miss something?
>>> We're clearly missing a trick here, but I'm not convinced of your
>>> approach.
>> To be honest, we were not fully convinced by ourselves either. I was worrying about guest switching GICR_CTRL or GICR_PROPBASER at runtime which probably causes issue for our rough approach.
>>
>>> What should happend is that the redistributors should be reset
>>> as well, and that this should recall any LPI that has been made pending.
>>> Unfortunately, we don't seem to have such code in place, which is
>>> embarrassing.
>>>
>>> Can you give the following, untested patch a go? It isn't right either,
>>> but it should have the right effect. If you confirm that it solves your
>>> problem, we can look at adding the right hooks...
>> Thanks, I'll test this and get back to you.
>> Heyi
>>
>>> Thanks,
>>>
>>>     M.
>>>
>>> diff --git a/virt/kvm/arm/vgic/vgic-its.c b/virt/kvm/arm/vgic/vgic-its.c
>>> index ab3f47745d9c..bd9a9250f323 100644
>>> --- a/virt/kvm/arm/vgic/vgic-its.c
>>> +++ b/virt/kvm/arm/vgic/vgic-its.c
>>> @@ -2403,8 +2403,32 @@ static int vgic_its_commit_v0(struct vgic_its *its)
>>>       return 0;
>>>   }
>>>   +static void vgic_nuke_pending_lpis(struct kvm_vcpu *vcpu)
>>> +{
>>> +    struct vgic_cpu *vgic_cpu = &vcpu->arch.vgic_cpu;
>>> +    struct vgic_irq *irq, *tmp;
>>> +    unsigned long flags;
>>> +
>>> + raw_spin_lock_irqsave(&vcpu->arch.vgic_cpu.ap_list_lock, flags);
>>> +
>>> +    list_for_each_entry_safe(irq, tmp, &vgic_cpu->ap_list_head, ap_list) {
>>> +        if (irq->intid >= VGIC_MIN_LPI) {
>>> +            list_del(&irq->ap_list);
>>> +            vgic_put_irq(vcpu->kvm, irq);
>>> +        }
>>> +    }
>>> +
>>> + raw_spin_unlock_irqrestore(&vcpu->arch.vgic_cpu.ap_list_lock, flags);
>>> +}
>>> +
>>>   static void vgic_its_reset(struct kvm *kvm, struct vgic_its *its)
>>>   {
>>> +    struct kvm_vcpu *vcpu;
>>> +    int c;
>>> +
>>> +    kvm_for_each_vcpu(c, vcpu, kvm)
>>> +        vgic_nuke_pending_lpis(vcpu);
>>> +
>>>       /* We need to keep the ABI specific field values */
>>>       its->baser_coll_table &= ~GITS_BASER_VALID;
>>>       its->baser_device_table &= ~GITS_BASER_VALID;
>>>
>>
>>
>>
>> .
>>
> 
> 


-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Unexpected interrupt received in Guest OS when booting after "system_reset"
  2019-03-29 10:54       ` Marc Zyngier
@ 2019-03-30  0:55         ` Heyi Guo
  0 siblings, 0 replies; 5+ messages in thread
From: Heyi Guo @ 2019-03-30  0:55 UTC (permalink / raw)
  To: Marc Zyngier, Christoffer Dall
  Cc: linux-arm-kernel, kvmarm, linux-kernel, wanghaibin 00208455



On 2019/3/29 18:54, Marc Zyngier wrote:
> On 29/03/2019 09:19, Heyi Guo wrote:
>> Hi Marc,
>>
>> The patch works. I tested for 1.5 hour and 52 VM resets. There were
>> 16 times that a virtual LPI left in the ap_list (seen by an
>> additional printk) during reset and we never saw "Unexpected
>> interrupt received" any more.
>
> Thanks for testing, much appreciated.
>
>> Just a minor comment: how about replacing /vcpu->arch.vgic_cpu./ with
>> /vgic_cpu->/ in the lock/unlock code line, to reduce some words?
> Well, as I said, the patch is wrong in other ways, so I wouldn't bother
> with that. It only serves as a test for my theory.
Sure, I hadn't caught the last sentence of your previous mail...
>
> I think I'm slowly warming up to you initial proposal to hook things
> into the PROPBASER/PENDBASER registers, as the LPIs do have a life
> outside of the ITS itself.
>
> I'll try to respin something next week.
Thanks,

Heyi

>
> Thanks,
>
> 	M.
>
>> Thanks,
>>
>> Heyi
>>
>> On 2019/3/29 9:19, Heyi Guo wrote:
>>>
>>> On 2019/3/29 1:18, Marc Zyngier wrote:
>>>> [Please do not send HTML emails]
>>> Sorry; will keep in mind next time :)
>>>> On 28/03/2019 15:44, Heyi Guo wrote:
>>>>> Hi Marc and Christoffer,
>>>>>
>>>>> When we issue "system_reset" from qemu monitor to a running VM, guest
>>>>> Linux will occasionally get "Unexpected interrupt" after rebooting, with
>>>>> kernel message at the bottom.
>>>>>
>>>>> After some investigation, we found it might be caused by the
>>>>> preservation of virtual LPI during system reset: it seems the virtual
>>>>> LPI remains in the ap_list during VM reset, as well as its "enabled" and
>>>>> "pending_latch" status, and this causes the virtual LPI to be injected
>>>>> wrongly after VCPU reboots and enables interrupt.
>>>>>
>>>>> We propose to clear "enabled" flag of virtual LPI when PROPBASER (or
>>>>> GICR_CTRL) of virtual GICR is written to 0, and update virtual LPI
>>>>> properties when GICR_CTRL.enableLPIs is set to 1 again.
>>>>>
>>>>> Any advice? Or did we miss something?
>>>> We're clearly missing a trick here, but I'm not convinced of your
>>>> approach.
>>> To be honest, we were not fully convinced by ourselves either. I was worrying about guest switching GICR_CTRL or GICR_PROPBASER at runtime which probably causes issue for our rough approach.
>>>
>>>> What should happend is that the redistributors should be reset
>>>> as well, and that this should recall any LPI that has been made pending.
>>>> Unfortunately, we don't seem to have such code in place, which is
>>>> embarrassing.
>>>>
>>>> Can you give the following, untested patch a go? It isn't right either,
>>>> but it should have the right effect. If you confirm that it solves your
>>>> problem, we can look at adding the right hooks...
>>> Thanks, I'll test this and get back to you.
>>> Heyi
>>>
>>>> Thanks,
>>>>
>>>>      M.
>>>>
>>>> diff --git a/virt/kvm/arm/vgic/vgic-its.c b/virt/kvm/arm/vgic/vgic-its.c
>>>> index ab3f47745d9c..bd9a9250f323 100644
>>>> --- a/virt/kvm/arm/vgic/vgic-its.c
>>>> +++ b/virt/kvm/arm/vgic/vgic-its.c
>>>> @@ -2403,8 +2403,32 @@ static int vgic_its_commit_v0(struct vgic_its *its)
>>>>        return 0;
>>>>    }
>>>>    +static void vgic_nuke_pending_lpis(struct kvm_vcpu *vcpu)
>>>> +{
>>>> +    struct vgic_cpu *vgic_cpu = &vcpu->arch.vgic_cpu;
>>>> +    struct vgic_irq *irq, *tmp;
>>>> +    unsigned long flags;
>>>> +
>>>> + raw_spin_lock_irqsave(&vcpu->arch.vgic_cpu.ap_list_lock, flags);
>>>> +
>>>> +    list_for_each_entry_safe(irq, tmp, &vgic_cpu->ap_list_head, ap_list) {
>>>> +        if (irq->intid >= VGIC_MIN_LPI) {
>>>> +            list_del(&irq->ap_list);
>>>> +            vgic_put_irq(vcpu->kvm, irq);
>>>> +        }
>>>> +    }
>>>> +
>>>> + raw_spin_unlock_irqrestore(&vcpu->arch.vgic_cpu.ap_list_lock, flags);
>>>> +}
>>>> +
>>>>    static void vgic_its_reset(struct kvm *kvm, struct vgic_its *its)
>>>>    {
>>>> +    struct kvm_vcpu *vcpu;
>>>> +    int c;
>>>> +
>>>> +    kvm_for_each_vcpu(c, vcpu, kvm)
>>>> +        vgic_nuke_pending_lpis(vcpu);
>>>> +
>>>>        /* We need to keep the ABI specific field values */
>>>>        its->baser_coll_table &= ~GITS_BASER_VALID;
>>>>        its->baser_device_table &= ~GITS_BASER_VALID;
>>>>
>>>
>>>
>>> .
>>>
>>
>



^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2019-03-30  0:56 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <e8ffbc4e-f7b7-14a1-7614-a3db85c9152f@huawei.com>
2019-03-28 17:18 ` Unexpected interrupt received in Guest OS when booting after "system_reset" Marc Zyngier
2019-03-29  1:19   ` Heyi Guo
2019-03-29  9:19     ` Heyi Guo
2019-03-29 10:54       ` Marc Zyngier
2019-03-30  0:55         ` Heyi Guo

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).