linux-pci.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Auger Eric <eric.auger@redhat.com>
To: Marc Zyngier <maz@kernel.org>,
	Shaokun Zhang <zhangshaokun@hisilicon.com>
Cc: kvmarm@lists.cs.columbia.edu,
	linux-arm-kernel@lists.infradead.org, kvm@vger.kernel.org,
	linux-pci@vger.kernel.org,
	Alex Williamson <alex.williamson@redhat.com>,
	Cornelia Huck <cohuck@redhat.com>,
	Nianyao Tang <tangnianyao@huawei.com>,
	Bjorn Helgaas <bhelgaas@google.com>
Subject: Re: Question on guest enable msi fail when using GICv4/4.1
Date: Sun, 9 May 2021 19:00:04 +0200	[thread overview]
Message-ID: <d6481eee-4318-1a56-a5a4-daf467070d22@redhat.com> (raw)
In-Reply-To: <878s4qq00u.wl-maz@kernel.org>

Hi,
On 5/7/21 1:02 PM, Marc Zyngier wrote:
> On Fri, 07 May 2021 10:58:23 +0100,
> Shaokun Zhang <zhangshaokun@hisilicon.com> wrote:
>>
>> Hi Marc,
>>
>> Thanks for your quick reply.
>>
>> On 2021/5/7 17:03, Marc Zyngier wrote:
>>> On Fri, 07 May 2021 06:57:04 +0100,
>>> Shaokun Zhang <zhangshaokun@hisilicon.com> wrote:
>>>>
>>>> [This letter comes from Nianyao Tang]
>>>>
>>>> Hi,
>>>>
>>>> Using GICv4/4.1 and msi capability, guest vf driver requires 3
>>>> vectors and enable msi, will lead to guest stuck.
>>>
>>> Stuck how?
>>
>> Guest serial does not response anymore and guest network shutdown.
>>
>>>
>>>> Qemu gets number of interrupts from Multiple Message Capable field
>>>> set by guest. This field is aligned to a power of 2(if a function
>>>> requires 3 vectors, it initializes it to 2).
>>>
>>> So I guess this is a MultiMSI device with 4 vectors, right?
>>>
>>
>> Yes, it can support maximum of 32 msi interrupts, and vf driver only use 3 msi.
>>
>>>> However, guest driver just sends 3 mapi-cmd to vits and 3 ite
>>>> entries is recorded in host.  Vfio initializes msi interrupts using
>>>> the number of interrupts 4 provide by qemu.  When it comes to the
>>>> 4th msi without ite in vits, in irq_bypass_register_producer,
>>>> producer and consumer will __connect fail, due to find_ite fail, and
>>>> do not resume guest.
>>>
>>> Let me rephrase this to check that I understand it:
>>> - The device has 4 vectors
>>> - The guest only create mappings for 3 of them
>>> - VFIO calls kvm_vgic_v4_set_forwarding() for each vector
>>> - KVM doesn't have a mapping for the 4th vector and returns an error
>>> - VFIO disable this 4th vector
>>>
>>> Is that correct? If yes, I don't understand why that impacts the guest
>>> at all. From what I can see, vfio_msi_set_vector_signal() just prints
>>> a message on the console and carries on.
>>>
>>
>> function calls:
>> --> vfio_msi_set_vector_signal
>>    --> irq_bypass_register_producer
>>       -->__connect
>>
>> in __connect, add_producer finally calls kvm_vgic_v4_set_forwarding
>> and fails to get the 4th mapping. When add_producer fail, it does
>> not call cons->start, calls kvm_arch_irq_bypass_start and then
>> kvm_arm_resume_guest.
> 
> [+Eric, who wrote the irq_bypass infrastructure.]
> 
> Ah, so the guest is actually paused, not in a livelock situation
> (which is how I interpreted "stuck").
> 
> I think we should handle this case gracefully, as there should be no
> expectation that the guest will be using this interrupt. Given that
> VFIO seems to be pretty unfazed when a producer fails, I'm temped to
> do the same thing and restart the guest.
> 
> Also, __disconnect doesn't care about errors, so why should __connect
> have this odd behaviour?

_disconnect() does not care as we should always succeed tearing off
things. del_* ops are void functions. On the opposite we can fail
setting up the bypass.

Effectively
a979a6aa009f ("irqbypass: do not start cons/prod when failed connect")
needs to be reverted.

I agree the kerneldoc comments in linux/irqbypass.h may be improved to
better explain the role of stop/start cbs and warn about their potential
global impact.

wrt the case above, "in __connect, add_producer finally calls
kvm_vgic_v4_set_forwarding and fails to get the 4th mapping", shouldn't
we succeed in that case?

Thanks

Eric

> 
> Can you please try this? It is completely untested (and I think the
> del_consumer call is odd, which is why I've also dropped it).
> 
> Eric, what do you think?
> 
> Thanks,
> 
> 	M.
> 
> diff --git a/virt/lib/irqbypass.c b/virt/lib/irqbypass.c
> index c9bb3957f58a..7e1865e15668 100644
> --- a/virt/lib/irqbypass.c
> +++ b/virt/lib/irqbypass.c
> @@ -40,21 +40,14 @@ static int __connect(struct irq_bypass_producer *prod,
>  	if (prod->add_consumer)
>  		ret = prod->add_consumer(prod, cons);
>  
> -	if (ret)
> -		goto err_add_consumer;
> -
> -	ret = cons->add_producer(cons, prod);
> -	if (ret)
> -		goto err_add_producer;
> +	if (!ret)
> +		ret = cons->add_producer(cons, prod);
>  
>  	if (cons->start)
>  		cons->start(cons);
>  	if (prod->start)
>  		prod->start(prod);
> -err_add_producer:
> -	if (prod->del_consumer)
> -		prod->del_consumer(prod, cons);
> -err_add_consumer:
> +
>  	return ret;
>  }
>  
> 


  parent reply	other threads:[~2021-05-09 17:00 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-05-07  5:57 Question on guest enable msi fail when using GICv4/4.1 Shaokun Zhang
2021-05-07  9:03 ` Marc Zyngier
2021-05-07  9:58   ` Shaokun Zhang
2021-05-07 11:02     ` Marc Zyngier
     [not found]       ` <874kfepht4.wl-maz@kernel.org>
2021-05-08  1:51         ` Jason Wang
2021-05-08  6:56           ` Zhu, Lingshan
2021-05-08  9:15           ` Marc Zyngier
2021-05-09 17:00       ` Auger Eric [this message]
2021-05-10  7:49         ` Marc Zyngier
2021-05-10  8:29           ` Auger Eric
2021-05-10  9:59             ` Marc Zyngier

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=d6481eee-4318-1a56-a5a4-daf467070d22@redhat.com \
    --to=eric.auger@redhat.com \
    --cc=alex.williamson@redhat.com \
    --cc=bhelgaas@google.com \
    --cc=cohuck@redhat.com \
    --cc=kvm@vger.kernel.org \
    --cc=kvmarm@lists.cs.columbia.edu \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=maz@kernel.org \
    --cc=tangnianyao@huawei.com \
    --cc=zhangshaokun@hisilicon.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).