From mboxrd@z Thu Jan 1 00:00:00 1970 From: Stefano Stabellini Subject: Re: Re: Still struggling with HVM: tx timeouts on emulated nics Date: Mon, 3 Oct 2011 18:24:53 +0100 Message-ID: References: <4E7B4768.8060103@canonical.com> <4E85883C.7030808@canonical.com> <4E85E8E8.2020702@canonical.com> <4E860382.7040108@canonical.com> Mime-Version: 1.0 Content-Type: text/plain; charset="US-ASCII" Return-path: In-Reply-To: <4E860382.7040108@canonical.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xensource.com Errors-To: xen-devel-bounces@lists.xensource.com To: Stefan Bader Cc: "xen-devel@lists.xensource.com" , Stefano Stabellini List-Id: xen-devel@lists.xenproject.org On Fri, 30 Sep 2011, Stefan Bader wrote: > Also I did not completely remove the section that would return the status > without setting needsEOI. I just changed the if condition to be <0 instead of > <=0 (I knew from the tests that the mapping was always 0 and maybe the <0 check > could be useful for something. > > irq_status_query.flags = 0; > if ( is_hvm_domain(v->domain) && > domain_pirq_to_irq(v->domain, irq) < 0 ) > { > ret = copy_to_guest(arg, &irq_status_query, 1) ? -EFAULT : 0; > break; > } > You need to remove the entire test because we want to receive notifications in all cases. > With that a quick test shows both the re-sends done sometimes and the domU doing > EOIs. And there is no stall apparent. Did the same quick test with the e1000 > emulated NIC and that still seems ok. Those were not very thorough tests but at > least I would have observed a stall pretty quick otherwise. I am glad it fixes the problem for you. I am going to send a different patch upstream for Xen 4.2, because I would also like it to cover the very unlikely scenario in which a PV guest (like dom0 or a PV guest with PCI passthrough) is loosing level interrupts because when Xen tries to set the corresponding event channel pending the bit is alreay set. The codebase is different enough that making the same change on 4.1 is non-trivial. I am appending the new patch to this email, it would be great if you could test it. You just need a 4.2 hypervisor, not the entire system. You should be able to perform the test updating only xen.gz. If you have trouble if xen-unstable.hg tip, try changeset 23843. --- diff -r bf533533046c xen/arch/x86/hvm/irq.c --- a/xen/arch/x86/hvm/irq.c Fri Sep 30 14:12:35 2011 +0000 +++ b/xen/arch/x86/hvm/irq.c Mon Oct 03 16:54:51 2011 +0000 @@ -36,7 +36,8 @@ static void assert_gsi(struct domain *d, if ( hvm_domain_use_pirq(d, pirq) ) { - send_guest_pirq(d, pirq); + if ( send_guest_pirq(d, pirq) && ioapic_gsi >= NR_ISAIRQS ) + pirq->lost++; return; } vioapic_irq_positive_edge(d, ioapic_gsi); @@ -63,6 +64,7 @@ static void __hvm_pci_intx_assert( { struct hvm_irq *hvm_irq = &d->arch.hvm_domain.irq; unsigned int gsi, link, isa_irq; + struct pirq *pirq; ASSERT((device <= 31) && (intx <= 3)); @@ -72,6 +74,11 @@ static void __hvm_pci_intx_assert( gsi = hvm_pci_intx_gsi(device, intx); if ( hvm_irq->gsi_assert_count[gsi]++ == 0 ) assert_gsi(d, gsi); + else { + pirq = pirq_info(d, domain_emuirq_to_pirq(d, gsi)); + if ( hvm_domain_use_pirq(d, pirq) ) + pirq->lost++; + } link = hvm_pci_intx_link(device, intx); isa_irq = hvm_irq->pci_link.route[link]; diff -r bf533533046c xen/arch/x86/irq.c --- a/xen/arch/x86/irq.c Fri Sep 30 14:12:35 2011 +0000 +++ b/xen/arch/x86/irq.c Mon Oct 03 16:54:51 2011 +0000 @@ -965,7 +965,11 @@ static void __do_IRQ_guest(int irq) !test_and_set_bool(pirq->masked) ) action->in_flight++; if ( !hvm_do_IRQ_dpci(d, pirq) ) - send_guest_pirq(d, pirq); + { + if ( send_guest_pirq(d, pirq) && + action->ack_type == ACKTYPE_EOI ) + pirq->lost++; + } } if ( action->ack_type != ACKTYPE_NONE ) diff -r bf533533046c xen/arch/x86/physdev.c --- a/xen/arch/x86/physdev.c Fri Sep 30 14:12:35 2011 +0000 +++ b/xen/arch/x86/physdev.c Mon Oct 03 16:54:51 2011 +0000 @@ -11,6 +11,7 @@ #include #include #include +#include #include #include #include @@ -270,6 +271,10 @@ ret_t do_physdev_op(int cmd, XEN_GUEST_H if ( !is_hvm_domain(v->domain) || domain_pirq_to_irq(v->domain, eoi.irq) > 0 ) pirq_guest_eoi(pirq); + if ( pirq->lost > 0) { + if ( !send_guest_pirq(v->domain, pirq) ) + pirq->lost--; + } spin_unlock(&v->domain->event_lock); ret = 0; break; @@ -328,9 +333,10 @@ ret_t do_physdev_op(int cmd, XEN_GUEST_H break; irq_status_query.flags = 0; if ( is_hvm_domain(v->domain) && - domain_pirq_to_irq(v->domain, irq) <= 0 ) + domain_pirq_to_irq(v->domain, irq) <= 0 && + domain_pirq_to_emuirq(v->domain, irq) == IRQ_UNBOUND ) { - ret = copy_to_guest(arg, &irq_status_query, 1) ? -EFAULT : 0; + ret = -EINVAL; break; } diff -r bf533533046c xen/include/xen/irq.h --- a/xen/include/xen/irq.h Fri Sep 30 14:12:35 2011 +0000 +++ b/xen/include/xen/irq.h Mon Oct 03 16:54:51 2011 +0000 @@ -146,6 +146,7 @@ struct pirq { int pirq; u16 evtchn; bool_t masked; + u32 lost; struct rcu_head rcu_head; struct arch_pirq arch; };