From mboxrd@z Thu Jan 1 00:00:00 1970 From: "Jan Beulich" Subject: Re: Re: Losing PS/2 Interrupts Date: Tue, 24 May 2011 10:07:27 +0100 Message-ID: <4DDB916F0200007800043155@vpn.id2.novell.com> References: <3E2050B5-59DC-4E4F-9C8D-8C04A6B465EB@gmail.com> <20110520175044.GA30367@dumpdata.com> <5D477258-8216-48BD-8A93-186E044118B9@gmail.com> <4DDA366E0200007800042C71@vpn.id2.novell.com> <1D3BFCDD-9D53-48BA-9ECD-D009AD535C2B@gmail.com> <04C6DFB0-08C8-4A8B-968F-FFE712BCABA1@gmail.com> Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: quoted-printable Return-path: In-Reply-To: Content-Disposition: inline List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xensource.com Errors-To: xen-devel-bounces@lists.xensource.com To: Stefano Stabellini Cc: Simon Graham , xen-devel@lists.xensource.com, Thomas Goetz , Konrad Rzeszutek Rzeszutek Wilk List-Id: xen-devel@lists.xenproject.org >>> On 23.05.11 at 19:28, Thomas Goetz wrote: > On May 23, 2011, at 1:16 PM, Thomas Goetz wrote: >=20 >>=20 >> On May 23, 2011, at 9:45 AM, Stefano Stabellini wrote: >>=20 >>> On Mon, 23 May 2011, Thomas Goetz wrote: >>>> My assumption is that at the point that the i8042 driver reads the = data=20 > register a new interrupt happens. There is gap in >>>> time between when the data register is read and when the event = channel=20 > pending state is cleared. Since the hypervisor >>>> ACKed the previous real interrupt before delivering it to the guest, = there=20 > is nothing to stop the i8042 device from >>>> interrupting immediately after the data register is read. If it = interrupt=20 > before the event channel pending state is >>>> cleared, then it will not be delivered to the guest and the EOI = mechanism=20 > will be set up, but I haven't found anything in >>>> that that will set up a delayed delivery of the second interrupt. >>>>=20 >>>> In this situation the i8042 device has every reason to believe the = second=20 > interrupt will be delivered. The previous >>>> interrupt was received and handled. Nothing is masked. >>>>=20 >>>> Am I missing something? >>>=20 >>>=20 >>> I am assuming you have the latest version of my fixes to >>> drivers/xen/events.c >>=20 >> I'll have a version ported from your 2.6.39 tree to my 2.6.38 tree. = I'll=20 > update my copy of your tree and make sure it's up to date. >>=20 >>>=20 >>> The problem you are describing shouldn't happen because the interrupt >>> handler returned by request_irq to i8042 is handle_edge_irq that calls >>> chip->irq_ack() before handle_irq_event(). >>=20 >> I checked on which method it is using and it's using handle_fasteoi_irq.= In=20 > fatc all of the IRQs under 16 are despite most being edge. Log snippet = below.=20 > I'm looking into why pirq_needs_eoi is returning the wrong answer now. >=20 >=20 > pirq_needs_eoi checks info->u.pirq.flags & PIRQ_NEEDS_EOI. PIRQ_NEEDS_EOI= is=20 > only set by pirq_query_unmask which sets it based on the hypercall=20 > PHYSDEVOP_irq_status_query which in Xen 4.0.1 and Xen unstable always = returns=20 > an EOI is needed. Stefano, I don't see any changes in your 2.6.39 tree = that=20 > would effect this. >=20 > Relevant code snippets included below: >=20 > if (pirq_needs_eoi(irq)) { > printk(KERN_ERR "%s: irq %d handle_fasteoi_irq\n",=20 > __FUNCTION__, irq); > set_irq_chip_and_handler_name(irq, &xen_pirq_chip, > handle_fasteoi_irq, name); > } else { > printk(KERN_ERR "%s: irq %d handle_edge_irq\n",=20 > __FUNCTION__, irq); > set_irq_chip_and_handler_name(irq, &xen_pirq_chip, > handle_edge_irq, name); > } Now this, imo, is a very good reason to not use handle_edge_irq() at all, and instead use the prior control flow (masking and clearing the event channel up front in do_upcall()) with only fasteoi (leaving aside per-CPU ones). Jan > static bool pirq_needs_eoi(unsigned irq) > { > struct irq_info *info =3D info_for_irq(irq); >=20 > BUG_ON(info->type !=3D IRQT_PIRQ); >=20 > return info->u.pirq.flags & PIRQ_NEEDS_EOI; > } >=20 > static void pirq_query_unmask(int irq) > { > struct physdev_irq_status_query irq_status; > struct irq_info *info =3D info_for_irq(irq); >=20 > BUG_ON(info->type !=3D IRQT_PIRQ); >=20 > irq_status.irq =3D pirq_from_irq(irq); > if (HYPERVISOR_physdev_op(PHYSDEVOP_irq_status_query, &irq_status= )) > irq_status.flags =3D 0; >=20 > printk(KERN_ERR "%s: irq %d needs eoi %d\n", __FUNCTION__, = irq,=20 > (irq_status.flags & XENIRQSTAT_needs_eoi) =3D=3D XENIRQSTAT_needs_eoi); >=20 > info->u.pirq.flags &=3D ~PIRQ_NEEDS_EOI; > if (irq_status.flags & XENIRQSTAT_needs_eoi) > info->u.pirq.flags |=3D PIRQ_NEEDS_EOI; > } >=20 > case PHYSDEVOP_irq_status_query: { > struct physdev_irq_status_query irq_status_query; > ret =3D -EFAULT; > if ( copy_from_guest(&irq_status_query, arg, 1) !=3D 0 ) > break; > irq =3D irq_status_query.irq; > ret =3D -EINVAL; > if ( (irq < 0) || (irq >=3D v->domain->nr_pirqs) ) > break; > irq_status_query.flags =3D 0; > /* > * Even edge-triggered or message-based IRQs can need masking = from > * time to time. If teh guest is not dynamically checking for = this > * via the new pirq_eoi_map mechanism, it must conservatively = always > * execute the EOI hypercall. In practice, this only really = makes a > * difference for maskable MSI sources, and if those are = supported > * then dom0 is probably modern anyway. > */ > irq_status_query.flags |=3D XENIRQSTAT_needs_eoi; > if ( pirq_shared(v->domain, irq) ) > irq_status_query.flags |=3D XENIRQSTAT_shared; > ret =3D copy_to_guest(arg, &irq_status_query, 1) ? -EFAULT : 0; > break; > } >=20 >=20 > --- > Tom Goetz > tcgoetz@gmail.com=20