From mboxrd@z Thu Jan 1 00:00:00 1970 From: Konrad Rzeszutek Wilk Subject: Re: radeon in dom0/ivtv in domU: irq 16 nobody cared Date: Tue, 13 Apr 2010 09:18:04 -0400 Message-ID: <20100413131804.GB16475@phenom.dumpdata.com> References: <20100408001916.GA10840@phenom.dumpdata.com> <20100408173700.GB26343@phenom.dumpdata.com> <4BBE2451.7090600@goop.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Return-path: Content-Disposition: inline In-Reply-To: <4BBE2451.7090600@goop.org> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xensource.com Errors-To: xen-devel-bounces@lists.xensource.com To: Jeremy Fitzhardinge Cc: xen-devel@lists.xensource.com, Mark Hurenkamp List-Id: xen-devel@lists.xenproject.org On Thu, Apr 08, 2010 at 11:45:37AM -0700, Jeremy Fitzhardinge wrote: > On 04/08/2010 10:37 AM, Konrad Rzeszutek Wilk wrote: > >> Yes, > >> > >> Please e-mail your full serial log output, your cat /proc/interrupts, > >> and 'lspci -vvv' output. This is to say, for both Dom0 and DomU. > >> > > I think I am able to reproduce this with one device (in DomU) that shares the IRQ > > (17) with another device that is in Dom0. In Dom0 I get: > > > > For the "nobody cared" message to trigger, then there must either have > been no interrupt handlers at all, or they all returned IRQ_NONE. > > So in theory, if irq 17 has an active driver on it, then its irq handler > should see the interrupt, poke the device, go "huh, nothing for me to > do, must be a spurious interrupt from something else sharing the irq", > and I guess return IRQ_NONE. > > So what stops this? If the irq isn't being shared with anything in > dom0, we should be careful not even map the interrupt into dom0 (though > I suspect we only ever map, never unmap, interrupts). > > But if the interrupt is being shared, I think we need a proxy interrupt > handler installed by pciback (pcistub?)to absorb apparently spurious > interrupts, which always returns IRQ_HANDLED (and perhaps have some of > its own screaming interrupt logic in case something has gone awry)? I've done ahead and made an attempt at this, but it isn't completly finished. The code is in pv/pciback-2.6.32 branch. To make it work the 'fake_irq_handler' paramater has to be set to 1. > > Or if not that, what? How has this problem been avoided before? In 2.6.18 there was logic to return IRQ_HANDLED if the IRQ line was shared with another guest. Basically this: 914 int irq_ignore_unhandled(unsigned int irq) 915 { 916 struct physdev_irq_status_query irq_status = { .irq = irq }; 917 918 if (!is_running_on_xen()) 919 return 0; 920 921 if (HYPERVISOR_physdev_op(PHYSDEVOP_irq_status_query, &irq_status)) 922 return 0; 923 return !!(irq_status.flags & XENIRQSTAT_shared); 924 } Which would be called on any spurrios interrupt and it would shortcircuit it. I tried something similar by setting up the fake IRQ handler if this hypercall returned a positive value. But the call logic in any device driver is that it first does the PCI configuration writes (enable the device, etc) and then calls request_irq which binds the interrupt to the event channel and then this above hypercall returns the shared flag. But the pciback/pcifront isn't used for request_irq so I need to figure out some mechanism to schedule this hypercall later on in Dom0 to figure out if there is a need to insert the IRQ handler. Anyhow, my test rig that has a couple of IRQ lines shared across (A Dell Dimension something) various devices and is doing something wacky with or without this patch where the interrupt lines on the IOAPIC get masked (and only if a specific IRQ line gets shared - 17) and no interrupts get sent to either Dom0 or DomU. Manually unmasking the IOAPIC starts the flow of interrupts thought it becomes a storm. Not sure if it is just faulty hardware or operator, so please consider the above code/branch completly untested.