From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:36158) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1auxsp-0002c1-PW for qemu-devel@nongnu.org; Tue, 26 Apr 2016 03:58:13 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1auxsm-0005IM-F2 for qemu-devel@nongnu.org; Tue, 26 Apr 2016 03:58:11 -0400 Received: from mout.web.de ([212.227.17.12]:63141) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1auxsm-0005HQ-5f for qemu-devel@nongnu.org; Tue, 26 Apr 2016 03:58:08 -0400 References: <1461055122-32378-1-git-send-email-peterx@redhat.com> <571DA823.1030003@web.de> <20160425071806.GF3261@pxdev.xzpeter.org> <571DC61C.9020006@web.de> <20160426073426.GD28545@pxdev.xzpeter.org> From: Jan Kiszka Message-ID: <571F1F7F.5050604@web.de> Date: Tue, 26 Apr 2016 09:57:51 +0200 MIME-Version: 1.0 In-Reply-To: <20160426073426.GD28545@pxdev.xzpeter.org> Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [PATCH v4 00/16] IOMMU: Enable interrupt remapping for Intel IOMMU List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Peter Xu Cc: qemu-devel@nongnu.org, imammedo@redhat.com, rth@twiddle.net, ehabkost@redhat.com, jasowang@redhat.com, marcel@redhat.com, mst@redhat.com, pbonzini@redhat.com, rkrcmar@redhat.com, alex.williamson@redhat.com, wexu@redhat.com On 2016-04-26 09:34, Peter Xu wrote: > On Mon, Apr 25, 2016 at 09:24:12AM +0200, Jan Kiszka wrote: >> On 2016-04-25 09:18, Peter Xu wrote: >>> On Mon, Apr 25, 2016 at 07:16:19AM +0200, Jan Kiszka wrote: >>>> On 2016-04-19 10:38, Peter Xu wrote: >>> >>> [...] >>> >>>>> By default, IR is disabled to be better compatible with current >>>>> QEMU. To enable IR, we can using the following command to boot a >>>>> IR-supported VM with virtio-net device with vhost (still do not >>>>> support kvm-ioapic, so we need to specify kernel-irqchip={split|off} >>>>> here): >>>>> >>>>> $ qemu-system-x86_64 -M q35,iommu=on,intr=on,kernel-irqchip=split \ >>>> >>>> "intr" sounds a bit too much like "interrupt", not "interrupt >>>> remapping". Why not use the kernel's form, "intremap"? >>> >>> Sure. It sounds nice to be aligned with the kernel one. Let me take >>> it in v5. >>> >>>> >>>>> -enable-kvm -m 1024 \ >>>>> -netdev tap,id=net0,vhost=on \ >>>>> -device virtio-net-pci,netdev=user.0 \ >>>>> -monitor telnet::3333,server,nowait \ >>>>> /var/lib/libvirt/images/vm1.qcow2 >>>>> >>>>> When guest boots, we can verify whether IR enabled by grepping the >>>>> dmesg like: >>>>> >>>>> [root@localhost ~]# journalctl -k | grep "DMAR-IR" >>>>> Feb 19 11:21:23 localhost.localdomain kernel: DMAR-IR: IOAPIC id 0 under DRHD base 0xfed90000 IOMMU 0 >>>>> Feb 19 11:21:23 localhost.localdomain kernel: DMAR-IR: Enabled IRQ remapping in xapic mode >>>>> >>>>> Currently supported devices: >>>>> >>>>> - Emulated/Splitted irqchip >>>>> - Generic PCI Devices >>>>> - vhost devices >>>>> - pass through device support? Not tested, but suppose it should work. >>>> >>>> I've tested this series against my Jailhouse setup, and it works pretty >>>> well! Actually considering to move my test setup over this branch. >>> >>> This is really encouraging feedback! Btw, thanks for all kinds of >>> help on this patchset. :-) >>> >>>> >>>> However, split irqchip still has some issues: When I boot a q35 machine >>>> with Linux, the e1000 network adapter only gets a single IRQ delivered. >>>> Interestingly, other IOAPIC IRQs like the keyboard work all the time. I >>>> didn't debug this in details yet. >>> >>> I reproduced this problem. It seems that it fails even with >>> kernel-irqchip=off. Will try to dig it out. >> >> Very good. Hope it can be easily fixed. > > Hi, Jan, > > The above issue should be caused by EOI missing of level-triggered > interrupts. Before that, I was always using edge-triggered > interrupts for test, so didn't encounter this one. Would you please > help try below patch? It can be applied directly onto the series, > and should solve the issue (it works on my test vm, and I'll take it > in v5 as well if it also works for you): > Works here as well. I even made EIM working with some hack, though Jailhouse spits out strange warnings, despite it works fine (x2apic mode, split irqchip). > ------------------------- > > diff --git a/hw/intc/ioapic.c b/hw/intc/ioapic.c > index b41ab89..de6a8cf 100644 > --- a/hw/intc/ioapic.c > +++ b/hw/intc/ioapic.c > @@ -281,6 +281,36 @@ ioapic_mem_read(void *opaque, hwaddr addr, unsigned int size) > return val; > } > > +/* > + * This is to satisfy the hack in Linux kernel. One hack of it is to > + * simulate clearing the Remote IRR bit of IOAPIC entry using the > + * following: > + * > + * "For IO-APIC's with EOI register, we use that to do an explicit EOI. > + * Otherwise, we simulate the EOI message manually by changing the trigger > + * mode to edge and then back to level, with RTE being masked during > + * this." > + * > + * (See linux kernel __eoi_ioapic_pin() comment in commit c0205701) > + * > + * This is based on the assumption that, Remote IRR bit will be > + * cleared by IOAPIC hardware for edge-triggered interrupts (I > + * believe that's what the IOAPIC version 0x1X hardware does). So > + * if we are emulating it, we'd better do it the same here, so that > + * the guest kernel hack will work as well on QEMU. > + * > + * Without this, level-triggered interrupts in IR mode might fail to > + * work correctly. > + */ > +static inline void > +ioapic_fix_edge_remote_irr(uint64_t *entry) > +{ > + if (*entry & IOAPIC_LVT_TRIGGER_MODE) { > + /* Level triggered interrupts, make sure remote IRR is zero */ > + *entry &= ~((uint64_t)IOAPIC_LVT_REMOTE_IRR); > + } > +} > + > static void > ioapic_mem_write(void *opaque, hwaddr addr, uint64_t val, > unsigned int size) > @@ -314,6 +344,7 @@ ioapic_mem_write(void *opaque, hwaddr addr, uint64_t val, > s->ioredtbl[index] &= ~0xffffffffULL; > s->ioredtbl[index] |= val; > } > + ioapic_fix_edge_remote_irr(&s->ioredtbl[index]); > ioapic_service(s); > } > } > > ------------------------ > > I am still looking into guest part codes. Although the above patch > should solve the issue, there are still issues in guest codes when > IR is enabled: > > - mismatched "vector" in IOAPIC entry and IRTE entry (this is > required in vt-d spec 5.1.5.1, and required to correctly deliver > EOI broadcast I guess). See intel_irq_remapping_prepare_irte(): > > ... > /* > * IO-APIC RTE will be configured with virtual vector. > * irq handler will do the explicit EOI to the io-apic. > */ > entry->vector = info->ioapic_pin; > ... > > - I encountered that level-triggered entries in IOAPIC is marked as > edge-triggered interrupt in APIC (which is strange)... This will > also affect correct delivery of EOI broadcast. I still need time > to figure out why. > > If EOI broadcast can work, e1000 issue would be solved as > well even without above patch. > > [...] I don't remember details in this area, but maybe it's worth to look how my hacks dealt with these cause (or made Linux to not create such weird configurations). Jan