From mboxrd@z Thu Jan 1 00:00:00 1970 From: "Michael S. Tsirkin" Subject: Re: RFC: ioapic polarity vs. qemu os-x guest Date: Tue, 11 Feb 2014 21:54:44 +0200 Message-ID: <20140211195444.GB10951@redhat.com> References: <20140130204423.GK29329@ERROL.INI.CMU.EDU> <20140211182330.GC29329@ERROL.INI.CMU.EDU> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: kvm@vger.kernel.org, qemu-devel@nongnu.org, eddie.dong@intel.com, agraf@suse.de To: "Gabriel L. Somlo" Return-path: Received: from mx1.redhat.com ([209.132.183.28]:8628 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753904AbaBKTtt (ORCPT ); Tue, 11 Feb 2014 14:49:49 -0500 Content-Disposition: inline In-Reply-To: <20140211182330.GC29329@ERROL.INI.CMU.EDU> Sender: kvm-owner@vger.kernel.org List-ID: On Tue, Feb 11, 2014 at 01:23:31PM -0500, Gabriel L. Somlo wrote: > Hi, > > I'm trying to get OS X to work as a QEMU guest, and one of the few > remaining "mysteries" I need to solve is that the OS X guest hangs > during boot, waiting for its boot disk to be available, unless the > following KVM patch is applied: > > > diff --git a/virt/kvm/ioapic.c b/virt/kvm/ioapic.c > index ce9ed99..1539d37 100644 > --- a/virt/kvm/ioapic.c > +++ b/virt/kvm/ioapic.c > @@ -328,7 +328,6 @@ int kvm_ioapic_set_irq(struct kvm_ioapic *ioapic, int irq, int irq_source_id, > irq_level = __kvm_irq_line_state(&ioapic->irq_states[irq], > irq_source_id, level); > entry = ioapic->redirtbl[irq]; > - irq_level ^= entry.fields.polarity; > if (!irq_level) { > ioapic->irr &= ~mask; > ret = 1; > -- > > > After digging around the KVM source for a bit, and printk-ing things > from Windows 7, Fedora 20, and OS X (10.9), I figured out the following: > > > 1. Edge-triggered interrupts are invariably unaffected by the xor line > being removed by the patch. On all three guest types, edge-triggered > interrupts have polarity set to 0, so the xor is essentially a no-op, > and we can forget about it altogether. > > > 2. Windows and Linux always configure all level-triggered interrupts > with polarity 0 (active-high, consistent with QEMU's ACPI/DSDT, in > particular q35-acpi-dsdt.dsl, which is what I'm using with -M q35). > As such, on Windows and Linux, the xor line in question is still a > no-op. > > > 3. OS X (all versions I tried, at least since 10.5/Leopard) always > configures all level-triggered interrupts with polarity 1 (active-low), > regardless of what the QEMU DSDT says. As such, the xor line acts as > a negation of "irq_level", which at first glance sounds reasonable. > > However: when KVM negates "irq_level" due to "polarity == 1", the OS X > guest hangs during boot. > > OS X works fine when "polarity == 1" is ignored (with the xor line > commented out). > > This may be another instance (similar to how OS X didn't use to check > with CPUID regarding monitor/mwait instruction availability) where > apple devs know that any of their supported hardware advertises > active-low in the DSDT, so no need to check, just hardcode that > assumption... :) > > > 4. With s/ActiveHigh/ActiveLow/ in QEMU's q35-acpi-dsdt.dsl, Linux > actually switches to "polarity == 1" (active-low), and works fine > *with the xor line removed* !!!. With the xor line left intact (i.e. > without the above patch), the active-low fedora guest worked extremely > poorly, and printed out multiple error messages during boot: > > irq XX: nobody cared (try booting with the "irqpoll" option) > ... > Disabling IRQ #XX > > for XX in [16, 18, 19, ...]. > > > So, right now, I'm wondering about the following: > > > 1. Regarding KVM and the polarity xor line in the patch above: Does > anyone have experience with any *other* guests which insist on setting > level-triggered interrupt polarity to 1/active-low ? Is that xor line > actually doing anything useful in practice, for any other guest, on > either QEMU or any other platform ? > > > 2. Is there anything in QEMU (besides the ACPI DSDT .dsl files) which > has a hardcoded assumption re. "polarity == 0", or active-high, for > level-triggered interrupts? I tried to dig through hw/i386/kvm/ioapic.c > and a bunch of other files, but couldn't isolate anything that I could > "flip" to fix things in userspace. > > > Any ideas or suggestions about the appropriate way to move forward would > be much appreciated !!! > > > Thanks much, > --Gabriel I think changing ACPI is the right thing to do really. But we'll need to fix some things first of course. I think it's PC Q35 that has this assumption. hw/i386/pc_q35.c gsi = qemu_allocate_irqs(kvm_pc_gsi_handler, gsi_state, GSI_NUM_PINS); kvm_pc_gsi_handler simply forwards interrupts to kvm. and hw/isa/lpc_ich9.c static void ich9_lpc_update_pic(ICH9LPCState *lpc, int pic_irq) { int i, pic_level; /* The pic level is the logical OR of all the PCI irqs mapped to it */ /* The pic level is the logical OR of all the PCI irqs mapped to it * */ pic_level = 0; for (i = 0; i < ICH9_LPC_NB_PIRQS; i++) { int tmp_irq; int tmp_dis; ich9_lpc_pic_irq(lpc, i, &tmp_irq, &tmp_dis); if (!tmp_dis && pic_irq == tmp_irq) { pic_level |= pci_bus_get_irq_level(lpc->d.bus, i); } } so somewhere we need to flip it, I am guessing in ich9 along the lines of: - pic_level = 0; - pic_level |= pci_bus_get_irq_level(lpc->d.bus, i); + pic_level = 1; + pic_level &= !pci_bus_get_irq_level(lpc->d.bus, i); -- MST From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:36075) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1WDJL8-0000M4-6N for qemu-devel@nongnu.org; Tue, 11 Feb 2014 14:50:00 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1WDJL2-0002HY-3q for qemu-devel@nongnu.org; Tue, 11 Feb 2014 14:49:54 -0500 Received: from mx1.redhat.com ([209.132.183.28]:4305) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1WDJL1-0002FS-Qv for qemu-devel@nongnu.org; Tue, 11 Feb 2014 14:49:48 -0500 Date: Tue, 11 Feb 2014 21:54:44 +0200 From: "Michael S. Tsirkin" Message-ID: <20140211195444.GB10951@redhat.com> References: <20140130204423.GK29329@ERROL.INI.CMU.EDU> <20140211182330.GC29329@ERROL.INI.CMU.EDU> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20140211182330.GC29329@ERROL.INI.CMU.EDU> Subject: Re: [Qemu-devel] RFC: ioapic polarity vs. qemu os-x guest List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: "Gabriel L. Somlo" Cc: eddie.dong@intel.com, qemu-devel@nongnu.org, kvm@vger.kernel.org, agraf@suse.de On Tue, Feb 11, 2014 at 01:23:31PM -0500, Gabriel L. Somlo wrote: > Hi, > > I'm trying to get OS X to work as a QEMU guest, and one of the few > remaining "mysteries" I need to solve is that the OS X guest hangs > during boot, waiting for its boot disk to be available, unless the > following KVM patch is applied: > > > diff --git a/virt/kvm/ioapic.c b/virt/kvm/ioapic.c > index ce9ed99..1539d37 100644 > --- a/virt/kvm/ioapic.c > +++ b/virt/kvm/ioapic.c > @@ -328,7 +328,6 @@ int kvm_ioapic_set_irq(struct kvm_ioapic *ioapic, int irq, int irq_source_id, > irq_level = __kvm_irq_line_state(&ioapic->irq_states[irq], > irq_source_id, level); > entry = ioapic->redirtbl[irq]; > - irq_level ^= entry.fields.polarity; > if (!irq_level) { > ioapic->irr &= ~mask; > ret = 1; > -- > > > After digging around the KVM source for a bit, and printk-ing things > from Windows 7, Fedora 20, and OS X (10.9), I figured out the following: > > > 1. Edge-triggered interrupts are invariably unaffected by the xor line > being removed by the patch. On all three guest types, edge-triggered > interrupts have polarity set to 0, so the xor is essentially a no-op, > and we can forget about it altogether. > > > 2. Windows and Linux always configure all level-triggered interrupts > with polarity 0 (active-high, consistent with QEMU's ACPI/DSDT, in > particular q35-acpi-dsdt.dsl, which is what I'm using with -M q35). > As such, on Windows and Linux, the xor line in question is still a > no-op. > > > 3. OS X (all versions I tried, at least since 10.5/Leopard) always > configures all level-triggered interrupts with polarity 1 (active-low), > regardless of what the QEMU DSDT says. As such, the xor line acts as > a negation of "irq_level", which at first glance sounds reasonable. > > However: when KVM negates "irq_level" due to "polarity == 1", the OS X > guest hangs during boot. > > OS X works fine when "polarity == 1" is ignored (with the xor line > commented out). > > This may be another instance (similar to how OS X didn't use to check > with CPUID regarding monitor/mwait instruction availability) where > apple devs know that any of their supported hardware advertises > active-low in the DSDT, so no need to check, just hardcode that > assumption... :) > > > 4. With s/ActiveHigh/ActiveLow/ in QEMU's q35-acpi-dsdt.dsl, Linux > actually switches to "polarity == 1" (active-low), and works fine > *with the xor line removed* !!!. With the xor line left intact (i.e. > without the above patch), the active-low fedora guest worked extremely > poorly, and printed out multiple error messages during boot: > > irq XX: nobody cared (try booting with the "irqpoll" option) > ... > Disabling IRQ #XX > > for XX in [16, 18, 19, ...]. > > > So, right now, I'm wondering about the following: > > > 1. Regarding KVM and the polarity xor line in the patch above: Does > anyone have experience with any *other* guests which insist on setting > level-triggered interrupt polarity to 1/active-low ? Is that xor line > actually doing anything useful in practice, for any other guest, on > either QEMU or any other platform ? > > > 2. Is there anything in QEMU (besides the ACPI DSDT .dsl files) which > has a hardcoded assumption re. "polarity == 0", or active-high, for > level-triggered interrupts? I tried to dig through hw/i386/kvm/ioapic.c > and a bunch of other files, but couldn't isolate anything that I could > "flip" to fix things in userspace. > > > Any ideas or suggestions about the appropriate way to move forward would > be much appreciated !!! > > > Thanks much, > --Gabriel I think changing ACPI is the right thing to do really. But we'll need to fix some things first of course. I think it's PC Q35 that has this assumption. hw/i386/pc_q35.c gsi = qemu_allocate_irqs(kvm_pc_gsi_handler, gsi_state, GSI_NUM_PINS); kvm_pc_gsi_handler simply forwards interrupts to kvm. and hw/isa/lpc_ich9.c static void ich9_lpc_update_pic(ICH9LPCState *lpc, int pic_irq) { int i, pic_level; /* The pic level is the logical OR of all the PCI irqs mapped to it */ /* The pic level is the logical OR of all the PCI irqs mapped to it * */ pic_level = 0; for (i = 0; i < ICH9_LPC_NB_PIRQS; i++) { int tmp_irq; int tmp_dis; ich9_lpc_pic_irq(lpc, i, &tmp_irq, &tmp_dis); if (!tmp_dis && pic_irq == tmp_irq) { pic_level |= pci_bus_get_irq_level(lpc->d.bus, i); } } so somewhere we need to flip it, I am guessing in ich9 along the lines of: - pic_level = 0; - pic_level |= pci_bus_get_irq_level(lpc->d.bus, i); + pic_level = 1; + pic_level &= !pci_bus_get_irq_level(lpc->d.bus, i); -- MST