* PCI, ACPI, IRQ, IOAPIC: reroute PCI interrupt to legacy boot interrupt equivalent @ 2009-01-09 23:03 Len Brown 2009-01-12 11:09 ` Stefan Assmann 0 siblings, 1 reply; 30+ messages in thread From: Len Brown @ 2009-01-09 23:03 UTC (permalink / raw) To: Stefan Assmann, Ingo Molnar, Bjorn Helgaas, Jesse Barnes, Olaf Dabrunz, Thomas Gleixner, Steven Rostedt Cc: Linux Kernel Mailing List, linux-acpi Stefan, I had to exclude your changes to drivers/acpi/pci_irq.c from e1d3a90846b40ad3160bf4b648d36c6badad39ac in order to get some other changes to that file upstream in the 2.6.29 merge window. I left the other parts of the quirk intact - so at the moment on one of the quirked machines, you'll see PCI quirk: reroute interrupts for... but will not see pci irq %d -> rerouted to legacy as the quirk is effectively disabled. I had difficulty trying to port this patch to the new pci_irq.c because fundamentally I don't understand what it is trying to do, and why. The quirk is specific to Intel chipsets, so with all the Linux guys now working at Intel, I'm hopeful that we can reach a clear understanding of the issue and a consensus on the proper fix. BTW. I'm not excited about how the original patch drops a chipset specific workaround inside the ACPI code to go behind the mirrors and lie about what ACPI returns. I'm hopeful that a better place for the workaround can be found if this is the approach we need to take.. Can you help us understand what the failure is? thanks, -Len ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: PCI, ACPI, IRQ, IOAPIC: reroute PCI interrupt to legacy boot interrupt equivalent 2009-01-09 23:03 PCI, ACPI, IRQ, IOAPIC: reroute PCI interrupt to legacy boot interrupt equivalent Len Brown @ 2009-01-12 11:09 ` Stefan Assmann 2009-01-12 11:37 ` Ingo Molnar ` (2 more replies) 0 siblings, 3 replies; 30+ messages in thread From: Stefan Assmann @ 2009-01-12 11:09 UTC (permalink / raw) To: Len Brown Cc: Ingo Molnar, Bjorn Helgaas, Jesse Barnes, Olaf Dabrunz, Thomas Gleixner, Steven Rostedt, Linux Kernel Mailing List, linux-acpi, Sven Dietrich Hi Len, Len Brown wrote: > Stefan, > I had to exclude your changes to drivers/acpi/pci_irq.c from > e1d3a90846b40ad3160bf4b648d36c6badad39ac > in order to get some other changes to that file upstream in the > 2.6.29 merge window. > > I left the other parts of the quirk intact - so at the moment > on one of the quirked machines, you'll see > > PCI quirk: reroute interrupts for... > > but will not see > > pci irq %d -> rerouted to legacy > > as the quirk is effectively disabled. > > I had difficulty trying to port this patch to the new pci_irq.c > because fundamentally I don't understand what it is trying > to do, and why. Let me try to give you a short overview of what's happening there. If an IRQ arrives at line X of a non-primary IO-APIC and that line is masked a new IRQ will be generated on the primary IO-APIC/PIC. This is called a "Boot Interrupt" by Intel. It's purpose is, as the name suggests, to ensure that the IRQ is handled at boot time (when the non-primary) IO-APIC is still disabled. Condition to be met for "Boot Interrupts": - line X on non-primary IO-APIC interrupt line is masked - line X is asserted This behavior is not necessary during normal operation as the IRQ is handled by the non-primary IO-APIC itself. Now imagine what happens if these Boot Interrupts would occur during normal operation. You'd see spurious IRQs on your primary IO-APIC which, in the worst case, will bring down the interrupt line they occur on! Every device that shares this interrupt line will fail when this happens. Why can these IRQ lines be brought down by Boot Interrupts? Because there's no handler installed on the primary IO-APIC IRQ line that can take care of them and after too many unhandled IRQs the line will be shut down by the kernel. What this quirk does: It installs the interrupt handler on the primary IO-APICs interrupt line instead of the (original) non-primary IO-APICs interrupt line, keeping the original interrupt line masked. This guarantees that for every IRQ arriving at the non-primary IO-APIC a Boot Interrupt is generated _and_ handled properly. Note: You need this quirk if you mask your interrupts during handling. > The quirk is specific to Intel chipsets, so with all the > Linux guys now working at Intel, I'm hopeful that we can > reach a clear understanding of the issue and a consensus > on the proper fix. > > BTW. I'm not excited about how the original patch > drops a chipset specific workaround inside the ACPI code > to go behind the mirrors and lie about what ACPI returns. > I'm hopeful that a better place for the workaround > can be found if this is the approach we need to take.. Yes, I agree with you and I'm totally open for discussing a better place for this quirk if you find any. You see the problem here is to "move" the interrupt handler from one line to another and so far we haven't found a better way to do this. If I'm not mistaken this technically is pretty similar to what this new "derive an IRQ for this device from a parent bridge" does. > > Can you help us understand what the failure is? I hope this explains a little bit what this is all about. In case your looking for more in-depth information about the interrupt routing I'd suggest to have a look for example at the Intel® 6700PXH 64-bit PCI Hub Datasheet, especially chapter 2.15 I/OxAPIC Interrupt Controller. Feel free to ask any further questions that might arise. Taking a look at the new pci_irq.c code now, could you tell me where exactly you're seeing trouble with the quirk. It shouldn't be too troublesome to put it in again, but let me have a look and thanks for your efforts. > thanks, > -Len > Stefan -- Stefan Assmann | SUSE LINUX Products GmbH Software Engineer | Maxfeldstr. 5, D-90409 Nuernberg Mail: sassmann@suse.de | GF: Markus Rex, HRB 16746 (AG Nuernberg) ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: PCI, ACPI, IRQ, IOAPIC: reroute PCI interrupt to legacy boot interrupt equivalent 2009-01-12 11:09 ` Stefan Assmann @ 2009-01-12 11:37 ` Ingo Molnar 2009-01-12 18:51 ` Bjorn Helgaas 2009-01-13 8:25 ` Shaohua Li 2 siblings, 0 replies; 30+ messages in thread From: Ingo Molnar @ 2009-01-12 11:37 UTC (permalink / raw) To: Stefan Assmann Cc: Len Brown, Bjorn Helgaas, Jesse Barnes, Olaf Dabrunz, Thomas Gleixner, Steven Rostedt, Linux Kernel Mailing List, linux-acpi, Sven Dietrich * Stefan Assmann <sassmann@suse.de> wrote: > Why can these IRQ lines be brought down by Boot Interrupts? Because > there's no handler installed on the primary IO-APIC IRQ line that can > take care of them and after too many unhandled IRQs the line will be > shut down by the kernel. The failure mode can be quite nasty: ranging from non-working USB (and other) devices to hard lockups due to screaming IRQs. Ingo ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: PCI, ACPI, IRQ, IOAPIC: reroute PCI interrupt to legacy boot interrupt equivalent 2009-01-12 11:09 ` Stefan Assmann 2009-01-12 11:37 ` Ingo Molnar @ 2009-01-12 18:51 ` Bjorn Helgaas 2009-01-12 19:25 ` Jon Masters ` (2 more replies) 2009-01-13 8:25 ` Shaohua Li 2 siblings, 3 replies; 30+ messages in thread From: Bjorn Helgaas @ 2009-01-12 18:51 UTC (permalink / raw) To: Stefan Assmann Cc: Len Brown, Ingo Molnar, Jesse Barnes, Olaf Dabrunz, Thomas Gleixner, Steven Rostedt, Linux Kernel Mailing List, linux-acpi, Sven Dietrich, Eric W. Biederman, Maciej W. Rozycki, Jon Masters (I added Eric, Maciej, and Jon because they participated in previous discussion here: http://lkml.org/lkml/2008/6/2/269) On Monday 12 January 2009 04:09:25 am Stefan Assmann wrote: > Len Brown wrote: > > Stefan, > > I had to exclude your changes to drivers/acpi/pci_irq.c from > > e1d3a90846b40ad3160bf4b648d36c6badad39ac > > in order to get some other changes to that file upstream in the > > 2.6.29 merge window. ... > > Let me try to give you a short overview of what's happening there. > > If an IRQ arrives at line X of a non-primary IO-APIC and that line is > masked a new IRQ will be generated on the primary IO-APIC/PIC. This is > called a "Boot Interrupt" by Intel. It's purpose is, as the name > suggests, to ensure that the IRQ is handled at boot time (when the > non-primary) IO-APIC is still disabled. > > Condition to be met for "Boot Interrupts": > - line X on non-primary IO-APIC interrupt line is masked > - line X is asserted Thanks. Let me replay this to see whether I understand. Please correct any of my misapprehensions :-) Since your patch doesn't look at the IOxAPIC RDL register to see whether the pin is masked, you must be assuming that Linux is *always* using boot interrupts, and never unmasking the non-primary IOxAPIC entry. It looks like for each PCI bus, the 6700PXH contains a 24-input IOxAPIC with 16 of the inputs available for PCI interrupts. The boot interrupt feature generates only INTA-INTD messages (total of four choices). I think that means your patch forces sharing, e.g. pins 0, 4, 8, and 12 all generate INTA on PCI Express, so they will share the same IRQ. If the non-primary IOxAPIC entries were unmasked, the boot interrupt would not be generated, so those pins could all have separate IRQs. The 6700PXH boot interrupt behavior doesn't seem to be configurable, so it should work the same even with "acpi=off", so I think we would have a similar issue in the pirq_enable_irq() path. I was about to ask why you have devices generating interrupts while their APIC entry is masked, but the discussion starting with Eric's response in this thread: http://lkml.org/lkml/2008/6/2/269 suggests that the RT kernel masks APIC entries in the course of normal operation, while the device can still be generating interrupts. And these boot interrupts would certainly be a surprising side-effect of masking an APIC entry, even in a non-RT kernel. Basically, the interrupt routing changes depending on whether the APIC entry is masked. That seems pretty ugly, and I can't think of any nice way to describe that behavior via ACPI. It seems sub-optimal to add a constraint that we can't ever mask or unmask that APIC entry. What if we installed an extra ISR for the boot interrupt that just invoked the ISR for each device on each of the APIC pins that can generate that boot interrupt? I.e., for the 6700PXH case, install an ISR for INTA, and have that ISR call the ISRs for all the devices on pins 0, 4, 8, and 12. Then the normal case would be that the APIC entry is unmasked, we don't share the IRQ, and the ISR is called directly. But when the APIC entry is masked, we take the boot interrupt and pay the penalty of calling some extra ISRs. Bjorn > This behavior is not necessary during normal operation as the IRQ is > handled by the non-primary IO-APIC itself. Now imagine what happens if > these Boot Interrupts would occur during normal operation. You'd see > spurious IRQs on your primary IO-APIC which, in the worst case, will > bring down the interrupt line they occur on! Every device that shares this > interrupt line will fail when this happens. > > Why can these IRQ lines be brought down by Boot Interrupts? Because > there's no handler installed on the primary IO-APIC IRQ line that can > take care of them and after too many unhandled IRQs the line will be shut > down by the kernel. > > What this quirk does: > It installs the interrupt handler on the primary IO-APICs interrupt line > instead of the (original) non-primary IO-APICs interrupt line, keeping > the original interrupt line masked. This guarantees that for every IRQ > arriving at the non-primary IO-APIC a Boot Interrupt is generated _and_ > handled properly. > > Note: You need this quirk if you mask your interrupts during handling. > > > The quirk is specific to Intel chipsets, so with all the > > Linux guys now working at Intel, I'm hopeful that we can > > reach a clear understanding of the issue and a consensus > > on the proper fix. > > > > BTW. I'm not excited about how the original patch > > drops a chipset specific workaround inside the ACPI code > > to go behind the mirrors and lie about what ACPI returns. > > I'm hopeful that a better place for the workaround > > can be found if this is the approach we need to take.. > > Yes, I agree with you and I'm totally open for discussing a better place > for this quirk if you find any. You see the problem here is to "move" the > interrupt handler from one line to another and so far we haven't found a > better way to do this. If I'm not mistaken this technically is pretty > similar to what this new "derive an IRQ for this device from a parent > bridge" does. > > > Can you help us understand what the failure is? > > I hope this explains a little bit what this is all about. In case your > looking for more in-depth information about the interrupt routing I'd > suggest to have a look for example at the Intel® 6700PXH 64-bit PCI Hub > Datasheet, especially chapter 2.15 I/OxAPIC Interrupt Controller. Feel > free to ask any further questions that might arise. > > Taking a look at the new pci_irq.c code now, could you tell me where > exactly you're seeing trouble with the quirk. It shouldn't be too > troublesome to put it in again, but let me have a look and thanks for > your efforts. > > > thanks, > > -Len > > Stefan ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: PCI, ACPI, IRQ, IOAPIC: reroute PCI interrupt to legacy boot interrupt equivalent 2009-01-12 18:51 ` Bjorn Helgaas @ 2009-01-12 19:25 ` Jon Masters 2009-01-12 19:45 ` Bjorn Helgaas 2009-01-13 13:32 ` Stefan Assmann 2009-01-12 23:36 ` Eric W. Biederman 2009-01-13 11:18 ` Stefan Assmann 2 siblings, 2 replies; 30+ messages in thread From: Jon Masters @ 2009-01-12 19:25 UTC (permalink / raw) To: Bjorn Helgaas Cc: Stefan Assmann, Len Brown, Ingo Molnar, Jesse Barnes, Olaf Dabrunz, Thomas Gleixner, Steven Rostedt, Linux Kernel Mailing List, linux-acpi, Sven Dietrich, Eric W. Biederman, Maciej W. Rozycki On Mon, 2009-01-12 at 11:51 -0700, Bjorn Helgaas wrote: > (I added Eric, Maciej, and Jon because they participated in > previous discussion here: http://lkml.org/lkml/2008/6/2/269) Thanks. You know what I'd really like even more than being on the CC? I'd *love* someone to post a link to documentation on how this actually is supposed to work. We had to guess last time because none of the public documentation actually explains this. The guys at SuSE likely received some docs, but I'm not sure where from or the title thereof. If we all knew how this was supposed to work then we might have a much better likelihood of fixing this behavior. It's only going to get worse over time - we want to get threaded IRQs upstream (I'm about to be poking at that again over here) and that'll mean mainline has to learn to deal with these boot interrupts just as much as RT does today. Jon. ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: PCI, ACPI, IRQ, IOAPIC: reroute PCI interrupt to legacy boot interrupt equivalent 2009-01-12 19:25 ` Jon Masters @ 2009-01-12 19:45 ` Bjorn Helgaas 2009-01-13 13:32 ` Stefan Assmann 1 sibling, 0 replies; 30+ messages in thread From: Bjorn Helgaas @ 2009-01-12 19:45 UTC (permalink / raw) To: Jon Masters Cc: Stefan Assmann, Len Brown, Ingo Molnar, Jesse Barnes, Olaf Dabrunz, Thomas Gleixner, Steven Rostedt, Linux Kernel Mailing List, linux-acpi, Sven Dietrich, Eric W. Biederman, Maciej W. Rozycki On Monday 12 January 2009 12:25:49 pm Jon Masters wrote: > On Mon, 2009-01-12 at 11:51 -0700, Bjorn Helgaas wrote: > > (I added Eric, Maciej, and Jon because they participated in > > previous discussion here: http://lkml.org/lkml/2008/6/2/269) > > Thanks. You know what I'd really like even more than being on the CC? > I'd *love* someone to post a link to documentation on how this actually > is supposed to work. I don't work for Intel, so I don't have any internal knowledge. All I know is what I read in section 2.15 of the 6700PXH datasheet here: http://www.intel.com/Assets/PDF/datasheet/302628.pdf ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: PCI, ACPI, IRQ, IOAPIC: reroute PCI interrupt to legacy boot interrupt equivalent 2009-01-12 19:25 ` Jon Masters 2009-01-12 19:45 ` Bjorn Helgaas @ 2009-01-13 13:32 ` Stefan Assmann 2009-01-13 18:22 ` Olaf Dabrunz 1 sibling, 1 reply; 30+ messages in thread From: Stefan Assmann @ 2009-01-13 13:32 UTC (permalink / raw) To: Jon Masters Cc: Bjorn Helgaas, Len Brown, Ingo Molnar, Jesse Barnes, Olaf Dabrunz, Thomas Gleixner, Steven Rostedt, Linux Kernel Mailing List, linux-acpi, Sven Dietrich, Eric W. Biederman, Maciej W. Rozycki Hi Jon, Jon Masters wrote: > On Mon, 2009-01-12 at 11:51 -0700, Bjorn Helgaas wrote: >> (I added Eric, Maciej, and Jon because they participated in >> previous discussion here: http://lkml.org/lkml/2008/6/2/269) > > Thanks. You know what I'd really like even more than being on the CC? > I'd *love* someone to post a link to documentation on how this actually > is supposed to work. We had to guess last time because none of the > public documentation actually explains this. The guys at SuSE likely > received some docs, but I'm not sure where from or the title thereof. Actually, most of the Boot Interrupt patches resulted from reading the intel specs and observing the behavior of failing machines. We're trying to wrap up all the information gathered in a paper, which is pretty time consuming and a few steps away from being ready to publish. > If we all knew how this was supposed to work then we might have a much > better likelihood of fixing this behavior. It's only going to get worse > over time - we want to get threaded IRQs upstream (I'm about to be > poking at that again over here) and that'll mean mainline has to learn > to deal with these boot interrupts just as much as RT does today. At the moment I'm uploading the slides we showed at the 10th Real-Time Linux Workshop, which are a small, stripped down excerpt of our upcoming paper. It should be available soon at: ftp://ftp.suse.com/pub/people/sassmann/publication/boot_irq_quirks_rtlws10.pdf Meanwhile I'd suggest to have a look at United States Patent 6466998. That is intels patent for this interrupt routing mechanism. Hope this is helpful, we'll try to make more information available asap. > > Jon. > Stefan -- Stefan Assmann | SUSE LINUX Products GmbH Software Engineer | Maxfeldstr. 5, D-90409 Nuernberg Mail: sassmann@suse.de | GF: Markus Rex, HRB 16746 (AG Nuernberg) ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: PCI, ACPI, IRQ, IOAPIC: reroute PCI interrupt to legacy boot interrupt equivalent 2009-01-13 13:32 ` Stefan Assmann @ 2009-01-13 18:22 ` Olaf Dabrunz 2009-01-15 15:34 ` Olaf Dabrunz 0 siblings, 1 reply; 30+ messages in thread From: Olaf Dabrunz @ 2009-01-13 18:22 UTC (permalink / raw) To: Stefan Assmann Cc: Jon Masters, Bjorn Helgaas, Len Brown, Ingo Molnar, Jesse Barnes, Olaf Dabrunz, Thomas Gleixner, Steven Rostedt, Linux Kernel Mailing List, linux-acpi, Sven Dietrich, Eric W. Biederman, Maciej W. Rozycki On 13-Jan-09, Stefan Assmann wrote: > Hi Jon, > > Jon Masters wrote: > > On Mon, 2009-01-12 at 11:51 -0700, Bjorn Helgaas wrote: > >> (I added Eric, Maciej, and Jon because they participated in > >> previous discussion here: http://lkml.org/lkml/2008/6/2/269) > > > > Thanks. You know what I'd really like even more than being on the CC? > > I'd *love* someone to post a link to documentation on how this actually > > is supposed to work. We had to guess last time because none of the > > public documentation actually explains this. The guys at SuSE likely > > received some docs, but I'm not sure where from or the title thereof. > > Actually, most of the Boot Interrupt patches resulted from reading the > intel specs and observing the behavior of failing machines. We're trying > to wrap up all the information gathered in a paper, which is pretty time > consuming and a few steps away from being ready to publish. We read the specs of the chips, analyzed code, put up hypotheses and tested them. Then read the specs again, finding new hypotheses and tested again. Also reading what you guys sent out to lkml and reading through the public discussions for *BSD, Darwin, XEN etc. helped. And some colleagues discussed hypotheses and solution attempts with us. We summarized many of our end results in a presentation that we gave internally, and I am working on making this available online (in addition to the paper/slides that Stefan mentions below). The "real" paper still needs much work. > > If we all knew how this was supposed to work then we might have a much > > better likelihood of fixing this behavior. It's only going to get worse > > over time - we want to get threaded IRQs upstream (I'm about to be > > poking at that again over here) and that'll mean mainline has to learn > > to deal with these boot interrupts just as much as RT does today. > > At the moment I'm uploading the slides we showed at the 10th > Real-Time Linux Workshop, which are a small, stripped down excerpt of > our upcoming paper. > > It should be available soon at: > ftp://ftp.suse.com/pub/people/sassmann/publication/boot_irq_quirks_rtlws10.pdf > > Meanwhile I'd suggest to have a look at United States Patent 6466998. > That is intels patent for this interrupt routing mechanism. > > Hope this is helpful, we'll try to make more information available asap. > > > > > Jon. > > > > Stefan > > -- > Stefan Assmann | SUSE LINUX Products GmbH > Software Engineer | Maxfeldstr. 5, D-90409 Nuernberg > Mail: sassmann@suse.de | GF: Markus Rex, HRB 16746 (AG Nuernberg) -- Olaf Dabrunz (od/odabrunz), SUSE Linux Products GmbH, Nürnberg ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: PCI, ACPI, IRQ, IOAPIC: reroute PCI interrupt to legacy boot interrupt equivalent 2009-01-13 18:22 ` Olaf Dabrunz @ 2009-01-15 15:34 ` Olaf Dabrunz 0 siblings, 0 replies; 30+ messages in thread From: Olaf Dabrunz @ 2009-01-15 15:34 UTC (permalink / raw) To: Stefan Assmann, Jon Masters, Bjorn Helgaas, Len Brown, Ingo Molnar, Jesse Barnes, Thomas Gleixner, Steven Rostedt, Linux Kernel Mailing List, linux-acpi, Sven Dietrich, Eric W. Biederman, Maciej W. Rozycki On 13-Jan-09, Olaf Dabrunz wrote: > On 13-Jan-09, Stefan Assmann wrote: > > Hi Jon, > > > > Jon Masters wrote: > > > On Mon, 2009-01-12 at 11:51 -0700, Bjorn Helgaas wrote: > > >> (I added Eric, Maciej, and Jon because they participated in > > >> previous discussion here: http://lkml.org/lkml/2008/6/2/269) > > > > > > Thanks. You know what I'd really like even more than being on the CC? > > > I'd *love* someone to post a link to documentation on how this actually > > > is supposed to work. We had to guess last time because none of the > > > public documentation actually explains this. The guys at SuSE likely > > > received some docs, but I'm not sure where from or the title thereof. > > > > Actually, most of the Boot Interrupt patches resulted from reading the > > intel specs and observing the behavior of failing machines. We're trying > > to wrap up all the information gathered in a paper, which is pretty time > > consuming and a few steps away from being ready to publish. > > We read the specs of the chips, analyzed code, put up hypotheses and > tested them. Then read the specs again, finding new hypotheses and > tested again. Also reading what you guys sent out to lkml and reading > through the public discussions for *BSD, Darwin, XEN etc. helped. > And some colleagues discussed hypotheses and solution attempts with us. > > We summarized many of our end results in a presentation that we gave > internally, and I am working on making this available online (in > addition to the paper/slides that Stefan mentions below). The presentation can be found at http://www.suse.de/~odabrunz/Boot_Interrupts_and_IRQ_Threads.pdf Some parts of it have been corrected and updated. Please do not miss the "Details" slides after the License and the Disclaimer. The Details should answer many questions. Thanks, -- Olaf Dabrunz (od/odabrunz), SUSE Linux Products GmbH, Nürnberg ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: PCI, ACPI, IRQ, IOAPIC: reroute PCI interrupt to legacy boot interrupt equivalent 2009-01-12 18:51 ` Bjorn Helgaas 2009-01-12 19:25 ` Jon Masters @ 2009-01-12 23:36 ` Eric W. Biederman 2009-01-13 0:29 ` Jon Masters 2009-01-13 11:18 ` Stefan Assmann 2 siblings, 1 reply; 30+ messages in thread From: Eric W. Biederman @ 2009-01-12 23:36 UTC (permalink / raw) To: Bjorn Helgaas Cc: Stefan Assmann, Len Brown, Ingo Molnar, Jesse Barnes, Olaf Dabrunz, Thomas Gleixner, Steven Rostedt, Linux Kernel Mailing List, linux-acpi, Sven Dietrich, Maciej W. Rozycki, Jon Masters I don't get it. Why are we trying to do such a stupid thing? This hardware behavior is not specific to boot interrupts or Intel. This is the classic x86 ioapic behaviour of redirecting a ioapic irq into a legacy irq when the ioapic entry is disabled. If you really want not to have problems ensure all irqs 0-15 are disabled, and not needed. Otherwise you are taking the chance on something like this happening. Disabling irqs generically appears to be a crap shoot, and not on a path hardware vendors look at or care about heavily. Disabling an irq in hardware on every interrupt, increasing the cost of the interrupt and walking down these neglected hardware paths seems stupid. Especially when the interrupt line might be shared and we can be disabling several devices at once. Is this case really so interesting and compelling that we want to fight through and figure what we need to do to make this work reliably on every x86 chipset? Eric ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: PCI, ACPI, IRQ, IOAPIC: reroute PCI interrupt to legacy boot interrupt equivalent 2009-01-12 23:36 ` Eric W. Biederman @ 2009-01-13 0:29 ` Jon Masters 2009-01-13 1:47 ` Ingo Molnar 0 siblings, 1 reply; 30+ messages in thread From: Jon Masters @ 2009-01-13 0:29 UTC (permalink / raw) To: Eric W. Biederman Cc: Bjorn Helgaas, Stefan Assmann, Len Brown, Ingo Molnar, Jesse Barnes, Olaf Dabrunz, Thomas Gleixner, Steven Rostedt, Linux Kernel Mailing List, linux-acpi, Sven Dietrich, Maciej W. Rozycki On Mon, 2009-01-12 at 15:36 -0800, Eric W. Biederman wrote: > This hardware behavior is not specific to boot interrupts or Intel. It's not specific to Intel, but it is a specific compatibility behavior. > Is this case really so interesting and compelling that we want to fight > through and figure what we need to do to make this work reliably on every > x86 chipset? How else do you propose implementing IRQ handling in e.g. the RT kernel? We get a hardware interrupt, we can't FastEOI, we can't process synchronously, we can't do all of those things you might expect. Implementing RT requires that we delay handling of the IRQ until arbitrarily later in the future when we get around to it. Jon. ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: PCI, ACPI, IRQ, IOAPIC: reroute PCI interrupt to legacy boot interrupt equivalent 2009-01-13 0:29 ` Jon Masters @ 2009-01-13 1:47 ` Ingo Molnar 2009-01-13 3:47 ` Eric W. Biederman 0 siblings, 1 reply; 30+ messages in thread From: Ingo Molnar @ 2009-01-13 1:47 UTC (permalink / raw) To: Jon Masters Cc: Eric W. Biederman, Bjorn Helgaas, Stefan Assmann, Len Brown, Jesse Barnes, Olaf Dabrunz, Thomas Gleixner, Steven Rostedt, Linux Kernel Mailing List, linux-acpi, Sven Dietrich, Maciej W. Rozycki * Jon Masters <jcm@redhat.com> wrote: > On Mon, 2009-01-12 at 15:36 -0800, Eric W. Biederman wrote: > > > This hardware behavior is not specific to boot interrupts or Intel. > > It's not specific to Intel, but it is a specific compatibility behavior. > > > Is this case really so interesting and compelling that we want to > > fight through and figure what we need to do to make this work reliably > > on every x86 chipset? > > How else do you propose implementing IRQ handling in e.g. the RT kernel? > We get a hardware interrupt, we can't FastEOI, we can't process > synchronously, we can't do all of those things you might expect. > Implementing RT requires that we delay handling of the IRQ until > arbitrarily later in the future when we get around to it. a number of mainline drivers also mask/unmask irqs from within the IRQ handler. It's not particularly smart in a native driver, but can happen - and if we get an active line after that point (and this can happen because the driver is active), we are in trouble. Ingo ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: PCI, ACPI, IRQ, IOAPIC: reroute PCI interrupt to legacy boot interrupt equivalent 2009-01-13 1:47 ` Ingo Molnar @ 2009-01-13 3:47 ` Eric W. Biederman 2009-01-13 4:26 ` Jon Masters 0 siblings, 1 reply; 30+ messages in thread From: Eric W. Biederman @ 2009-01-13 3:47 UTC (permalink / raw) To: Ingo Molnar Cc: Jon Masters, Bjorn Helgaas, Stefan Assmann, Len Brown, Jesse Barnes, Olaf Dabrunz, Thomas Gleixner, Steven Rostedt, Linux Kernel Mailing List, linux-acpi, Sven Dietrich, Maciej W. Rozycki Ingo Molnar <mingo@elte.hu> writes: > * Jon Masters <jcm@redhat.com> wrote: > >> On Mon, 2009-01-12 at 15:36 -0800, Eric W. Biederman wrote: >> >> > This hardware behavior is not specific to boot interrupts or Intel. >> >> It's not specific to Intel, but it is a specific compatibility behavior. >> >> > Is this case really so interesting and compelling that we want to >> > fight through and figure what we need to do to make this work reliably >> > on every x86 chipset? >> >> How else do you propose implementing IRQ handling in e.g. the RT kernel? >> We get a hardware interrupt, we can't FastEOI, we can't process >> synchronously, we can't do all of those things you might expect. >> Implementing RT requires that we delay handling of the IRQ until >> arbitrarily later in the future when we get around to it. > > a number of mainline drivers also mask/unmask irqs from within the IRQ > handler. It's not particularly smart in a native driver, but can happen - > and if we get an active line after that point (and this can happen because > the driver is active), we are in trouble. Yep. Right now it might be simpler to fix the mainline drivers. If we can sit down and write some nice clean obviously correct patches I am all for fixing this bug, possibly even a chipset at a time. However this is a really weird case and people seem to be really struggling to understand what is going on and to write those patches. We are outside the descriptions provided by ACPI so it requires chipset specific knowledge, and a general understanding of how chipsets work to actually even comprehend the problem. Eric ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: PCI, ACPI, IRQ, IOAPIC: reroute PCI interrupt to legacy boot interrupt equivalent 2009-01-13 3:47 ` Eric W. Biederman @ 2009-01-13 4:26 ` Jon Masters 2009-01-14 11:40 ` Ingo Molnar 0 siblings, 1 reply; 30+ messages in thread From: Jon Masters @ 2009-01-13 4:26 UTC (permalink / raw) To: Eric W. Biederman Cc: Ingo Molnar, Bjorn Helgaas, Stefan Assmann, Len Brown, Jesse Barnes, Olaf Dabrunz, Thomas Gleixner, Steven Rostedt, Linux Kernel Mailing List, linux-acpi, Sven Dietrich, Maciej W. Rozycki On Mon, 2009-01-12 at 19:47 -0800, Eric W. Biederman wrote: > Ingo Molnar <mingo@elte.hu> writes: > > a number of mainline drivers also mask/unmask irqs from within the IRQ > > handler. It's not particularly smart in a native driver, but can happen - > > and if we get an active line after that point (and this can happen because > > the driver is active), we are in trouble. > > Yep. Right now it might be simpler to fix the mainline drivers. Taking the easy option now doesn't make the pain go away later :) Just because ACPI doesn't provide a handy description doesn't mean we shouldn't handle "boot interrupts" - the kernel is riddled with quirks already to deal with broken, buggy, or just quirky hardware scenarios. > We are outside the descriptions provided by ACPI so it requires > chipset specific knowledge, and a general understanding of how > chipsets work to actually even comprehend the problem. But how does that differ from most other chipset code? I'm not being belligerent but I'm not seeing how your argument is uniquely special to this particular situation. Personally, I'm a little biased because I'd eventually like to see RT merged upstream and I /know/ that's going to re-open this whole can of worms once again, even if it's "fixed" now. Jon. ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: PCI, ACPI, IRQ, IOAPIC: reroute PCI interrupt to legacy boot interrupt equivalent 2009-01-13 4:26 ` Jon Masters @ 2009-01-14 11:40 ` Ingo Molnar 2009-01-14 19:18 ` Jon Masters 0 siblings, 1 reply; 30+ messages in thread From: Ingo Molnar @ 2009-01-14 11:40 UTC (permalink / raw) To: Jon Masters Cc: Eric W. Biederman, Bjorn Helgaas, Stefan Assmann, Len Brown, Jesse Barnes, Olaf Dabrunz, Thomas Gleixner, Steven Rostedt, Linux Kernel Mailing List, linux-acpi, Sven Dietrich, Maciej W. Rozycki * Jon Masters <jcm@redhat.com> wrote: > On Mon, 2009-01-12 at 19:47 -0800, Eric W. Biederman wrote: > > Ingo Molnar <mingo@elte.hu> writes: > > > > a number of mainline drivers also mask/unmask irqs from within the IRQ > > > handler. It's not particularly smart in a native driver, but can happen - > > > and if we get an active line after that point (and this can happen because > > > the driver is active), we are in trouble. > > > > Yep. Right now it might be simpler to fix the mainline drivers. > > Taking the easy option now doesn't make the pain go away later :) Just > because ACPI doesn't provide a handy description doesn't mean we > shouldn't handle "boot interrupts" - the kernel is riddled with quirks > already to deal with broken, buggy, or just quirky hardware scenarios. > > > We are outside the descriptions provided by ACPI so it requires > > chipset specific knowledge, and a general understanding of how > > chipsets work to actually even comprehend the problem. > > But how does that differ from most other chipset code? I'm not being > belligerent but I'm not seeing how your argument is uniquely special to > this particular situation. Personally, I'm a little biased because I'd > eventually like to see RT merged upstream and I /know/ that's going to > re-open this whole can of worms once again, even if it's "fixed" now. it's not just -rt, but it is also needed for the concept of threaded IRQ handlers - which was discussed at the Kernel Summit to be desired for mainline. Ingo ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: PCI, ACPI, IRQ, IOAPIC: reroute PCI interrupt to legacy boot interrupt equivalent 2009-01-14 11:40 ` Ingo Molnar @ 2009-01-14 19:18 ` Jon Masters 2009-01-14 22:42 ` Eric W. Biederman 0 siblings, 1 reply; 30+ messages in thread From: Jon Masters @ 2009-01-14 19:18 UTC (permalink / raw) To: Ingo Molnar Cc: Eric W. Biederman, Bjorn Helgaas, Stefan Assmann, Len Brown, Jesse Barnes, Olaf Dabrunz, Thomas Gleixner, Steven Rostedt, Linux Kernel Mailing List, linux-acpi, Sven Dietrich, Maciej W. Rozycki On Wed, 2009-01-14 at 12:40 +0100, Ingo Molnar wrote: > it's not just -rt, but it is also needed for the concept of threaded IRQ > handlers - which was discussed at the Kernel Summit to be desired for > mainline. Right. I'm poking at Thomas' patches and hope to post something soon on that front - I'm acutely aware that this will be impacted aswell but because it's vaguely RT related had banded it under that banner. Jon. ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: PCI, ACPI, IRQ, IOAPIC: reroute PCI interrupt to legacy boot interrupt equivalent 2009-01-14 19:18 ` Jon Masters @ 2009-01-14 22:42 ` Eric W. Biederman 2009-01-14 22:53 ` Steven Rostedt ` (2 more replies) 0 siblings, 3 replies; 30+ messages in thread From: Eric W. Biederman @ 2009-01-14 22:42 UTC (permalink / raw) To: Jon Masters Cc: Ingo Molnar, Bjorn Helgaas, Stefan Assmann, Len Brown, Jesse Barnes, Olaf Dabrunz, Thomas Gleixner, Steven Rostedt, Linux Kernel Mailing List, linux-acpi, Sven Dietrich, Maciej W. Rozycki Jon Masters <jcm@redhat.com> writes: > On Wed, 2009-01-14 at 12:40 +0100, Ingo Molnar wrote: > >> it's not just -rt, but it is also needed for the concept of threaded IRQ >> handlers - which was discussed at the Kernel Summit to be desired for >> mainline. > > Right. I'm poking at Thomas' patches and hope to post something soon on > that front - I'm acutely aware that this will be impacted aswell but > because it's vaguely RT related had banded it under that banner. Stepping back a moment. The only way I can see this working reliably is if we disable the boot interrupt. Anything that leaves the boot interrupt enabled means that when we disable the primary interrupt the boot interrupt will scream, and thus we must disable it as well. Which leads to my problem with the entire development process of this feature. People want the feature. People don't want to pay attention to the limits of the hardware. Which leads to countless broken patches proposed. Which leads me to conclude. - IRQ handling in the RT kernel is hopelessly broken. - IRQ threads are a bad idea. Because it is all leading to stupid patches and stupid development. None of this works reliably on level triggered ioapic irqs. Eric ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: PCI, ACPI, IRQ, IOAPIC: reroute PCI interrupt to legacy boot interrupt equivalent 2009-01-14 22:42 ` Eric W. Biederman @ 2009-01-14 22:53 ` Steven Rostedt 2009-01-14 22:56 ` Jon Masters 2009-01-15 10:16 ` Stefan Assmann 2 siblings, 0 replies; 30+ messages in thread From: Steven Rostedt @ 2009-01-14 22:53 UTC (permalink / raw) To: Eric W. Biederman Cc: Jon Masters, Ingo Molnar, Bjorn Helgaas, Stefan Assmann, Len Brown, Jesse Barnes, Olaf Dabrunz, Thomas Gleixner, Linux Kernel Mailing List, linux-acpi, Sven Dietrich, Maciej W. Rozycki On Wed, 14 Jan 2009, Eric W. Biederman wrote: > Jon Masters <jcm@redhat.com> writes: > > Which leads me to conclude. > - IRQ handling in the RT kernel is hopelessly broken. > - IRQ threads are a bad idea. I would rephrase that to -- boot interrupts are hopelessly broken > > Because it is all leading to stupid patches and stupid development. Hmm, WTF should we have maskable interrupts for anyway? Sounds like broken hardware design to me. > > None of this works reliably on level triggered ioapic irqs. > I guess this means that ioapic's are not for RTOS of any kind. -- Steve ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: PCI, ACPI, IRQ, IOAPIC: reroute PCI interrupt to legacy boot interrupt equivalent 2009-01-14 22:42 ` Eric W. Biederman 2009-01-14 22:53 ` Steven Rostedt @ 2009-01-14 22:56 ` Jon Masters 2009-01-15 12:36 ` Olaf Dabrunz 2009-01-15 10:16 ` Stefan Assmann 2 siblings, 1 reply; 30+ messages in thread From: Jon Masters @ 2009-01-14 22:56 UTC (permalink / raw) To: Eric W. Biederman Cc: Ingo Molnar, Bjorn Helgaas, Stefan Assmann, Len Brown, Jesse Barnes, Olaf Dabrunz, Thomas Gleixner, Steven Rostedt, Linux Kernel Mailing List, linux-acpi, Sven Dietrich, Maciej W. Rozycki On Wed, 2009-01-14 at 14:42 -0800, Eric W. Biederman wrote: > Jon Masters <jcm@redhat.com> writes: > > > On Wed, 2009-01-14 at 12:40 +0100, Ingo Molnar wrote: > > > >> it's not just -rt, but it is also needed for the concept of threaded IRQ > >> handlers - which was discussed at the Kernel Summit to be desired for > >> mainline. > > > > Right. I'm poking at Thomas' patches and hope to post something soon on > > that front - I'm acutely aware that this will be impacted aswell but > > because it's vaguely RT related had banded it under that banner. > > Stepping back a moment. The only way I can see this working reliably > is if we disable the boot interrupt. Anything that leaves the boot interrupt > enabled means that when we disable the primary interrupt the boot interrupt > will scream, and thus we must disable it as well. > > Which leads to my problem with the entire development process of this feature. > > People want the feature. > People don't want to pay attention to the limits of the hardware. > Which leads to countless broken patches proposed. Is a patch broken because hardware has limitations? If that were always true then many of the patches we see in the kernel wouldn't be there. > Which leads me to conclude. > - IRQ handling in the RT kernel is hopelessly broken. Nope. It's done in a very similar way to other real time kernels already out there - really there are only so many ways to do this. > - IRQ threads are a bad idea. Why? IRQ threads actually make life so much easier - you have a task context, you can do everything inside that rather than scheduling all kinds of deferred work (that in RT will be done in another task later), and so forth. > None of this works reliably on level triggered ioapic irqs. Level triggered IOAPIC IRQs have quirks, film at 11! Jon. ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: PCI, ACPI, IRQ, IOAPIC: reroute PCI interrupt to legacy boot interrupt equivalent 2009-01-14 22:56 ` Jon Masters @ 2009-01-15 12:36 ` Olaf Dabrunz 0 siblings, 0 replies; 30+ messages in thread From: Olaf Dabrunz @ 2009-01-15 12:36 UTC (permalink / raw) To: Jon Masters Cc: Eric W. Biederman, Ingo Molnar, Bjorn Helgaas, Stefan Assmann, Len Brown, Jesse Barnes, Olaf Dabrunz, Thomas Gleixner, Steven Rostedt, Linux Kernel Mailing List, linux-acpi, Sven Dietrich, Maciej W. Rozycki On 14-Jan-09, Jon Masters wrote: > On Wed, 2009-01-14 at 14:42 -0800, Eric W. Biederman wrote: > > Jon Masters <jcm@redhat.com> writes: > > > > > On Wed, 2009-01-14 at 12:40 +0100, Ingo Molnar wrote: > > > > > >> it's not just -rt, but it is also needed for the concept of threaded IRQ > > >> handlers - which was discussed at the Kernel Summit to be desired for > > >> mainline. > > > > > > Right. I'm poking at Thomas' patches and hope to post something soon on > > > that front - I'm acutely aware that this will be impacted aswell but > > > because it's vaguely RT related had banded it under that banner. > > > > Stepping back a moment. The only way I can see this working reliably > > is if we disable the boot interrupt. Anything that leaves the boot interrupt > > enabled means that when we disable the primary interrupt the boot interrupt > > will scream, and thus we must disable it as well. > > > > Which leads to my problem with the entire development process of this feature. > > > > People want the feature. > > People don't want to pay attention to the limits of the hardware. > > Which leads to countless broken patches proposed. > > Is a patch broken because hardware has limitations? If that were always > true then many of the patches we see in the kernel wouldn't be there. > > > Which leads me to conclude. > > - IRQ handling in the RT kernel is hopelessly broken. > > Nope. It's done in a very similar way to other real time kernels already > out there - really there are only so many ways to do this. > > > - IRQ threads are a bad idea. > > Why? IRQ threads actually make life so much easier - you have a task > context, you can do everything inside that rather than scheduling all > kinds of deferred work (that in RT will be done in another task later), > and so forth. > > > None of this works reliably on level triggered ioapic irqs. Actually it works very well. The patches are also not _that_ complicated. We have two kinds of patches: 1) if possible on the chipset, disable boot irqs (if in APIC mode) - this works as designed and has no problems - we cover up for broken BIOSes here that forget to disable boot irqs 2) if we cannot disable boot irqs, disable the original interrupt line and only use the boot irq line -> no duplicated interrupts, but increased interrupt sharing - this is a hack for older, but widely used chipsets - we cover up for broken hardware (wrt threaded IRQ handling) - when the broken hardware falls out of use, this is not needed anymore The only limitations we have so far: - For newer chipsets, we have to make sure that we find the disable bit. Vendors should put this bit in a common place. - It is a lot of work to describe all of the findings and experimental results we had, in a digestable way. > Level triggered IOAPIC IRQs have quirks, film at 11! Love this. *grin* :) -- Olaf Dabrunz (od/odabrunz), SUSE Linux Products GmbH, Nürnberg ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: PCI, ACPI, IRQ, IOAPIC: reroute PCI interrupt to legacy boot interrupt equivalent 2009-01-14 22:42 ` Eric W. Biederman 2009-01-14 22:53 ` Steven Rostedt 2009-01-14 22:56 ` Jon Masters @ 2009-01-15 10:16 ` Stefan Assmann 2 siblings, 0 replies; 30+ messages in thread From: Stefan Assmann @ 2009-01-15 10:16 UTC (permalink / raw) To: Eric W. Biederman Cc: Jon Masters, Ingo Molnar, Bjorn Helgaas, Len Brown, Jesse Barnes, Olaf Dabrunz, Thomas Gleixner, Steven Rostedt, Linux Kernel Mailing List, linux-acpi, Sven Dietrich, Maciej W. Rozycki Eric W. Biederman wrote: > Jon Masters <jcm@redhat.com> writes: > >> On Wed, 2009-01-14 at 12:40 +0100, Ingo Molnar wrote: >> >>> it's not just -rt, but it is also needed for the concept of threaded IRQ >>> handlers - which was discussed at the Kernel Summit to be desired for >>> mainline. >> Right. I'm poking at Thomas' patches and hope to post something soon on >> that front - I'm acutely aware that this will be impacted aswell but >> because it's vaguely RT related had banded it under that banner. > > Stepping back a moment. The only way I can see this working reliably > is if we disable the boot interrupt. Anything that leaves the boot interrupt > enabled means that when we disable the primary interrupt the boot interrupt > will scream, and thus we must disable it as well. Disabling Boot Interrupts is our goal, if they don't appear everything is fine. Now if you take a closer look at United States Patent 6466998 you can read that there has to be a way to disable Boot Interrupts on APIC aware OSes. Let me back this up with a quote from that patent: "Therefore, 8259 PIC may be incorporated into a system board along with the APIC system to ensure proper operation of an operating system (OS) regardless whether such an operating system (OS) may or may not support an APIC system. However, external logic devices are required to route particular interrupts from a non-legacy peripheral bus to the 8259 PIC. General purpose I/O pins are then needed to enable/disable this functionality once an operating system (OS) which understands the APIC system is loaded." We already managed to successfully disable Boot Interrupt generation on several chipsets. See the following posts: http://lkml.org/lkml/2008/7/8/213 http://lkml.org/lkml/2008/7/8/215 http://lkml.org/lkml/2008/6/2/270 What makes things complicated is that we have to deal with buggy hardware which can't disable the generation of Boot Interrupts, that's why we introduced the reroute to legacy interrupt patch. > Which leads to my problem with the entire development process of this feature. > > People want the feature. > People don't want to pay attention to the limits of the hardware. > Which leads to countless broken patches proposed. Sorry if you feel that way, really trying to pay attention to hardware limitations here. I bet you have a lot more experience in this field than I have, so please let me know what technical reasons you see for this being a dead end. > Which leads me to conclude. > - IRQ handling in the RT kernel is hopelessly broken. > - IRQ threads are a bad idea. > > Because it is all leading to stupid patches and stupid development. > > None of this works reliably on level triggered ioapic irqs. We're trying really hard to make it work reliably, but this is a complicated matter and it sure needs a lot of research. > > Eric Stefan -- Stefan Assmann | SUSE LINUX Products GmbH Software Engineer | Maxfeldstr. 5, D-90409 Nuernberg Mail: sassmann@suse.de | GF: Markus Rex, HRB 16746 (AG Nuernberg) ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: PCI, ACPI, IRQ, IOAPIC: reroute PCI interrupt to legacy boot interrupt equivalent 2009-01-12 18:51 ` Bjorn Helgaas 2009-01-12 19:25 ` Jon Masters 2009-01-12 23:36 ` Eric W. Biederman @ 2009-01-13 11:18 ` Stefan Assmann 2009-01-13 15:57 ` Olaf Dabrunz 2 siblings, 1 reply; 30+ messages in thread From: Stefan Assmann @ 2009-01-13 11:18 UTC (permalink / raw) To: Bjorn Helgaas Cc: Len Brown, Ingo Molnar, Jesse Barnes, Olaf Dabrunz, Thomas Gleixner, Steven Rostedt, Linux Kernel Mailing List, linux-acpi, Sven Dietrich, Eric W. Biederman, Maciej W. Rozycki, Jon Masters Bjorn Helgaas wrote: > (I added Eric, Maciej, and Jon because they participated in > previous discussion here: http://lkml.org/lkml/2008/6/2/269) > > On Monday 12 January 2009 04:09:25 am Stefan Assmann wrote: >> Len Brown wrote: >>> Stefan, >>> I had to exclude your changes to drivers/acpi/pci_irq.c from >>> e1d3a90846b40ad3160bf4b648d36c6badad39ac >>> in order to get some other changes to that file upstream in the >>> 2.6.29 merge window. ... >> Let me try to give you a short overview of what's happening there. >> >> If an IRQ arrives at line X of a non-primary IO-APIC and that line is >> masked a new IRQ will be generated on the primary IO-APIC/PIC. This is >> called a "Boot Interrupt" by Intel. It's purpose is, as the name >> suggests, to ensure that the IRQ is handled at boot time (when the >> non-primary) IO-APIC is still disabled. >> >> Condition to be met for "Boot Interrupts": >> - line X on non-primary IO-APIC interrupt line is masked >> - line X is asserted > > Thanks. Let me replay this to see whether I understand. Please > correct any of my misapprehensions :-) > > Since your patch doesn't look at the IOxAPIC RDL register to see > whether the pin is masked, you must be assuming that Linux is > *always* using boot interrupts, and never unmasking the non-primary > IOxAPIC entry. Yes, for this special case that is correct. > It looks like for each PCI bus, the 6700PXH contains a 24-input IOxAPIC > with 16 of the inputs available for PCI interrupts. The boot interrupt > feature generates only INTA-INTD messages (total of four choices). > I think that means your patch forces sharing, e.g. pins 0, 4, 8, and > 12 all generate INTA on PCI Express, so they will share the same IRQ. > If the non-primary IOxAPIC entries were unmasked, the boot interrupt > would not be generated, so those pins could all have separate IRQs. Indeed, this method introduces a kind of artificial interrupt sharing. The trade-off here is, having a reliably working system with the possibility of slightly increased interrupt sharing rather than a system that shows all kinds of nasty behaviors as Ingo pointed out. How much of a performance hit gives interrupt sharing anyway? Does anybody have some numbers? > > The 6700PXH boot interrupt behavior doesn't seem to be configurable, > so it should work the same even with "acpi=off", so I think we would > have a similar issue in the pirq_enable_irq() path. > > I was about to ask why you have devices generating interrupts while > their APIC entry is masked, but the discussion starting with Eric's > response in this thread: > http://lkml.org/lkml/2008/6/2/269 > suggests that the RT kernel masks APIC entries in the course of > normal operation, while the device can still be generating interrupts. > And these boot interrupts would certainly be a surprising side-effect > of masking an APIC entry, even in a non-RT kernel. > > Basically, the interrupt routing changes depending on whether the > APIC entry is masked. That seems pretty ugly, and I can't think of > any nice way to describe that behavior via ACPI. It seems sub-optimal > to add a constraint that we can't ever mask or unmask that APIC entry. Well, it shouldn't be that bad. Mind you this only applies to non-primary IO-APICs. These IO-APICs are usually used to handle a separate bus, which usually only has a few devices hanging around. > What if we installed an extra ISR for the boot interrupt that just > invoked the ISR for each device on each of the APIC pins that can > generate that boot interrupt? You might end up invoking the same ISR twice. Once by the original IRQ followed by a Boot Interrupt that is generated during masked interrupt handling. > I.e., for the 6700PXH case, install an > ISR for INTA, and have that ISR call the ISRs for all the devices on > pins 0, 4, 8, and 12. Then the normal case would be that the APIC entry > is unmasked, we don't share the IRQ, and the ISR is called directly. > But when the APIC entry is masked, we take the boot interrupt and pay > the penalty of calling some extra ISRs. Don't you think that would clutter /proc/interrupts? The other question I have, is that really worth the effort? Another thing that comes to mind is, what if there is no other device on the interrupt line where INTA, INTB, etc. end up. Would you want to permanently unmask all these interrupt lines in case a Boot Interrupt shows up even if there are no devices on the interrupt lines? The whole case is pretty tricky. > > Bjorn > >> This behavior is not necessary during normal operation as the IRQ is >> handled by the non-primary IO-APIC itself. Now imagine what happens if >> these Boot Interrupts would occur during normal operation. You'd see >> spurious IRQs on your primary IO-APIC which, in the worst case, will >> bring down the interrupt line they occur on! Every device that shares this >> interrupt line will fail when this happens. >> >> Why can these IRQ lines be brought down by Boot Interrupts? Because >> there's no handler installed on the primary IO-APIC IRQ line that can >> take care of them and after too many unhandled IRQs the line will be shut >> down by the kernel. >> >> What this quirk does: >> It installs the interrupt handler on the primary IO-APICs interrupt line >> instead of the (original) non-primary IO-APICs interrupt line, keeping >> the original interrupt line masked. This guarantees that for every IRQ >> arriving at the non-primary IO-APIC a Boot Interrupt is generated _and_ >> handled properly. >> >> Note: You need this quirk if you mask your interrupts during handling. >> >>> The quirk is specific to Intel chipsets, so with all the >>> Linux guys now working at Intel, I'm hopeful that we can >>> reach a clear understanding of the issue and a consensus >>> on the proper fix. >>> >>> BTW. I'm not excited about how the original patch >>> drops a chipset specific workaround inside the ACPI code >>> to go behind the mirrors and lie about what ACPI returns. >>> I'm hopeful that a better place for the workaround >>> can be found if this is the approach we need to take.. >> Yes, I agree with you and I'm totally open for discussing a better place >> for this quirk if you find any. You see the problem here is to "move" the >> interrupt handler from one line to another and so far we haven't found a >> better way to do this. If I'm not mistaken this technically is pretty >> similar to what this new "derive an IRQ for this device from a parent >> bridge" does. >> >>> Can you help us understand what the failure is? >> I hope this explains a little bit what this is all about. In case your >> looking for more in-depth information about the interrupt routing I'd >> suggest to have a look for example at the Intel® 6700PXH 64-bit PCI Hub >> Datasheet, especially chapter 2.15 I/OxAPIC Interrupt Controller. Feel >> free to ask any further questions that might arise. >> >> Taking a look at the new pci_irq.c code now, could you tell me where >> exactly you're seeing trouble with the quirk. It shouldn't be too >> troublesome to put it in again, but let me have a look and thanks for >> your efforts. >> >>> thanks, >>> -Len >> Stefan > Stefan -- Stefan Assmann | SUSE LINUX Products GmbH Software Engineer | Maxfeldstr. 5, D-90409 Nuernberg Mail: sassmann@suse.de | GF: Markus Rex, HRB 16746 (AG Nuernberg) ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: PCI, ACPI, IRQ, IOAPIC: reroute PCI interrupt to legacy boot interrupt equivalent 2009-01-13 11:18 ` Stefan Assmann @ 2009-01-13 15:57 ` Olaf Dabrunz 2009-01-15 0:10 ` Bjorn Helgaas 0 siblings, 1 reply; 30+ messages in thread From: Olaf Dabrunz @ 2009-01-13 15:57 UTC (permalink / raw) To: Stefan Assmann Cc: Bjorn Helgaas, Len Brown, Ingo Molnar, Jesse Barnes, Olaf Dabrunz, Thomas Gleixner, Steven Rostedt, Linux Kernel Mailing List, linux-acpi, Sven Dietrich, Eric W. Biederman, Maciej W. Rozycki, Jon Masters On 13-Jan-09, Stefan Assmann wrote: > Bjorn Helgaas wrote: > > (I added Eric, Maciej, and Jon because they participated in > > previous discussion here: http://lkml.org/lkml/2008/6/2/269) > > > > On Monday 12 January 2009 04:09:25 am Stefan Assmann wrote: > >> Len Brown wrote: > >>> Stefan, > >>> I had to exclude your changes to drivers/acpi/pci_irq.c from > >>> e1d3a90846b40ad3160bf4b648d36c6badad39ac > >>> in order to get some other changes to that file upstream in the > >>> 2.6.29 merge window. ... > >> Let me try to give you a short overview of what's happening there. > >> > >> If an IRQ arrives at line X of a non-primary IO-APIC and that line is > >> masked a new IRQ will be generated on the primary IO-APIC/PIC. This is > >> called a "Boot Interrupt" by Intel. It's purpose is, as the name > >> suggests, to ensure that the IRQ is handled at boot time (when the > >> non-primary) IO-APIC is still disabled. > >> > >> Condition to be met for "Boot Interrupts": > >> - line X on non-primary IO-APIC interrupt line is masked > >> - line X is asserted > > > > Thanks. Let me replay this to see whether I understand. Please > > correct any of my misapprehensions :-) > > > > Since your patch doesn't look at the IOxAPIC RDL register to see > > whether the pin is masked, you must be assuming that Linux is > > *always* using boot interrupts, and never unmasking the non-primary > > IOxAPIC entry. > > Yes, for this special case that is correct. Right. The current reroute code assumes that all interrupts use masking. If you think about migrating the reroute code to new per-line (or per-device) masking/threading: this means we need to hook into request_irq() or some such, to find out if an interrupt handler requests a masked/threaded IRQ. The state of the mask bit in the IO-APIC RDL is not sufficient. It is switched on in the initial IRQ handler and switched off when the IRQ has been handled by the IRQ thread. A cleared mask bit does not mean that the IRQ-line will not be masked. > > It looks like for each PCI bus, the 6700PXH contains a 24-input IOxAPIC > > with 16 of the inputs available for PCI interrupts. The boot interrupt > > feature generates only INTA-INTD messages (total of four choices). > > I think that means your patch forces sharing, e.g. pins 0, 4, 8, and > > 12 all generate INTA on PCI Express, so they will share the same IRQ. > > If the non-primary IOxAPIC entries were unmasked, the boot interrupt > > would not be generated, so those pins could all have separate IRQs. > > Indeed, this method introduces a kind of artificial interrupt sharing. > The trade-off here is, having a reliably working system with the > possibility of slightly increased interrupt sharing rather than a system > that shows all kinds of nasty behaviors as Ingo pointed out. > > How much of a performance hit gives interrupt sharing anyway? Does > anybody have some numbers? IRQ sharing mainly has an effect on latency. I do not expect the latency changes to be large, so this is not an issue for non-RT computing. But for RT computing increased latency is an issue. So why would we increase the latency for RT computing with our reroute code? Because it is the only approach that reliably targets and fixes the problems caused by broken hardware for RT. The only alternative to that is to use different hardware. Making the widely available broken hardware work at the cost of a bit of added latency seems to be a good tradeoff (and the best tradeoff available, as far as we know). We have not seen any newer bridges that are broken, so we expect that eventually the need for rerouting will go away. But this is probably years in the future, when the broken hardware is not being used anymore. We were preparing a paper which describes the problem and all the approaches we considered and/or tried. We also have a presentation about this. Unfortunately, due to the huge number of other interesting presentations, we could not give our presentation at the LPC and the paper was not finished. But we have a simplified version of the paper and are working on making it and the presentation slides available online. > > The 6700PXH boot interrupt behavior doesn't seem to be configurable, > > so it should work the same even with "acpi=off", so I think we would > > have a similar issue in the pirq_enable_irq() path. > > > > I was about to ask why you have devices generating interrupts while > > their APIC entry is masked, but the discussion starting with Eric's > > response in this thread: > > http://lkml.org/lkml/2008/6/2/269 > > suggests that the RT kernel masks APIC entries in the course of > > normal operation, while the device can still be generating interrupts. Threaded IRQ handling masks the IRQ while the original IRQ is still pending. This is done so that IRQs in general can be activated quickly again (by leaving the initial IRQ handler) in order to minimize the times where other, high-priority threads cannot run. The masking then prevents the pending IRQ from re-activating the IRQ handler. Later the IRQ thread will handle the IRQ and unmask the line. > > And these boot interrupts would certainly be a surprising side-effect > > of masking an APIC entry, even in a non-RT kernel. Yes. To prevent this, the patent by Intel describes a bit that disables boot interrupts. But this bit has not been implemented in several Intel chips. > > Basically, the interrupt routing changes depending on whether the > > APIC entry is masked. That seems pretty ugly, and I can't think of > > any nice way to describe that behavior via ACPI. It seems sub-optimal Yes, I agree: if I was a BIOS manufacturer and had to write the ACPI tables for boot interrupts, I would not see a way to describe this behaviour in the ACPI tables either. > > to add a constraint that we can't ever mask or unmask that APIC entry. Yes, preventing boot interrupts by completely disallowing masking in software is not a solution either: masking is needed for threaded interrupt handling and for disabling (!) interrupt lines (although this also does not work as expected on broken chips, as the IRQ will then be delivered via the boot interrupt mechanism -- in the case of screaming IRQs this leads to a situation where both lines are disabled). > Well, it shouldn't be that bad. Mind you this only applies to > non-primary IO-APICs. These IO-APICs are usually used to handle a > separate bus, which usually only has a few devices hanging around. > > > What if we installed an extra ISR for the boot interrupt that just > > invoked the ISR for each device on each of the APIC pins that can > > generate that boot interrupt? > > You might end up invoking the same ISR twice. Once by the original IRQ > followed by a Boot Interrupt that is generated during masked interrupt > handling. And this will happen every time. And as you (Stefan) reminded me, the second of the two invocations of the ISR may not see the IRQ asserted on the device anymore (e.g. if the boot interrupt line was masked, and if it is unmasked after the threaded IRQ handler de-asserted the IRQ on the device). So we may end up having screaming IRQs on one IRQ line, leading to the nasty effects again that we needed to avoid. This failure mode is also very difficult to reproduce and analyze. > > I.e., for the 6700PXH case, install an > > ISR for INTA, and have that ISR call the ISRs for all the devices on > > pins 0, 4, 8, and 12. Then the normal case would be that the APIC entry > > is unmasked, we don't share the IRQ, and the ISR is called directly. > > But when the APIC entry is masked, we take the boot interrupt and pay > > the penalty of calling some extra ISRs. > > Don't you think that would clutter /proc/interrupts? The other question > I have, is that really worth the effort? Another thing that comes to > mind is, what if there is no other device on the interrupt line where > INTA, INTB, etc. end up. Would you want to permanently unmask all these > interrupt lines in case a Boot Interrupt shows up even if there are no > devices on the interrupt lines? The whole case is pretty tricky. > > > > > Bjorn > > > >> This behavior is not necessary during normal operation as the IRQ is > >> handled by the non-primary IO-APIC itself. Now imagine what happens if > >> these Boot Interrupts would occur during normal operation. You'd see > >> spurious IRQs on your primary IO-APIC which, in the worst case, will > >> bring down the interrupt line they occur on! Every device that shares this > >> interrupt line will fail when this happens. > >> > >> Why can these IRQ lines be brought down by Boot Interrupts? Because > >> there's no handler installed on the primary IO-APIC IRQ line that can > >> take care of them and after too many unhandled IRQs the line will be shut > >> down by the kernel. > >> > >> What this quirk does: > >> It installs the interrupt handler on the primary IO-APICs interrupt line > >> instead of the (original) non-primary IO-APICs interrupt line, keeping > >> the original interrupt line masked. This guarantees that for every IRQ > >> arriving at the non-primary IO-APIC a Boot Interrupt is generated _and_ > >> handled properly. > >> > >> Note: You need this quirk if you mask your interrupts during handling. > >> > >>> The quirk is specific to Intel chipsets, so with all the > >>> Linux guys now working at Intel, I'm hopeful that we can > >>> reach a clear understanding of the issue and a consensus > >>> on the proper fix. > >>> > >>> BTW. I'm not excited about how the original patch > >>> drops a chipset specific workaround inside the ACPI code > >>> to go behind the mirrors and lie about what ACPI returns. > >>> I'm hopeful that a better place for the workaround > >>> can be found if this is the approach we need to take.. > >> Yes, I agree with you and I'm totally open for discussing a better place > >> for this quirk if you find any. You see the problem here is to "move" the > >> interrupt handler from one line to another and so far we haven't found a > >> better way to do this. If I'm not mistaken this technically is pretty > >> similar to what this new "derive an IRQ for this device from a parent > >> bridge" does. Yes, we should better integrate this information and keep the information in the ACPI tables available. My suggestion is to add a "used_irq" field to acpi_prt_entry. If that field is valid (i.e. not -1), we use that IRQ number instead of the one reported in the ACPI table. > >>> Can you help us understand what the failure is? > >> I hope this explains a little bit what this is all about. In case your > >> looking for more in-depth information about the interrupt routing I'd > >> suggest to have a look for example at the Intel® 6700PXH 64-bit PCI Hub > >> Datasheet, especially chapter 2.15 I/OxAPIC Interrupt Controller. Feel > >> free to ask any further questions that might arise. > >> > >> Taking a look at the new pci_irq.c code now, could you tell me where > >> exactly you're seeing trouble with the quirk. It shouldn't be too > >> troublesome to put it in again, but let me have a look and thanks for > >> your efforts. > >> > >>> thanks, > >>> -Len > >> Stefan > > > > Stefan Thanks, -- Olaf Dabrunz (od/odabrunz), SUSE Linux Products GmbH, Nürnberg ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: PCI, ACPI, IRQ, IOAPIC: reroute PCI interrupt to legacy boot interrupt equivalent 2009-01-13 15:57 ` Olaf Dabrunz @ 2009-01-15 0:10 ` Bjorn Helgaas 2009-01-15 14:08 ` Stefan Assmann 0 siblings, 1 reply; 30+ messages in thread From: Bjorn Helgaas @ 2009-01-15 0:10 UTC (permalink / raw) To: Olaf Dabrunz Cc: Stefan Assmann, Len Brown, Ingo Molnar, Jesse Barnes, Thomas Gleixner, Steven Rostedt, Linux Kernel Mailing List, linux-acpi, Sven Dietrich, Eric W. Biederman, Maciej W. Rozycki, Jon Masters On Tuesday 13 January 2009 08:57:17 am Olaf Dabrunz wrote: > On 13-Jan-09, Stefan Assmann wrote: > > Bjorn Helgaas wrote: > > > (I added Eric, Maciej, and Jon because they participated in > > > previous discussion here: http://lkml.org/lkml/2008/6/2/269) > > > > > > On Monday 12 January 2009 04:09:25 am Stefan Assmann wrote: > > >> Len Brown wrote: > > >>> Stefan, > > >>> I had to exclude your changes to drivers/acpi/pci_irq.c from > > >>> e1d3a90846b40ad3160bf4b648d36c6badad39ac > > >>> in order to get some other changes to that file upstream in the > > >>> 2.6.29 merge window. ... > > >> Let me try to give you a short overview of what's happening there. > > >> > > >> If an IRQ arrives at line X of a non-primary IO-APIC and that line is > > >> masked a new IRQ will be generated on the primary IO-APIC/PIC. This is > > >> called a "Boot Interrupt" by Intel. It's purpose is, as the name > > >> suggests, to ensure that the IRQ is handled at boot time (when the > > >> non-primary) IO-APIC is still disabled. > > >> > > >> Condition to be met for "Boot Interrupts": > > >> - line X on non-primary IO-APIC interrupt line is masked > > >> - line X is asserted > > > > > > Thanks. Let me replay this to see whether I understand. Please > > > correct any of my misapprehensions :-) > > > > > > Since your patch doesn't look at the IOxAPIC RDL register to see > > > whether the pin is masked, you must be assuming that Linux is > > > *always* using boot interrupts, and never unmasking the non-primary > > > IOxAPIC entry. > > > > Yes, for this special case that is correct. > > Right. The current reroute code assumes that all interrupts use masking. I'm not an expert in the "classic x86 ioapic behaviour" or in RT interrupt handling. My main objections are that the patch as proposed is a gross hack in the ACPI _PRT lookup path, and that it didn't touch the non-ACPI lookup path, which should have the same problem. Now that Eric pointed out this is just classic behavior, I also wonder why the quirk doesn't list more devices. The effect of the patch sounds sort of similar to using "noapic" -- you're basically ignoring the IOAPIC and using only the "boot interrupt" or legacy PIC interrupt or whatever it is. Can you contrast it with that? Bjorn ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: PCI, ACPI, IRQ, IOAPIC: reroute PCI interrupt to legacy boot interrupt equivalent 2009-01-15 0:10 ` Bjorn Helgaas @ 2009-01-15 14:08 ` Stefan Assmann 0 siblings, 0 replies; 30+ messages in thread From: Stefan Assmann @ 2009-01-15 14:08 UTC (permalink / raw) To: Bjorn Helgaas Cc: Olaf Dabrunz, Len Brown, Ingo Molnar, Jesse Barnes, Thomas Gleixner, Steven Rostedt, Linux Kernel Mailing List, linux-acpi, Sven Dietrich, Eric W. Biederman, Maciej W. Rozycki, Jon Masters Bjorn Helgaas wrote: > On Tuesday 13 January 2009 08:57:17 am Olaf Dabrunz wrote: >> On 13-Jan-09, Stefan Assmann wrote: >>> Bjorn Helgaas wrote: >>>> (I added Eric, Maciej, and Jon because they participated in >>>> previous discussion here: http://lkml.org/lkml/2008/6/2/269) >>>> >>>> On Monday 12 January 2009 04:09:25 am Stefan Assmann wrote: >>>>> Len Brown wrote: >>>>>> Stefan, >>>>>> I had to exclude your changes to drivers/acpi/pci_irq.c from >>>>>> e1d3a90846b40ad3160bf4b648d36c6badad39ac >>>>>> in order to get some other changes to that file upstream in the >>>>>> 2.6.29 merge window. ... >>>>> Let me try to give you a short overview of what's happening there. >>>>> >>>>> If an IRQ arrives at line X of a non-primary IO-APIC and that line is >>>>> masked a new IRQ will be generated on the primary IO-APIC/PIC. This is >>>>> called a "Boot Interrupt" by Intel. It's purpose is, as the name >>>>> suggests, to ensure that the IRQ is handled at boot time (when the >>>>> non-primary) IO-APIC is still disabled. >>>>> >>>>> Condition to be met for "Boot Interrupts": >>>>> - line X on non-primary IO-APIC interrupt line is masked >>>>> - line X is asserted >>>> Thanks. Let me replay this to see whether I understand. Please >>>> correct any of my misapprehensions :-) >>>> >>>> Since your patch doesn't look at the IOxAPIC RDL register to see >>>> whether the pin is masked, you must be assuming that Linux is >>>> *always* using boot interrupts, and never unmasking the non-primary >>>> IOxAPIC entry. >>> Yes, for this special case that is correct. >> Right. The current reroute code assumes that all interrupts use masking. > > I'm not an expert in the "classic x86 ioapic behaviour" or in RT > interrupt handling. My main objections are that the patch as proposed > is a gross hack in the ACPI _PRT lookup path, and that it didn't touch > the non-ACPI lookup path, which should have the same problem. Now that I guess by non-ACPI path you refer to the MP_TABLE. So in case you boot with "acpi=off" the system falls back to the MP_TABLE information and this is not covered yet. > Eric pointed out this is just classic behavior, I also wonder why the > quirk doesn't list more devices. The reason why the list of quirks isn't larger is that there's no documentation available in public for some of the chipsets. For example for Nvidia and VIA chipsets. Note that there is only a problem if more than one IO-APIC exists in the system. Additional IO-APICs usually appear on PCI bridges. We have not encountered any systems yet that are Nvidia or VIA based with more than one IO-APIC. But your mileage may vary. > The effect of the patch sounds sort of similar to using "noapic" -- > you're basically ignoring the IOAPIC and using only the "boot interrupt" > or legacy PIC interrupt or whatever it is. Can you contrast it with > that? The main difference here is that "noapic" disables any IO-APIC in the system, whereas the patch leaves them intact. We won't be falling back to the PIC instead we fall back to the first IO-APIC. What difference does this make? We won't lose all the enhancements of the IO-APIC over the PIC. What comes to mind first is that, when using the PIC all non-MSI interrupts have to be dealt with by CPU0. You have no IRQ balancing, pinning and such things. > > Bjorn > > Stefan -- Stefan Assmann | SUSE LINUX Products GmbH Software Engineer | Maxfeldstr. 5, D-90409 Nuernberg Mail: sassmann@suse.de | GF: Markus Rex, HRB 16746 (AG Nuernberg) ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: PCI, ACPI, IRQ, IOAPIC: reroute PCI interrupt to legacy boot interrupt equivalent 2009-01-12 11:09 ` Stefan Assmann 2009-01-12 11:37 ` Ingo Molnar 2009-01-12 18:51 ` Bjorn Helgaas @ 2009-01-13 8:25 ` Shaohua Li 2009-01-14 9:57 ` Stefan Assmann 2 siblings, 1 reply; 30+ messages in thread From: Shaohua Li @ 2009-01-13 8:25 UTC (permalink / raw) To: Stefan Assmann Cc: Len Brown, Ingo Molnar, Bjorn Helgaas, Jesse Barnes, Olaf Dabrunz, Thomas Gleixner, Steven Rostedt, Linux Kernel Mailing List, linux-acpi, Sven Dietrich On Mon, Jan 12, 2009 at 07:09:25PM +0800, Stefan Assmann wrote: > Hi Len, > > Len Brown wrote: > > Stefan, > > I had to exclude your changes to drivers/acpi/pci_irq.c from > > e1d3a90846b40ad3160bf4b648d36c6badad39ac > > in order to get some other changes to that file upstream in the > > 2.6.29 merge window. > > > > I left the other parts of the quirk intact - so at the moment > > on one of the quirked machines, you'll see > > > > PCI quirk: reroute interrupts for... > > > > but will not see > > > > pci irq %d -> rerouted to legacy > > > > as the quirk is effectively disabled. > > > > I had difficulty trying to port this patch to the new pci_irq.c > > because fundamentally I don't understand what it is trying > > to do, and why. > > Let me try to give you a short overview of what's happening there. > > If an IRQ arrives at line X of a non-primary IO-APIC and that line is > masked a new IRQ will be generated on the primary IO-APIC/PIC. This is > called a "Boot Interrupt" by Intel. It's purpose is, as the name > suggests, to ensure that the IRQ is handled at boot time (when the > non-primary) IO-APIC is still disabled. > > Condition to be met for "Boot Interrupts": > - line X on non-primary IO-APIC interrupt line is masked > - line X is asserted > > This behavior is not necessary during normal operation as the IRQ is > handled by the non-primary IO-APIC itself. Now imagine what happens if > these Boot Interrupts would occur during normal operation. You'd see > spurious IRQs on your primary IO-APIC which, in the worst case, will > bring down the interrupt line they occur on! Every device that shares this > interrupt line will fail when this happens. > > Why can these IRQ lines be brought down by Boot Interrupts? Because > there's no handler installed on the primary IO-APIC IRQ line that can > take care of them and after too many unhandled IRQs the line will be shut > down by the kernel. > > What this quirk does: > It installs the interrupt handler on the primary IO-APICs interrupt line > instead of the (original) non-primary IO-APICs interrupt line, keeping > the original interrupt line masked. This guarantees that for every IRQ > arriving at the non-primary IO-APIC a Boot Interrupt is generated _and_ > handled properly. > > Note: You need this quirk if you mask your interrupts during handling. So a device can generate interrupt from two irqs. And we can get the irq number for the routing table. Can we extend the irq mechanism and automatically register the interrupt handler for the two irqs? Thanks, Shaohua ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: PCI, ACPI, IRQ, IOAPIC: reroute PCI interrupt to legacy boot interrupt equivalent 2009-01-13 8:25 ` Shaohua Li @ 2009-01-14 9:57 ` Stefan Assmann 2009-01-14 15:48 ` Bjorn Helgaas 0 siblings, 1 reply; 30+ messages in thread From: Stefan Assmann @ 2009-01-14 9:57 UTC (permalink / raw) To: Shaohua Li Cc: Len Brown, Ingo Molnar, Bjorn Helgaas, Jesse Barnes, Olaf Dabrunz, Thomas Gleixner, Steven Rostedt, Linux Kernel Mailing List, linux-acpi, Sven Dietrich Shaohua Li wrote: > On Mon, Jan 12, 2009 at 07:09:25PM +0800, Stefan Assmann wrote: >> Hi Len, >> >> Len Brown wrote: >>> Stefan, >>> I had to exclude your changes to drivers/acpi/pci_irq.c from >>> e1d3a90846b40ad3160bf4b648d36c6badad39ac >>> in order to get some other changes to that file upstream in the >>> 2.6.29 merge window. >>> >>> I left the other parts of the quirk intact - so at the moment >>> on one of the quirked machines, you'll see >>> >>> PCI quirk: reroute interrupts for... >>> >>> but will not see >>> >>> pci irq %d -> rerouted to legacy >>> >>> as the quirk is effectively disabled. >>> >>> I had difficulty trying to port this patch to the new pci_irq.c >>> because fundamentally I don't understand what it is trying >>> to do, and why. >> Let me try to give you a short overview of what's happening there. >> >> If an IRQ arrives at line X of a non-primary IO-APIC and that line is >> masked a new IRQ will be generated on the primary IO-APIC/PIC. This is >> called a "Boot Interrupt" by Intel. It's purpose is, as the name >> suggests, to ensure that the IRQ is handled at boot time (when the >> non-primary) IO-APIC is still disabled. >> >> Condition to be met for "Boot Interrupts": >> - line X on non-primary IO-APIC interrupt line is masked >> - line X is asserted >> >> This behavior is not necessary during normal operation as the IRQ is >> handled by the non-primary IO-APIC itself. Now imagine what happens if >> these Boot Interrupts would occur during normal operation. You'd see >> spurious IRQs on your primary IO-APIC which, in the worst case, will >> bring down the interrupt line they occur on! Every device that shares this >> interrupt line will fail when this happens. >> >> Why can these IRQ lines be brought down by Boot Interrupts? Because >> there's no handler installed on the primary IO-APIC IRQ line that can >> take care of them and after too many unhandled IRQs the line will be shut >> down by the kernel. >> >> What this quirk does: >> It installs the interrupt handler on the primary IO-APICs interrupt line >> instead of the (original) non-primary IO-APICs interrupt line, keeping >> the original interrupt line masked. This guarantees that for every IRQ >> arriving at the non-primary IO-APIC a Boot Interrupt is generated _and_ >> handled properly. >> >> Note: You need this quirk if you mask your interrupts during handling. > So a device can generate interrupt from two irqs. And we can get the irq > number for the routing table. Can we extend the irq mechanism and > automatically register the interrupt handler for the two irqs? This would not solve the problem of asserting 2 different interrupt lines, in the masked interrupt handling case, for 1 interrupt request. The result would be that the ISR is called twice and at the second call you can't be sure that the device hasn't already been serviced. > > Thanks, > Shaohua Stefan -- Stefan Assmann | SUSE LINUX Products GmbH Software Engineer | Maxfeldstr. 5, D-90409 Nuernberg Mail: sassmann@suse.de | GF: Markus Rex, HRB 16746 (AG Nuernberg) ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: PCI, ACPI, IRQ, IOAPIC: reroute PCI interrupt to legacy boot interrupt equivalent 2009-01-14 9:57 ` Stefan Assmann @ 2009-01-14 15:48 ` Bjorn Helgaas 2009-01-14 15:55 ` Olaf Dabrunz 0 siblings, 1 reply; 30+ messages in thread From: Bjorn Helgaas @ 2009-01-14 15:48 UTC (permalink / raw) To: Stefan Assmann Cc: Shaohua Li, Len Brown, Ingo Molnar, Jesse Barnes, Olaf Dabrunz, Thomas Gleixner, Steven Rostedt, Linux Kernel Mailing List, linux-acpi, Sven Dietrich On Wednesday 14 January 2009 02:57:22 am Stefan Assmann wrote: > Shaohua Li wrote: > > So a device can generate interrupt from two irqs. And we can get the irq > > number for the routing table. Can we extend the irq mechanism and > > automatically register the interrupt handler for the two irqs? > > This would not solve the problem of asserting 2 different interrupt > lines, in the masked interrupt handling case, for 1 interrupt request. > The result would be that the ISR is called twice and at the second call > you can't be sure that the device hasn't already been serviced. Calling the ISR twice isn't a problem, is it? We're talking about PCI interrupts, which are shareable, so ISRs have to handle being called extra times. There's still the problem that the core will disable an IRQ if we take it too many times without any ISR that cares about it. But that's a core issue, not an ISR issue. Bjorn ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: PCI, ACPI, IRQ, IOAPIC: reroute PCI interrupt to legacy boot interrupt equivalent 2009-01-14 15:48 ` Bjorn Helgaas @ 2009-01-14 15:55 ` Olaf Dabrunz 2009-01-14 16:52 ` Bjorn Helgaas 0 siblings, 1 reply; 30+ messages in thread From: Olaf Dabrunz @ 2009-01-14 15:55 UTC (permalink / raw) To: Bjorn Helgaas Cc: Stefan Assmann, Shaohua Li, Len Brown, Ingo Molnar, Jesse Barnes, Olaf Dabrunz, Thomas Gleixner, Steven Rostedt, Linux Kernel Mailing List, linux-acpi, Sven Dietrich On 14-Jan-09, Bjorn Helgaas wrote: > On Wednesday 14 January 2009 02:57:22 am Stefan Assmann wrote: > > Shaohua Li wrote: > > > So a device can generate interrupt from two irqs. And we can get the irq > > > number for the routing table. Can we extend the irq mechanism and > > > automatically register the interrupt handler for the two irqs? > > > > This would not solve the problem of asserting 2 different interrupt > > lines, in the masked interrupt handling case, for 1 interrupt request. > > The result would be that the ISR is called twice and at the second call > > you can't be sure that the device hasn't already been serviced. > > Calling the ISR twice isn't a problem, is it? We're talking about > PCI interrupts, which are shareable, so ISRs have to handle being > called extra times. > > There's still the problem that the core will disable an IRQ if we > take it too many times without any ISR that cares about it. But that's > a core issue, not an ISR issue. It is not solvable in the core. How do you find out that the "nobody cared" spurious IRQ is benign? Regards, -- Olaf Dabrunz (od/odabrunz), SUSE Linux Products GmbH, Nürnberg ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: PCI, ACPI, IRQ, IOAPIC: reroute PCI interrupt to legacy boot interrupt equivalent 2009-01-14 15:55 ` Olaf Dabrunz @ 2009-01-14 16:52 ` Bjorn Helgaas 0 siblings, 0 replies; 30+ messages in thread From: Bjorn Helgaas @ 2009-01-14 16:52 UTC (permalink / raw) To: Olaf Dabrunz Cc: Stefan Assmann, Shaohua Li, Len Brown, Ingo Molnar, Jesse Barnes, Thomas Gleixner, Steven Rostedt, Linux Kernel Mailing List, linux-acpi, Sven Dietrich On Wednesday 14 January 2009 08:55:29 am Olaf Dabrunz wrote: > On 14-Jan-09, Bjorn Helgaas wrote: > > On Wednesday 14 January 2009 02:57:22 am Stefan Assmann wrote: > > > Shaohua Li wrote: > > > > So a device can generate interrupt from two irqs. And we can get the irq > > > > number for the routing table. Can we extend the irq mechanism and > > > > automatically register the interrupt handler for the two irqs? > > > > > > This would not solve the problem of asserting 2 different interrupt > > > lines, in the masked interrupt handling case, for 1 interrupt request. > > > The result would be that the ISR is called twice and at the second call > > > you can't be sure that the device hasn't already been serviced. > > > > Calling the ISR twice isn't a problem, is it? We're talking about > > PCI interrupts, which are shareable, so ISRs have to handle being > > called extra times. > > > > There's still the problem that the core will disable an IRQ if we > > take it too many times without any ISR that cares about it. But that's > > a core issue, not an ISR issue. > > It is not solvable in the core. How do you find out that the "nobody > cared" spurious IRQ is benign? Sorry, I'm not suggesting that you can. I was just trying to clarify that the problem is not with calling an ISR twice, but I think I only managed to muddy the discussion to no benefit. Bjorn ^ permalink raw reply [flat|nested] 30+ messages in thread
end of thread, other threads:[~2009-01-15 15:35 UTC | newest] Thread overview: 30+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2009-01-09 23:03 PCI, ACPI, IRQ, IOAPIC: reroute PCI interrupt to legacy boot interrupt equivalent Len Brown 2009-01-12 11:09 ` Stefan Assmann 2009-01-12 11:37 ` Ingo Molnar 2009-01-12 18:51 ` Bjorn Helgaas 2009-01-12 19:25 ` Jon Masters 2009-01-12 19:45 ` Bjorn Helgaas 2009-01-13 13:32 ` Stefan Assmann 2009-01-13 18:22 ` Olaf Dabrunz 2009-01-15 15:34 ` Olaf Dabrunz 2009-01-12 23:36 ` Eric W. Biederman 2009-01-13 0:29 ` Jon Masters 2009-01-13 1:47 ` Ingo Molnar 2009-01-13 3:47 ` Eric W. Biederman 2009-01-13 4:26 ` Jon Masters 2009-01-14 11:40 ` Ingo Molnar 2009-01-14 19:18 ` Jon Masters 2009-01-14 22:42 ` Eric W. Biederman 2009-01-14 22:53 ` Steven Rostedt 2009-01-14 22:56 ` Jon Masters 2009-01-15 12:36 ` Olaf Dabrunz 2009-01-15 10:16 ` Stefan Assmann 2009-01-13 11:18 ` Stefan Assmann 2009-01-13 15:57 ` Olaf Dabrunz 2009-01-15 0:10 ` Bjorn Helgaas 2009-01-15 14:08 ` Stefan Assmann 2009-01-13 8:25 ` Shaohua Li 2009-01-14 9:57 ` Stefan Assmann 2009-01-14 15:48 ` Bjorn Helgaas 2009-01-14 15:55 ` Olaf Dabrunz 2009-01-14 16:52 ` Bjorn Helgaas
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).