All of lore.kernel.org
 help / color / mirror / Atom feed
* 2.6.32 help needed with reverting APIC patches.
@ 2010-03-22  0:23 Zbigniew Luszpinski
  2010-03-22  7:06 ` Justin P. mattock
  0 siblings, 1 reply; 4+ messages in thread
From: Zbigniew Luszpinski @ 2010-03-22  0:23 UTC (permalink / raw)
  To: linux-kernel

Hello,

I'm investigating why my Nvidia MCP78S chipset needs noapic or acpi=noirq to protect USB OHCI against hanging at random time.
(both kernel parameters turns off APIC and fall back to PIC). Asrock K10N78FullHD-hSLI R3.0 mainboard.
The OHCI USB controller integrated in chipset is the only device which hangs in APIC mode.
The bug does not exist on Windows XP SP3 and OpenSolaris 2009.6 so I'm focused on APIC handling differences between those systems and Linux.

First it looks like those systems use classical handling with interrupt masking. Linux abandoned this between 2.6.20 and 2.6.21:
http://www.mail-archive.com/linux-net@vger.kernel.org/msg01535.html
(the bug described at this link is identical to mine except it talks about network cards not usb ohci)
That is why I would like to revert those patches and do the test.
I tried to revert these 2 patches in vanilla kernel 2.6.32.8 I use but failed because IRQ_DELAYED_DISABLE flag and supporting code vanished away.
It is problematic to reinsert it back because of this:
2.6.20:
#define IRQ_DELAYED_DISABLE    0x00100000      /* IRQ disable (masking) happens delayed. */ 
2.6.32:
#define IRQ_WAKEUP             0x00100000      /* IRQ triggers system wakeup */
I tried to revert these 2 patches and workaround missing IRQ_DELAYED_DISABLE but this resulted in black screen and kernel unable to boot.

So I tried to boot 2.6.20 kernel distro - boot failed because MCP78S AHCI is too new for 2.6.20. Backporting PCI IDs did nothing.
Kernel still can not mount VFS and access SATA hdd or CD - kernel can not correctly talk to AHCI controller.
IDE compatibility mode is even worse. So 2.6.20 or older is not an option for me.

Second they use Level triggered interrupts when Linux uses fasteoi.
AFAIR kernel 2.6.18 uses Level triggered IRQs. It seems RHEL5 still uses kernel 2.6.18 but it is custom patched and will not be reliable for checking.
If someone is APIC guru and would like to make a patch which brings back level triggered IRQs to 2.6.32.8 I would be happy to do the test.

Third: it may be bug in ACPI - Linux uses ACPI to enable APIC.
I browsed the ACPI tables code but did not find any traps prepared for Linux so it looks clean. But to be sure it would be cool to use APIC without ACPI.
If you know how to patch kernel to force APIC use with acpi=off let me know. My mainboard has no MPS tables - everything is done via ACPI.
The ACPI tables in bios are not syntax and hw clean - I fixed all ACPI bugs but this not changed anything.

I believe the #1 solution will be the one I look for without the need for testing solutions #2 and #3.
If someone will help me reverting:
http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=76d2160147f43f982dfe881404cfde9fd0a9da21
http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=d7e25f3394ba05a6d64cb2be42c2765fe72ea6b2
in kernel 2.6.32 or 2.6.33 I could start testing and excluding possible bug sources.

Here is my bugreport:
https://bugzilla.kernel.org/show_bug.cgi?id=13405

have a nice day,
Zbigniew Luszpinski

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: 2.6.32 help needed with reverting APIC patches.
  2010-03-22  0:23 2.6.32 help needed with reverting APIC patches Zbigniew Luszpinski
@ 2010-03-22  7:06 ` Justin P. mattock
  2010-03-22 23:18   ` Zbigniew Luszpinski
  0 siblings, 1 reply; 4+ messages in thread
From: Justin P. mattock @ 2010-03-22  7:06 UTC (permalink / raw)
  To: Zbigniew Luszpinski; +Cc: linux-kernel

On 03/21/2010 05:23 PM, Zbigniew Luszpinski wrote:
> Hello,
>
> I'm investigating why my Nvidia MCP78S chipset needs noapic or acpi=noirq to protect USB OHCI against hanging at random time.
> (both kernel parameters turns off APIC and fall back to PIC). Asrock K10N78FullHD-hSLI R3.0 mainboard.
> The OHCI USB controller integrated in chipset is the only device which hangs in APIC mode.
> The bug does not exist on Windows XP SP3 and OpenSolaris 2009.6 so I'm focused on APIC handling differences between those systems and Linux.
>
> First it looks like those systems use classical handling with interrupt masking. Linux abandoned this between 2.6.20 and 2.6.21:
> http://www.mail-archive.com/linux-net@vger.kernel.org/msg01535.html
> (the bug described at this link is identical to mine except it talks about network cards not usb ohci)
> That is why I would like to revert those patches and do the test.
> I tried to revert these 2 patches in vanilla kernel 2.6.32.8 I use but failed because IRQ_DELAYED_DISABLE flag and supporting code vanished away.
> It is problematic to reinsert it back because of this:
> 2.6.20:
> #define IRQ_DELAYED_DISABLE    0x00100000      /* IRQ disable (masking) happens delayed. */
> 2.6.32:
> #define IRQ_WAKEUP             0x00100000      /* IRQ triggers system wakeup */
> I tried to revert these 2 patches and workaround missing IRQ_DELAYED_DISABLE but this resulted in black screen and kernel unable to boot.
>
> So I tried to boot 2.6.20 kernel distro - boot failed because MCP78S AHCI is too new for 2.6.20. Backporting PCI IDs did nothing.
> Kernel still can not mount VFS and access SATA hdd or CD - kernel can not correctly talk to AHCI controller.
> IDE compatibility mode is even worse. So 2.6.20 or older is not an option for me.
>
> Second they use Level triggered interrupts when Linux uses fasteoi.
> AFAIR kernel 2.6.18 uses Level triggered IRQs. It seems RHEL5 still uses kernel 2.6.18 but it is custom patched and will not be reliable for checking.
> If someone is APIC guru and would like to make a patch which brings back level triggered IRQs to 2.6.32.8 I would be happy to do the test.
>
> Third: it may be bug in ACPI - Linux uses ACPI to enable APIC.
> I browsed the ACPI tables code but did not find any traps prepared for Linux so it looks clean. But to be sure it would be cool to use APIC without ACPI.
> If you know how to patch kernel to force APIC use with acpi=off let me know. My mainboard has no MPS tables - everything is done via ACPI.
> The ACPI tables in bios are not syntax and hw clean - I fixed all ACPI bugs but this not changed anything.
>
> I believe the #1 solution will be the one I look for without the need for testing solutions #2 and #3.
> If someone will help me reverting:
> http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=76d2160147f43f982dfe881404cfde9fd0a9da21
> http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=d7e25f3394ba05a6d64cb2be42c2765fe72ea6b2
> in kernel 2.6.32 or 2.6.33 I could start testing and excluding possible bug sources.
>
> Here is my bugreport:
> https://bugzilla.kernel.org/show_bug.cgi?id=13405
>
> have a nice day,
> Zbigniew Luszpinski
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
>


best would be todo a bisect. but if your for sure these
are the commits, then you probably wont need too.

if your doing git revert xxxx
and the commit reverts for you then your good,
but if there's too much happening(big merge), then
a git rebase probably is the best bet(but could be wrong).

hope this helps.

Justin P. Mattock

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: 2.6.32 help needed with reverting APIC patches.
  2010-03-22  7:06 ` Justin P. mattock
@ 2010-03-22 23:18   ` Zbigniew Luszpinski
  2010-03-22 23:25     ` Justin P. Mattock
  0 siblings, 1 reply; 4+ messages in thread
From: Zbigniew Luszpinski @ 2010-03-22 23:18 UTC (permalink / raw)
  To: Justin P. mattock; +Cc: linux-kernel

> best would be todo a bisect. but if your for sure these
> are the commits, then you probably wont need too.
> 
> if your doing git revert xxxx
> and the commit reverts for you then your good,
> but if there's too much happening(big merge), then
> a git rebase probably is the best bet(but could be wrong).
> 
> hope this helps.
> 
> Justin P. Mattock
> 

Yeah I imagine reverting patches from kernel 2.6.32.10 up to 2.6.20 and seeing all these conflicts in git.
I have just done this in different way: downloaded official 2.6.32.10 and manually reinserted the removed code.
I resolved IRQ_DELAYED_DISABLE irq.h conflict by moving it to higher bitfield position.
The patched kernel booted OK but these reverts did not resolve the OHCI hangs in APIC mode.
The final verdict is: idea #1 was false.
Now I focus on #2 and #3 ideas.

Call for action:
1. If you have any idea why ohci hangs in APIC mode let me know.
2. If you have Nvidia MCP78S mainboard with AMD CPU please send me ACPI dump:
acpidump > dump.bin
bzip2 -9v dump.bin
This will be good APIC code comparison to check the #3 idea.
Do not mess the ML with attachments - send me the file(s) PM.

On my mainboard almost all onboard devices are sitting on INT A according to ACPI code.
Only USB 2.0 EHCI controller sits on INT B alone.

have a nice day,
Zbigniew Luszpinski

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: 2.6.32 help needed with reverting APIC patches.
  2010-03-22 23:18   ` Zbigniew Luszpinski
@ 2010-03-22 23:25     ` Justin P. Mattock
  0 siblings, 0 replies; 4+ messages in thread
From: Justin P. Mattock @ 2010-03-22 23:25 UTC (permalink / raw)
  To: Zbigniew Luszpinski; +Cc: linux-kernel

On 03/22/2010 04:18 PM, Zbigniew Luszpinski wrote:
>> best would be todo a bisect. but if your for sure these
>> are the commits, then you probably wont need too.
>>
>> if your doing git revert xxxx
>> and the commit reverts for you then your good,
>> but if there's too much happening(big merge), then
>> a git rebase probably is the best bet(but could be wrong).
>>
>> hope this helps.
>>
>> Justin P. Mattock
>>
>
> Yeah I imagine reverting patches from kernel 2.6.32.10 up to 2.6.20 and seeing all these conflicts in git.
> I have just done this in different way: downloaded official 2.6.32.10 and manually reinserted the removed code.
> I resolved IRQ_DELAYED_DISABLE irq.h conflict by moving it to higher bitfield position.
> The patched kernel booted OK but these reverts did not resolve the OHCI hangs in APIC mode.
> The final verdict is: idea #1 was false.
> Now I focus on #2 and #3 ideas.
>
> Call for action:
> 1. If you have any idea why ohci hangs in APIC mode let me know.
> 2. If you have Nvidia MCP78S mainboard with AMD CPU please send me ACPI dump:
> acpidump>  dump.bin
> bzip2 -9v dump.bin
> This will be good APIC code comparison to check the #3 idea.
> Do not mess the ML with attachments - send me the file(s) PM.
>
> On my mainboard almost all onboard devices are sitting on INT A according to ACPI code.
> Only USB 2.0 EHCI controller sits on INT B alone.
>
> have a nice day,
> Zbigniew Luszpinski
>


for number one, the only thing off the top of my
head would possibly be load time i.g. if ehci_hcd loads
after ohci is loaded there could be a conflict or vice/versa
with something else.

for number two, I don't have an AMD processor here only and intel
(iMac)

Justin P. Mattock

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2010-03-22 23:25 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-03-22  0:23 2.6.32 help needed with reverting APIC patches Zbigniew Luszpinski
2010-03-22  7:06 ` Justin P. mattock
2010-03-22 23:18   ` Zbigniew Luszpinski
2010-03-22 23:25     ` Justin P. Mattock

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.