linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Additional info. for PCI VIA IDE crazyness.  Please read.
@ 2001-01-02  2:34 Evan Thompson
  2001-01-02  4:33 ` Linus Torvalds
       [not found] ` <200101020433.UAA23808@penguin.transmeta.com>
  0 siblings, 2 replies; 7+ messages in thread
From: Evan Thompson @ 2001-01-02  2:34 UTC (permalink / raw)
  To: linux-kernel

IN ADVANCE:  I'm sorry for this being so long, but I'm just trying to
make sure people understand what my problem is.  If you need more info,
I'll be happy to give as much as I can, just give me a reply.

Okay.  Many of you already know the problems I'm having with my PCI VIA
IDE controller.  I've done a bit of additional testing and I think I
have found out what the problem is.  The only problem is that I'm not
too versed in kernel programming (I'm getting there, but still don't
understand some more complex C ideas), and therefore cannot fix this.

-- THE PROBLEM --

I know that any kernel version in the 2.2, 2.3, 2.3.99pre series and
2.4.0-test kernels =<2.4.0-test11, I need to append
'ide1=0x170,0x376,15' to get my (so called) PCI VIA IDE controller to
put the secondary channel on IRQ 15 (otherwise, it'd put it on IRQ 14,
causing hdc/hdd: lost interrupt errors and would take 5 or so minutes
too boot).

--

WHAT I HAVE FOUND NOW, is that something has changed from 2.4.0-test11
to 2.4.0-test12 in either the ide implimentation or with IRQ handling
(although there was only one change in irq.c -- something going from
and int to a long) that has caused my system to complain about hdb:
lost interrupt, and refuses to boot.

I used the EXACT SAME configuration for both -test11 and -test12, and
11 worked properly, and 12 causes problems (see above).  I was clever
enough to add printer console support to my kernel, and was able to
print out the kernel messages for -test12 (I didn't need to print out
-test11's messages, but the support was still in the kernel).  After
comparing the output, the only relavent change I found was the addition
of this line in the kernel message:

2.4.0-test11:

Uniform Multi-Platform E-IDE driver Revision: 6.31
ide: Assuming 33MHz system bus speed for PIO modes; override with
idebus=xx
VP_IDE: IDE controller on PCI bus 00 dev 39
VP_IDE: chipset revision 6
VP_IDE: 100% native mode on irq 14

TO 2.4.0-test12:

 Uniform Multi-Platform E-IDE driver Revision: 6.31
 ide: Assuming 33MHz system bus speed for PIO modes; override with
     idebus=xx
 VP_IDE: IDE controller on PCI bus 00 dev 39
+PCI: Assigned IRQ 14 for device 00:07.1 
 VP_IDE: chipset revision 6
 VP_IDE: 100% native mode on irq 14

(notice the new line idicated with the '+').

--

If I haven't given enough information, don't hesitate to ask for more.
I'd like some reply to this situation because it just seems odd for
this to happen.  Like I said eariler, if I knew what to do, I'd be
happy to submit a patch, but I'm still learning C, so I'm not capable
to do that yet.  Thanks in advance,
-- 
| Evan Thompson                    | ICQ:    2233067   |
| Freelance Computer Nerd          | AIM:    Evaner517 |
| evaner@bigfoot.com               | Yahoo!: evanat    |
| http://evaner.penguinpowered.com | MSN:    evaner517 |
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Additional info. for PCI VIA IDE crazyness.  Please read.
  2001-01-02  2:34 Additional info. for PCI VIA IDE crazyness. Please read Evan Thompson
@ 2001-01-02  4:33 ` Linus Torvalds
       [not found] ` <200101020433.UAA23808@penguin.transmeta.com>
  1 sibling, 0 replies; 7+ messages in thread
From: Linus Torvalds @ 2001-01-02  4:33 UTC (permalink / raw)
  To: linux-kernel

In article <20010101203409.A335@evaner.penguinpowered.com>,
Evan Thompson  <evaner@bigfoot.com> wrote:
>
>-- THE PROBLEM --
>
>I know that any kernel version in the 2.2, 2.3, 2.3.99pre series and
>2.4.0-test kernels =<2.4.0-test11, I need to append
>'ide1=0x170,0x376,15' to get my (so called) PCI VIA IDE controller to
>put the secondary channel on IRQ 15 (otherwise, it'd put it on IRQ 14,
>causing hdc/hdd: lost interrupt errors and would take 5 or so minutes
>too boot).

Hmm..

The PCI irq code will parse your BIOS pirq tables, and enforce the fact
that those tables do seem to say that it's irq 14.

It obviously appears that the tables are wrong, which is kind of sad. 
Especially as a lot of machines _have_ to trust the tables in order to
get a working setup. 

What happens if you don't try to override the ide logic with the command
line? It looks like Linux has always wanted to put it on irq14, which
implies that the BIOS really set it up that way, and your irq15 thing
was always something that the driver disagreed with.

In particular, notice how even before, the IDE driver ignored the fact
that you had specified irq15. The driver was very aware of the fact that
it really had irq14 allocated:

>2.4.0-test11:
>
>Uniform Multi-Platform E-IDE driver Revision: 6.31
>ide: Assuming 33MHz system bus speed for PIO modes; override with
>idebus=xx
>VP_IDE: IDE controller on PCI bus 00 dev 39
>VP_IDE: chipset revision 6
>VP_IDE: 100% native mode on irq 14

and it may be that what the irq=15 thing did was just hide some other
bug. 

For example, what the PCI irq routing code will do is to not just enable
irq14 (which looks like it was enabled even in test11), but it will also
mark it as being level-triggered.  It may be that the IDE driver itself
has problems with this: it used to ignore the (real) irq14 before
because you had specified irq15 by hand, and if that was an
edge-triggered irq it didn't hurt.  Now, when the PCI irq is properly
set up as a level-triggered one, ignoring the real interrupt will result
in an infinite flood of interrupts - they will _not_ go away until they
are handled (which they never will be, because you lied to it and said
it had irq 15). 

Please try a few things:

 - enabled DEBUG in arch/i386/kernel/pci-i386.h to see what the PCI irq
   routing tables say.

 - don't pass the command line with the bogus irq

 - alternatively, pass the command line, but use "ide1=0x170,0x376,14"
   instead (which will force it to use irq14 - the only difference from
   no command line at all should be that it doesn't even try to probe it)

 - see what happens if VIA low-level driver support is disabled, so that
   you end up using the non-chipset-specific code. It may be that the
   chipset-specific code has some magic "change the irq setup" code that
   clashes with the fact that the PCI layer has enabled the irq routing.

(In particular, some of the low-level drivers have tried to do some
things by hand, to work around the fact that the PCI layer hasn't done
the kind of complete setup that it _does_ try to do these days.
Sometimes that code is broken.).

Thanks,

		Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Additional info. for PCI VIA IDE crazyness.  Please read.
       [not found] ` <200101020433.UAA23808@penguin.transmeta.com>
@ 2001-01-02 16:29   ` Evan Thompson
  2001-01-02 18:56     ` Linus Torvalds
  0 siblings, 1 reply; 7+ messages in thread
From: Evan Thompson @ 2001-01-02 16:29 UTC (permalink / raw)
  To: Linus Torvalds, linux-kernel

WOW...a mail from the guy himself, Linus Torvalds!

Beforehand, I'd like to point out that I forgot to add one little
point.  All the kernels I've used >2.2 put ide1 on IRQ 15 when I make
it do that (even the ones >-test11) because it gives me:

ide1 at 0x170-0x177,0x376 on irq 15

but it seems as though >-test11 kernels want to give me a whole bunch
of hdb: lost interrupt errors, not allowing me to boot into Linux
(unless I use some kind of floppy root...hey...there's an idea)

On Mon, Jan 01, 2001 at 08:33:53PM -0800, Linus Torvalds wrote:
> Please try a few things:
> 
>  - enabled DEBUG in arch/i386/kernel/pci-i386.h to see what the PCI irq
>    routing tables say.
> 
>  - don't pass the command line with the bogus irq
> 
>  - alternatively, pass the command line, but use "ide1=0x170,0x376,14"
>    instead (which will force it to use irq14 - the only difference from
>    no command line at all should be that it doesn't even try to probe it)

Okay.  With -test11, both of these alternatives give me the same
result.  The system boots, but I cannot access my two CD-ROM drives
(hdc and hdd) because of 'lost interrupts', with -test12, my system
refuses to bring up init because hdb (my Linux drive) keeps giving me
'lost interrupt', execpt now, -test12 gives me some more strange errors
(before I enable DEBUG):

hdc: cdrom_pc_intr: The drive appears confused (ireason = 0x 1)

This repeats for at least 20 lines, then says the same thing for hdd.

When I enable DEBUG, I get a whole bunch of new stuff, most of which
makes any sense to me, but I'll try to give the relavent info:

PCI: BIOS32 Service Directory structure at 0xc00fdb40
PCI: BIOS32 Service Directory entry at 0xfdb50
PCI: BIOS probe returned s=00 hw=01 ver=02.10 l=01
PCI: PCI BIOS revision 2.10 entry at 0xfdb71, last bus=1
PCI: Using configuration type 1
PCI: Probing PCI hardware
PCI: IDE base address fixup for 00:07.1
PCI: Scanning for ghost devices on bus 0
PCI: Scanning for ghost devices on bus 1
PCI: IRQ init
PCI: Interrupt Routing Table found at 0xc00f85f0
00:07 slot=00 0:fe/4000 1:ff/8000 2:00/0000 3:04/deb8
00:08 slot=01 0:01/deb8 1:02/deb8 2:03/deb8 3:04/deb8
00:09 slot=02 0:02/deb8 1:03/deb8 2:04/deb8 3:01/deb8
00:09 slot=03 0:03/deb8 1:04/deb8 2:01/deb8 3:02/deb8
c3:00 slot=72 0:60/0e1e 1:1f/e852 2:93/8b00 3:fa/1f5a
0a:18 slot=05 0:74/3c27 1:f0/0c73 2:e8/feb9 e:0a/74c0 
PCI: Scanning for ghost devices on bus 10
PCI: Discovered primary peer bus 0a [IRQ]
PCI: Scanning for ghost devices on bus 195
PCI: Discovered primary peer bus c3 [IRQ]
PCI: Using IRQ router VIA [1106/0586] at 00:07.0
PCI: IRQ fixup
PCI: Allocating resources

(then it allocates resouces...a bunch of I/O ports from what I
understand...doesn't seem too important)

(then it goes on...)

Uniform Multi-Platform E-IDE driver Revision: 6.31
ide: Assuming 33MHz system bus speed for PIO modes; override with
idebus=xx
VP_IDE: IDE controller on PCI bus 00 dev 39
IRQ for 00:07.1:0 -> PIRQ fe, mask 4000, excl 0000 -> newirq=14 ->
assigning IRQ
PCI: Assigned IRQ 14 for device 00:07.1

(then it goes on with the confused device errors and stops)

If I've left out something important, ask.  I've got the printout right
here, so I can type it in.
> 
>  - see what happens if VIA low-level driver support is disabled, so that
>    you end up using the non-chipset-specific code. It may be that the
>    chipset-specific code has some magic "change the irq setup" code that
>    clashes with the fact that the PCI layer has enabled the irq routing.
> 

I didn't enable low-level driver support in the first place.  Just
"Generic PCI IDE chipset support" (CONFIG_BLK_DEV_IDEPCI)

> (In particular, some of the low-level drivers have tried to do some
> things by hand, to work around the fact that the PCI layer hasn't done
> the kind of complete setup that it _does_ try to do these days.
> Sometimes that code is broken.).

Well, with the quailty of this low quailty POS m/b, I'm not surprised
if that is the case (if I had more money, I would throw this thing out
of the window, go outside in this minus 40 weather and jump on it until
it turns into a nice fine powder, then go and buy a new computer from
VA or something.  Then I'd have someone to complain to if it didn't
work. (I built my last one)).

Thanks,
-- 
| Evan Thompson                    | ICQ:    2233067   |
| Freelance Computer Nerd          | AIM:    Evaner517 |
| evaner@bigfoot.com               | Yahoo!: evanat    |
| http://evaner.penguinpowered.com | MSN:    evaner517 |
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Additional info. for PCI VIA IDE crazyness.  Please read.
  2001-01-02 16:29   ` Evan Thompson
@ 2001-01-02 18:56     ` Linus Torvalds
  2001-01-03  1:57       ` Evan Thompson
  2001-01-03  2:08       ` Evan Thompson
  0 siblings, 2 replies; 7+ messages in thread
From: Linus Torvalds @ 2001-01-02 18:56 UTC (permalink / raw)
  To: Evan Thompson; +Cc: linux-kernel



On Tue, 2 Jan 2001, Evan Thompson wrote:
> PCI: Interrupt Routing Table found at 0xc00f85f0
> 00:07 slot=00 0:fe/4000 1:ff/8000 2:00/0000 3:04/deb8
> 00:08 slot=01 0:01/deb8 1:02/deb8 2:03/deb8 3:04/deb8
> 00:09 slot=02 0:02/deb8 1:03/deb8 2:04/deb8 3:01/deb8
> 00:09 slot=03 0:03/deb8 1:04/deb8 2:01/deb8 3:02/deb8
> c3:00 slot=72 0:60/0e1e 1:1f/e852 2:93/8b00 3:fa/1f5a
> 0a:18 slot=05 0:74/3c27 1:f0/0c73 2:e8/feb9 e:0a/74c0 

Ok, this is interesting. In particular, the "fe" and "ff" entries in the
routing table are something I've seen before. They are magic values for
the ALI interrupt router, and they seem to be magic values for VIA too.

As far as I can tell, "fe" means "hardcoded to 14" and "ff" means
"hardcoded to 15".

I wonder whether your "fa" means "hardcoded to 10". What is your PCI
device c3:00.3? That looks _really_ strange (it might just be a BIOS bug,
and a harmless one - you probably don't have such a device at all, is my
guess). I assume you don't have a "slot 4" at all.

Anyway, I suspect that the "fe"/"ff" values are specified by MS (no way to
know, as the docs are obviously NDA'd), which means that it would be
interesting to hear whether the problem is fixed by something like this:

In the file arch/i386/kernel/pci-irq.c, around line 240, there's a
function called pirq_via_get(). Right now it just does a
"read_config_nybble()", and I'd ask you to add these two magic lines to
the beginning of it:

	if ((pirq & 0xf0) == 0xf0)
		return pirq & 0xf;

and please tell me if that changes/fixes the problem for you.

Oh, and could you pass me the output of /proc/pci while you're at it, so
that I can match it up with your pirq table. That corrupted slot 4 entry
still makes me go "Hmm..".

	Thanks,
		Linus

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Additional info. for PCI VIA IDE crazyness.  Please read.
  2001-01-02 18:56     ` Linus Torvalds
@ 2001-01-03  1:57       ` Evan Thompson
  2001-01-03  2:08       ` Evan Thompson
  1 sibling, 0 replies; 7+ messages in thread
From: Evan Thompson @ 2001-01-03  1:57 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: linux-kernel

[-- Attachment #1: Type: text/plain, Size: 1953 bytes --]

On Tue, Jan 02, 2001 at 10:56:27AM -0800, Linus Torvalds wrote:
> Anyway, I suspect that the "fe"/"ff" values are specified by MS (no way to
> know, as the docs are obviously NDA'd), which means that it would be
> interesting to hear whether the problem is fixed by something like this:
> 
> In the file arch/i386/kernel/pci-irq.c, around line 240, there's a
> function called pirq_via_get(). Right now it just does a
> "read_config_nybble()", and I'd ask you to add these two magic lines to
> the beginning of it:
> 
> 	if ((pirq & 0xf0) == 0xf0)
> 		return pirq & 0xf;
> 
> and please tell me if that changes/fixes the problem for you.

It seems to change the problem, but fixes it in most ways.  I can
give you a dmesg output if you want one.  That fixed the "confused
drive" errors and lets me boot, but now hdc and hdd are timing out
because of the reasons that they timed out in 2.2.

ide0 at 0x1f0-0x1f7,0x3f6 on irq 14
ide1 at 0x170-0x177,0x376 on irq 14

is what I get now.  I haven't tried this yet, but I'm assuming a simple
re-addition of 'ide1=0x170,0x376,15' to the kernel options should put
ide1 back on irq 15 where it belongs (I haven't rebooted yet because I
was just so excited that this works.  Geez your smart, Linus!)

> Oh, and could you pass me the output of /proc/pci while you're at it, so
> that I can match it up with your pirq table. That corrupted slot 4 entry
> still makes me go "Hmm..".

I've attached the output I got from doing cat /proc/pci from
2.4.0-prerelease.  Because of a subtle configuration difference between
the two, there is a few I/O differences and output differences (because
of the jump in kernel versions) from 2.2.18pre21 (I really should
upgrade to 2.2.18).

Thanks,
-- 
| Evan Thompson                    | ICQ:    2233067   |
| Freelance Computer Nerd          | AIM:    Evaner517 |
| evaner@bigfoot.com               | Yahoo!: evanat    |
| http://evaner.penguinpowered.com | MSN:    evaner517 |

[-- Attachment #2: forlinus2 --]
[-- Type: text/plain, Size: 1565 bytes --]

PCI devices found:
  Bus  0, device   0, function  0:
    Host bridge: VIA Technologies, Inc. VT82C693A/694x [Apollo PRO133x] (rev 1).
      Master Capable.  Latency=16.  
      Prefetchable 32 bit memory at 0xe8000000 [0xebffffff].
  Bus  0, device   1, function  0:
    PCI bridge: VIA Technologies, Inc. VT82C598/694x [Apollo MVP3/Pro133x AGP] (rev 0).
      Master Capable.  No bursts.  Min Gnt=8.
  Bus  0, device   7, function  0:
    ISA bridge: VIA Technologies, Inc. VT82C586/A/B PCI-to-ISA [Apollo VP] (rev 65).
  Bus  0, device   7, function  1:
    IDE interface: VIA Technologies, Inc. Bus Master IDE (rev 6).
      IRQ 14.
      Master Capable.  Latency=32.  
      I/O at 0x1f0 [0x1f7].
      I/O at 0x3f6 [0x3f6].
      I/O at 0x170 [0x177].
      I/O at 0x376 [0x376].
      I/O at 0xffa0 [0xffaf].
  Bus  0, device   7, function  2:
    USB Controller: VIA Technologies, Inc. UHCI USB (rev 2).
      IRQ 10.
      Master Capable.  Latency=64.  
      I/O at 0xdf00 [0xdf1f].
  Bus  0, device   7, function  3:
    Bridge: VIA Technologies, Inc. VT82C586B ACPI (rev 16).
  Bus  0, device   9, function  0:
    Ethernet controller: Realtek Semiconductor Co., Ltd. RTL-8029(AS) (rev 0).
      IRQ 10.
      I/O at 0xde80 [0xde9f].
  Bus  1, device   0, function  0:
    VGA compatible controller: Silicon Integrated Systems [SiS] 86C326 (rev 11).
      Master Capable.  Latency=64.  Min Gnt=2.
      Prefetchable 32 bit memory at 0xe7000000 [0xe77fffff].
      Non-prefetchable 32 bit memory at 0xefef0000 [0xefefffff].
      I/O at 0xcc80 [0xccff].

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Additional info. for PCI VIA IDE crazyness.  Please read.
  2001-01-02 18:56     ` Linus Torvalds
  2001-01-03  1:57       ` Evan Thompson
@ 2001-01-03  2:08       ` Evan Thompson
  1 sibling, 0 replies; 7+ messages in thread
From: Evan Thompson @ 2001-01-03  2:08 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: linux-kernel

Oh, and adding that option DID fix the lost interrupt problems
(ide1=0x170,0x376,15), so yeah, that fix fixed it.

Thanks for your help.  Maybe it should become a config option to enable
that (CONFIG_BLK_DEV_MESSED_UP_VIA_CHIPSET_FIX).
-- 
| Evan Thompson                    | ICQ:    2233067   |
| Freelance Computer Nerd          | AIM:    Evaner517 |
| evaner@bigfoot.com               | Yahoo!: evanat    |
| http://evaner.penguinpowered.com | MSN:    evaner517 |
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Additional info. for PCI VIA IDE crazyness. Please read.
@ 2001-01-02  5:52 Ray Strode
  0 siblings, 0 replies; 7+ messages in thread
From: Ray Strode @ 2001-01-02  5:52 UTC (permalink / raw)
  To: linux-kernel

Is this problem possibly related to my issues on alpha? 
(when compiling for PC164 optimizations instead of 
generic alphaI get a lost interrupt message as well )

--Ray

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2001-01-03  2:39 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2001-01-02  2:34 Additional info. for PCI VIA IDE crazyness. Please read Evan Thompson
2001-01-02  4:33 ` Linus Torvalds
     [not found] ` <200101020433.UAA23808@penguin.transmeta.com>
2001-01-02 16:29   ` Evan Thompson
2001-01-02 18:56     ` Linus Torvalds
2001-01-03  1:57       ` Evan Thompson
2001-01-03  2:08       ` Evan Thompson
2001-01-02  5:52 Ray Strode

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).