linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* isp1020 memory trample in 2.5.66
@ 2003-04-02  6:08 Zwane Mwaikambo
  2003-04-02 16:02 ` Patrick Mansfield
  0 siblings, 1 reply; 5+ messages in thread
From: Zwane Mwaikambo @ 2003-04-02  6:08 UTC (permalink / raw)
  To: Linux Kernel; +Cc: linux-scsi

2.5.65 was ok, 2.5.66 can't boot, there is nothing obvious in the patch 
which could have led to this. I'd try a binary search but i'm afraid i 
won't have that much free time for a while.

Box is 32way PIII-450, devices attached to the only real active isp1020 
HBA are;

qlogicisp : new isp1020 revision ID (5)
qlogicisp : interrupt 233 already in use
scsi0 : QLogic ISP1020 SCSI on PCI bus 01 device 70 irq 41 MEM base 
0xf8c1a000
  Vendor: IBM       Model: DRHS36V           Rev: 0270
  Type:   Direct-Access                      ANSI SCSI revision: 03
  Vendor: IBM       Model: DRHS36V           Rev: 0270
  Type:   Direct-Access                      ANSI SCSI revision: 03
  Vendor: PLEXTOR   Model: CD-ROM PX-32CS    Rev: 1.02
  Type:   CD-ROM                             ANSI SCSI revision: 02
scsi1 : QLogic ISP1020 SCSI on PCI bus 04 device 70 irq 89 MEM base 0xf8c1c000
SCSI device sda: 72170879 512-byte hdwr sectors (36951 MB)
SCSI device sda: drive cache: write through
 sda: sda1 sda2 sda3
Attached scsi disk sda at scsi0, channel 0, id 0, lun 0
SCSI device sdb: 72170879 512-byte hdwr sectors (36951 MB)
SCSI device sdb: drive cache: write through
 sdb: unknown partition table

Unable to handle kernel paging request at virtual address 6b6b6b6b
 printing eip:
6b6b6b6b
*pde = 00000000
Oops: 0000 [#1]
CPU:    0
EIP:    0060:[<6b6b6b6b>]    Not tainted
EFLAGS: 00010086
EIP is at 0x6b6b6b6b
eax: f8c1c000   ebx: e4f9c000   ecx: 00000001   edx: c3f56400
esi: e4f9c000   edi: 00000002   ebp: e4fa09cc   esp: c0375f10
ds: 007b   es: 007b   ss: 0068
Process swapper (pid: 0, threadinfo=c0374000 task=c03064a0)
Stack: c0228d4b e4fa09cc 00000001 00000001 c3f564b0 c3f56400 0000000b c3f56400 
       00000002 c0375f98 c0228a39 0000000b c3f56400 c0375f98 e59f1914 24000001 
       0000000b c010cd4a 0000000b c3f56400 c0375f98 c0356960 c0356970 0000000b 
Call Trace:
 [<c0228d4b>] isp1020_intr_handler+0x2db/0x300
 [<c0228a39>] do_isp1020_intr_handler+0x49/0x80
 [<c010cd4a>] handle_IRQ_event+0x3a/0x60
 [<c010d052>] do_IRQ+0x112/0x1f0
 [<c01089b0>] default_idle+0x0/0x40
 [<c01089b0>] default_idle+0x0/0x40
 [<c010b700>] common_interrupt+0x18/0x20
 [<c01089b0>] default_idle+0x0/0x40
 [<c01089b0>] default_idle+0x0/0x40
 [<c01089dd>] default_idle+0x2d/0x40
 [<c0108a82>] cpu_idle+0x52/0x70
 [<c0105000>] _stext+0x0/0x70

Code:  Bad EIP value.
 <0>Kernel panic: Aiee, killing interrupt handler!

0xc0228d4b is in isp1020_intr_handler (drivers/scsi/qlogicisp.c:1072).
1071                    (*Cmnd->scsi_done)(Cmnd); <===
1072            }

-- 
function.linuxpower.ca

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: isp1020 memory trample in 2.5.66
  2003-04-02  6:08 isp1020 memory trample in 2.5.66 Zwane Mwaikambo
@ 2003-04-02 16:02 ` Patrick Mansfield
  2003-04-03  8:27   ` Zwane Mwaikambo
  0 siblings, 1 reply; 5+ messages in thread
From: Patrick Mansfield @ 2003-04-02 16:02 UTC (permalink / raw)
  To: Zwane Mwaikambo; +Cc: Linux Kernel, linux-scsi

On Wed, Apr 02, 2003 at 01:08:54AM -0500, Zwane Mwaikambo wrote:
> 2.5.65 was ok, 2.5.66 can't boot, there is nothing obvious in the patch 
> which could have led to this. I'd try a binary search but i'm afraid i 
> won't have that much free time for a while.

> Box is 32way PIII-450, devices attached to the only real active isp1020 
> HBA are;

I've been booting OK with 2.5.66 with isp1020 and qlogicisp driver with
multiple disks, though the boot sometimes hangs.

I've also booted OK with the feral driver.

> qlogicisp : new isp1020 revision ID (5)
> qlogicisp : interrupt 233 already in use
> scsi0 : QLogic ISP1020 SCSI on PCI bus 01 device 70 irq 41 MEM base 
> 0xf8c1a000
>   Vendor: IBM       Model: DRHS36V           Rev: 0270
>   Type:   Direct-Access                      ANSI SCSI revision: 03
>   Vendor: IBM       Model: DRHS36V           Rev: 0270
>   Type:   Direct-Access                      ANSI SCSI revision: 03
>   Vendor: PLEXTOR   Model: CD-ROM PX-32CS    Rev: 1.02
>   Type:   CD-ROM                             ANSI SCSI revision: 02
> scsi1 : QLogic ISP1020 SCSI on PCI bus 04 device 70 irq 89 MEM base 0xf8c1c000
> SCSI device sda: 72170879 512-byte hdwr sectors (36951 MB)
> SCSI device sda: drive cache: write through
>  sda: sda1 sda2 sda3
> Attached scsi disk sda at scsi0, channel 0, id 0, lun 0
> SCSI device sdb: 72170879 512-byte hdwr sectors (36951 MB)
> SCSI device sdb: drive cache: write through
>  sdb: unknown partition table
> 
> Unable to handle kernel paging request at virtual address 6b6b6b6b
>  printing eip:
> 6b6b6b6b
> *pde = 00000000
> Oops: 0000 [#1]
> CPU:    0
> EIP:    0060:[<6b6b6b6b>]    Not tainted
> EFLAGS: 00010086
> EIP is at 0x6b6b6b6b
> eax: f8c1c000   ebx: e4f9c000   ecx: 00000001   edx: c3f56400
> esi: e4f9c000   edi: 00000002   ebp: e4fa09cc   esp: c0375f10
> ds: 007b   es: 007b   ss: 0068
> Process swapper (pid: 0, threadinfo=c0374000 task=c03064a0)
> Stack: c0228d4b e4fa09cc 00000001 00000001 c3f564b0 c3f56400 0000000b c3f56400 
>        00000002 c0375f98 c0228a39 0000000b c3f56400 c0375f98 e59f1914 24000001 
>        0000000b c010cd4a 0000000b c3f56400 c0375f98 c0356960 c0356970 0000000b 
> Call Trace:
>  [<c0228d4b>] isp1020_intr_handler+0x2db/0x300
>  [<c0228a39>] do_isp1020_intr_handler+0x49/0x80
>  [<c010cd4a>] handle_IRQ_event+0x3a/0x60
>  [<c010d052>] do_IRQ+0x112/0x1f0
>  [<c01089b0>] default_idle+0x0/0x40
>  [<c01089b0>] default_idle+0x0/0x40
>  [<c010b700>] common_interrupt+0x18/0x20
>  [<c01089b0>] default_idle+0x0/0x40
>  [<c01089b0>] default_idle+0x0/0x40
>  [<c01089dd>] default_idle+0x2d/0x40
>  [<c0108a82>] cpu_idle+0x52/0x70
>  [<c0105000>] _stext+0x0/0x70
> 
> Code:  Bad EIP value.
>  <0>Kernel panic: Aiee, killing interrupt handler!
> 
> 0xc0228d4b is in isp1020_intr_handler (drivers/scsi/qlogicisp.c:1072).
> 1071                    (*Cmnd->scsi_done)(Cmnd); <===
> 1072            }

This looks a lot like the oops when trying to send IO to more than one
disk at a time with the isp1020 + qlogicisp.

Is there something different causing IO to muliple disks at that point?

I hit this once when I enabled parallel fsck (it didn't oops until after I
got a late oops, and rebooted).

Martin hit it when the queue depth was not properly checked.

wli has hit it with parallel mkfs (or something).

The following thread was pretty useful, Doug L mentions that the qlogicisp
does bad things, starting with Martin's analysis:

http://marc.theaimsgroup.com/?l=linux-kernel&m=104457083601573&w=2

-- Patrick Mansfield

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: isp1020 memory trample in 2.5.66
  2003-04-02 16:02 ` Patrick Mansfield
@ 2003-04-03  8:27   ` Zwane Mwaikambo
  2003-04-03  8:55     ` William Lee Irwin III
  0 siblings, 1 reply; 5+ messages in thread
From: Zwane Mwaikambo @ 2003-04-03  8:27 UTC (permalink / raw)
  To: Patrick Mansfield; +Cc: Linux Kernel, linux-scsi

On Wed, 2 Apr 2003, Patrick Mansfield wrote:

> I've been booting OK with 2.5.66 with isp1020 and qlogicisp driver with
> multiple disks, though the boot sometimes hangs.
> 
> I've also booted OK with the feral driver.

I'll be prepping a kernel with that.

> This looks a lot like the oops when trying to send IO to more than one
> disk at a time with the isp1020 + qlogicisp.
> 
> Is there something different causing IO to muliple disks at that point?

Yes it is possible as i have another disk on the same HBA with /usr

> I hit this once when I enabled parallel fsck (it didn't oops until after I
> got a late oops, and rebooted).
> 
> Martin hit it when the queue depth was not properly checked.
> 
> wli has hit it with parallel mkfs (or something).

Ok this thing sounds _very_ fragile ;)

> The following thread was pretty useful, Doug L mentions that the qlogicisp
> does bad things, starting with Martin's analysis:
> 
> http://marc.theaimsgroup.com/?l=linux-kernel&m=104457083601573&w=2

Thanks! that looks very familiar.

	Zwane
-- 
function.linuxpower.ca

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: isp1020 memory trample in 2.5.66
  2003-04-03  8:27   ` Zwane Mwaikambo
@ 2003-04-03  8:55     ` William Lee Irwin III
  2003-04-03  9:19       ` Zwane Mwaikambo
  0 siblings, 1 reply; 5+ messages in thread
From: William Lee Irwin III @ 2003-04-03  8:55 UTC (permalink / raw)
  To: Zwane Mwaikambo; +Cc: Patrick Mansfield, Linux Kernel, linux-scsi

On Wed, 2 Apr 2003, Patrick Mansfield wrote:
>> Martin hit it when the queue depth was not properly checked.
>> wli has hit it with parallel mkfs (or something).

On Thu, Apr 03, 2003 at 03:27:38AM -0500, Zwane Mwaikambo wrote:
> Ok this thing sounds _very_ fragile ;)

Debugging ode this obfuscated and crappy is as hopeless as trying to
debug the nvidia binary-only oops-o-rama.

What are the odds of just throwing away the isp1020 and replacing it
with anything else? It won't fix it, but it can't be fixed due to
utter lack of information about the things and/or lack of maintainers
with information hidden by NDA's anyway.


-- wli

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: isp1020 memory trample in 2.5.66
  2003-04-03  8:55     ` William Lee Irwin III
@ 2003-04-03  9:19       ` Zwane Mwaikambo
  0 siblings, 0 replies; 5+ messages in thread
From: Zwane Mwaikambo @ 2003-04-03  9:19 UTC (permalink / raw)
  To: William Lee Irwin III; +Cc: Patrick Mansfield, Linux Kernel, linux-scsi

On Thu, 3 Apr 2003, William Lee Irwin III wrote:

> Debugging ode this obfuscated and crappy is as hopeless as trying to
> debug the nvidia binary-only oops-o-rama.
> 
> What are the odds of just throwing away the isp1020 and replacing it
> with anything else? It won't fix it, but it can't be fixed due to
> utter lack of information about the things and/or lack of maintainers
> with information hidden by NDA's anyway.

I'd love to but the hardware isn't mine so i'll have to make do until 
something better is available. But i'll go by yours and Patrick's 
recommendations and try feral.

Cheers,
	Zwane
-- 
function.linuxpower.ca

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2003-04-03  9:12 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2003-04-02  6:08 isp1020 memory trample in 2.5.66 Zwane Mwaikambo
2003-04-02 16:02 ` Patrick Mansfield
2003-04-03  8:27   ` Zwane Mwaikambo
2003-04-03  8:55     ` William Lee Irwin III
2003-04-03  9:19       ` Zwane Mwaikambo

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).